# Project Milestone 4: Cleaning/Formatting API Source

### Load Necessary Packages

In [23]:
import pandas as pd
import requests

import warnings
warnings.filterwarnings('ignore')

In [24]:
# this would typically not be stored in an accessible way, it would be securely maintained
API_KEY = '01a79733bf6abffde9e83685b6d4ac04'

## Prepare and View Data

Weather data from the API is accessed via longitude and latitude. Therefore, we need a list of longitudes and latitudes of the cities we are examining in this study. We will import the csv data, which contains longitudes and latitudes of each city in order to produce this list. 

**Note:** Due to the large size of the dataset containing longitudes and latitudes, along with the API call limit for the chosen weather API, we need to reduce the size of the dataset. Because we are interested in differing weather patterns, there is no need to examine two cities that are right next to each other and have a similar weather pattern. Therefore, we will sort by longitude and latitude value and select the top 2, bottom 2, and middle 2 cities from each country. In this way, we get a variey of weather patterns in each country and reduce the size of the dataset. After completing this task, we will still have at least 1,000 rows of data to examine. 

In [37]:
# import csv data
df_csv = pd.read_csv('csv_data.csv', index_col = 0)

In [38]:
# determine the number of cities in each country and save as dataframe
counts = pd.DataFrame(df_csv.groupby(['Country'])['Latitude'].count())
# set country as column instead of index and rename column to count
counts.reset_index(inplace = True)
counts = counts.rename(columns = {"Latitude": "Counts"})

In [39]:
# create dataframe that contains list of available countries and the number of rows for each
countries = pd.DataFrame(df_csv.Country.unique(), columns = ['Country'])
countries = countries.merge(counts, on = 'Country', how = 'left')

In [40]:
# selects top 2 cities per lon and lat for each country
df_top = df_csv.sort_values(by = ['Latitude', 'Longitude'], 
                                ascending = False).groupby('Country').head(2)

In [41]:
# selects bottom 2 cities per lon and lat for each country
df_bottom = df_csv.sort_values(by = ['Latitude', 'Longitude'], 
                                ascending = False).groupby('Country').tail(2)

In [42]:
# create empty dataframe for holding middle cities per lon an dlat
df_mid = pd.DataFrame(columns = ['City', 'Population', 'Latitude', 'Longitude', 'Country'])

# iteratres through available countries
for i, r in countries.iterrows():
    # determines middle row
    mid = int(r["Counts"] / 2)
    # selects middle row for given country
    row1 = df_csv.loc[df_csv['Country'] == r['Country']].sort_values(by = ['Latitude', 'Longitude'], 
                                ascending = False).iloc[mid].tolist()
    
    # selects a second row for given country if it exists, otherwise continue 
    try:
        row2 = df_csv.loc[df_csv['Country'] == r['Country']].sort_values(by = ['Latitude', 'Longitude'], 
                                ascending = False).iloc[mid + 1].tolist()
    except IndexError as e:
        continue
    
    # adds rows to middle dataframe
    df_mid = df_mid.append(pd.DataFrame([row1], 
     columns = ['City', 'Population', 'Latitude', 'Longitude', 'Country']), 
     ignore_index = True)
    df_mid = df_mid.append(pd.DataFrame([row2], 
     columns = ['City', 'Population', 'Latitude', 'Longitude', 'Country']), 
     ignore_index = True)

In [43]:
# combines top, bottom, and middle cities from each country
df_csv = pd.concat([df_top, df_bottom, df_mid])

In [44]:
# if a country had only 1 city.. top, middle, and bottom will be the same, so drop dups 
df_csv = df_csv.drop_duplicates(subset = ['Latitude', 'Longitude', 'Country'])
# reset index after dropping rows
df_csv = df_csv.reset_index()
# drop newly created index column
df_csv = df_csv.drop(columns = ['index'])

In [45]:
# extract longitudes and latitudes from reduced list
lon_lat = df_csv[['Latitude', 'Longitude']].copy()

In [46]:
# view longitude and latitude dataframe
lon_lat.head()

Unnamed: 0,Latitude,Longitude
0,78.933333,11.95
1,78.216667,15.633333
2,77.795278,-70.755833
3,77.509722,-66.647778
4,73.506944,80.546389


In [47]:
# we will walk through the first example to determine what weather data we want to extract
# once we determine this, the rest of the rows will be completed in a loop
lat = lon_lat['Latitude'].iloc[0]    # get latitude of first row
lon = lon_lat['Longitude'].iloc[0]   # get longitude of first row

# request url
url = 'https://history.openweathermap.org/data/2.5/aggregated/month?month={}&lat={}&lon={}&appid={}'

# GET request from endpoint for January
r = requests.get(url.format(1, lat, lon, API_KEY))

# save response (we will not view entire thing, as it's very long)
json = r.json()

# view part of respone
# this statistical info is avaialble for temp, pressure, humidity, wind, precipitation, clouds
json['result']['temp']

{'record_min': 234.15,
 'record_max': 278.42,
 'average_min': 240.69,
 'average_max': 274.05,
 'median': 262.31,
 'mean': 260.6,
 'p25': 254.42,
 'p75': 268.11,
 'st_dev': 8.92,
 'num': 8184}

Because it would take a long time, along with many API calls, to get data for each month for each latitude and longitude, we will instead attempt to get weather patterns for each distinct time of the year of each location. To do this, we will get data from March and October for each location to limit the number of API calls while maintaining an accurate picture of the weather patterns for a location.

In [75]:
weather_data = pd.DataFrame(columns = ['Latitude', 'Longitude', 'Temp3', 'Pressure3', 'Humidity3',
                                      'Wind3', 'Precipitation3', 'Clouds3', 'Temp10', 'Pressure10',
                                      'Humidity10', 'Wind10', 'Precipitation10', 'Clouds10'])
months = [3, 10]

for i, r in lon_lat.iterrows():
    lat = r['Latitude']
    lon = r['Longitude']
    
    jsons = []
    for j in months:
        r = requests.get(url.format(j, lat, lon, API_KEY))
        json = r.json()
        jsons.append(json)
        
    temp3 = jsons[0]['result']['temp']['mean']
    pressure3 = jsons[0]['result']['pressure']['mean']
    humidity3 = jsons[0]['result']['humidity']['mean']
    wind3 = jsons[0]['result']['wind']['mean']
    precipitation3 = jsons[0]['result']['precipitation']['mean']
    clouds3 = jsons[0]['result']['clouds']['mean']
    
    temp10 = jsons[1]['result']['temp']['mean']
    pressure10 = jsons[1]['result']['pressure']['mean']
    humidity10 = jsons[1]['result']['humidity']['mean']
    wind10 = jsons[1]['result']['wind']['mean']
    precipitation10 = jsons[1]['result']['precipitation']['mean']
    clouds10 = jsons[1]['result']['clouds']['mean']
    
    row = [lat, lon, temp3, pressure3, humidity3, wind3, precipitation3, clouds3, temp10, pressure10,
          humidity10, wind10, precipitation10, clouds10]
    
    weather_data = weather_data.append(pd.DataFrame([row], 
     columns = ['Latitude', 'Longitude', 'Temp3', 'Pressure3', 'Humidity3',
                                      'Wind3', 'Precipitation3', 'Clouds3', 'Temp10', 'Pressure10',
                                      'Humidity10', 'Wind10', 'Precipitation10', 'Clouds10']), 
     ignore_index = True)
    
    print('Run {}'.format(i))

Run 0
Run 1
Run 2
Run 3
Run 4
Run 5
Run 6
Run 7
Run 8
Run 9
Run 10
Run 11
Run 12
Run 13
Run 14
Run 15
Run 16
Run 17
Run 18
Run 19
Run 20
Run 21
Run 22
Run 23
Run 24
Run 25
Run 26
Run 27
Run 28
Run 29
Run 30
Run 31
Run 32
Run 33
Run 34
Run 35
Run 36
Run 37
Run 38
Run 39
Run 40
Run 41
Run 42
Run 43
Run 44
Run 45
Run 46
Run 47
Run 48
Run 49
Run 50
Run 51
Run 52
Run 53
Run 54
Run 55
Run 56
Run 57
Run 58
Run 59
Run 60
Run 61
Run 62
Run 63
Run 64
Run 65
Run 66
Run 67
Run 68
Run 69
Run 70
Run 71
Run 72
Run 73
Run 74
Run 75
Run 76
Run 77
Run 78
Run 79
Run 80
Run 81
Run 82
Run 83
Run 84
Run 85
Run 86
Run 87
Run 88
Run 89
Run 90
Run 91
Run 92
Run 93
Run 94
Run 95
Run 96
Run 97
Run 98
Run 99
Run 100
Run 101
Run 102
Run 103
Run 104
Run 105
Run 106
Run 107
Run 108
Run 109
Run 110
Run 111
Run 112
Run 113
Run 114
Run 115
Run 116
Run 117
Run 118
Run 119
Run 120
Run 121
Run 122
Run 123
Run 124
Run 125
Run 126
Run 127
Run 128
Run 129
Run 130
Run 131
Run 132
Run 133
Run 134
Run 135
Run 136
Run 137
Run 13

Run 1034
Run 1035
Run 1036
Run 1037
Run 1038
Run 1039
Run 1040
Run 1041
Run 1042
Run 1043
Run 1044
Run 1045
Run 1046
Run 1047
Run 1048
Run 1049
Run 1050
Run 1051
Run 1052
Run 1053
Run 1054
Run 1055
Run 1056
Run 1057
Run 1058
Run 1059
Run 1060
Run 1061
Run 1062
Run 1063
Run 1064
Run 1065
Run 1066
Run 1067
Run 1068
Run 1069
Run 1070
Run 1071
Run 1072
Run 1073
Run 1074
Run 1075
Run 1076
Run 1077
Run 1078
Run 1079
Run 1080
Run 1081
Run 1082
Run 1083
Run 1084
Run 1085
Run 1086
Run 1087
Run 1088
Run 1089
Run 1090
Run 1091
Run 1092
Run 1093
Run 1094
Run 1095
Run 1096
Run 1097
Run 1098
Run 1099
Run 1100
Run 1101
Run 1102
Run 1103
Run 1104
Run 1105
Run 1106
Run 1107
Run 1108
Run 1109
Run 1110
Run 1111
Run 1112
Run 1113
Run 1114
Run 1115
Run 1116
Run 1117
Run 1118
Run 1119
Run 1120
Run 1121
Run 1122
Run 1123
Run 1124
Run 1125
Run 1126
Run 1127
Run 1128
Run 1129
Run 1130
Run 1131
Run 1132
Run 1133
Run 1134
Run 1135
Run 1136
Run 1137
Run 1138
Run 1139
Run 1140
Run 1141
Run 1142
Run 1143
Run 1144
R

In [76]:
# view data
weather_data.head()

Unnamed: 0,Latitude,Longitude,Temp3,Pressure3,Humidity3,Wind3,Precipitation3,Clouds3,Temp10,Pressure10,Humidity10,Wind10,Precipitation10,Clouds10
0,78.933333,11.95,267.48,1005.31,77.51,3.05,0,49.72,274.13,1009.4,87.6,2.72,0.03,64.98
1,78.216667,15.633333,267.48,1005.31,77.51,3.05,0,49.72,274.13,1009.4,87.6,2.72,0.03,64.98
2,77.795278,-70.755833,267.48,1005.31,77.51,3.05,0,49.72,274.13,1009.4,87.6,2.72,0.03,64.98
3,77.509722,-66.647778,267.48,1005.31,77.51,3.05,0,49.72,274.13,1009.4,87.6,2.72,0.03,64.98
4,73.506944,80.546389,256.07,1006.12,85.57,4.95,0,64.29,266.3,1007.05,88.74,4.61,0.02,68.38


## Step 1: Replace Column Names

While the current column names were great for the simplicity of appending the API data to the dataframe, parts of them could be abbreviated and made more specific.

In [77]:
# rename api columns
weather_data = weather_data.rename(columns = {'Temp3' : 'Temp_March', 'Pressure3': "Pres_March", 
                                              'Humidity3': "Humi_March", 'Wind3' : 'Wind_March', 
                                              'Precipitation3' : "Prec_March", "Clouds3": "Clou_March",
                                              'Temp10' : 'Temp_Oct', 'Pressure10': "Pres_Oct", 
                                              'Humidity10': "Humi_Oct", 'Wind10' : 'Wind_Oct',
                                              'Precipitation10' : "Prec_Oct", "Clouds10" : "Clou_Oct"})

In [78]:
# view new columns (march)
weather_data.iloc[:,:8].head()

Unnamed: 0,Latitude,Longitude,Temp_March,Pres_March,Humi_March,Wind_March,Prec_March,Clou_March
0,78.933333,11.95,267.48,1005.31,77.51,3.05,0,49.72
1,78.216667,15.633333,267.48,1005.31,77.51,3.05,0,49.72
2,77.795278,-70.755833,267.48,1005.31,77.51,3.05,0,49.72
3,77.509722,-66.647778,267.48,1005.31,77.51,3.05,0,49.72
4,73.506944,80.546389,256.07,1006.12,85.57,4.95,0,64.29


In [79]:
# view new columns (october)
weather_data.iloc[:,8:].head()

Unnamed: 0,Temp_Oct,Pres_Oct,Humi_Oct,Wind_Oct,Prec_Oct,Clou_Oct
0,274.13,1009.4,87.6,2.72,0.03,64.98
1,274.13,1009.4,87.6,2.72,0.03,64.98
2,274.13,1009.4,87.6,2.72,0.03,64.98
3,274.13,1009.4,87.6,2.72,0.03,64.98
4,266.3,1007.05,88.74,4.61,0.02,68.38


## Step 2: Convert Kelvin to Fahrenheit

To make the temperature values more interpretable to a typical end-user, we will convert them to Fahrenheit.

In [80]:
# overwrite column with conversion
weather_data = weather_data.assign(Temp_March = lambda x: (x['Temp_March'] - 273.15) * 9/5 + 32 )
weather_data = weather_data.assign(Temp_Oct = lambda x: (x['Temp_Oct'] - 273.15) * 9/5 + 32 )

In [81]:
# view new columns (march)
weather_data.iloc[:,:8].head()

Unnamed: 0,Latitude,Longitude,Temp_March,Pres_March,Humi_March,Wind_March,Prec_March,Clou_March
0,78.933333,11.95,21.794,1005.31,77.51,3.05,0,49.72
1,78.216667,15.633333,21.794,1005.31,77.51,3.05,0,49.72
2,77.795278,-70.755833,21.794,1005.31,77.51,3.05,0,49.72
3,77.509722,-66.647778,21.794,1005.31,77.51,3.05,0,49.72
4,73.506944,80.546389,1.256,1006.12,85.57,4.95,0,64.29


In [82]:
# view new columns (october)
weather_data.iloc[:,8:].head()

Unnamed: 0,Temp_Oct,Pres_Oct,Humi_Oct,Wind_Oct,Prec_Oct,Clou_Oct
0,33.764,1009.4,87.6,2.72,0.03,64.98
1,33.764,1009.4,87.6,2.72,0.03,64.98
2,33.764,1009.4,87.6,2.72,0.03,64.98
3,33.764,1009.4,87.6,2.72,0.03,64.98
4,19.67,1007.05,88.74,4.61,0.02,68.38


## Step 3: Convert Meters per Second to Feet per Second

To make the wind values more interpretable to a typical end-user, we will convert them to feet per second.

In [83]:
# overwrite column with conversion
weather_data = weather_data.assign(Wind_March = lambda x: x['Wind_March'] * 3.28084)
weather_data = weather_data.assign(Wind_Oct = lambda x: x['Wind_Oct'] * 3.28084)

In [84]:
# view new columns (march)
weather_data.iloc[:,:8].head()

Unnamed: 0,Latitude,Longitude,Temp_March,Pres_March,Humi_March,Wind_March,Prec_March,Clou_March
0,78.933333,11.95,21.794,1005.31,77.51,10.006562,0,49.72
1,78.216667,15.633333,21.794,1005.31,77.51,10.006562,0,49.72
2,77.795278,-70.755833,21.794,1005.31,77.51,10.006562,0,49.72
3,77.509722,-66.647778,21.794,1005.31,77.51,10.006562,0,49.72
4,73.506944,80.546389,1.256,1006.12,85.57,16.240158,0,64.29


In [85]:
# view new columns (october)
weather_data.iloc[:,8:].head()

Unnamed: 0,Temp_Oct,Pres_Oct,Humi_Oct,Wind_Oct,Prec_Oct,Clou_Oct
0,33.764,1009.4,87.6,8.923885,0.03,64.98
1,33.764,1009.4,87.6,8.923885,0.03,64.98
2,33.764,1009.4,87.6,8.923885,0.03,64.98
3,33.764,1009.4,87.6,8.923885,0.03,64.98
4,19.67,1007.05,88.74,15.124672,0.02,68.38


## Step 4: Convert Millimeters to Inches

To make the precipitation values more interpretable to a typical end-user, we will convert them to inches.

In [86]:
# overwrite column with conversion
weather_data = weather_data.assign(Prec_March = lambda x: x['Prec_March'] * 0.039)
weather_data = weather_data.assign(Prec_Oct = lambda x: x['Prec_Oct'] * 0.039)

In [87]:
# view new columns (march)
weather_data.iloc[:,:8].head()

Unnamed: 0,Latitude,Longitude,Temp_March,Pres_March,Humi_March,Wind_March,Prec_March,Clou_March
0,78.933333,11.95,21.794,1005.31,77.51,10.006562,0.0,49.72
1,78.216667,15.633333,21.794,1005.31,77.51,10.006562,0.0,49.72
2,77.795278,-70.755833,21.794,1005.31,77.51,10.006562,0.0,49.72
3,77.509722,-66.647778,21.794,1005.31,77.51,10.006562,0.0,49.72
4,73.506944,80.546389,1.256,1006.12,85.57,16.240158,0.0,64.29


In [88]:
# view new columns (october)
weather_data.iloc[:,8:].head()

Unnamed: 0,Temp_Oct,Pres_Oct,Humi_Oct,Wind_Oct,Prec_Oct,Clou_Oct
0,33.764,1009.4,87.6,8.923885,0.00117,64.98
1,33.764,1009.4,87.6,8.923885,0.00117,64.98
2,33.764,1009.4,87.6,8.923885,0.00117,64.98
3,33.764,1009.4,87.6,8.923885,0.00117,64.98
4,19.67,1007.05,88.74,15.124672,0.00078,68.38


## Step 5: Create Column for Yearly Averages

Having a column that summarizes the weather patterns for a location could be helpful, rather than having to look at indivudal months. Therefore, we will create a column that takes the average of the available information.

In [89]:
# calculate average of each metric
weather_data['Temp_Yearly_Av'] = weather_data[['Temp_March', 'Temp_Oct']].mean(axis = 1)
weather_data['Pres_Yearly_Av'] = weather_data[['Pres_March', 'Pres_Oct']].mean(axis = 1)
weather_data['Humi_Yearly_Av'] = weather_data[['Humi_March', 'Humi_Oct']].mean(axis = 1)
weather_data['Wind_Yearly_Av'] = weather_data[['Wind_March', 'Wind_Oct']].mean(axis = 1)
weather_data['Prec_Yearly_Av'] = weather_data[['Prec_March', 'Prec_Oct']].mean(axis = 1)
weather_data['Clou_Yearly_Av'] = weather_data[['Clou_March', 'Clou_Oct']].mean(axis = 1)

In [90]:
# view new columns
weather_data[['Temp_Yearly_Av', 'Pres_Yearly_Av', 'Humi_Yearly_Av', 'Wind_Yearly_Av',
             'Prec_Yearly_Av', 'Clou_Yearly_Av']].head()

Unnamed: 0,Temp_Yearly_Av,Pres_Yearly_Av,Humi_Yearly_Av,Wind_Yearly_Av,Prec_Yearly_Av,Clou_Yearly_Av
0,27.779,1007.355,82.555,9.465223,0.000585,57.35
1,27.779,1007.355,82.555,9.465223,0.000585,57.35
2,27.779,1007.355,82.555,9.465223,0.000585,57.35
3,27.779,1007.355,82.555,9.465223,0.000585,57.35
4,10.463,1006.585,87.155,15.682415,0.00039,66.335


### View Final Data

In [91]:
# view first 10 rows, first 12 columns
weather_data.iloc[:,:12].head(10)

Unnamed: 0,Latitude,Longitude,Temp_March,Pres_March,Humi_March,Wind_March,Prec_March,Clou_March,Temp_Oct,Pres_Oct,Humi_Oct,Wind_Oct
0,78.933333,11.95,21.794,1005.31,77.51,10.006562,0.0,49.72,33.764,1009.4,87.6,8.923885
1,78.216667,15.633333,21.794,1005.31,77.51,10.006562,0.0,49.72,33.764,1009.4,87.6,8.923885
2,77.795278,-70.755833,21.794,1005.31,77.51,10.006562,0.0,49.72,33.764,1009.4,87.6,8.923885
3,77.509722,-66.647778,21.794,1005.31,77.51,10.006562,0.0,49.72,33.764,1009.4,87.6,8.923885
4,73.506944,80.546389,1.256,1006.12,85.57,16.240158,0.0,64.29,19.67,1007.05,88.74,15.124672
5,71.966667,102.5,1.256,1006.12,85.57,16.240158,0.0,64.29,19.67,1007.05,88.74,15.124672
6,71.290556,-156.788611,25.646,1009.71,70.27,7.939633,0.00078,45.41,37.526,1005.12,81.68,8.103675
7,71.033333,27.85,21.794,1005.31,77.51,10.006562,0.0,49.72,33.764,1009.4,87.6,8.923885
8,71.000556,24.696111,21.794,1005.31,77.51,10.006562,0.0,49.72,33.764,1009.4,87.6,8.923885
9,70.45,-68.566667,20.318,1012.63,74.74,14.173229,0.00078,67.36,43.7,1011.47,80.08,11.220473


In [92]:
# view first 10 rows, last 8 columns
weather_data.iloc[:,12:].head(10)

Unnamed: 0,Prec_Oct,Clou_Oct,Temp_Yearly_Av,Pres_Yearly_Av,Humi_Yearly_Av,Wind_Yearly_Av,Prec_Yearly_Av,Clou_Yearly_Av
0,0.00117,64.98,27.779,1007.355,82.555,9.465223,0.000585,57.35
1,0.00117,64.98,27.779,1007.355,82.555,9.465223,0.000585,57.35
2,0.00117,64.98,27.779,1007.355,82.555,9.465223,0.000585,57.35
3,0.00117,64.98,27.779,1007.355,82.555,9.465223,0.000585,57.35
4,0.00078,68.38,10.463,1006.585,87.155,15.682415,0.00039,66.335
5,0.00078,68.38,10.463,1006.585,87.155,15.682415,0.00039,66.335
6,0.00897,54.83,31.586,1007.415,75.975,8.021654,0.004875,50.12
7,0.00117,64.98,27.779,1007.355,82.555,9.465223,0.000585,57.35
8,0.00117,64.98,27.779,1007.355,82.555,9.465223,0.000585,57.35
9,0.00312,66.66,32.009,1012.05,77.41,12.696851,0.00195,67.01


In [93]:
# view last 10 rows, first 12 columns
weather_data.iloc[:,:12].tail(10)

Unnamed: 0,Latitude,Longitude,Temp_March,Pres_March,Humi_March,Wind_March,Prec_March,Clou_March,Temp_Oct,Pres_Oct,Humi_Oct,Wind_Oct
1150,13.966667,44.183333,65.912,957.08,53.91,5.643045,0.00117,17.49,65.858,937.63,54.55,4.72441
1151,13.916667,44.15,65.912,957.08,53.91,5.643045,0.00117,17.49,65.858,937.63,54.55,4.72441
1152,-12.787222,45.102778,82.652,1009.81,79.4,9.84252,0.00429,40.36,81.356,1014.86,69.43,13.156168
1153,-12.833333,45.110556,82.652,1009.81,79.4,9.84252,0.00429,40.36,81.356,1014.86,69.43,13.156168
1154,-28.085614,27.138143,67.802,956.79,63.6,10.433071,0.00468,33.46,68.054,930.68,43.87,12.893701
1155,-28.10391,26.86593,67.802,956.79,63.6,10.433071,0.00468,33.46,68.054,930.68,43.87,12.893701
1156,-13.6,24.2,68.072,954.65,87.72,6.36483,0.01014,61.44,73.976,930.03,56.03,8.038058
1157,-13.616667,29.4,68.828,956.98,84.95,7.841208,0.0078,53.16,76.424,929.57,39.04,11.646982
1158,-18.533333,32.116667,68.792,969.41,80.72,5.675853,0.00468,40.36,70.16,949.96,61.46,7.874016
1159,-18.916667,29.816667,68.648,957.86,70.5,12.664042,0.00507,31.61,72.086,934.51,45.08,13.648294


In [94]:
# view last 10 rows, last 8 columns
weather_data.iloc[:,12:].tail(10)

Unnamed: 0,Prec_Oct,Clou_Oct,Temp_Yearly_Av,Pres_Yearly_Av,Humi_Yearly_Av,Wind_Yearly_Av,Prec_Yearly_Av,Clou_Yearly_Av
1150,0.00117,21.4,65.885,947.355,54.23,5.183727,0.00117,19.445
1151,0.00117,21.4,65.885,947.355,54.23,5.183727,0.00117,19.445
1152,0.00039,20.86,82.004,1012.335,74.415,11.499344,0.00234,30.61
1153,0.00039,20.86,82.004,1012.335,74.415,11.499344,0.00234,30.61
1154,0.00156,27.48,67.928,943.735,53.735,11.663386,0.00312,30.47
1155,0.00156,27.48,67.928,943.735,53.735,11.663386,0.00312,30.47
1156,0.00234,44.32,71.024,942.34,71.875,7.201444,0.00624,52.88
1157,0.00078,30.2,72.626,943.275,61.995,9.744095,0.00429,41.68
1158,0.00117,21.71,69.476,959.685,71.09,6.774935,0.002925,31.035
1159,0.00234,23.91,70.367,946.185,57.79,13.156168,0.003705,27.76


## Ethical Implications

Ethical concerns almost always arise when dealing with the summarizing of data. The extent of the ethical implications for summarizing data deal less with stakeholders are more with the quality and accuracy of the data being presented. Especially in the case of this weather data, it is often the case that weather patterns vary drastically throughout the year during different seasons, whether that be a wet and dry season, or a typical winter, spring, summer, and fall. Because of the limitations surrounding the size of the dataset and the API call limit, each month of the year cannot be considered for this data. Due to this, two months had to be chosen as representative samples. Although these two months do likely paint a compelling picture of the weather in each location, it may miss out on crucial points, thus bringing data quality into question. Considering the nature of this analysis and the end goal of predicting suicide rates, having uncertainty surrounding data quality can be impactful. 

In [96]:
weather_data.to_csv('api_data.csv')