# Assignment - What's the Weather Like?

## Background

Whether financial, political, or social -- data's true power lies in its ability to answer questions definitively. So let's take what you've learned about Python requests, APIs, and JSON traversals to answer a fundamental question: "What's the weather like as we approach the equator?"

Now, we know what you may be thinking: _"Duh. It gets hotter..."_

But, if pressed, how would you **provide evidence**?

## WeatherPy

In this example, you'll be creating a Python script to visualize the weather of 500+ cities across the world of varying distance from the equator. To accomplish this, you'll be utilizing a [simple Python library](https://pypi.python.org/pypi/citipy), the [OpenWeatherMap API](https://openweathermap.org/api), and a little common sense to create a representative model of weather across world cities.

Your objective is to build a series of scatter plots to showcase the following relationships:

* Temperature (F) vs. Latitude
* Humidity (%) vs. Latitude
* Cloudiness (%) vs. Latitude
* Wind Speed (mph) vs. Latitude

Your final notebook must:

* Randomly select **at least** 500 unique (non-repeat) cities based on latitude and longitude.
* Perform a weather check on each of the cities using a series of successive API calls.
* OPTIONAL: Include a print log of each city as it's being processed with the city number and city name.
* OPTIONAL: Save both a CSV of all data retrieved and png images for each scatter plot.

As final considerations:

* You must complete your analysis using a Jupyter notebook.
* You must use the Matplotlib ,Seaborn or Pandas plotting libraries.
* You must include a written description of three observable trends based on the data.
* You must use proper labeling of your plots, including aspects like: Plot Titles (with date of analysis) and Axes Labels.

## Hints and Considerations

* The city data is generated based on random coordinates; as such, your outputs will not be an exact match to the provided starter notebook.

* You may want to start this assignment by refreshing yourself on the [geographic coordinate system](http://desktop.arcgis.com/en/arcmap/10.3/guide-books/map-projections/about-geographic-coordinate-systems.htm).

* Next, spend the requisite time necessary to study the OpenWeatherMap API. Based on your initial study, you should be able to answer  basic questions about the API: Where do you request the API key? Which Weather API in particular will you need? What URL endpoints does it expect? What JSON structure does it respond with? Before you write a line of code, you should be aiming to have a crystal clear understanding of your intended outcome.

* A starter code for Citipy has been provided. However, if you're craving an extra challenge, push yourself to learn how it works: [citipy Python library](https://pypi.python.org/pypi/citipy). 

* Lastly, remember -- this is a challenging activity. Push yourself! If you complete this task, then you can safely say that you've gained a strong mastery of the core foundations of data analytics and it will only go better from here. Good luck!
#### Note
* Instructions have been included for each segment. You do not have to follow them exactly, but they are included to help you think through the steps.

In [28]:
# Don't modify this cell.
# Dependencies and Setup
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
import requests
import time
from pprint import pprint

# Import API key
import api_keys

# Incorporated citipy to determine city based on latitude and longitude
from citipy import citipy

# Range of latitudes and longitudes
lat_range = (-90, 90)
lng_range = (-180, 180)

## Generate Cities List

In [29]:
# Don't modify this cell
# List for holding lat_lngs and cities
lat_lngs = []
cities = []

# Create a set of random lat and lng combinations
lats = np.random.uniform(low=-90.000, high=90.000, size=1500)
lngs = np.random.uniform(low=-180.000, high=180.000, size=1500)
lat_lngs = zip(lats, lngs)

# Identify nearest city for each lat, lng combination
for lat_lng in lat_lngs:
    city = citipy.nearest_city(lat_lng[0], lat_lng[1]).city_name
    
    # If the city is unique, then add it to a our cities list
    if city not in cities:
        cities.append(city)

# Print the city count to confirm sufficient count
len(cities)

603

A lot of starter code has been generated for you. Use the cell below to play with the data to ensure you know what's happening. 

### Perform API Calls
* Perform a weather check on each city in `cities` using a series of successive API calls.
* OPTIONAL: Include a print log of each city as it'sbeing processed (with the city number and city name).


In [30]:
# OpenWeatherMap API Key
api_key = api_keys.api_key

# base url for getting api data
base_url = "http://api.openweathermap.org/data/2.5/weather?units=Imperial&APPID=" + api_key

# example request
#req = requests.get(base_url + f'&q={cities[0]}').json()

# Your code here. A loop maybe? 
#NOTE: API calls can be slow. They can also be limited. 
#Once you have a list try not to rerun all API calls very often. 

weather = []
cities = cities[:500]

for city in cities: 
    req = requests.get(base_url + f'&q={city}').json()
    time.sleep(1)
    
    weather.append(req)
    
    



In [31]:
pprint(weather[0]['main']['temp'])

82.81


In [32]:
my_dict = {'name' : [], 'clouds' : [], 'country': [], 'dt': [], 'humidity' : [], 'lat' : [], 'lon': [],  'temp_max': [] , 'speed' : []} #create dictionary for every variable we are plotting against

for i in weather:
    if i['cod'] == 200:
        my_dict['name'].append(i['name'])
        my_dict['clouds'].append(i['clouds']['all'])
        my_dict['country'].append(i['sys']['country'])
        my_dict['dt'].append(i['dt'])
        my_dict['humidity'].append(i['main']['humidity'])
        my_dict['lat'].append(i['coord']['lat'])
        my_dict['lon'].append(i['coord']['lon'])
        my_dict['temp_max'].append(i['main']['temp_max'])
        my_dict['speed'].append(i['wind']['speed'])
    else:
        print(i)
    


{'cod': '404', 'message': 'city not found'}
{'cod': '404', 'message': 'city not found'}
{'cod': '404', 'message': 'city not found'}
{'cod': '404', 'message': 'city not found'}
{'cod': '404', 'message': 'city not found'}
{'cod': '404', 'message': 'city not found'}
{'cod': '404', 'message': 'city not found'}
{'cod': '404', 'message': 'city not found'}
{'cod': '404', 'message': 'city not found'}
{'cod': '404', 'message': 'city not found'}
{'cod': '404', 'message': 'city not found'}
{'cod': '404', 'message': 'city not found'}
{'cod': '404', 'message': 'city not found'}
{'cod': '404', 'message': 'city not found'}
{'cod': '404', 'message': 'city not found'}
{'cod': '404', 'message': 'city not found'}
{'cod': '404', 'message': 'city not found'}
{'cod': '404', 'message': 'city not found'}
{'cod': '404', 'message': 'city not found'}
{'cod': '404', 'message': 'city not found'}
{'cod': '404', 'message': 'city not found'}
{'cod': '404', 'message': 'city not found'}
{'cod': '404', 'message': 'city 

In [33]:
pprint(weather[1])

{'base': 'stations',
 'clouds': {'all': 16},
 'cod': 200,
 'coord': {'lat': -23.12, 'lon': -134.97},
 'dt': 1600902864,
 'id': 4030556,
 'main': {'feels_like': 71.22,
          'grnd_level': 1018,
          'humidity': 72,
          'pressure': 1021,
          'sea_level': 1021,
          'temp': 72.86,
          'temp_max': 72.86,
          'temp_min': 72.86},
 'name': 'Rikitea',
 'sys': {'country': 'PF', 'sunrise': 1600872463, 'sunset': 1600916178},
 'timezone': -32400,
 'visibility': 10000,
 'weather': [{'description': 'few clouds',
              'icon': '02d',
              'id': 801,
              'main': 'Clouds'}],
 'wind': {'deg': 74, 'speed': 11.01}}


In [34]:
#Convert to csv
#export data to csv

df_cities_weather= pd.DataFrame(my_dict)


df_cities_weather.to_csv('df_cities_weather.csv')

cities_weather = pd.read_csv('df_cities_weather.csv')

df_cities_weather.rename({'name' : 'city'}, inplace= 1, axis=1)
#df_cities_weather['City'] = df_cities_weather['name']
#df_cities_weather.drop(['name'], inplace = True, axis = 1)

#rankings_pd.rename(columns = {'test':'TEST'}, inplace = True)
df_cities_weather



Unnamed: 0,city,clouds,country,dt,humidity,lat,lon,temp_max,speed
0,Butaritari,100,KI,1600903141,72,3.07,172.79,82.81,14.16
1,Rikitea,16,PF,1600902864,72,-23.12,-134.97,72.86,11.01
2,George Town,40,MY,1600903012,94,5.41,100.34,77.00,5.82
3,Kandrian,0,PG,1600903146,71,-6.22,149.55,83.08,4.92
4,Castro,19,BR,1600903147,92,-24.79,-50.01,56.10,6.73
...,...,...,...,...,...,...,...,...,...
460,Waipawa,3,NZ,1600903784,77,-41.41,175.52,60.80,25.28
461,Ngunguru,81,NZ,1600903785,79,-35.62,174.50,64.99,10.00
462,Griffith,1,US,1600903786,53,41.53,-87.42,73.99,5.82
463,Uturoa,1,PF,1600903787,74,-16.73,-151.43,80.01,16.44


In [35]:
cities_weather = pd.read_csv('df_cities_weather.csv')

df_cities_weather['City'] == df_cities_weather['name']

df_cities_weather

KeyError: 'City'

### Plotting the Data
* Use proper labeling of the plots using plot titles (including date of analysis) and axes labels.
* Use matplotlib
* OPTIONAL: Save the plotted figures as .pngs.

#### Latitude vs. Temperature Plot

In [None]:
fig, ax = plt.subplots()

ax.scatter(my_dict['lat'], my_dict['temp_max'], color = 'hotpink')
ax.set_xlabel('Latitude')
ax.set_ylabel('Max Temperature (F)')
ax.set_title('City Latitude vs. Max Temperature 09/22/2020')
ax.grid()
plt.show()
fig.savefig('lat_temp.png')


#### Latitude vs. Humidity Plot

In [None]:
fig, ax = plt.subplots()

ax.scatter(my_dict['lat'], my_dict['humidity'], color = 'lightblue')
ax.set_xlabel('Latitude')
ax.set_ylabel('Humidity (%)')
ax.set_title('City Latitude vs.Humidity (09/22/2020)')
ax.grid()
plt.show()
fig.savefig('lat_humidity.png')

#### Latitude vs. Cloudiness Plot

In [None]:
fig, ax = plt.subplots()

ax.scatter(my_dict['lat'], my_dict['clouds'], color = 'orange')
ax.set_xlabel('Latitude')
ax.set_ylabel('Cloudiness (%)')
ax.set_title('City Latitude vs. Cloudiness (09/22/2020)')
ax.grid()
plt.show()
fig.savefig('lat_cloudiness.png')

#### Latitude vs. Wind Speed Plot

In [None]:
fig, ax = plt.subplots()

ax.scatter(my_dict['lat'], my_dict['speed'], color = 'purple')
ax.set_xlabel('Latitude')
ax.set_ylabel('Wind Speed (mph)')
ax.set(title ='City Latitude vs. Wind Speed (09/22/2020)')
ax.grid()
plt.show()
fig.savefig('lat_windspeed.png')

In [None]:
fig, ax = plt.subplots()

ax.scatter(my_dict['lat'], my_dict['clouds'], color = 'orange')
ax.set_xlabel('Latitude')
ax.set_ylabel('Cloudiness (%)')
ax.set(title ='City Latitude vs. Cloudiness (09/22/2020)')
ax.grid()
plt.show()

### Use the Seaborn library to re-create 2-4 of the above plots. 
* Use the same data just make a different plot.
* Note the differences in amount of code required to generate a similar plot.

In [None]:
import seaborn as sns

cloud_lat = sns.scatterplot(x='lat', y='clouds', data=my_dict, hue='clouds')
sns.set_style('whitegrid')
plt.savefig('lat_clouds_sns.png')
cloud_lat.set_title('City Latitude vs. Cloudiness (09/22/2020)')
cloud_lat.set(xlabel='Latitude', ylabel='Cloudiness(%)')

In [None]:
maxtemp_lat = sns.scatterplot(x='lat', y='temp_max', data = my_dict, hue='temp_max')
sns.set_style('whitegrid')
maxtemp_lat.set_title('City Latitude vs. Max Temperature (09/22/2020)')
maxtemp_lat.set(xlabel='Latitude', ylabel='Temperature(F)')
plt.savefig('lat_maxtemp_sns.png')

In [None]:
humidity_lat = sns.scatterplot(x='lat', y='humidity', data=my_dict, hue='humidity')
sns.set_style('whitegrid')
humidity_lat.set_title('City Latitude vs. Humidity (09/22/2020)')
humidity_lat.set(xlabel='Latitude', ylabel='Humidity')
plt.savefig('lat_humidity_sns.png')

In [None]:
speed_lat = sns.scatterplot(x='lat', y='speed', data=my_dict, hue='speed')
sns.set_style('whitegrid')
speed_lat.set_title('City Latitude vs. Wind Speed (09/22/2020)')
speed_lat.set(xlabel='Latitude', ylabel='Wind Speed (mph)')
plt.savefig('lat_windspeed_sns.png')

### As a Data-Scientist:
In addition to generating data, munging data and plotting data you will also be responsible for *interpreting* data. 
* Provide a written description of three observable trends based on the data.

In [None]:
#Max Temp + Latitude
#The temperature is on average is higher near the equator. However, the further north the lower the temperature goes. The temperature reamins similar to each other between -20 to 20 latitude 

In [None]:
#WindSpeed
#There is not correlation between wind speed and location

#Humidity does not vary much near the equator, but it does vary significantly as you move away from the equator.

### OPTIONAL Homework Problem: 
* Use a **different** api endpoint such as `Hourly Forecast 4 days` to get data. 
* Other weather api endpoints are documented [here](https://openweathermap.org/api). 
* You will have to change the URL parameters to get the data you want. 
* Get the data into a data structure of your choice. (Pandas, dicts, lists etc...)
* Use the plotting library of your choice to make 1-4 plots of your choice. 