# Unit 6 | Assignment - What's the Weather Like?


Background

Whether financial, political, or social -- data's true power lies in its ability to answer questions definitively. So let's take what you've learned about Python requests, APIs, and JSON traversals to answer a fundamental question: "What's the weather like as we approach the equator?"

Now, we know what you may be thinking: "Duh. It gets hotter..."

But, if pressed, how would you prove it?




WeatherPy

In this example, you'll be creating a Python script to visualize the weather of 500+ cities across the world of varying distance from the equator. To accomplish this, you'll be utilizing a simple Python library, the OpenWeatherMap API, and a little common sense to create a representative model of weather across world cities.

Your objective is to build a series of scatter plots to showcase the following relationships:


Temperature (F) vs. Latitude
Humidity (%) vs. Latitude
Cloudiness (%) vs. Latitude
Wind Speed (mph) vs. Latitude


Your final notebook must:


Randomly select at least 500 unique (non-repeat) cities based on latitude and longitude.
Perform a weather check on each of the cities using a series of successive API calls.
Include a print log of each city as it's being processed with the city number and city name.
Save both a CSV of all data retrieved and png images for each scatter plot.


As final considerations:


You must complete your analysis using a Jupyter notebook.
You must use the Matplotlib or Pandas plotting libraries.
You must include a written description of three observable trends based on the data.
You must use proper labeling of your plots, including aspects like: Plot Titles (with date of analysis) and Axes Labels.
See Example Solution for a reference on expected format.



Hints and Considerations


You may want to start this assignment by refreshing yourself on the geographic coordinate system.
Next, spend the requisite time necessary to study the OpenWeatherMap API. Based on your initial study, you should be able to answer  basic questions about the API: Where do you request the API key? Which Weather API in particular will you need? What URL endpoints does it expect? What JSON structure does it respond with? Before you write a line of code, you should be aiming to have a crystal clear understanding of your intended outcome.
Though we've never worked with the citipy Python library, push yourself to decipher how it works, and why it might be relevant. Before you try to incorporate the library into your analysis, start by creating simple test cases outside your main script to confirm that you are using it correctly. Too often, when introduced to a new library, students get bogged down by the most minor of errors -- spending hours investigating their entire code -- when, in fact, a simple and focused test would have shown their basic utilization of the library was wrong from the start. Don't let this be you!
Part of our expectation in this challenge is that you will use critical thinking skills to understand how and why we're recommending the tools we are. What is Citipy for? Why would you use it in conjunction with the OpenWeatherMap API? How would you do so?
In building your script, pay attention to the cities you are using in your query pool. Are you getting coverage of the full gamut of latitudes and longitudes? Or are you simply choosing 500 cities concentrated in one region of the world? Even if you were a geographic genius, simply rattling 500 cities based on your human selection would create a biased dataset. Be thinking of how you should counter this. (Hint: Consider the full range of latitudes).
Lastly, remember -- this is a challenging activity. Push yourself! If you complete this task, then you can safely say that you've gained a strong mastery of the core foundations of data analytics and it will only go better from here. Good luck!



Copyright

Data Boot Camp © 2018. All Rights Reserved.

In [1]:
import pandas as pd
from citipy import citipy
import numpy as np
import pandas as pd
import requests
import json
import seaborn as sns
import matplotlib.pyplot as plt
from config import api_keys
%matplotlib inline
api_keys

'25bc90a1196e6f153eece0bc0b0fc9eb'

In [2]:
#generate random list for latitudes and longitudes 
latitude=[]
longitude=[]
for each_lat in np.random.randint(-90,90,2000):
    latitude.append(each_lat)
for each_lon in np.random.randint(-180,180,2000):
    longitude.append(each_lon)


In [3]:
#Create and variable called latitude_and_Longitude and store all random latitudes and longitudes generated
latitude_and_Longitude=tuple(zip(latitude,longitude))
#latitude_and_Longitude

In [4]:
#use citipy library to find the nearest city for all latitudes and longitudes
cities=[]
countries=[]
for lat,lon in latitude_and_Longitude:
    city=citipy.nearest_city(lat,lon)
    cities.append(city.city_name)
    countries.append(city.country_code)
#unit_city=set(cities)
#unit_city

In [5]:
#Print out how many the different cities were generated
print(len(set(cities)))
print('The above script has generated {} different cities using random numbers.'.format(len(set(cities))))
#print('The above script has generated {} different cities using random numbers.'.format(len(set(cities))))

753
The above script has generated 753 different cities using random numbers.


In [6]:
#create a dataframe to store all the cities and countries generated with citipy

city_dic={"City":cities,
          "Country":countries}
df_countries=pd.DataFrame(city_dic)
#add additional blank columns to store information from openweathermap api

df_countries["Temperature (F)"]=""
df_countries["Humidity (%)"]=""
df_countries["Cloudiness (%)"]=""
df_countries["Wind Speed (mph)"]=""
df_countries.head()

Unnamed: 0,City,Country,Temperature (F),Humidity (%),Cloudiness (%),Wind Speed (mph)
0,kieta,pg,,,,
1,attawapiskat,ca,,,,
2,ushuaia,ar,,,,
3,punta arenas,cl,,,,
4,haldia,in,,,,


In [7]:
#drop all duplicates in column Cities.
#df_countries.shape
df_countries=df_countries.drop_duplicates(subset=['City'],keep='first')
df_countries_1=df_countries.head(5)
df_countries_1.head()

Unnamed: 0,City,Country,Temperature (F),Humidity (%),Cloudiness (%),Wind Speed (mph)
0,kieta,pg,,,,
1,attawapiskat,ca,,,,
2,ushuaia,ar,,,,
3,punta arenas,cl,,,,
4,haldia,in,,,,


In [8]:
from pprint import pprint
city_name="London"
Country_id="uk"
api_url = "http://api.openweathermap.org/data/2.5/forecast" \
          "?q={},{}&units=metric&mode=json&appid={}c".format(city_name,Country_id,api_keys)
response=requests.get(api_url).json()
pprint(response)
#print(api_url)
#df_countries["Temperature (F)"].append(response['list'][3])

{'cod': 401,
 'message': 'Invalid API key. Please see '
            'http://openweathermap.org/faq#error401 for more info.'}


In [None]:
#loop through all rows and fill in values for blank columns in dataframe
for index,row in df_countries_1.iterrows():
    city_name=row['City']
    Country_id=row['Country']

    api_url = "http://api.openweathermap.org/data/2.5/forecast" \
          "?q={},{}&units=IMPERIAL&mode=json&APPID={}".format(city_name,Country_id,api_keys)
    response=requests.get(api_url).json()
    print(api_url)
    try:
        df_countries_1.set_value(index,'Latitude',response['city']['coord']['lat'])
        df_countries_1.set_value(index,'Longitude',response['city']['coord']['lon'])
        df_countries_1.set_value(index,'Temperature (F)',response['list'][0]['main']['temp'])
        df_countries_1.set_value(index,'Humidity (%)',response['list'][0]['main']['humidity'])
        df_countries_1.set_value(index,'Cloudiness (%)',response['list'][0]['clouds']['all'])
        df_countries_1.set_value(index,'Wind Speed (mph)',response['list'][0]['wind']['speed'])
        
    except KeyError:
        df_countries_1.set_value(index,'Latitude',np.nan)
        df_countries_1.set_value(index,'Longitude',np.nan)
        df_countries_1.set_value(index,'Temperature (F)',np.nan)
        df_countries_1.set_value(index,'Humidity (%)',np.nan)
        df_countries_1.set_value(index,'Cloudiness (%)',np.nan)
        df_countries_1.set_value(index,'Wind Speed (mph)',np.nan)
        print('Weather information is missing just..skip')

In [None]:
from pprint import pprint
city_name="London"
Country_id="uk"
api_url = "http://api.openweathermap.org/data/2.5/forecast" \
          "?q={},{}&units=metric&mode=json&APPID={}c".format(city_name,Country_id,api_keys)
response=requests.get(api_url).json()
pprint(response)

#df_countries["Temperature (F)"].append(response['list'][3])

In [None]:



df_countries.set_value(index,'Latitude',response['city']['coord']['lat'])



In [None]:
#loop through all rows and fill in values for blank columns in dataframe
for index,row in df_countries.iterrows():
    city_name=row['City']
    Country_id=row['Country']

    api_url = "http://api.openweathermap.org/data/2.5/forecast" \
          "?q={},{}&units=metric&mode=json&APPID={}".format(city_name,Country_id,api_keys)
    response=requests.get(api_url).json()
    print(api_url)
   
        df_countries["Temperature (F)"].append(response[])

In [None]:
#change all data recieved from openweathermap api to numerical data


In [None]:
#display dataframe df_countries after openweathermap api calls 


In [None]:
#print for the following cities associated with this dataframe ,{} cities did not contain weather information. These cities
#\t will be dropped from this dataframe.'.format(missing_weather_info)


In [None]:
#save out put file into csv file
df_countries = df_countries.dropna()
df_countries.to_csv(path_or_buf='df_countries.csv')
df_countries

In [None]:
df_countries_table = df_countries.copy()
df_countries_table['Latitude'] = pd.qcut(df_countries['Latitude'],11,precision=0)

In [None]:
cm = sns.light_palette('pink',as_cmap=True)

df_countries_table.groupby(['Latitude'])['Temperature (F)'].mean().reset_index().style.background_gradient(cmap=cm)

In [None]:
#plot the following plots Temperature (F) vs. Lat, Humidity (%) vs. Lat, Cloudiness (%) vs. Lat, and 
#Wind Speed (mph) vs. Lat

