![Equator](equatorsign.png)

In this example, you'll be creating a Python script to visualize the weather of 500+ cities across the world of varying distance from the equator. To accomplish this, you'll be utilizing a [simple Python library](https://pypi.python.org/pypi/citipy), the [OpenWeatherMap API](https://openweathermap.org/api), and a little common sense to create a representative model of weather across world cities.

Your objective is to build a series of scatter plots to showcase the following relationships:

* Temperature (F) vs. Latitude
* Humidity (%) vs. Latitude
* Cloudiness (%) vs. Latitude
* Wind Speed (mph) vs. Latitude

Your final notebook must:

* Randomly select **at least** 500 unique (non-repeat) cities based on latitude and longitude.
* Perform a weather check on each of the cities using a series of successive API calls. 
* Include a print log of each city as it's being processed with the city number, city name, and requested URL.
* Save both a CSV of all data retrieved and png images for each scatter plot.

As final considerations:

* You must use the Matplotlib and Seaborn libraries.
* You must include a written description of three observable trends based on the data. 
* You must use proper labeling of your plots, including aspects like: Plot Titles (with date of analysis) and Axes Labels.
* You must include an exported markdown version of your Notebook called  `README.md` in your GitHub repository.  
* See [Example Solution](WeatherPy_Example.pdf) for a reference on expected format. 

## Hints and Considerations

* You may want to start this assignment by refreshing yourself on 4th grade geography, in particular, the [geographic coordinate system](http://desktop.arcgis.com/en/arcmap/10.3/guide-books/map-projections/about-geographic-coordinate-systems.htm). 

* Next, spend the requisite time necessary to study the OpenWeatherMap API. Based on your initial study, you should be able to answer  basic questions about the API: Where do you request the API key? Which Weather API in particular will you need? What URL endpoints does it expect? What JSON structure does it respond with? Before you write a line of code, you should be aiming to have a crystal clear understanding of your intended outcome.

* Though we've never worked with the [citipy Python library](https://pypi.python.org/pypi/citipy), push yourself to decipher how it works, and why it might be relevant. Before you try to incorporate the library into your analysis, start by creating simple test cases outside your main script to confirm that you are using it correctly. Too often, when introduced to a new library, students get bogged down by the most minor of errors -- spending hours investigating their entire code -- when, in fact, a simple and focused test would have shown their basic utilization of the library was wrong from the start. Don't let this be you!

* Part of our expectation in this challenge is that you will use critical thinking skills to understand how and why we're recommending the tools we are. What is Citipy for? Why would you use it in conjunction with the OpenWeatherMap API? How would you do so?

* In building your script, pay attention to the cities you are using in your query pool. Are you getting coverage of the full gamut of latitudes and longitudes? Or are you simply choosing 500 cities concentrated in one region of the world? Even if you were a geographic genius, simply rattling 500 cities based on your human selection would create a biased dataset. Be thinking of how you should counter this. (Hint: Consider the full range of latitudes).

* Lastly, remember -- this is a challenging activity. Push yourself! If you complete this task, then you can safely say that you've gained a strong mastery of the core foundations of data analytics and it will only go better from here. Good luck!

In [41]:
%matplotlib notebook

In [42]:
import csv
import numpy as np
import pandas as pd

import logging

import matplotlib.pyplot as plt
import requests
import pandas as pd
from citipy import citipy
import json
from config import weather_api_key
import time
from datetime import datetime
import seaborn as sns

sns.set()

np.random.seed()

In [43]:
logging.basicConfig(filename ="api_calls.log", 
                    level = logging.INFO, 
                    format = '%(asctime)s %(levelname)-8s %(message)s' )

In [44]:
# Output File (CSV)
output_data_file = "output_data/cities.csv"

# Range of latitudes and longitudes
lat_range = (-90, 90)
lng_range = (-180, 180)

In [45]:
np.random.seed()

#creat lat_lngs and cities list
data = pd.DataFrame(columns =['Lat',"Lng","City","Temperature","Humidity","Clouds","Wind Speed","Date"])
cities = []
lat_lngs = []


#lats (-90, 90) lngs (-180, 180)
lats = np.random.randint(-90, 90, size=700)
lngs = np.random.randint(-180, 180, size=700)
lat_lngs = (lats, lngs)

data['Lat']=lats
data['Lng']=lngs
data.head()

# coords = pd.DataFrame({
#     "latitude": lats,
#     "longitude": lngs
# })


Unnamed: 0,Lat,Lng,City,Temperature,Humidity,Clouds,Wind Speed,Date
0,39,-155,,,,,,
1,84,36,,,,,,
2,-43,97,,,,,,
3,-3,-44,,,,,,
4,0,146,,,,,,


In [46]:
#find the cities related to the coordinates
cities = []
for index, row in data.iterrows():
    city=citipy.nearest_city(row["Lat"],row["Lng"])
    cities.append(city.city_name)
data['City']=cities
data.head()

Unnamed: 0,Lat,Lng,City,Temperature,Humidity,Clouds,Wind Speed,Date
0,39,-155,kapaa,,,,,
1,84,36,vardo,,,,,
2,-43,97,busselton,,,,,
3,-3,-44,rosario,,,,,
4,0,146,lorengau,,,,,


In [47]:
#check for and remove duplicates and make sure there are atleast 500 unique cities to test
new_data = data.drop_duplicates("City",keep="first")
len(new_data)

287

In [48]:
# Create an "extracts" object to get the temperature, humidity, cloudiness and wind speed
latitude = []
longitude = []
temperature = []
humidity = []
cloudiness = []
wind_speed = []
dates = []


counter = 0
url = "http://api.openweathermap.org/data/2.5/weather?"
units = "Imperial"

# Build partial query URL
query_url = f"{url}appid={weather_api_key}&units={units}&q="

In [54]:
#set variable for city_data to hold all data from the API Request

#loop through cities and request weather information for each city
for index, row in new_data.iterrows():
    counter +=1
    city = row["City"]
    time.sleep(0.25)
    response = requests.get(f'{city_url}{city}').json()
    # Some of the cities we generate don't have data in openweathermap, so set their values to numpy's NaN
    try:
        temperature.append(response['main']['temp_max'])
        latitude.append(response['coord']['lat'])
        longitude.append(response['coord']['lon'])
        humidity.append(response['main']['humidity'])
        wind_speed.append(response['wind']['speed'])
        dates.append(response['dt'])
    except KeyError:
        temperature.append(np.nan)
        latitude.append(np.nan)
        longitude.append(np.nan)
        humidity.append(np.nan)
        wind_speed.append(np.nan)
        dates.append(np.nan)
    
    # Sometimes it's not cloudy! Then 'clouds' does not exist, so set it to zero.
    try:
        cloudiness.append(response['clouds']['all'])
    except KeyError:
        cloudiness.append(0)
#     city_name = row["City"]
#     city_lat = response["coord"]["lat"]
#     city_lng = response["coord"]["Lng"]
#     city_max_temp = response["main"]["temp_max"]
#     city_humidity = response["main"]["humidity"]
#     city_cloud = response["clouds"]["all"]
#     city_wind = response["wind"]["speed"]
        
#         #append dictionary to city_data
#         city_data.append({
#             "City": city_name, 
#             "Lat": city_lat, 
#             "Lng": city_lng, 
#             "Max Temp": city_max_temp, 
#             "Humidity": city_humidity,
#             "Wind Speed": city_wind
#         })
# Assemble everything into a data frame
weather_df = pd.DataFrame({"City": cities,
                           "Latitude": latitude,
                           "Longitude": longitude,
                           "Humidity": humidity,
                           "Max Temp": temperature,
                           "Cloudiness": cloudiness,
                           "Wind Speed": wind_speed,
                           "Date": dates,
                          })
weath_df.head()

ValueError: arrays must all be same length

In [55]:
# Remove any cities that have NaN values
weather_df = weather_df.dropna(how='any')

print(f"The data frame contains {len(weather_df['City'])} unique cities.")

The data frame contains 0 unique cities.


In [None]:
data.head()


In [None]:
#check size of city_data
len(city_data)