Hints and Considerations

* The city data you generate is based on random coordinates as well as different query times; as such, your outputs will not be an exact match to the provided starter notebook.

* You may want to start this assignment by refreshing yourself on the [geographic coordinate system](http://desktop.arcgis.com/en/arcmap/10.3/guide-books/map-projections/about-geographic-coordinate-systems.htm).

* Next, spend the requisite time necessary to study the OpenWeatherMap API. Based on your initial study, you should be able to answer  basic questions about the API: Where do you request the API key? Which Weather API in particular will you need? What URL endpoints does it expect? Which JSON structure does it respond with? Before you write a line of code, you should be aiming to have a crystal clear understanding of your intended outcome.

* Starter code for Citipy has been provided. However, if you're craving an extra challenge, push yourself to learn how it works: [citipy Python library](https://pypi.python.org/pypi/citipy). Before you try to incorporate the library into your analysis, start by creating simple test cases outside your main script to confirm that you are using it correctly. Too often, when introduced to a new library, students get bogged down by the most minor of errors -- spending hours investigating their entire code -- when, in fact, a simple and focused test would have shown their basic utilization of the library was wrong from the start. Don't let this be you!

* Part of our expectation in this challenge is that you will use critical thinking skills to understand how and why we're recommending the tools we are. What is Citipy for? Why would you use it in conjunction with the OpenWeatherMap API? How would you do so?

* In building your script, pay attention to the cities you are using in your query pool. Are you getting coverage of the full gamut of latitudes and longitudes? Or are you simply choosing 500 cities concentrated in one region of the world? Even if you were a geographic genius, simply rattling 500 cities based on your human selection would create a biased dataset. Be thinking of how you should counter this. (Hint: Consider the full range of latitudes).

* Remember that each coordinate will trigger a separate call to the Google API. If you're creating your own criteria to plan your vacation, try to reduce the results in your DataFrame to 10 or fewer cities.

* Lastly, remember -- this is a challenging activity. Push yourself! If you complete this task, then you can safely say that you've gained a strong mastery of the core foundations of data analytics and it will only go better from here. Good luck!

Your final notebook must:

* Randomly select **at least** 500 unique (non-repeat) cities based on latitude and longitude.
* Obtain the weather from each city using the OpenWeatherMap API.
* Include a print log of each city as it's being processed, with the city number and city name.
* Save a CSV of all retrieved data and a PNG image for each scatter plot.

In [1]:
# Dependencies and Setup
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
import requests
import time
from scipy.stats import linregress
import json

# Import API key
from config import weather_api_key

# Incorporated citipy to determine city based on latitude and longitude
from citipy import citipy

# Output File (CSV)
output_data_file = "output_data/cities.csv"

# Range of latitudes and longitudes
lat_range = (-90, 90)
lng_range = (-180, 180)

Generate Cities List

In [2]:
# List for holding lat_lngs and cities
lat_lngs = []
cities = []

# Create a set of random lat and lng combinations
lats = np.random.uniform(lat_range[0], lat_range[1], size=1500)
lngs = np.random.uniform(lng_range[0], lng_range[1], size=1500)
lat_lngs = zip(lats, lngs)

# Identify nearest city for each lat, lng combination
for lat_lng in lat_lngs:
    city = citipy.nearest_city(lat_lng[0], lat_lng[1]).city_name
    
    # If the city is unique, then add it to a our cities list
    if city not in cities:
        cities.append(city)

# Print the city count to confirm sufficient count
len(cities)
#print(cities)
#set(zip(lats, lngs))


604

Perform API Calls
Perform a weather check on each city using a series of successive API calls.
Include a print log of each city as it's being processed (with the city number and city name).
***Note that these will be at different times of day, depending on longitude... do have timezone... historical data for e.g. last 5 days would be better, but is a lot more data; longer-term costs money. Suppose could divide calls to database by timezone, do a call at e.g. local noon until have the complete set.
OpenWeatherMap uses Unix time and UTC/GMT time zone for all API calls including current weather, forecast and historical data.
api.openweathermap.org/data/2.5/weather?q={city name}&appid={your api key}

In [3]:
url = "http://api.openweathermap.org/data/2.5/weather?q="

In [21]:
#test_weather_data = []
test_cities = {"Bremen", "Boston", "Bilbao", "Quito", "Auckland"}

lon = []
lat = []
#weather = []
#weather_desc = []
temp_C = []
#apparent_C = []
#temp_min_C = []
#temp_max_C = []
#humidity = []
#wind_speed = []
#wind_gust = []
#rain_1h = []
#clouds = []
#country = []
#timezone = []

units = "metric"
for city in test_cities:
    query_url = url + city + "&appid=" + weather_api_key + "&units=" + units
    current_weather = requests.get(query_url).json()
    lat.append(current_weather['coord']['lat'])
    lon.append(current_weather['coord']['lon'])
    #weather.append(current_weather['weather']['main'])
    #weather_desc.append(current_weather['weather']['description'])
    #temp_C.append(current_weather['main']['temp'])
    #apparent_C.append(current_weather['main']['feels_like']) = []
    #temp_min_C.append(current_weather['main']['temp_min']) = []
    #temp_max_C.append(current_weather['main']['temp_max']) = []
    #humidity.append(current_weather['main']['humidity']) = []
    #wind_speed.append(current_weather['wind']['speed']) = []
    #wind_gust.append(current_weather['wind']['gust']) = []
    #wind_gust.append(current_weather['wind']['gust']) = []
    #rain_1h.append(current_weather['rain']['1h']) = []
    #clouds.append(current_weather['clouds']['all']) = []
    #country.append(current_weather['sys']['country']) = []
    #timezone.append(current_weather['timezone']) = []
    #print(current_weather)
    #test_weather_data.append(current_weather)
print(lon)

[174.77, 8.81, -78.52, -2.93, -71.06]


Convert Raw Data to DataFrame
Export the city data into a .csv.
Display the DataFrame

In [None]:
weather_dict = {
    "city": cities,
    "lat": lat,
    "temp": temp
}
weather_data = pd.DataFrame(weather_dict)
weather_data.head()

Inspect the data and remove the cities where the humidity > 100%.
Skip this step if there are no cities that have humidity > 100%.

Plotting the Data
Use proper labeling of the plots using plot titles (including date of analysis) and axes labels.
Save the plotted figures as .pngs.

Temperature (F) vs. Latitude (Max temp? No, app just supplies current temp)

Humidity (%) vs. Latitude

Cloudiness (%) vs. Latitude

Wind Speed (mph) vs. Latitude

Linear Regression
#OPTIONAL: Create a function to create Linear Regression plots
**Optional:** Since you're creating multiple linear regression plots, you could do this in a function. To optimize 
your code, write a function that creates the linear regression plots based on parameters you provide. 
Again, this step is **optional**. 

#Create Northern and Southern Hemisphere DataFrames

#After each pair of plots (i.e., northern and southern hemispheres) explain what the linear regression is modeling, 
comment on any relationships you notice, and include any other analysis you may have.

Northern Hemisphere - Max Temp (F) vs. Latitude Linear Regression

Southern Hemisphere - Max Temp (F) vs. Latitude Linear Regression

Northern Hemisphere - Humidity (%) vs. Latitude Linear Regression

Southern Hemisphere - Humidity (%) vs. Latitude Linear Regression

Northern Hemisphere - Cloudiness (%) vs. Latitude Linear Regression

Southern Hemisphere - Cloudiness (%) vs. Latitude Linear Regression

Northern Hemisphere - Wind Speed (mph) vs. Latitude Linear Regression

Southern Hemisphere - Wind Speed (mph) vs. Latitude Linear Regression