# WeatherPy
----
## Observations

Latitude appears to correlate with temperature: the closer a city is to the equator, the higher the temperature. As you move away from the equator - either north or south - the temperature decreases.

Latitude does not appear to affect the percentage of cloudiness. 
Throughout the specturm of latitude, some cities are at 100% cloudiness while others are at 0%.

The majority of cities have wind speeds below 20mph but the wind speed does not appear to increase or decrease as one moves north or south in latitude.

There are two outliers with exceptionally high humidity (Puerto Maldonado,Peru and Paita, Peru) but aside from those, all other humidity is 100% or less. Humidity is highest at far northern and southern latitudes as well as near the equator (generally between 50% and 100%) and lower around 20 degrees south and 20-40 degrees north.

----
#### Note
* Instructions have been included for each segment. You do not have to follow them exactly, but they are included to help you think through the steps.

In [1]:
# Dependencies
import numpy as np
import pandas as pd
import requests
import time
from scipy.stats import linregress
from matplotlib import pyplot as plt
from api_keys import weather_api_key
from citipy import citipy


# Output File (CSV)
output_data_file = "output_data/cities.csv"

# Range of latitudes and longitudes
lat_range = (-90, 90)
lng_range = (-180, 180)


## Generate Cities List

In [None]:
# List for holding lat_lngs and cities
lat_lngs = []
cities = []

# Create a set of random lat and lng combinations
lats = np.random.uniform(low=-90.000, high=90.000, size=1000)
lngs = np.random.uniform(low=-180.000, high=180.000, size=1000)
lat_lngs = zip(lats, lngs) 

# Identify nearest city for each lat, lng combination
for lat_lng in lat_lngs:
    city = citipy.nearest_city(lat_lng[0], lat_lng[1]).city_name
    
    # If the city is unique, then add it to a our cities list
    if city not in cities:
        city = city.replace(" ","%20") #replace space with 20% so url doesnt break.
        
        cities.append(city)

# Print the city count to confirm sufficient count
len(cities)

### Perform API Calls
* Perform a weather check on each city using a series of successive API calls.
* Include a print log of each city as it'sbeing processed (with the city number and city name).


In [30]:
# Save config information.
url = "http://api.openweathermap.org/data/2.5/weather?"
units = "imperical"

# Build partial query URL
query_url = f"{url}appid={weather_api_key}&units={units}&q="

print(query_url+cities[0])
#Set up respone info
city_name = []
cloudiness = []
country = []
date = []
humidity = []
lat = []
lng = []
max_temp = []
wind_speed = []

record = 1

print(f'Beginning Data Retrieval')
print(f'-----------------------------')
# Loop through the list of cities and perform a request for data on each
for city in cities:
    try:
        response = requests.get(f'{query_url}{city}').json()
        city_name.append(response["name"])
        cloudiness.append(response["clouds"]["all"])
        country.append(response["sys"]["country"])
        date.append(response["dt"])
        humidity.append(response["main"]["humidity"])
        lat.append(response["coord"]["lat"])
        lng.append(response["coord"]["lon"])
        max_temp.append(response["main"]["temp_max"])
        wind_speed.append(response["wind"]["speed"])
        print(f'Processing Record {record} | {city}')
        record = record + 1
    except KeyError:
        print(f'City not found. Skipping...')
    time.sleep(1)

http://api.openweathermap.org/data/2.5/weather?appid=b99de8080f7598e3d09f80123d76496c&units=imperical&q=imbituba
Beginning Data Retrieval
-----------------------------
Processing Record 1 | imbituba
Processing Record 2 | asau
Processing Record 3 | gornopravdinsk
Processing Record 4 | puerto%20ayora
Processing Record 5 | stokmarknes
Processing Record 6 | atuona
Processing Record 7 | qaanaaq
City not found. Skipping...
Processing Record 8 | gigmoto
Processing Record 9 | takaka
Processing Record 10 | narsaq
Processing Record 11 | butaritari
Processing Record 12 | berga
Processing Record 13 | yellowknife
Processing Record 14 | nouadhibou
Processing Record 15 | busselton
Processing Record 16 | tsimlyansk
Processing Record 17 | srednekolymsk
Processing Record 18 | adrar
Processing Record 19 | hithadhoo
Processing Record 20 | provideniya
Processing Record 21 | jamestown
Processing Record 22 | arraial%20do%20cabo
Processing Record 23 | east%20london
Processing Record 24 | dunedin
City not foun

### Convert Raw Data to DataFrame
* Export the city data into a .csv.
* Display the DataFrame

In [None]:
city_data = pd.DataFrame({"City": city_name,
                         "Cloudiness": cloudiness,
                         "Country": country, 
                         "Date": date,
                         "Humidity": humidity,
                         "Lat": lat,
                         "Lng": lng,
                         "Max Temp": max_temp,
                         "Wind Speed": wind_speed})

city_data.to_csv("city_weather_data.csv")
city_data.count()

In [None]:
city_data.head()

## Inspect the data and remove the cities where the humidity > 100%.
----
Skip this step if there are no cities that have humidity > 100%. 

In [None]:
city_data_H_under100 = city_data.loc[city_data["Humidity"]<100]
city_data_H_under100

In [None]:
#  Get the indices of cities that have humidity over 100%.
city_data_H_above100 = city_data.loc[city_data["Humidity"]>100]
city_data_H_above100.index


In [None]:
# Make a new DataFrame equal to the city data to drop all humidity outliers by index.
# Passing "inplace=False" will make a copy of the city_data DataFrame, which we call "clean_city_data".
clean_city_data = city_data_H_under100
clean_city_data

## Plotting the Data
* Use proper labeling of the plots using plot titles (including date of analysis) and axes labels.
* Save the plotted figures as .pngs.

## Latitude vs. Temperature Plot

In [None]:
# Plotting the data for Latitude vs Max Temp
plt.figure(figsize=(10,8))
plt.scatter(clean_city_data['Lat'], clean_city_data['Max Temp'],edgecolor="black", linewidth = .75)
plt.title(f"City Latitude vs. Max Temperature (30/01/21)", fontsize="16")
plt.xlabel("Latitude", fontsize="14")
plt.ylabel("Max Temperature (F)", fontsize="14")
plt.ylim(0, 120)
plt.savefig("Lat vs Temp.png")

plt.show()

## Latitude vs. Humidity Plot

In [None]:
plt.scatter(clean_city_data["Lat"], clean_city_data["Humidity"], edgecolor="black", linewidth = .75, color="mediumpurple")
plt.title("City Latitude vs Humidity (30/01/2019)")
plt.xlabel("Latitude")
plt.ylabel("Humidity (%)")
plt.grid()

plt.savefig("Lat vs Humidity.png")
plt.show()

## Latitude vs. Cloudiness Plot

In [None]:
plt.scatter(clean_city_data["Lat"], clean_city_data["Cloudiness"], edgecolor="black", linewidth = .75, color="salmon")
plt.title("City Latitude vs Cloudiness (30/01/2021)")
plt.xlabel("Latitude")
plt.ylabel("Cloudiness (%)")
plt.grid()

plt.savefig("Lat vs Cloudiness.png")
plt.show()

## Latitude vs. Wind Speed Plot

In [None]:
plt.scatter(clean_city_data["Lat"], clean_city_data["Wind Speed"], edgecolor="black", linewidth = .75, color="mediumaquamarine")
plt.title("City Latitude vs Wind Speed (30/01/2021)")
plt.xlabel("Latitude")
plt.ylabel("Wind Speed (mph)")
plt.grid()

plt.savefig("Lat vs Wind.png")
plt.show()

## Linear Regression

In [None]:
#Filter out Nothen and Southen Hemisphere data using Latidudes
clean_city_data_Nth = clean_city_data.loc[clean_city_data["Lat"] > 0]
clean_city_data_Sth = clean_city_data.loc[clean_city_data["Lat"] < 0]

x_values = clean_city_data_Nth['Lat']
y_values = clean_city_data_Nth['Max Temp']
(slope, intercept, rvalue, pvalue, stderr) = linregress(x_values, y_values)
regress_values = x_values * slope + intercept
line_eq = "y = " + str(round(slope,2)) + "x + " + str(round(intercept,2))
plt.scatter(x_values,y_values)
plt.plot(x_values,regress_values,"r-")
plt.annotate(line_eq,(30,30),fontsize=15,color="red")
plt.xlabel('Latitute')
plt.ylabel('Max Temperature (F)')
print(f"The r-squared is: {rvalue**2}")
plt.show()

####  Northern Hemisphere - Max Temp vs. Latitude Linear Regression

In [None]:
x_values = clean_city_data_Sth['Lat']
y_values = clean_city_data_Sth['Max Temp']
(slope, intercept, rvalue, pvalue, stderr) = linregress(x_values, y_values)
regress_values = x_values * slope + intercept
line_eq = "y = " + str(round(slope,2)) + "x + " + str(round(intercept,2))
plt.scatter(x_values,y_values)
plt.plot(x_values,regress_values,"r-")
plt.annotate(line_eq,(-40,20),fontsize=15,color="red")
plt.xlabel('Latitute')
plt.ylabel('Max Temperature (F)')
print(f"The r-squared is: {rvalue**2}")
plt.show()

####  Southern Hemisphere - Max Temp vs. Latitude Linear Regression

In [None]:
x_values = clean_city_data_Nth['Lat']
y_values = clean_city_data_Nth['Humidity']
(slope, intercept, rvalue, pvalue, stderr) = linregress(x_values, y_values)
regress_values = x_values * slope + intercept
line_eq = "y = " + str(round(slope,2)) + "x + " + str(round(intercept,2))
plt.scatter(x_values,y_values)
plt.plot(x_values,regress_values,"r-")
plt.annotate(line_eq,(0,50),fontsize=15,color="red")
plt.xlabel('Latitute')
plt.ylabel('Humidity (%)')
print(f"The r-squared is: {rvalue**2}")
plt.show()

####  Northern Hemisphere - Humidity (%) vs. Latitude Linear Regression

In [None]:
x_values = clean_city_data_Sth['Lat']
y_values = clean_city_data_Sth['Humidity']
(slope, intercept, rvalue, pvalue, stderr) = linregress(x_values, y_values)
regress_values = x_values * slope + intercept
line_eq = "y = " + str(round(slope,2)) + "x + " + str(round(intercept,2))
plt.scatter(x_values,y_values)
plt.plot(x_values,regress_values,"r-")
plt.annotate(line_eq,(-45,65),fontsize=15,color="red")
plt.xlabel('Latitute')
plt.ylabel('Humidity (%)')
print(f"The r-squared is: {rvalue**2}")
plt.show()

####  Southern Hemisphere - Humidity (%) vs. Latitude Linear Regression

In [None]:
x_values = clean_city_data_Nth['Lat']
y_values = clean_city_data_Nth['Cloudiness']
(slope, intercept, rvalue, pvalue, stderr) = linregress(x_values, y_values)
regress_values = x_values * slope + intercept
line_eq = "y = " + str(round(slope,2)) + "x + " + str(round(intercept,2))
plt.scatter(x_values,y_values)
plt.plot(x_values,regress_values,"r-")
plt.annotate(line_eq,(20,50),fontsize=15,color="red")
plt.xlabel('Latitute')
plt.ylabel('Cloudiness (%)')
print(f"The r-squared is: {rvalue**2}")
plt.show()

####  Northern Hemisphere - Cloudiness (%) vs. Latitude Linear Regression

In [None]:
x_values = clean_city_data_Sth['Lat']
y_values = clean_city_data_Sth['Cloudiness']
(slope, intercept, rvalue, pvalue, stderr) = linregress(x_values, y_values)
regress_values = x_values * slope + intercept
line_eq = "y = " + str(round(slope,2)) + "x + " + str(round(intercept,2))
plt.scatter(x_values,y_values)
plt.plot(x_values,regress_values,"r-")
plt.annotate(line_eq,(-50,50),fontsize=15,color="red")
plt.xlabel('Latitute')
plt.ylabel('Cloudiness (%)')
print(f"The r-squared is: {rvalue**2}")
plt.show()

####  Southern Hemisphere - Cloudiness (%) vs. Latitude Linear Regression

In [None]:
x_values = clean_city_data_Nth['Lat']
y_values = clean_city_data_Nth['Wind Speed']
(slope, intercept, rvalue, pvalue, stderr) = linregress(x_values, y_values)
regress_values = x_values * slope + intercept
line_eq = "y = " + str(round(slope,2)) + "x + " + str(round(intercept,2))
plt.scatter(x_values,y_values)
plt.plot(x_values,regress_values,"r-")
plt.annotate(line_eq,(5,5),fontsize=15,color="red")
plt.xlabel('Latitute')
plt.ylabel('Wind Speed (mph)')
print(f"The r-squared is: {rvalue**2}")
plt.show()

####  Northern Hemisphere - Wind Speed (mph) vs. Latitude Linear Regression

In [None]:
x_values = clean_city_data_Sth['Lat']
y_values = clean_city_data_Sth['Wind Speed']
(slope, intercept, rvalue, pvalue, stderr) = linregress(x_values, y_values)
regress_values = x_values * slope + intercept
line_eq = "y = " + str(round(slope,2)) + "x + " + str(round(intercept,2))
plt.scatter(x_values,y_values)
plt.plot(x_values,regress_values,"r-")
plt.annotate(line_eq,(-40,8),fontsize=15,color="red")
plt.xlabel('Latitute')
plt.ylabel('Wind Speed (mph)')
print(f"The r-squared is: {rvalue**2}")
plt.show()

####  Southern Hemisphere - Wind Speed (mph) vs. Latitude Linear Regression