# WeatherPy
### 
### Observations and Trends:
- The warmest temperatures can be found in the latitude band of -10&deg; to 40&deg;, relative to the equatorial line. The relationship of temperature to latitude shows there is an inverse, linear relationship between a city's latitude and temperature. This is supported by the regression line which is able to correleate as latitude decreases, the temperature increases. Looking at the R-squared analysis, it only shows a medium correleation. However, this is good enough, as ocean currents play a big part in weather patterns, and are not accounted for in the analysis. 
- The plot for Humidity vs Latitude shows no correleation between latitude and humidity. However, when looking at the Northern Hemisphere, there is somewhat of a relationship, with lower humidity the further the city is from the equator.
- Cities along the equatorial band (latitudes -20&deg; to 40&deg;) show a pattern of having the greatest measure of cloudiness.  
- There seems to be no discernible pattern when trying to correlate wind speed and latitude.

In [None]:
# Dependencies and Setup
import pandas as pd
import requests
import matplotlib.pyplot as plt
import numpy as np
import time
from scipy.stats import linregress
from datetime import datetime

# Determine city from latitude and longitude
from citipy import citipy

# Get api key from config file, an untracked file in gitignore
from config import api_key

# output_files = 'Output/cities.csv'

# Define range of latitudes and longitudes
lat_r=(-90, 90)
lng_r=(-180, 180)

#### Generate List of Cities

In [None]:
# Variable with an empty list to hold cities and lat_long
cities = []
lat_long = []

# Randomize creation of latitude & longitude. Set np size to 1000; change if needed
lat = np.random.uniform(low=-45.00, high=45.00, size=2000)
lng = np.random.uniform(low=-180, high = 180, size=2000)
lat_long = zip(lat, lng)

# Find nearest city with geo coordinates
for lat_lng in lat_long:
    city = citipy.nearest_city(lat_lng[0], lat_lng[1]).city_name
    if city not in cities:
        cities.append(city)

# Validate sufficient number of cities generated        
len(cities)

#### Perform API Calls

In [None]:
# Set up api endpoints
base_url = 'http://api.openweathermap.org/data/2.5/weather?'
units = 'imperial'

# Setup URL query
query_url = f'{base_url}appid={api_key}&units={units}&q='

# Dictionary of variables where data from api calls will be stored
city_data = {'City_Name':[], 'City_Lat':[], 'City_Long':[], 'Temperature':[], 'Humidity':[],
             'Cloudiness':[], 'Wind_Speed':[], 'Country':[], 'Date':[]}

In [None]:
# Loop iterating openweather api calls, adding retrieved data into the city_data dictionary.
city_ttls = len(cities)
r = 1

print('Retrieving weather data')
print('-' * 30)

for r, city in enumerate(cities):
    try:
        print(f'Retrieving {city}, number {r} of {city_ttls}.')
        r+=1
        city = requests.get(query_url + city).json()
        city_data["City_Name"].append(city["name"])
        city_data["City_Lat"].append(city["coord"]["lat"])
        city_data["City_Long"].append(city["coord"]["lon"])
        city_data["Temperature"].append(city["main"]["temp"])
        city_data["Humidity"].append(city["main"]["humidity"])
        city_data["Cloudiness"].append(city["clouds"]["all"])
        city_data["Wind_Speed"].append(city["wind"]["speed"])
        city_data["Country"].append(city["sys"]["country"])
        city_data["Date"].append(city["dt"])
        
        # Use timer to delay request to not exceed query limits.
        time.sleep(0.005)
    except:
        print(f'Incomplete record for {city}. Skipping {city}.')
        pass

print(f'Data retrieval completed.')

In [None]:
# Create New Dataframe From City Data
city_data_df = pd.DataFrame(city_data)
# Convert unix timestamp string to datetime
city_data_df['Date'] = pd.to_datetime(city_data_df['Date'], unit = 's')
city_data_df.to_csv('Output/city_data.csv')
city_data_df.head()

### Plotting City Data
The code analyzes the relationship between the city latitude and a weather variable. Data is taken from city_data_df, using 'o' as a marker to plot the city latitude and selected weather variable, as x,y coordinates. Matpltlib's colormap is used to map colors to values on a gradient, with the lowest temperature in blue, and the highest in red. A color bar has been added to aid in identifying color to numerical range.

##### Temperature vs City Latitude:
The plot shows that cities with the warmest temperatures are in the latitude range of 10&deg; to 40&deg;, while cities outside the latitude range of -20&deg; to 40&deg;, are generally cooler

In [None]:
# Plot relationship of Temperature (F) vs. Latitude
plt.figure(figsize=(8, 6))
plt.scatter(city_data_df['City_Lat'], city_data_df['Temperature'], 
           c =city_data_df['Temperature'], s=15, cmap= 'coolwarm', marker='o')

plt.title(f'Temperature vs. City Latitude  ' + time.strftime('%m/%d/%Y'), fontsize=14 )
plt.ylabel('Temperature (F)', fontsize=12)
plt.xlabel('Latitude', fontsize=12)

# Insert colorbar to indicate what info is being displayed
cbar= plt.colorbar()
cbar.set_label("temperature (F)", labelpad=+1)
plt.grid()
plt.savefig('Images/temperature-vs-latitude.png')
plt.show()

#### Humidity vs City Latitude:
The code analyzes the relationship between the city latitude and humidity. There doesn't seem to be a discernible pattern.

In [None]:
# Plot relationship of Humidity (%) vs. Latitude
plt.figure(figsize=(8, 6))
plt.scatter(city_data_df['City_Lat'], city_data_df['Humidity'], 
           c =city_data_df['Humidity'], s=15, cmap= 'coolwarm', marker='o')

plt.title('Humidity vs. City Latitude  ' + time.strftime('%m/%d/%Y'), fontsize=14)
plt.ylabel('Humidity (%)', fontsize=12)
plt.xlabel('Latitude', fontsize=12)
cbar= plt.colorbar()
cbar.set_label("humidity (%)", labelpad=+1)
plt.grid(True) # add (True) for test

plt.savefig('Images/humidity-vs-latitude.png')
plt.show()

#### Cloudiness vs City Latitude:
The code analyzes the relationship between the city latitude and cloudiness. There doesn't seem to be a discernible pattern.

In [None]:
# Plot relationship of Cloudiness (%) vs. Latitude
plt.figure(figsize=(8, 6))
plt.scatter(city_data_df['City_Lat'], city_data_df['Cloudiness'], 
           c =city_data_df['Cloudiness'], s=15, cmap= 'coolwarm', marker='o')

plt.title('Cloudiness (%) vs. City Latitude  ' + time.strftime('%m/%d/%Y'), fontsize=14)
plt.ylabel('Cloudiness (%)', fontsize=12)
plt.xlabel('Latitude', fontsize=12)
cbar= plt.colorbar()
cbar.set_label("cloudiness (%)", labelpad=+1)
plt.grid() # add (True) for test

plt.savefig('Images/cloudiness-vs-latitude.png')
plt.show()

#### Wind Speed vs. Latitude
The code analyzes the relationship between the city latitude and wind speed. There doesn't seem to be any discernible pattern.

In [None]:
# Plot relationship of Wind Speed (%) vs. Latitude
plt.figure(figsize=(8, 6))
plt.scatter(city_data_df['City_Lat'], city_data_df['Wind_Speed'], 
           c =city_data_df['Wind_Speed'], s=15, cmap= 'coolwarm', marker='o')

plt.title('Wind Speed (mph) vs. City Latitude  ' + time.strftime('%m/%d/%Y'), fontsize=14)
plt.ylabel('Wind Speed (mph)', fontsize=12)
plt.xlabel('Latitude', fontsize=12)
cbar= plt.colorbar()
cbar.set_label("wind speed (mph)", labelpad=+1)
plt.grid()

plt.savefig('Images/wind_speed-vs-latitude.png')
plt.show()

### Linear Regression


In [None]:
# Linear Regression plot function to be called in linear regression plots
def linear_regres(x, y):
    slope, intercept, rvalue, pvalue, stderror = linregress(x, y)
    print(f'R Squared is: {rvalue**2}')
    
    # Print statement executes based on evaluation if rvalue condition is true/false.
    if rvalue <0.0:
        print('R-squared displays a negative correlation')
    elif 0.0 < rvalue < 0.05:
        print('R-squared displays a neutral correlation')
    else:
        print('R-squared displays a positive correlation')
    
    regression_values = slope * x + intercept
    
    # Linear regression equation
    linear_eq = 'y = ' + str(round(slope,2)) + 'x + ' + str(round(intercept,2))
                                                          
    # Plot linear regression
    plt.plot(x, regression_values, color='maroon')
    
    # End linear regression function and return results
    return linear_eq

In [None]:
# Create dataframes and dict for N. Hemisphere (latitude >=0) and S. Hemisphere (latitude <0)
northern_df = city_data_df.loc[pd.to_numeric(city_data_df['City_Lat']).astype(float) >= 0, :]
southern_df = city_data_df.loc[pd.to_numeric(city_data_df['City_Lat']).astype(float) < 0, :]

n_latitude = northern_df['City_Lat']
n_temperature = northern_df['Temperature']
n_humidity = northern_df['Humidity']
n_cloudiness = northern_df['Cloudiness']
n_windspeed = northern_df['Wind_Speed']

s_latitude = southern_df['City_Lat']
s_temperature = southern_df['Temperature']
s_humidity = southern_df['Humidity']
s_cloudiness = southern_df['Cloudiness']
s_windspeed = southern_df['Wind_Speed']
# print(n_latitude)  -- test if dict is valid

##### 
##### Northern Hemisphere - Temperature (F) vs. Latitude

In [None]:
plt.figure(figsize=(8, 6))
plt.scatter(n_latitude, n_temperature, 
           c =northern_df['Temperature'], s=15, cmap= 'coolwarm', marker='o')

# Run linear regression plot function
linear_eq = linear_regres(n_latitude, n_temperature)

plt.title(f'N. Hemisphere, Temp vs. City Latitude  ' + time.strftime('%m/%d/%Y'), fontsize=14 )
plt.ylabel('Temperature (F)', fontsize=12)
plt.xlabel('Latitude', fontsize=12)

plt.annotate(linear_eq,(5,50),fontsize=16, color='maroon')
cbar= plt.colorbar()
cbar.set_label("temperature (F)", labelpad=+1)
plt.grid()
plt.savefig('Images/NH_lin_regres_temp-vs-lat.png')
print("The regression line shows not much of a correlation between city temperatures and latitude. The R-squared shows that the data points don't fit the line. However, the plot shows somewhat decreasing temperatures the further a city is from the equator, in the Northern Hemisphere.")
plt.show()

##### 
##### Southern Hemisphere - Temperature (F) vs. Latitude

In [None]:
plt.figure(figsize=(8, 6))
plt.scatter(s_latitude, s_temperature, 
           c =southern_df['Temperature'], s=15, cmap= 'coolwarm', marker='o')

# Run linear regression plot function
linear_eq = linear_regres(s_latitude, s_temperature)

plt.title(f'S. Hemisphere, Temp vs. City Latitude  ' + time.strftime('%m/%d/%Y'), fontsize=14 )
plt.ylabel('Temperature (F)', fontsize=12)
plt.xlabel('Latitude', fontsize=12)

# Annotate linear regression equation on plot
plt.annotate(linear_eq,(-45,40),fontsize=16, color='maroon')

# Insert colorbar to indicate what info is being displayed
cbar= plt.colorbar()
cbar.set_label('temperature (F)', labelpad=+1)
plt.grid()
plt.savefig('Images/SH_lin_regres_temp-vs-lat.png')
print('In the Southern Hemisphere, the regression line shows a stronger correlation than the Northern Hemisphere, with city temperatures increasing the closer they are to the equator. The R-squared value, at 55% shows there is a relationship.')
plt.show()

#####  
##### Northern Hemisphere - Humidity (%) vs. Latitude

In [None]:
plt.figure(figsize=(8, 6))
plt.scatter(n_latitude, n_humidity, 
           c =northern_df['Humidity'], s=15, cmap= 'coolwarm', marker='o')

# Run linear regression plot function
linear_eq = linear_regres(n_latitude, n_humidity)

plt.title(f'N. Hemisphere, Humidity vs. City Latitude  ' + time.strftime('%m/%d/%Y'), fontsize=14 )
plt.ylabel('Humidity (%)', fontsize=12)
plt.xlabel('Latitude', fontsize=12)


plt.annotate(linear_eq,(5,20),fontsize=16, color='maroon')
cbar= plt.colorbar()
cbar.set_label('humidity (%)', labelpad=+1)
plt.grid()
plt.savefig('Images/NH_lin_regres_humidity-vs-lat.png')
print('The regression line shows that there is a small relationship between humidity and city latitude. The further away from the equatorial line, the less humid it is')
plt.show()

##### 
##### Southern Hemisphere - Humidity (%) vs. Latitude

In [None]:
plt.figure(figsize=(8, 6))
plt.scatter(s_latitude, s_humidity, 
           c =southern_df['Humidity'], s=15, cmap= 'coolwarm', marker='o')

# Run linear regression plot function
linear_eq = linear_regres(s_latitude, s_humidity)

plt.title(f'S. Hemisphere, Humidity vs. City Latitude  ' + time.strftime('%m/%d/%Y'), fontsize=14 )
plt.ylabel('Humidity (%)', fontsize=12)
plt.xlabel('Latitude', fontsize=12)

plt.annotate(linear_eq,(-45,45),fontsize=16, color='maroon')
cbar= plt.colorbar()
cbar.set_label('humidity (%)', labelpad=+1)
plt.grid()
plt.savefig('Images/SH_lin_regres_humidity-vs-lat.png')
print('The regression line shows that there is no relationship between humidity and latitude.')
plt.show()

##### 
##### Northern Hemisphere - Cloudiness (%) vs. Latitude

In [None]:
plt.figure(figsize=(8, 6))
plt.scatter(n_latitude, n_cloudiness, 
           c =northern_df['Cloudiness'], s=15, cmap= 'coolwarm', marker='o')

# Run linear regression plot function
linear_eq = linear_regres(n_latitude, n_cloudiness)

plt.title(f'N. Hemisphere, Cloudiness vs. City Latitude  ' + time.strftime('%m/%d/%Y'), fontsize=14 )
plt.ylabel('Cloudiness (%)', fontsize=12)
plt.xlabel('Latitude', fontsize=12)

plt.annotate(linear_eq,(5,20),fontsize=16, color='maroon')
cbar= plt.colorbar()
cbar.set_label('cloudiness (%)', labelpad=+1)
plt.grid()
plt.savefig('Images/NH_lin_regres_cloudiness-vs-lat.png')
print('The regression line shows that the further cities are from the equator, the less cloudy it is. R-square also shows that there is a correleation, at 52%.')
plt.show()

#### 
##### Southern Hemisphere - Cloudiness (%) vs. Latitude

In [None]:
plt.figure(figsize=(8, 6))
plt.scatter(s_latitude, s_cloudiness, 
           c =southern_df['Cloudiness'], s=15, cmap= 'coolwarm', marker='o')

# Run linear regression plot function
linear_eq = linear_regres(s_latitude, s_cloudiness)

plt.title(f'S. Hemisphere, Cloudiness vs. City Latitude  ' + time.strftime('%m/%d/%Y'), fontsize=14 )
plt.ylabel('Cloudiness (%)', fontsize=12)
plt.xlabel('Latitude', fontsize=12)

plt.annotate(linear_eq,(-45,20),fontsize=16, color='maroon')
cbar= plt.colorbar()
cbar.set_label('cloudiness (%)', labelpad=+1)
plt.grid()
plt.savefig('Images/SH_lin_regres_cloudiness-vs-lat.png')
print('The regression line shows that there is no relationship between cloudiness and latitude.')
plt.show()

##### 
##### Northern Hemisphere - Wind Speed (mph) vs. Latitude

In [None]:
plt.figure(figsize=(8, 6))
plt.scatter(n_latitude, n_windspeed, 
           c =northern_df['Wind_Speed'], s=15, cmap= 'coolwarm', marker='o')

# Run linear regression plot function
linear_eq = linear_regres(n_latitude, n_windspeed)

plt.title(f'N. Hemisphere, Wind Speed vs. City Latitude  ' + time.strftime('%m/%d/%Y'), fontsize=14 )
plt.ylabel('Wind Speed (mph)', fontsize=12)
plt.xlabel('Latitude', fontsize=12)

plt.annotate(linear_eq,(5,25),fontsize=16, color='maroon')
cbar= plt.colorbar()
cbar.set_label('wind speed (mph)', labelpad=+1)
plt.grid()
plt.savefig('Images/NH_lin_regres_wind_speed-vs-lat.png')
print('The regression line does not show a decernible relationship between wind speed and city latitude in the Southern Hemisphere.')
plt.show()

##### 
##### Southern Hemisphere - Wind Speed (mph) vs. Latitude

In [None]:
plt.figure(figsize=(8, 6))
plt.scatter(s_latitude, s_windspeed, 
           c =southern_df['Wind_Speed'], s=15, cmap= 'coolwarm', marker='o')

# Run linear regression plot function
linear_eq = linear_regres(s_latitude, s_windspeed)

plt.title(f'S. Hemisphere, Wind Speed vs. City Latitude  ' + time.strftime('%m/%d/%Y'), fontsize=14 )
plt.ylabel('Wind Speed (mph)', fontsize=12)
plt.xlabel('Latitude', fontsize=12)

plt.annotate(linear_eq,(-45,20),fontsize=16, color='maroon')
cbar= plt.colorbar()
cbar.set_label('wind speed (mph)', labelpad=+1)
plt.grid()
plt.savefig('Images/SH_lin_regres_wind_speed-vs-lat.png')
print('The regression line shows no relationship between wind speed and city latitude and R-squared of zero supports shows relationship.')
plt.show()