# WeatherPy
----
## Analysis
- You can clearly see a relationship between the northern hemisphere and max temperature. The further north, the colder it get. We see a slight correlation in the southern hemisphere, with the warmest temperature being around the equator.

- I found no correlation between latitude and humidity. There did appear to be some drop in humidity between 0 and 40 degrees latitude before spiking again. Perhaps due to weather systems.

- I found no correlation between wind speed and latitude. And I found no correlation between cloudiness and latitude.

In [1]:
# Dependencies and Setup
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
import requests
import datetime
import time
from scipy.stats import linregress

# Import API key
from api_keys import weather_api_key

# Incorporated citipy to determine city based on latitude and longitude
from citipy import citipy

# Output File (CSV)
output_data_file = "output_data/cities.csv"

# Range of latitudes and longitudes
lat_range = (-90, 90)
lng_range = (-180, 180)

## Generate Cities List

In [2]:
# List for holding lat_lngs and cities
lat_lngs = []
cities = []

# Create a set of random lat and lng combinations
lats = np.random.uniform(lat_range[0], lat_range[1], size=3200)
lngs = np.random.uniform(lng_range[0], lng_range[1], size=3200)
lat_lngs = zip(lats, lngs)

# Identify nearest city for each lat, lng combination
for lat_lng in lat_lngs:
    city = citipy.nearest_city(lat_lng[0], lat_lng[1]).city_name
    
    # If the city is unique, then add it to a our cities list
    if city not in cities:
        cities.append(city)

# Print the city count to confirm sufficient count
len(cities)

1060

### Perform API Calls
* Perform a weather check on each city using a series of successive API calls.
* Include a print log of each city as it'sbeing processed (with the city number and city name).


In [3]:
# url = api.openweathermap.org/data/2.5/weather?q={city name}&appid={API key}&units=imperial
base_url = 'http://api.openweathermap.org/data/2.5/weather?&appid='+weather_api_key+'&units=imperial&q=' 

# use variable to track count of cities 
cityCount = 1

# use variables to store values from api
lat_list = []
lon_list = []
temp_list = []
humidity_list = []
clouds_list = []
wind_list = []
country_list = []
date_list = []
city_list = []


print(f'Begin Data Retrevial')
print(f'------------------')

# for each city in the list attempt to get weather data
for city in cities:
    
    # attempt to get response data from open weather map api
    try:
        
        print(f'Processing Record {cityCount} | {city}')
        cityCount += 1
        url = base_url+city
        response = requests.get(url).json()

        lat = response['coord']['lat']
        lon = response['coord']['lon']
        temp = response['main']['temp_max']
        humidity = response['main']['humidity']
        clouds = response['clouds']['all']
        wind = response['wind']['speed']
        country = response['sys']['country']
        date = response['dt']
        
        #store data
        lat_list.append(lat)
        lon_list.append(lon)
        temp_list.append(temp)
        humidity_list.append(humidity)
        clouds_list.append(clouds)
        wind_list.append(wind)
        country_list.append(country)
        date_list.append(date)
        city_list.append(city)
    
    # skip city if response fails
    except:
        print(f'{city} not found. Skipping...')
        pass
    
    time.sleep(2) # Sleep for 2 seconds
    
print("-----------------------------")
print("Data Retrieval Complete      ")
print("-----------------------------")

Begin Data Retrevial
------------------
Processing Record 1 | esmeralda
esmeralda not found. Skipping...
Processing Record 2 | kencong
kencong not found. Skipping...
Processing Record 3 | faya
faya not found. Skipping...
Processing Record 4 | hermanus
hermanus not found. Skipping...
Processing Record 5 | gambela
gambela not found. Skipping...


KeyboardInterrupt: 

### Convert Raw Data to DataFrame
* Export the city data into a .csv.
* Display the DataFrame

In [None]:
city_df = pd.DataFrame({'City': city_list,
                        'Lat': lat_list,
                        'Lng': lon_list,
                        'Max Temp': temp_list,
                        'Humidity': humidity_list,
                        'Cloudiness': clouds_list,
                        'Wind Speed': wind_list,
                        'Country': country_list,
                        'Date': date_list
                       })

city_df.head()

In [None]:
city_df.describe()

## Inspect the data and remove the cities where the humidity > 100%.
----
Skip this step if there are no cities that have humidity > 100%. 

In [None]:
#  Get the indices of cities that have humidity over 100%.
bad_data = city_df.loc[city_df['Humidity'] > 100].index

In [None]:
# Make a new DataFrame equal to the city data to drop all humidity outliers by index.
# Passing "inplace=False" will make a copy of the city_data DataFrame, which we call "clean_city_data".
clean_city_data = city_df.drop(bad_data, inplace=False)
clean_city_data.head()

## Plotting the Data
* Use proper labeling of the plots using plot titles (including date of analysis) and axes labels.
* Save the plotted figures as .pngs.

## Latitude vs. Temperature Plot

In [None]:
# Get today's date
# using now() to get current time  
current_time = datetime.datetime.now()  
year = current_time.year
month = current_time.month
day = current_time.day
date = "-".join([str(month),str(day),str(year)])

In [None]:
# latitude vs. temperature scatter plot
plt.scatter(clean_city_data['Lat'], 
            clean_city_data['Max Temp'],
            edgecolor="black", linewidths=1, marker="o")

# add labels, titles, etc.
plt.title(f"City Latitude vs. Max Temperature ({date})")
plt.ylabel("Max Temperature (F)")
plt.xlabel("Latitude")
plt.grid(True)

# Save the plot
plt.savefig("output_data/latVStemp.png")

# Show plot
plt.show()

This plot shows the relationship between lattitude and the highest temperature on a single day.
In this case it shows it colder in the north (positive latitude) than it is in the south (negative latitude).

## Latitude vs. Humidity Plot

In [None]:
# latitude vs. humidity scatter plot
plt.scatter(clean_city_data['Lat'], 
            clean_city_data['Humidity'],
            edgecolor="black", linewidths=1, marker="o")

# add labels, titles, etc.
plt.title(f"City Latitude vs. Humidity ({date})")
plt.ylabel("Humidity (%)")
plt.xlabel("Latitude")
plt.grid(True)

# Save the plot
plt.savefig("output_data/lat_VS_humidity.png")

# Show plot
plt.show()

This plot shows the relationship between lattitude and the humidity on a single day.
In this case there does not appear to be a strong correlation between latitude and humidity.

## Latitude vs. Cloudiness Plot

In [None]:
# latitude vs. cloudiness scatter plot
plt.scatter(clean_city_data['Lat'], 
            clean_city_data['Cloudiness'],
            edgecolor="black", linewidths=1, marker="o")

# add labels, titles, etc.
plt.title(f"City Latitude vs. Cloudiness ({date})")
plt.ylabel("Cloudiness (%)")
plt.xlabel("Latitude")
plt.grid(True)

# Save the plot
plt.savefig("output_data/lat_VS_cloudiness.png")

# Show plot
plt.show()

This plot shows the relationship between lattitude and the cloudiness on a single day. In this case there does not appear to be a strong correlation between latitude and cloudiness.

## Latitude vs. Wind Speed Plot

In [None]:
# latitude vs. wind speed scatter plot
plt.scatter(clean_city_data['Lat'], 
            clean_city_data['Wind Speed'],
            edgecolor="black", linewidths=1, marker="o")

# add labels, titles, etc.
plt.title(f"City Latitude vs. Wind Speed ({date})")
plt.ylabel("Wind Speed (mph)")
plt.xlabel("Latitude")
plt.grid(True)

# Save the plot
plt.savefig("output_data/lat_VS_windSpeed.png")

# Show plot
plt.show()

This plot shows the relationship between lattitude and the wind speed on a single day. In this case there does not appear to be a strong correlation between latitude and wind speed.

## Linear Regression

In [None]:
# linear regression calculation & plot function
def lin_regress(x_values, y_values, ylabel, text_coordinates):
    
    # calculate regression value
    (slope, intercept, rvalue, pvalue, stderr) = linregress(x_values, y_values)
    regression_values = x_values * slope + intercept
    line_eq = "y = " + str(round(slope,2)) + "x + " + str(round(intercept,2))

    # create scatter plot
    plt.scatter(x_values,y_values)
    
    # create regession line & display equation
    plt.plot(x_values,regression_values,"r-")
    plt.annotate(line_eq,text_coordinates,fontsize=15,color="red")
    
    # add axis & plot titles
    plt.xlabel('Latitude')
    plt.ylabel(ylabel)
    
    # print r value
    print(f'The r-value is: {rvalue**2}')
    plt.show()

In [None]:
north_hem = clean_city_data.loc[(clean_city_data["Lat"] >= 0)]
south_hem = clean_city_data.loc[(clean_city_data["Lat"] < 0)]

####  Northern Hemisphere - Max Temp vs. Latitude Linear Regression

In [None]:
lin_regress(north_hem['Lat'], north_hem['Max Temp'], 'Max Temp', (5,-20))

The high r-value indicates a strong correlation between max temperature and latitude. The correlation in this case is negative. The further north, the lower the temperature

####  Southern Hemisphere - Max Temp vs. Latitude Linear Regression

In [None]:
lin_regress(south_hem['Lat'], south_hem['Max Temp'], 'Max Temp', (-50,42))

The low r-value does not indicate a strong correlation between max temperature and latitude.

####  Northern Hemisphere - Humidity (%) vs. Latitude Linear Regression

In [None]:
lin_regress(north_hem['Lat'], north_hem['Humidity'], 'Humidity', (45,20))

The low r-value show little or no correlation between humidity and latitude.

####  Southern Hemisphere - Humidity (%) vs. Latitude Linear Regression

In [None]:
lin_regress(south_hem['Lat'], south_hem['Humidity'], 'Humidity', (-30,20))

The low r-value suggests a weak correlation between humidity and latitude.

####  Northern Hemisphere - Cloudiness (%) vs. Latitude Linear Regression

In [None]:
lin_regress(north_hem['Lat'], north_hem['Cloudiness'], 'Cloudiness', (45,25))

The low r-value suggests a weak correlation between cloudiness and latitude.

####  Southern Hemisphere - Cloudiness (%) vs. Latitude Linear Regression

In [None]:
lin_regress(south_hem['Lat'], south_hem['Cloudiness'], 'Cloudiness', (-55,5))

The low r-value suggests a weak correlation between cloudiness and latitude.

####  Northern Hemisphere - Wind Speed (mph) vs. Latitude Linear Regression

In [None]:
lin_regress(north_hem['Lat'], north_hem['Wind Speed'], 'Wind Speed', (5,30))

The low r-value suggests a weak correlation between wind speed and latitude.

####  Southern Hemisphere - Wind Speed (mph) vs. Latitude Linear Regression

In [None]:
lin_regress(south_hem['Lat'], south_hem['Wind Speed'], 'Wind Speed', (-45,26))

The low r-value suggests a weak correlation between wind speed and latitude.

In [None]:
# Export the City_Data into a csv
clean_city_data.to_csv(output_data_file, index_label="City_ID")

In [None]:
clean_city_data