# WeatherPy

---

## Starter Code to Generate Random Geographic Coordinates and a List of Cities

In [1]:
# Dependencies and Setup
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
import requests
import time
from scipy.stats import linregress

# Impor the OpenWeatherMap API key
from api_keys import weather_api_key

# Import citipy to determine the cities based on latitude and longitude
from citipy import citipy

### Generate the Cities List by Using the `citipy` Library

In [2]:
# Empty list for holding the latitude and longitude combinations
lat_lngs = []

# Empty list for holding the cities names
cities = []

# Range of latitudes and longitudes
lat_range = (-90, 90)
lng_range = (-180, 180)

# Create a set of random lat and lng combinations
lats = np.random.uniform(lat_range[0], lat_range[1], size=1550)
lngs = np.random.uniform(lng_range[0], lng_range[1], size=1550)
lat_lngs = zip(lats, lngs)

# Identify nearest city for each lat, lng combination
for lat_lng in lat_lngs:
    city = citipy.nearest_city(lat_lng[0], lat_lng[1]).city_name
    
    # If the city is unique, then add it to a our cities list
    if city not in cities:
        cities.append(city)

# Print the city count to confirm sufficient count
print(f"Number of cities in the list: {len(cities)}")

Number of cities in the list: 597


## Requirement 1: Create Plots to Showcase the Relationship Between Weather Variables and Latitude

### Use the OpenWeatherMap API to retrieve weather data from the cities list generated in the started code

In [3]:
# Set the API base URL
url = "http://api.openweathermap.org/data/2.5/weather?"
units = "metric"
# Define an empty list to fetch the weather data for each city
city_data = []

# Print to logger
print("Beginning Data Retrieval     ")
print("-----------------------------")

# Create counters
record_count = 1
set_count = 1

# Loop through all the cities in our list to fetch weather data
for i, city in enumerate(cities):
        
    # Group cities in sets of 50 for logging purposes
    if (i % 50 == 0 and i >= 50):
        set_count += 1
        record_count = 0

    # Create endpoint URL with each city
    city_url = f"{url}appid={weather_api_key}&units={units}&q= + {city}"
    
    # Log the url, record, and set numbers
    print("Processing Record %s of Set %s | %s" % (record_count, set_count, city))

    # Add 1 to the record count
    record_count += 1
    
    # Run an API request for each of the cities
    try:
        # Parse the JSON and retrieve data
        city_weather = requests.get(city_url).json()

        # Parse out latitude, longitude, max temp, humidity, cloudiness, wind speed, country, and date
        city_lat = city_weather['coord']['lat']
        city_lng = city_weather['coord']['lon']
        city_max_temp = city_weather['main']['temp_max']
        city_humidity = city_weather['main']['humidity']
        city_clouds = city_weather['clouds']['all']
        city_wind = city_weather['wind']['speed']
        city_country = city_weather['sys']['country']
        city_date = city_weather['dt']

        # Append the City information into city_data list
        city_data.append({"City": city, 
                          "Lat": city_lat, 
                          "Lng": city_lng, 
                          "Max Temp": city_max_temp,
                          "Humidity": city_humidity,
                          "Cloudiness": city_clouds,
                          "Wind Speed": city_wind,
                          "Country": city_country,
                          "Date": city_date})

    # If an error is experienced, skip the city
    except:
        print("City not found. Skipping...")
        pass
              
# Indicate that Data Loading is complete 
print("-----------------------------")
print("Data Retrieval Complete      ")
print("-----------------------------")

Beginning Data Retrieval     
-----------------------------
Processing Record 1 of Set 1 | adamstown
Processing Record 2 of Set 1 | ancud
Processing Record 3 of Set 1 | sumbe
Processing Record 4 of Set 1 | bethel
Processing Record 5 of Set 1 | kodiak
Processing Record 6 of Set 1 | bredasdorp
Processing Record 7 of Set 1 | new norfolk
Processing Record 8 of Set 1 | atafu village
Processing Record 9 of Set 1 | udachny
City not found. Skipping...
Processing Record 10 of Set 1 | olonkinbyen
Processing Record 11 of Set 1 | san jose del cabo
Processing Record 12 of Set 1 | la passe
City not found. Skipping...
Processing Record 13 of Set 1 | pangoa
Processing Record 14 of Set 1 | yellowknife
Processing Record 15 of Set 1 | kerikeri
Processing Record 16 of Set 1 | bilibino
Processing Record 17 of Set 1 | leh
Processing Record 18 of Set 1 | hamilton
Processing Record 19 of Set 1 | kota kinabalu
Processing Record 20 of Set 1 | port-aux-francais
Processing Record 21 of Set 1 | kasane
Processing R

In [17]:
city_data_df = pd.DataFrame(city_data)
# Show Record Count
city_data_df.count()

City          524
Lat           524
Lng           524
Max Temp      524
Humidity      524
Cloudiness    524
Wind Speed    524
Country       524
Date          524
dtype: int64

In [61]:
# Convert the cities weather data into a Pandas DataFrame
city_data_df = pd.DataFrame({ "City": cities,
                              "Lat": city_lat, 
                              "Lng": city_lng, 
                              "Max Temp": city_max_temp,
                              "Humidity": city_humidity,
                              "Cloudiness": city_clouds,
                              "Wind Speed": city_wind,
                              "Country": city_country,
                              "Date": city_date
                            })

# Convert unix numbers in date column to readable datetime
city_data_df["Date"] = pd.to_datetime(city_data_df["Date"] , unit = "s")

# Display sample data
city_data_df.head()


Unnamed: 0,City,Lat,Lng,Max Temp,Humidity,Cloudiness,Wind Speed,Country,Date
0,adamstown,39.9755,-111.7852,7.87,61,100,3.13,US,2024-05-06 19:56:26
1,ancud,39.9755,-111.7852,7.87,61,100,3.13,US,2024-05-06 19:56:26
2,sumbe,39.9755,-111.7852,7.87,61,100,3.13,US,2024-05-06 19:56:26
3,bethel,39.9755,-111.7852,7.87,61,100,3.13,US,2024-05-06 19:56:26
4,kodiak,39.9755,-111.7852,7.87,61,100,3.13,US,2024-05-06 19:56:26


In [None]:
# Export the City_Data into a csv
city_data_df.to_csv("output_data/cities.csv", index_label="City_ID")

In [None]:
# Read saved data
city_data_df = pd.read_csv("output_data/cities.csv", index_col="City_ID")

# Display sample data
city_data_df.head()

### Create the Scatter Plots Requested

#### Latitude Vs. Temperature

In [None]:
# Build scatter plot for latitude vs. temperature
# YOUR CODE HERE
plt.scatter(city_data_df["Lat"], city_data_df["Max Temp"], marker="o", facecolors="steelblue", edgecolor="black" , alpha=0.75)
# Incorporate the other graph properties
# YOUR CODE HERE
plt.title("City Max Latitude vs. Tempurature (May 8, 2024)")
plt.xlabel("Latitude")
plt.ylabel("Max Temperatures (C)")
plt.grid(True)

# Save the figure
plt.savefig("output_data/Fig1.png")

# Show plot
plt.show()

#### Latitude Vs. Humidity

In [None]:
# Build the scatter plots for latitude vs. humidity
# YOUR CODE HERE
plt.scatter(city_data_df["Lat"], city_data_df["Humidity"], marker="o", facecolors="steelblue", edgecolor="black" , alpha=0.75)
# Incorporate the other graph properties
# YOUR CODE HERE
plt.title("City Latitude vs. Humidity (May 8, 2024)")
plt.xlabel("City Latitudes")
plt.ylabel("Humidity (%)")
plt.grid(True)
# Save the figure
plt.savefig("output_data/Fig2.png")

# Show plot
plt.show()

#### Latitude Vs. Cloudiness

In [None]:
# Build the scatter plots for latitude vs. cloudiness
# YOUR CODE HERE
plt.scatter(city_data_df["Lat"], city_data_df["Cloudiness"], marker="o", facecolors="steelblue", edgecolor="black" , alpha=0.75)
# Incorporate the other graph properties
# YOUR CODE HERE
plt.title("City Latitude vs. Cloudiness (May 8, 2024)")
plt.xlabel("Latitude")
plt.ylabel("Cloudiness (%)")
plt.grid(True)
# Save the figure
plt.savefig("output_data/Fig3.png")

# Show plot
plt.show()

#### Latitude vs. Wind Speed Plot

In [None]:
# Build the scatter plots for latitude vs. wind speed
# YOUR CODE HERE
plt.scatter(city_data_df["Lat"], city_data_df["Wind Speed"], marker="o", facecolors="steelblue", edgecolor="black" , alpha=0.75)
# Incorporate the other graph properties
# YOUR CODE HERE
plt.title("City Latitude vs. Wind Speed (May 8, 2024)")
plt.xlabel("Latitude")
plt.ylabel("Wind Speed (m/s)")
plt.grid(True)
# Save the figure
plt.savefig("output_data/Fig4.png")

# Show plot
plt.show()

---

## Requirement 2: Compute Linear Regression for Each Relationship


In [None]:
# Define a function to create Linear Regression plots
# YOUR CODE HERE
def lr_plot(x_values, y_values, title, text_coordinates):
    (slope, intercept, rvalue, pvalue, stderr) = linregress(x_value, y_value)
    regression = x_value * slope + intercept
    
    
    line_eq = ' y = ' + str(round(slope,2)) + ' x + ' + str(round(intercept,2)) #format on how equation writtten
    
    plt.scatter(x_value, y_value, marker='o') #, edgecolors='black')
    plt.plot(x_value, regression, color='red', linewidth=3) #plot line
    plt.annotate(line_eq, text_coordinates, fontsize=16, color="red") #is the equation text
    plt.grid(True)
    print(f"The r-value is: {rvalue}")


In [None]:
# Create a DataFrame with the Northern Hemisphere data (Latitude >= 0)
# YOUR CODE HERE
northern_hemi_df = city_data_df[(city_data_df["Lat"] >= 0)]
# Display sample data
northern_hemi_df.head()

In [None]:
# Create a DataFrame with the Southern Hemisphere data (Latitude < 0)
# YOUR CODE HERE
southern_hemi_df = city_data_df[(city_data_df["Lat"] < 0)]
# Display sample data
southern_hemi_df.head()

###  Temperature vs. Latitude Linear Regression Plot

In [None]:
# Linear regression on Northern Hemisphere
# YOUR CODE HERE
x_value = northern_hemi_df['Lat']
y_value = northern_hemi_df['Max Temp']
xc = 0
yc = 1

lr_plot(x_value , y_value , "Max Temp" , (6,-10))
plt.title("Northern Temperature vs. Latitude")
#plt.xlabel("Latitude")
#plt.ylabel("Max Temp")
#lt.show()

In [None]:
# Linear regression on Southern Hemisphere
# YOUR CODE HERE
x_value = southern_hemi_df['Lat']
y_value = southern_hemi_df['Max Temp']

lr_plot(x_value , y_value , "Max Temp" , (-30,5))

plt.title("Southern Temperature vs. Latitude")
#plt.xlabel("Latitude")
#plt.ylabel("Max Temp")
#plt.show()


**Discussion about the linear relationship:** YOUR RESPONSE HERE
The analysis of both charts reveals a compelling correlation between latitude and temperature. As we approach the equator (0° latitude), there's a noticeable uptick in temperature, indicating a direct relationship. This observation strongly aligns with the hypothesis asserting that regions closer to the equator experience higher temperatures compared to those near the poles. This correlation is bolstered by a robust r-value, typically hovering around 0.7, indicating a significant statistical relationship between latitude and temperature.

### Humidity vs. Latitude Linear Regression Plot

In [None]:
# Northern Hemisphere
# YOUR CODE HERE
x_value = northern_hemi_df['Lat']
y_value = northern_hemi_df['Humidity']
lr_plot(x_value , y_value , "Humidity" , (45,9))

plt.title("Northern Humidity vs. Latitude")
#plt.xlabel("Latitude")
#plt.ylabel("Humidity")
#plt.show()

In [None]:
# Southern Hemisphere
# YOUR CODE HERE
x_value = southern_hemi_df['Lat']
y_value = southern_hemi_df['Humidity']
lr_plot(x_value , y_value , "Humidity" , (-55,30))
plt.title("Southern Humidity vs. Latitude")

**Discussion about the linear relationship:** YOUR RESPONSE HERE
After carefully examining both charts, it becomes apparent that there's essentially no discernible correlation between humidity and latitude. Regardless of whether we're looking at Northern or Southern latitudes, the linear regression remains slightly flat, indicating a consistent level of humidity across different regions. This consistency is reinforced by the low r-values observed in both charts, suggesting that latitude has little influence on humidity levels. In essence, latitude seems to have minimal impact on humidity, with variations in other factors likely playing a more significant role in determining moisture levels across different latitudes.

### Cloudiness vs. Latitude Linear Regression Plot

In [None]:
# Northern Hemisphere
# YOUR CODE HERE
x_value = northern_hemi_df['Lat']
y_value = northern_hemi_df['Cloudiness']
lr_plot(x_value , y_value , "Cloudiness" , (50,25))
plt.title("Northern Cloudiness vs. Latitude")

In [None]:
# Southern Hemisphere
# YOUR CODE HERE
x_value = southern_hemi_df['Lat']
y_value = southern_hemi_df['Cloudiness']

lr_plot(x_value , y_value , "Cloudiness" , (-30,5))
plt.title("Southern Cloudiness vs. Latitude")

**Discussion about the linear relationship:** YOUR RESPONSE HERE
Upon close examination, it becomes evident that there's essentially no discernible relationship between cloud cover and latitude. Regardless of the latitude considered, cloud cover appears to occur across the spectrum, varying from complete absence to full coverage (0 to 100). The flat linear regression line further reinforces this observation, indicating that latitude doesn't significantly influence cloud cover. This conclusion is supported by the r-value, which hovers close to zero, suggesting that changes in latitude have little impact on the occurrence or extent of cloud cover. In essence, cloud cover seems to be influenced by factors other than latitude, with diverse atmospheric dynamics at play across different regions.

### Wind Speed vs. Latitude Linear Regression Plot

In [None]:
# Northern Hemisphere
# YOUR CODE HERE
x_value = northern_hemi_df['Lat']
y_value = northern_hemi_df['Wind Speed']
lr_plot(x_value , y_value , "Wind Speed" , (0,12))
plt.title("Northern Wind Speed vs. Latitude")

In [None]:
# Southern Hemisphere
# YOUR CODE HERE
x_value = southern_hemi_df['Lat']
y_value = southern_hemi_df['Wind Speed']

lr_plot(x_value , y_value , "Wind sSpeed" , (-40,12))
plt.title("Southern Wind Speed vs. Latitude")


**Discussion about the linear relationship:** YOUR RESPONSE HERE
Exploring the connection between wind speed and latitude unveils a somewhat intricate scenario, differing between the hemispheres. In the Northern Hemisphere, the linear regression line remains flat, indicating no significant correlation between wind speed and latitude. Conversely, in the Southern Hemisphere, a subtle trend emerges, with wind speed exhibiting a slight decrease as it approaches the equator. 
However, this decline is minimal, typically hovering around 7-8 f/s, and is accompanied by a low r-value, indicating a weak correlation at best. In essence, while there's a nuanced pattern in the Southern Hemisphere, the overall relationship between wind speed and latitude appears to be relatively weak and influenced by various other factors.