# Lab Exercise: Using OpenWeather API to Collect and Export Data

## Objective
- Fetch weather data for multiple cities using the OpenWeather API.
- Process and clean the retrieved data.
- Store the data in a pandas DataFrame.
- Export the data to a CSV file.
- Plot the data

## 1: Sign Up for OpenWeather API and Obtain an API Key

1. Go to the [OpenWeather website](https://home.openweathermap.org/users/sign_up) and sign up for an account.
2. After logging in, navigate to the API Keys section of your account dashboard.
3. Copy your unique API key, which will be used to authenticate your requests.

## 2: Install Necessary Python Libraries

1. Install the required Python libraries if they are not already installed. You will need:
   - `requests` for making HTTP requests.
   - `pandas` for data manipulation and analysis.


More info, https://docs.openweather.co.uk/appid

In [None]:
! pip install...

In [1]:
import requests #library for making HTTP requests
import pandas as pd #data manipulation library
import matplotlib as plt #graph ploting library
import json
from api_keys import open_weather_key

## 4: Get the base URL and define your endpoints 

See the class example for reference and also refer to the python requests library docs for examples.

***Hint*** search for python requests params, this way you can make a request without having to know the exact HTTP request syntax! 

Get data for London - this will be your query. 

Remember you need your API key!

If you get a 400 code there is something wrong with your request, or your API key isn't working yet.
If you get 200 it worked! 

In [2]:
url = "https://api.openweathermap.org/data/2.5/weather"
location = "London,uk"
api_id = open_weather_key
request_url = f'{url}?q={location},&APPID={api_id}'

response = requests.get(request_url) # Make a request to the API
response.status_code

200

## Make the request and store it in a variable in JSON format

Then print it to see what it looks like 

In [3]:
response_json = response.json()

In [4]:
# Option 1 to view json

response_json

{'coord': {'lon': -0.1257, 'lat': 51.5085},
 'weather': [{'id': 803,
   'main': 'Clouds',
   'description': 'broken clouds',
   'icon': '04d'}],
 'base': 'stations',
 'main': {'temp': 289.84,
  'feels_like': 288.74,
  'temp_min': 288.4,
  'temp_max': 291.03,
  'pressure': 1029,
  'humidity': 45,
  'sea_level': 1029,
  'grnd_level': 1024},
 'visibility': 10000,
 'wind': {'speed': 3.6, 'deg': 340},
 'clouds': {'all': 67},
 'dt': 1726242200,
 'sys': {'type': 2,
  'id': 268730,
  'country': 'GB',
  'sunrise': 1726205567,
  'sunset': 1726251608},
 'timezone': 3600,
 'id': 2643743,
 'name': 'London',
 'cod': 200}

In [5]:
# Option 2 to view json

for key, value in response_json.items():
        print(key, ":", value)

coord : {'lon': -0.1257, 'lat': 51.5085}
weather : [{'id': 803, 'main': 'Clouds', 'description': 'broken clouds', 'icon': '04d'}]
base : stations
main : {'temp': 289.84, 'feels_like': 288.74, 'temp_min': 288.4, 'temp_max': 291.03, 'pressure': 1029, 'humidity': 45, 'sea_level': 1029, 'grnd_level': 1024}
visibility : 10000
wind : {'speed': 3.6, 'deg': 340}
clouds : {'all': 67}
dt : 1726242200
sys : {'type': 2, 'id': 268730, 'country': 'GB', 'sunrise': 1726205567, 'sunset': 1726251608}
timezone : 3600
id : 2643743
name : London
cod : 200


In [6]:
# Option 3 to view json

pretty = json.dumps(response_json, indent=4)
print(pretty)

{
    "coord": {
        "lon": -0.1257,
        "lat": 51.5085
    },
    "weather": [
        {
            "id": 803,
            "main": "Clouds",
            "description": "broken clouds",
            "icon": "04d"
        }
    ],
    "base": "stations",
    "main": {
        "temp": 289.84,
        "feels_like": 288.74,
        "temp_min": 288.4,
        "temp_max": 291.03,
        "pressure": 1029,
        "humidity": 45,
        "sea_level": 1029,
        "grnd_level": 1024
    },
    "visibility": 10000,
    "wind": {
        "speed": 3.6,
        "deg": 340
    },
    "clouds": {
        "all": 67
    },
    "dt": 1726242200,
    "sys": {
        "type": 2,
        "id": 268730,
        "country": "GB",
        "sunrise": 1726205567,
        "sunset": 1726251608
    },
    "timezone": 3600,
    "id": 2643743,
    "name": "London",
    "cod": 200
}


## 5: Target specific data

So you should now have weather for London.

Analyse the JSON data and get the city name, temperature, humidity, weather description, and windspeed and store them in their own variables.

In [7]:
response_json['coord']['lon']

-0.1257

In [12]:
city_name = response_json['name']
temperature = response_json['main']['temp']
humidity = response_json['main']['humidity']
weather_description = response_json['weather'][0]['description']
wind_speed = response_json['wind']['speed']

print(f"City: {city_name}, Temperature: {temperature}°C, Humidity: {humidity}%, Weather: {weather_description}, Wind Speed: {wind_speed} m/s")

City: London, Temperature: 289.84°C, Humidity: 45%, Weather: broken clouds, Wind Speed: 3.6 m/s


# 6: Create a data frame using pandas 

Create columns and give the data frame the data you've collected. 

In [13]:
df = pd.DataFrame(columns=["city_name", "temperature", "humidity", "weather_description", "wind_speed"], data=[[city_name, temperature, humidity, weather_description, wind_speed]])

df

Unnamed: 0,city_name,temperature,humidity,weather_description,wind_speed
0,London,289.84,45,broken clouds,3.6


## 7: Get data for more cities

Nice work! you've got data for London and made a data frame. But let's now get data for more cities so we can compare. 

Start by making a list of the cities you want to get data for. You'll be making an API request for each city and the API limit is 60 requests a minute, so don't do more than this. 5-10 will be good.

In [14]:
cities = ["london,uk","manchester,uk","edinburgh,uk","birmingham,uk","bristol,uk",]

## 8: Use a for loop to make multiple requests

Make a for loop to create a request for each city. Add the data you've collected to a the `cities_weather_data` list. 

If you're struggling then pair up with a group mate, but basically what you're doing is most of the previous steps within a single loop.

In [16]:
weather_data_list = []

for city in cities:

    response = requests.get(f'{url}?q={city},&APPID={api_id}') # Make a request to the API using the city name

    if response.status_code == 200:
        response_json = response.json()
        city_name = response_json['name']
        temperature = response_json['main']['temp']
        humidity = response_json['main']['humidity']
        weather_description = response_json['weather'][0]['description']
        wind_speed = response_json['wind']['speed']

        # create a dictionary with the extracted data
        extracted_data = {"city_name": city_name , "temperature": temperature, "humidity": humidity, "temperature": temperature, "weather_description": weather_description, "wind_speed": wind_speed}

        # append the dictionary to the list
        weather_data_list.append(extracted_data)

weather_data_list

SSLError: HTTPSConnectionPool(host='api.openweathermap.org', port=443): Max retries exceeded with url: /data/2.5/weather?q=london,uk,&APPID=19532d6ba2de52f1bffb09fda0effc50 (Caused by SSLError(SSLEOFError(8, '[SSL: UNEXPECTED_EOF_WHILE_READING] EOF occurred in violation of protocol (_ssl.c:1000)')))

## 9: Create a data frame with all your cities.

You should just be able to give the dataframe function your `weather_data_list` list. 

In [21]:
weather_data_df = pd.DataFrame.from_dict(weather_data_list)
weather_data_df

## 10: Organise the data by temperature

Organise by dataframe by the temperature column, and reset the index. 

In [20]:
weather_data_df.reset_index().sort_values(by='temperature', ascending=False)

KeyError: 'temperature'

## 11: Export the data as a CSV

Look for data frame to CSV function

In [None]:
weather_data_df.to_csv('weather_data.csv')

## 12: Plot your data on a graph using matplot lib 

Pick a data point and plot it on a graph. Experiment with graph types and formatting. Try plotting more than one data point for each city e.g temperature and humidity. 

In [22]:
import matplotlib.pyplot as plt

In [None]:
# TODO


plt.figure(figsize=(10, 6))
plt.bar(weather_data_df['Video Title'][:10], sort_video_df['Like Count'][:10], color='skyblue')
plt.xticks(rotation=90)
plt.ylabel("Like Count")
plt.title("Top 10 Videos by Like Count")
plt.show()

## Extra Challenges 
- Modify your script to handle additional cities dynamically by reading from an external CSV file containing a list of cities. You could find a data set on kaggle or create a dataframe and export as CSV.
- Implement error handling for network issues, API rate limits, and incorrect city names