# OpenWeatherMap API Data Collection

Start a new notebook and begin seeing if you can make a successful API call to Open Weather https://openweathermap.org/forecast5
* If you look on the right hand side of the page linked above, there is a menu of options for the API, a couple of useful ones could be How to make an API call, Weather fields in API response, Built-in API request by city name and Units of measurement, have an explore and see how you can customise the API to your needs.
* If you struggle to make a successful API call, please contact me or Yanish
Once you've got the JSON returned, start looking and investigating at what parts of the information you want. Try access these pieces of information one at a time.


Once you've learnt how to access each of the pieces of information, use a loop and a dictionary to bring the information together, and then make a DataFrame (other options include .json_normalize() where possible)
Edit the DataFrame to get rid of any weird characters, correct the data types etc....
When you've completed all of this, bring your code together in a function. A good input for this function would be a list of cities so you can see the weather for as many cities as you choose

# Necessary Imports

In [1]:
import pandas as pd
import requests
import json
from datetime import datetime
import pytz
import marcus_keys

# Functions for API Call & Data Processing

In [5]:
def get_weather_data(cities):
    api_key = marcus_keys.open_weather_key
    tz = pytz.timezone("Europe/Berlin")
    now = datetime.now().astimezone(tz)

    weather_data = {"city_name": [],
                    "country_code": [],
                    "date_time": [],
                    "temperature": [],
                    "wind_speed": [],
                    "humidity": [],
                    "outlook": []}

    for city in cities:
        url = f"https://api.openweathermap.org/data/2.5/forecast?q={city}&appid={api_key}&units=metric"
        weather = requests.get(url)
        weather_json = weather.json()
        
        for i in weather_json["list"]:
            weather_data["city_name"].append(weather_json["city"]["name"])
            weather_data["country_code"].append(weather_json["city"]["country"])
            weather_data["date_time"].append(i["dt_txt"])
            weather_data["temperature"].append(i["main"]["temp"])
            weather_data["wind_speed"].append(i["wind"]["speed"])
            weather_data["humidity"].append(i["main"]["humidity"])
            weather_data["outlook"].append(i["weather"][0]["description"])
            weather_data["data_collection_from"] = datetime.now().strftime('%Y-%m-%d %H:%M:%S')

    return pd.DataFrame(weather_data)

def process_weather_data(weather_data_raw_df):
    weather_data_raw_df["data_collection_from"] = pd.to_datetime(weather_data_raw_df["data_collection_from"])
    weather_data_raw_df["date_time"] = pd.to_datetime(weather_data_raw_df["date_time"])
    
    return weather_data_raw_df

In [6]:
cities = ["Berlin", "Hamburg", "Stuttgart", "Duesseldorf"]

In [7]:
weather_data_raw_df = pd.DataFrame(get_weather_data(cities))

In [8]:
weather_data_df = process_weather_data(weather_data_raw_df)

In [9]:
weather_data_df

Unnamed: 0,city_name,country_code,date_time,temperature,wind_speed,humidity,outlook,data_collection_from
0,Berlin,DE,2023-07-10 18:00:00,26.04,5.05,56,light rain,2023-07-10 17:32:26
1,Berlin,DE,2023-07-10 21:00:00,22.83,3.38,57,broken clouds,2023-07-10 17:32:26
2,Berlin,DE,2023-07-11 00:00:00,18.06,2.49,66,broken clouds,2023-07-10 17:32:26
3,Berlin,DE,2023-07-11 03:00:00,16.47,1.84,71,scattered clouds,2023-07-10 17:32:26
4,Berlin,DE,2023-07-11 06:00:00,20.11,1.08,56,scattered clouds,2023-07-10 17:32:26
...,...,...,...,...,...,...,...,...
155,Düsseldorf,DE,2023-07-15 03:00:00,20.51,2.99,46,overcast clouds,2023-07-10 17:32:26
156,Düsseldorf,DE,2023-07-15 06:00:00,17.92,2.13,74,light rain,2023-07-10 17:32:26
157,Düsseldorf,DE,2023-07-15 09:00:00,25.35,5.00,38,light rain,2023-07-10 17:32:26
158,Düsseldorf,DE,2023-07-15 12:00:00,28.75,6.42,40,overcast clouds,2023-07-10 17:32:26


# Local MySQL Connection

In [38]:
schema="p5_gans_database"
host="127.0.0.1"
user="root"
password=marcus_keys.my_sql_key
port=3306
con = f'mysql+pymysql://{user}:{password}@{host}:{port}/{schema}'

# AWS RDS MySQL Connection

In [8]:
schema="aws_p5_gans_database"
host="wbs-cs-p5-db.cjdcbdhnueky.eu-north-1.rds.amazonaws.com"
user="mkadmin"
password=marcus_keys.aws_rds_key
port=3306
con = f'mysql+pymysql://{user}:{password}@{host}:{port}/{schema}'

# Uploading the DataFrame into the database

In [9]:
weather_data_df.to_sql('weather_table', 
                        if_exists='append', 
                        con=con, 
                        index=False)

160