#### Objectives
- Learn how to fetch weather forecast data using the OpenWeatherMap API.
- Understand how to process and save the fetched data.
- Review the structure and content of the raw and processed data.
- Calculate daily weather statistics from forecast data.

#### 1. Introduction to Data Fetching
In this session, we will fetch weather forecast data from the OpenWeatherMap API and process it for analysis.

##### Import Required Libraries

In [3]:
import requests
import pandas as pd

##### Set Up API Key and City
Replace the placeholder with your own OpenWeatherMap API key.

In [5]:
# Replace with your OpenWeatherMap API key
API_KEY = '97490e98294c10e32022a2ef28d662fa'
city = 'Guangzhou'

#### 2. Fetching Weather Forecast Data
We will define a function to fetch the 5-day weather forecast data for a specified city.

##### Define Function to Fetch Data

In [7]:
# Function to fetch data from the given URL and return JSON
def fetch_data(url):
    response = requests.get(url)
    data = response.json()
    if response.status_code != 200:
        print(f"Error fetching data: {data}")
        return None
    return data

##### Fetch Data

In [9]:
# Fetch current weather data
current_url = f'http://api.openweathermap.org/data/2.5/weather?q={city}&appid={API_KEY}'
current_data = fetch_data(current_url)

# Fetch 5-day forecast data
forecast_url = f'http://api.openweathermap.org/data/2.5/forecast?q={city}&appid={API_KEY}'
forecast_data = fetch_data(forecast_url)

#### 3. Processing and Saving Data
We will process the fetched data to extract relevant information and save it in CSV format.

##### Parse Weather Data

In [11]:
# Function to parse a single weather entry into a dictionary
def parse_weather(entry):
    if entry is None:
        return None
    return {
        'Datetime': pd.to_datetime(entry['dt'], unit='s'),
        'Temperature (C)': entry['main']['temp'] - 273.15,
        'Humidity (%)': entry['main']['humidity'],
        'Wind Speed (m/s)': entry['wind']['speed'],
        'Weather': entry['weather'][0]['description']
    }

##### Extract and Process Data

In [13]:
# Process current weather data
current_weather = parse_weather(current_data)

# Process forecast weather data
forecast_weather = [parse_weather(entry) for entry in forecast_data['list']]
print(forecast_weather)

[{'Datetime': Timestamp('2024-06-07 03:00:00'), 'Temperature (C)': 24.970000000000027, 'Humidity (%)': 90, 'Wind Speed (m/s)': 1.96, 'Weather': 'light rain'}, {'Datetime': Timestamp('2024-06-07 06:00:00'), 'Temperature (C)': 24.5, 'Humidity (%)': 91, 'Wind Speed (m/s)': 2.07, 'Weather': 'light rain'}, {'Datetime': Timestamp('2024-06-07 09:00:00'), 'Temperature (C)': 24.29000000000002, 'Humidity (%)': 91, 'Wind Speed (m/s)': 2.07, 'Weather': 'light rain'}, {'Datetime': Timestamp('2024-06-07 12:00:00'), 'Temperature (C)': 23.720000000000027, 'Humidity (%)': 93, 'Wind Speed (m/s)': 1.75, 'Weather': 'overcast clouds'}, {'Datetime': Timestamp('2024-06-07 15:00:00'), 'Temperature (C)': 23.78000000000003, 'Humidity (%)': 93, 'Wind Speed (m/s)': 1.26, 'Weather': 'overcast clouds'}, {'Datetime': Timestamp('2024-06-07 18:00:00'), 'Temperature (C)': 23.670000000000016, 'Humidity (%)': 94, 'Wind Speed (m/s)': 1.38, 'Weather': 'light rain'}, {'Datetime': Timestamp('2024-06-07 21:00:00'), 'Temperatu

##### Save Data to CSV

In [15]:
# Function to save a list of dictionaries to a CSV file
def save_to_csv(data, filename):
    if data is None or not data:
        print(f"No data to save for {filename}")
        return
    df = pd.DataFrame(data)
    df.to_csv(filename, index=False)
    print(f"{filename} saved")

# Save current weather data to CSV
save_to_csv([current_weather], '../data/processed/current_weather_data.csv')

# Save forecast weather data to CSV
save_to_csv(forecast_weather, '../data/processed/hourly_weather_data.csv')

../data/processed/current_weather_data.csv saved
../data/processed/hourly_weather_data.csv saved


##### Calculate Daily Statistics

In [17]:
# Function to calculate daily min and max stats from forecast data
def save_to_csv(data, filename):
    if data is None or data.empty:
        print(f"No data to save for {filename}")
        return
    data.to_csv(filename, index=False)
    print(f"Data saved to {filename}")
def calculate_daily_stats(forecast_data):
    if forecast_data is None:
        return None
    
    # Convert the list of dictionaries to a DataFrame
    df_forecast = pd.DataFrame(forecast_data)
    if not pd.api.types.is_datetime64_any_dtype(df_forecast['Datetime']):
        df_forecast['Datetime'] = pd.to_datetime(df_forecast['Datetime'])
    # Extract the date from the 'Datetime' column and create a new 'Date' column
    df_forecast['Date'] = df_forecast['Datetime'].dt.date

    # Group the data by the 'Date' column and calculate the min and max for each group
    daily_stats = df_forecast.groupby('Date').agg({
        'Temperature (C)': ['min', 'max'],
        'Humidity (%)': ['min', 'max'],
        'Wind Speed (m/s)': ['min', 'max']
    })

    # Flatten the MultiIndex columns
    daily_stats.columns = ['Min Temperature (C)', 'Max Temperature (C)', 
                           'Min Humidity (%)', 'Max Humidity (%)', 
                           'Min Wind Speed (m/s)', 'Max Wind Speed (m/s)']
    
    # Reset the index to turn the 'Date' back into a column
    daily_stats.reset_index(inplace=True)
    
    return daily_stats

# Calculate and save daily statistics to CSV
daily_stats = calculate_daily_stats(forecast_weather)
save_to_csv(daily_stats, '../data/processed/daily_weather_stats.csv')

Data saved to ../data/processed/daily_weather_stats.csv


#### 4. Reviewing the Data
We will review the structure and content of the raw and processed data to ensure it is correctly fetched and processed.

##### Display the First Few Rows of Raw Data

In [19]:
df_raw = pd.read_csv('../data/processed/hourly_weather_data.csv')
df_raw.head()

Unnamed: 0,Datetime,Temperature (C),Humidity (%),Wind Speed (m/s),Weather
0,2024-06-07 03:00:00,24.97,90,1.96,light rain
1,2024-06-07 06:00:00,24.5,91,2.07,light rain
2,2024-06-07 09:00:00,24.29,91,2.07,light rain
3,2024-06-07 12:00:00,23.72,93,1.75,overcast clouds
4,2024-06-07 15:00:00,23.78,93,1.26,overcast clouds


##### Display the First Few Rows of Processed Data

In [24]:
df_processed = pd.read_csv('../data/processed/daily_weather_stats.csv')
df_processed.head()

Unnamed: 0,Date,Min Temperature (C),Max Temperature (C),Min Humidity (%),Max Humidity (%),Min Wind Speed (m/s),Max Wind Speed (m/s)
0,2024-06-07,23.67,24.97,90,94,1.26,2.07
1,2024-06-08,24.97,29.91,69,91,1.53,3.83
2,2024-06-09,25.92,32.26,59,90,1.51,3.46
3,2024-06-10,26.05,31.13,64,91,0.8,3.06
4,2024-06-11,26.39,31.97,62,88,0.95,4.09


#### Homework
- Experiment with fetching data for different cities and review the structure of the fetched data.
- Extend the data processing to include additional weather parameters if available.

#### Summary
In this session, we learned how to fetch weather forecast data using the OpenWeatherMap API. We processed the data to convert temperature values from Kelvin to Celsius and saved both raw and processed data to CSV files. We also calculated daily weather statistics, preparing us for further analysis and visualization in upcoming sessions.