# **NB01 - Data Collection**

**OBJECTIVE:**
Collect weather data from the OpenMeteo API and save it to a JSON file. The weather data will be extracted from ten different cities, including London, and will include the following:
- The annual precipitation in mm in 2003 and 2023
- The number of days of rainfall in 2003 and 2023

**AUTHOR:** 
@nadiabegic on GitHub

**LAST EDITED:**
1-Nov-2024

-----------------------
**Imports**:

In [2]:
import requests
import os
import json
from datetime import datetime 
import pandas as pd

# 1. Preparation and defining repeat functions

1.1 Read the CSV file of world_cities to access the country codes and city names

In [3]:
world_cities = pd.read_csv('../data/world_cities.csv')

In [None]:
# test
get_rain_sum('GB', 'London', '2023-01-01', '2023-12-31', world_cities)

KeyError: 'GB'

In [7]:
world_cities

Unnamed: 0,country,name,lat,lng
0,AD,El Tarter,42.57952,1.65362
1,AD,Sant Julià de Lòria,42.46372,1.49129
2,AD,Pas de la Casa,42.54277,1.73361
3,AD,Ordino,42.55623,1.53319
4,AD,les Escaldes,42.50729,1.53414
...,...,...,...,...
149832,ZW,Beitbridge,-22.21667,30.00000
149833,ZW,Beatrice,-18.25283,30.84730
149834,ZW,Banket,-17.38333,30.40000
149835,ZW,Epworth,-17.89000,31.14750


In [13]:
valid_rows = (world_cities['country']=='GB') & (world_cities['name']=='London')
world_cities[valid_rows]

Unnamed: 0,country,name,lat,lng
56726,GB,London,51.50853,-0.12574


1.2 Define the repeat function to obtain the latitude and longitude of a city

In [18]:
def get_lat_long(country_code, city_name, world_cities):
    """
    Retrieves the latitude and longitude of a given city in a specific country.

    Parameters:
        country_code (str): The country code of the city.
        city_name (str): The name of the city.
        world_cities (dict): A dictionary containing city data for different countries.

    Returns:
        Float: returns two floats representing latitude and longitude.
    """
    
    valid_rows = (world_cities['country']==country_code) & (world_cities['name']==city_name)
    city_data = world_cities[valid_rows]
    return city_data['lat'].iloc[0], city_data['lng'].iloc[0]

In [19]:
latitude, longitude = get_lat_long('GB', 'London', world_cities)
print(latitude, longitude)

51.50853 -0.12574


1.3 Define the repeat function to obtain the amount of rain in a given time period

In [23]:
def get_rain_sum(country_code, city_name, start_date, end_date, world_cities):
    """
    A function which retrieves the rain sum for a given country code and city name.

    Parameters:
        country_code (str): The country code of the location.
        city_name (str): The name of the city.
        start_date (str): The start date of the historical data in the format 'YYYY-MM-DD'.
        end_date (str): The end date of the historical data in the format 'YYYY-MM-DD'.
        world_cities (dict): A dictionary containing city data for different countries.

    Returns:
        list: a list of the daily rain_sum (in mm) for the given data range. 
    """

    latitude, longitude = get_lat_long(country_code, city_name, world_cities)

    base_historical_url = "https://archive-api.open-meteo.com/v1/archive?"
    params_lat_long     = "latitude=" + str(latitude) + "&longitude="  + str(longitude)
    params_date         = "&start_date=" + str(start_date) + "&end_date=" + str(end_date)
    params_others       = "&daily=rain_sum"

    final_url = base_historical_url + params_lat_long + params_date + params_others

    print(final_url)
    response = requests.get(final_url)
    print(response.status_code)
    rain_data = response.json()
    rain_sum = rain_data['daily']['rain_sum']
   
    return rain_sum
    


In [25]:
get_rain_sum('GB', 'London', '2023-01-01', '2023-12-31', world_cities)

https://archive-api.open-meteo.com/v1/archive?latitude=51.50853&longitude=-0.12574&start_date=2023-01-01&end_date=2023-12-31&daily=rain_sum
200


[4.0,
 0.2,
 3.2,
 0.9,
 0.1,
 1.2,
 5.0,
 1.8,
 0.3,
 6.5,
 2.8,
 5.4,
 0.0,
 11.4,
 0.0,
 12.3,
 0.0,
 0.0,
 0.0,
 0.0,
 0.0,
 0.0,
 0.0,
 0.0,
 1.7,
 0.3,
 0.0,
 0.0,
 0.0,
 0.0,
 0.1,
 0.0,
 0.2,
 0.0,
 0.5,
 0.0,
 0.0,
 0.0,
 0.0,
 0.0,
 0.0,
 0.0,
 0.0,
 0.0,
 0.0,
 0.0,
 3.9,
 0.1,
 0.7,
 0.0,
 0.0,
 1.2,
 2.9,
 1.7,
 0.0,
 0.0,
 0.0,
 0.0,
 0.7,
 0.6,
 0.0,
 0.4,
 0.0,
 0.3,
 1.6,
 4.1,
 5.7,
 5.9,
 4.0,
 0.9,
 1.3,
 0.8,
 4.6,
 2.2,
 1.9,
 1.9,
 4.0,
 1.7,
 3.3,
 1.4,
 2.6,
 8.5,
 1.7,
 0.5,
 8.1,
 0.0,
 1.9,
 3.3,
 3.9,
 12.4,
 1.0,
 0.0,
 0.0,
 0.0,
 1.4,
 4.5,
 0.2,
 0.0,
 0.0,
 9.1,
 7.5,
 3.2,
 1.0,
 9.3,
 1.1,
 0.0,
 0.5,
 0.2,
 0.0,
 2.3,
 3.3,
 1.5,
 3.9,
 13.8,
 0.0,
 0.0,
 6.7,
 0.7,
 0.0,
 0.0,
 1.0,
 0.0,
 0.0,
 0.9,
 0.7,
 5.6,
 0.7,
 5.0,
 7.7,
 6.2,
 3.7,
 1.9,
 0.0,
 0.7,
 0.5,
 0.2,
 0.4,
 0.2,
 1.1,
 0.1,
 0.0,
 0.0,
 0.0,
 0.0,
 0.0,
 0.0,
 0.0,
 0.0,
 0.0,
 0.0,
 0.0,
 0.0,
 0.0,
 0.0,
 0.0,
 0.0,
 0.1,
 0.0,
 0.0,
 0.0,
 0.0,
 0.0,
 5.7,
 0.0,
 0.0,
 0.0,


# 2. Collect the number of days of rainfall in 2023

1.1 Define the function to obtain the number of days of rainfall in 2023

In [78]:
def num_days_rain(country_code, city_name, start_date, end_date, world_cities):
    """
    A function which retrieves the number of days it rained in a given country code and city name.
    
    Parameters:
        country_code (str): The country code of the location.
        city_name (str): The name of the city.
        start_date (str): The start date of the historical data in the format 'YYYY-MM-DD'.
        end_date (str): The end date of the historical data in the format 'YYYY-MM-DD'.
        world_cities (dict): A dictionary containing city data for different countries.

    Returns:
        int: the number of days it rained in the given city in a given time period.
    """
        
    days_of_rain = 0
    for rain_sum in get_rain_sum.rain_sum:
        if rain_sum > 0:
            days_of_rain += 1

    return days_of_rain

In [70]:
# test
get_rain_sum('GB', 'London', '2023-01-01', '2023-12-31', world_cities)

KeyError: 'GB'

# 3. Calculate the annual precipitation in 2023

# 4. Collect the number of days of rainfall in 2003

# 5. Calculate the annual precipitation in 2003