# **NB01 - Data Collection**

**OBJECTIVE:**
Extract historical  from ten different cities, including London, and do the following:


1. Load city metadata (coordinates) from `world_cities.csv`
2. Define helper functions for API calls (imported from `functions.py`)
3. Query the Open-Meteo API:
    - The number of days of rainfall in 2003 and 2023
    - The total precipitation in mm in 2003 and 2023
    - The daily precipitation in mm in 2003 and 2023
4. Validate and save responses into `data/`

> **Note:** API responses may vary slightly depending on changes in the Open-Meteo service. For reproducibility, raw JSON outputs are saved locally.


**AUTHOR:** 
@nadiabegic on GitHub

**LAST EDITED:**
3-Dec-2025

-----------------------
**Imports**:

In [1]:
import json
import pandas as pd
from functions import *
from dotenv import *

# 1. Collect the number of days of rainfall in 2023

1.1 Read _world_cities.csv_ to access the country codes and city names

In [2]:
world_cities = pd.read_csv('../data/world_cities.csv')

In [13]:
world_cities.head()

Unnamed: 0,country,name,lat,lng
0,AD,El Tarter,42.57952,1.65362
1,AD,Sant Julià de Lòria,42.46372,1.49129
2,AD,Pas de la Casa,42.54277,1.73361
3,AD,Ordino,42.55623,1.53319
4,AD,les Escaldes,42.50729,1.53414


1.2 Obtain the number of days of rainfall in 2023 for the ten different cities and store the data into a dictionary

In [4]:
cities = [
    ("GB", "London"),
    ("GB", "Edinburgh"),
    ("BA", "Sarajevo"),
    ("NL", "Amsterdam"),
    ("FR", "Paris"),
    ("ES", "Madrid"),
    ("SY", "Damascus"),
    ("US", "New York City"),
    ("US", "Los Angeles"),
    ("AE", "Dubai")
]

In [6]:
days_rain_2023 = {}

start_date = "2023-01-01"
end_date = "2023-12-31"

for country_code, city_name in cities: 
    days_rain = num_days_rain(country_code, city_name, start_date, end_date, world_cities)
    days_rain_2023[city_name] = days_rain

1.3 Save the data to a JSON file

In [10]:
with open('../data/days_rain_2023.json', 'w') as file:
    json.dump(days_rain_2023, file)

# 2. Obtain the total precipitation in 2023

2.1 Obtain the total precipitation in 2023 for the ten different cities and store the data into a dictionary

In [7]:
total_precipitation_2023 = {}

start_date = "2023-01-01"
end_date = "2023-12-31"

for country_code, city_name in cities: 
    precipitation = total_precipitation(country_code, city_name, start_date, end_date, world_cities)
    total_precipitation_2023[city_name] = precipitation

2.2 Save the data to a JSON file

In [12]:
with open('../data/total_precipitation_2023.json', 'w') as file:
    json.dump(total_precipitation_2023, file)

# 3. Obtain the daily precipitation in 2023

3.1 Obtain the daily precipitation for the ten different cities and store the data into a dictionary

In [8]:
daily_precipitation_2023 = {}

start_date = "2023-01-01"
end_date = "2023-12-31"

for country_code, city_name in cities: 
    daily_precipitation = get_rain_sum(country_code, city_name, start_date, end_date, world_cities)
    daily_precipitation_2023[city_name] = daily_precipitation

3.2 Save the data to a JSON file

In [14]:
with open('../data/daily_precipitation_2023.json', 'w') as file:
    json.dump(daily_precipitation_2023, file)

# 4. Collect the number of days of rainfall in 2003

4.1 Obtain the number of days of rainfall in 2003 for the ten different cities and store the data into a dictionary

In [9]:
days_rain_2003 = {}

start_date = "2003-01-01"
end_date = "2003-12-31"

for country_code, city_name in cities: 
    days_rain = num_days_rain(country_code, city_name, start_date, end_date, world_cities)
    days_rain_2003[city_name] = days_rain

4.2 Save the data to a JSON file

In [16]:
with open('../data/days_rain_2003.json', 'w') as file:
    json.dump(days_rain_2003, file)

# 5. Obtain the total precipitation in 2003

5.1 Obtain the total precipitation in 2003 for the ten different cities and store the data into a dictionary

In [10]:
total_precipitation_2003 = {}

start_date = "2003-01-01"
end_date = "2003-12-31"

for country_code, city_name in cities: 
    precipitation = total_precipitation(country_code, city_name, start_date, end_date, world_cities)
    total_precipitation_2003[city_name] = precipitation

5.2 Save the data to a JSON file

In [18]:
with open('../data/total_precipitation_2003.json', 'w') as file:
    json.dump(total_precipitation_2003, file)

# 6. Obtain the daily precipitation in 2003

6.1 Obtain the daily precipitation in 2003 for the ten different cities and store the data into a dictionary

In [11]:
daily_precipitation_2003 = {}

start_date = "2003-01-01"
end_date = "2003-12-31"

for country_code, city_name in cities: 
    daily_precipitation = get_rain_sum(country_code, city_name, start_date, end_date, world_cities)
    daily_precipitation_2003[city_name] = daily_precipitation

6.2 Save the data to a JSON file

In [20]:
with open('../data/daily_precipitation_2003.json', 'w') as file:
    json.dump(daily_precipitation_2003, file)