# City Raininess

##### This notebook allows a user to enter a city and country, and will locate the latitude and longitude of the most populated city following those parameters,if there is one; this utilizes the world_cities CSV. Then, based on the latitude and longitude of the city, historical and forecast data will be determined from the OpenMeteo API. A raininess index is defined, weighting recent raininess more heavily than less recent data.

***

In [102]:
# Importing
import pandas as pd
import requests
import datetime
import pprint
import math

***

### 1. Converting City and Country to Latitude and Longitude

In [2]:
def get_city_latlon(city, country):
    '''
    Get the latitude and longitude of a city in a country, derived from the world_cities CSV file.
    Params:
        city: str, the name of the city
        country: str, the name of the country
    Returns:
        a tuple of the latitude and longitude
    '''
    # Read the world_cities CSV file
    world_cities = pd.read_csv('world_cities.csv')

    # Filter the dataframe based on the city and country
    filtered_cities = world_cities[(world_cities['name'] == city) & (world_cities['country'] == country)]

    # Check if any matching cities are found
    if len(filtered_cities) > 0:
        # Get the longitude of the first matching city
        longitude = filtered_cities.iloc[0]['lng']
        # Get the latitude of the first matching city
        latitude = filtered_cities.iloc[0]['lat']
        # Return the latitude and longitude as a tuple
        return (latitude, longitude)
    else:
        # Return None if no matching city is found
        return None

### <center>Testing</center>

In [3]:
# Testing the above function

# Test case 1: Mumbai, India
city = 'Mumbai'
country = 'IN'
print(f'The latitude and longitude of {city}, {country} is {get_city_latlon(city, country)}')

# Test case 2: New York City, United States
city = 'New York City'
country = 'US'
print(f'The latitude and longitude of {city}, {country} is {get_city_latlon(city, country)}')

# Test case 3: Tokyo, Japan
city = 'Tokyo'
country = 'JP'
print(f'The latitude and longitude of {city}, {country} is {get_city_latlon(city, country)}')

# Test case 4: Walnut Creek, United States
city = 'Walnut Creek'
country = 'US'
print(f'The latitude and longitude of {city}, {country} is {get_city_latlon(city, country)}')

The latitude and longitude of Mumbai, IN is (19.07283, 72.88261)
The latitude and longitude of New York City, US is (40.71427, -74.00597)
The latitude and longitude of Tokyo, JP is (35.6895, 139.69171)
The latitude and longitude of Walnut Creek, US is (37.90631, -122.06496)


***

### 2. Collecting Historical Data (Last Year)

In [33]:
def get_historical_precipitation(latitude: float, longitude: float) -> dict:
    '''
    Get the number of days rained and number of mm of rain in the past year
    Params:
        latitude: float - latitude of the location
        longitude: float - longitude of the location
    Returns:
        a dictionary of the daily rain for the past five years in mm, number of hours of precipitation
    '''
    
    # Get the historical weather data

    base_historical_url = "https://archive-api.open-meteo.com/v1/era5?"
    params_lat_long_ = "latitude=" + str(latitude) + "&longitude=" + str(longitude)

    # Get the current date
    current_date = datetime.date.today()
    
    # Calculate the date five years ago
    one_year_ago = current_date - datetime.timedelta(days=365)
    
    # Format the date as YYYY-MM-DD
    formatted_date = one_year_ago.strftime("%Y-%m-%d")
    
    params_dates = "&start_date=" + formatted_date + "&end_date=" + str(current_date)
    param_other = "&daily=rain_sum&daily=precipitation_hours"
    total_url = base_historical_url + params_lat_long_ + params_dates + param_other

    response = requests.get(total_url)
    historical_data = response.json()
    
    historical_rain_sum = historical_data['daily']['rain_sum']
    historical_precipitation_hours = historical_data['daily']['precipitation_hours']
    
    return {
        "Historical Rain Sum": historical_rain_sum, 
        "Historical Precipitation Hours": historical_precipitation_hours
            }

### <center>Testing</center>

In [34]:
# Testing historical precipitation using the get_historical_precipitation function and the get_city_latlon function

# Test 1: San Francisco
print("Test 1: San Francisco")
city = 'San Francisco'
country = 'US'
latitude, longitude = get_city_latlon(city, country)
pprint.pp(get_historical_precipitation(latitude, longitude))

print("\n")
print("-" * 20)
print("\n")

# Test 2: New York
print("Test 2: New York City")
city = 'New York City'
country = 'US'
latitude, longitude = get_city_latlon(city, country)
pprint.pp(get_historical_precipitation(latitude, longitude))

print("\n")
print("-" * 20)
print("\n")

# Test 3: London
print("Test 3: London")
city = 'London'
country = 'GB'
latitude, longitude = get_city_latlon(city, country)
pprint.pp(get_historical_precipitation(latitude, longitude))

Test 1: San Francisco
{'Historical Rain Sum': [0.0,
                         0.0,
                         2.1,
                         5.7,
                         0.9,
                         0.0,
                         0.0,
                         0.1,
                         0.0,
                         0.0,
                         0.0,
                         10.3,
                         0.0,
                         13.9,
                         5.9,
                         27.1,
                         1.2,
                         0.0,
                         0.0,
                         0.0,
                         0.0,
                         0.0,
                         0.0,
                         0.0,
                         0.0,
                         0.0,
                         11.4,
                         0.0,
                         0.0,
                         3.5,
                         1.5,
                         0.0,
              

***

### 3. Collecting Forecast Data (Next 7 Days)

In [25]:
def get_forecast_precipitation(latitude: float, longitude: float) -> dict:
    '''
    Get the hourly forecast precipitation for a given latitude and longitude for the next 7 days
    Params: 
        latitude: float - latitude of the location
        longitude: float - Longitude of the location
    Returns:
        dict - Forecast precipitation
    '''

    # Building the base URL
    base_forecast_url = "https://api.open-meteo.com/v1/forecast?"
    params_lat_long = "latitude=" + str(latitude) + "&longitude="  + str(longitude)
    params_others = "&daily=rain_sum"

    final_url = base_forecast_url + params_lat_long + params_others

    # Getting the forecast data
    response = requests.get(final_url)

    # Extracting the forecast precipitation
    forecast_data = response.json()
    forecast_precipitation = forecast_data['daily']['rain_sum']
    return forecast_precipitation

### <center>Testing</center>

In [26]:
# Testing the get_forecast_precipitation function using the get_city_latlon function

# Test 1: San Francisco
city = 'San Francisco'
country = 'US'
latitude, longitude = get_city_latlon(city, country)
print(get_forecast_precipitation(latitude, longitude))

# Test 2: New York
city = 'New York City'
country = 'US'
latitude, longitude = get_city_latlon(city, country)
print(get_forecast_precipitation(latitude, longitude))

# Test 3: London
city = 'London'
country = 'GB'
latitude, longitude = get_city_latlon(city, country)
print(get_forecast_precipitation(latitude, longitude))

# Test 4: Mumbai
city = 'Mumbai'
country = 'IN'
latitude, longitude = get_city_latlon(city, country)
print(get_forecast_precipitation(latitude, longitude))

# Test 5: Tokyo
city = 'Tokyo'
country = 'JP'
latitude, longitude = get_city_latlon(city, country)
print(get_forecast_precipitation(latitude, longitude))

[4.4, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]
[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]
[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]
[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]
[29.1, 0.0, 0.1, 0.0, 0.0, 0.0, 0.0]


***

### 4. Defining the Raininess Index

##### The raininess index combines the daily historical data of amount of rain (mm), hours of precipitation, and the mean probability of precipitation with the hourly forecast data of amount of precipitation

#### Collecting London data:

##### To collect data for a particular city (in this case, London), we can call the functions defined about to gather both historical and forecast data

In [41]:
# Collecting historical and forecast precipitation data for London, GB
city = 'London'
country = 'GB'
latitude, longitude = get_city_latlon(city, country)
historical_precipitation = f"   Historical precipitation: {get_historical_precipitation(latitude, longitude)}"
forecast_precipitation = f"   Forecast precipitation: {get_forecast_precipitation(latitude, longitude)}"
print(f"Weather data for {city}, {country}:")
print(historical_precipitation)
print(forecast_precipitation)

# storing above output into a variable
London = {
    "City": city,
    "Country": country,
    "Historical Precipitation": get_historical_precipitation(latitude, longitude),
    "Forecast Precipitation": get_forecast_precipitation(latitude, longitude)
}

Weather data for London, GB:
   Historical precipitation: {'Historical Rain Sum': [5.7, 5.5, 0.1, 0.0, 0.1, 3.3, 0.1, 1.7, 0.0, 2.0, 12.5, 6.1, 0.0, 3.0, 1.3, 5.4, 0.0, 3.0, 0.7, 0.0, 0.2, 0.1, 0.0, 4.9, 5.1, 0.0, 0.0, 0.0, 0.0, 0.0, 6.2, 7.8, 2.5, 0.0, 5.7, 0.0, 7.9, 2.1, 0.0, 7.3, 0.5, 0.9, 0.0, 0.0, 0.0, 0.0, 12.7, 0.0, 0.3, 0.3, 0.0, 1.2, 3.2, 0.2, 6.3, 0.4, 1.5, 3.6, 7.6, 10.5, 7.4, 2.3, 30.3, 2.1, 0.1, 0.0, 0.4, 0.0, 0.0, 0.0, 0.0, 0.0, 1.6, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 2.2, 5.1, 2.9, 0.1, 0.1, 1.6, 0.0, 0.0, 0.0, 0.0, 0.5, 0.1, 0.0, 0.7, 0.1, 0.0, 8.1, 10.0, 16.5, 7.8, 2.1, 7.3, 0.0, 2.6, 2.6, 0.7, 2.0, 1.7, 12.2, 2.2, 0.1, 3.5, 11.5, 1.5, 1.6, 3.6, 0.7, 0.5, 1.3, 8.3, 6.5, 2.2, 0.0, 1.1, 1.8, 0.0, 0.3, 0.0, 0.4, 8.6, 1.9, 13.3, 0.5, 1.2, 1.3, 1.3, 6.4, 0.2, 0.2, 0.8, 0.0, 0.4, 1.1, 0.0, 0.9, 11.9, 2.9, 10.2, 5.0, 0.0, 2.0, 4.3, 4.3, 0.7, 3.6, 5.4, 0.1, 0.0, 1.7, 0.5, 1.7, 0.4, 0.0, 0.0, 0.0, 4.6, 4.4, 0.5, 1.1, 1.0, 0.2, 0.0, 1.0, 1.5, 0.0, 4.2, 0.0, 9.2, 12.3, 0.3, 0.4, 0.1, 

#### Converting London Data to Raininess Index

##### Now, a raininess index will be created, which weights more recent weather more heavily to determine how "rainy" a city is

##### The weighting works as follows:

The most emphasis is placed on the past two weeks of rain, with the least recent historical data being weighted the least. 

As forecast data is often not very accurate (e.g., many very rainy cities often show no rain in the forecast), forecast data will not **decrease** the raininess index of a city. However, if there is a substantial amount of rain in the forecast, this will **increase** the raininess index.

In [113]:
# The raininess index
def raininess(data: dict) -> str:
    '''
    Calculate the raininess of a city based on the historical and forecast precipitation data
    Params:
        data: dict - a dictionary containing historical and forecast precipitation data
    Returns:
        int - the raininess of the city
    '''

    historical_rain_sum = data['Historical Precipitation']['Historical Rain Sum']
    historical_precipitation_hours = data['Historical Precipitation']['Historical Precipitation Hours']
    forecast_rain_sum = data['Forecast Precipitation']

    # Remove None values from the historical metrics (will be None when data isn't recorded yet)
    historical_rain_sum = [day for day in historical_rain_sum if day is not None]
    historical_precipitation_hours = [hour for hour in historical_precipitation_hours if hour is not None]

    # Ensure the lists are the same length
    min_length = min(len(historical_rain_sum), len(historical_precipitation_hours))
    historical_rain_sum = historical_rain_sum[:min_length]
    historical_precipitation_hours = historical_precipitation_hours[:min_length]
    
    raininess_index = sum(historical_rain_sum) + 0.3 * sum(historical_precipitation_hours)

    # Add a small boost if there's significant rain in the forecast
    forecast_threshold = 5
    if any(rain >= forecast_threshold for rain in forecast_rain_sum):
        raininess_index += 10  # Small boost if there’s rain in the forecast

    # Normalize the raininess index using a logarithmic scale
    log_base = 4
    multiplier = 10
    normalized_raininess_index = multiplier * math.log(raininess_index + 1, log_base)

    return normalized_raininess_index


### <center>Testing</center>

In [114]:
# Testing the raininess function

# Test 1: London
print("Raininess of London:")
print(raininess(London))

print("-" * 20)

# Test 2: San Francisco
print("Raininess of San Francisco:")

city = 'San Francisco'
country = 'US'
latitude, longitude = get_city_latlon(city, country)

San_Francisco = {
    "City": 'San Francisco',
    "Country": 'US',
    "Historical Precipitation": get_historical_precipitation(latitude, longitude),
    "Forecast Precipitation": get_forecast_precipitation(latitude, longitude)
}

print(raininess(San_Francisco))

print("-" * 20)

# Test 3: New York

print("Raininess of New York City:")
city = 'New York City'
country = 'US'
latitude, longitude = get_city_latlon(city, country)

New_York = {
    "City": 'New York City',
    "Country": 'US',
    "Historical Precipitation": get_historical_precipitation(latitude, longitude),
    "Forecast Precipitation": get_forecast_precipitation(latitude, longitude)
}

print(raininess(New_York))

print("-" * 20)

# Test 4: Mumbai
print("Raininess of Mumbai:")
city = 'Mumbai'
country = 'IN'
latitude, longitude = get_city_latlon(city, country)

Mumbai = {
    "City": 'Mumbai',
    "Country": 'IN',
    "Historical Precipitation": get_historical_precipitation(latitude, longitude),
    "Forecast Precipitation": get_forecast_precipitation(latitude, longitude)
}

print(raininess(Mumbai))



Raininess of London:
52.39384809737882
--------------------
Raininess of San Francisco:
49.98095497544067
--------------------
Raininess of New York City:
53.666771703069145
--------------------
Raininess of Mumbai:
58.35018276878996


#### For now, the raininess index is just a number. 

We are next going to run a number of cities through these functions in order to create a distribution of raininess; from this distribution, the bottom 25% of cities (in terms of raininess index) will be deemed "Not rainy", while the top 25% will be deemed "Very rainy". Our goal here is to see where London falls in this distribution.

***

### 4. Comparing London to Other Cities