## <b> NB01 - City Selection </b>

Selection criteria
| What | Why and How |
| :--: | :--: |
| In the Northern Hemisphere | to somewhat standardize climate in my chosen time period (i.e., the seasons would relatively be the same in my chosen cities) |
| Have a reputation for being rainy | this will be decided upon intuitively/through general searches on infamous rainy movie scenes (e.g., "It rains nine months a year in Seattle", Sleepless in Seattle) |
| 5 cities in total | to have a big enough sample size to be able to comprehensively compare between cities, yet not have the number of cities be overwhelming |

Based on my criteria, these are the cities I have chosen:
1) London, UK 🇬🇧
2) Oslo, Norway 🇳🇴
3) Seattle, USA 🇺🇸
4) Munich, Germany 🇩🇪
5) Kyoto, Japan 🇯🇵

The movies that prominently feature rain in these cities respectively are:
1) Notting Hill/Bridget Jones' Diary/Four Weddings and a Funeral 📔💐
2) Oslo, August 31st (<u> NOTE: </u> Mostly no rain in the movie but it rains a lot in Oslo/Norway, which is why it's on the list!) 🚲🗓
3) Sleepless in Seattle ☎️📻
4) Suspiria 💃🩸
5) Rashomon ⚔️💧

In [1]:
## I'm importing the packages that I'll be using throughout this assignment
import pandas as pd
import os
import json
import requests
import numpy as np

from lets_plot import *
LetsPlot.setup_html()

#!pip install geopandas

import geopandas as gpd
from lets_plot.geo_data import *

from lets_plot import *
LetsPlot.setup_html()

from functions import *

#!pip install geodatasets
import geodatasets

from IPython.display import *

The geodata is provided by © OpenStreetMap contributors and is made available here under the Open Database License (ODbL).


I have created a function to print out the output from "get_lat_lon", so I have imported that and will be using it to efficiently get the latitude and longitudes for all five of my cities!

In [2]:
print_location_lat_lon('GB', 'London')

The latitude and longitude of GB, London is (51.50853, -0.12574)


In [3]:
print_location_lat_lon('NO', 'Oslo')

The latitude and longitude of NO, Oslo is (59.91273, 10.74609)


In [4]:
print_location_lat_lon('US', 'Seattle')

The latitude and longitude of US, Seattle is (47.60621, -122.33207)


In [5]:
print_location_lat_lon('DE', 'Munich')

The latitude and longitude of DE, Munich is (48.13743, 11.57549)


In [6]:
print_location_lat_lon('JP', 'Kyoto')

The latitude and longitude of JP, Kyoto is (35.02107, 135.75385)


I am creating a CSV dataframe to hold city data between January 01, 2021 and January 01, 2024. I am doing this to achieve a neater working environment but also to limit my API calls, which has a daily cap. It's more convenient to be working with data in a local environment.

In [7]:
## Creating a list with my chosen country codes and cities
cities = [
    ("GB", "London"),
    ("JP", "Kyoto"),
    ("DE", "Munich"),
    ("US", "Seattle"),
    ("NO", "Oslo")
]

## Defining start and end dates
start_date = '2021-01-01'
end_date = '2024-01-01'

## Creating an empty list to hold the data
all_city_data = []

## Creating loop for calling country codes, city names, times and rain sum data and storing it in the JSON format
for country_code, city_name in cities:
    json_data = get_historical_data(country_code, city_name, start_date, end_date)
    
    city_data = {
        "country": country_code,
        "city": city_name,
        "date": json_data['daily']['time'],
        "rain_sum": json_data['daily']['rain_sum']
    }
    
    city_df = pd.DataFrame(city_data)
    all_city_data.append(city_df)

## Combining city rain sum data into a dataframe
final_df = pd.concat(all_city_data, ignore_index=True)
## Saving dataframe into the data file as CSV
final_df.to_csv('../data/historical_city_rain_data.csv', index=False)

## Map graph

In [9]:
data = pd.read_csv('../data/mean_raininess_per_city.csv')

centroids = geocode_cities(data["city"]).get_centroids()

p = ggplot() + ggsize(800, 500)

lats = [0 * y for y in range(4)]

plot = (
    p + 
    geom_livemap(zoom=2) +  
    geom_hline(aes(yintercept=lats), color='#e0218a', linetype=2, size=1) +  
    geom_point(aes(color='city'), 
               data=centroids,  
               size=5,  
               show_legend=True,  
               tooltips=layer_tooltips().title("@city"))  
)

plot

In [None]:
data = pd.read_csv('/files/ds105a-2024-w06-summative-deyavuz/data/mean_raininess_per_city.csv')

countries_data = gpd.read_file('/files/ds105a-2024-w06-summative-deyavuz/data/world_cities.csv')

selected_countries = ['US', 'GB', 'JP', 'NO', 'DE']
countries_data_filtered = countries_data[countries_data['country'].isin(selected_countries)]

centroids = geocode_cities(data["city"]).get_centroids()

p = ggplot() + ggsize(800, 500)

lats = [0 * y for y in range(4)]

plot = (
    p + 
    geom_livemap(zoom=2) +  
     geom_polygon(aes(fill='country', group='country'),  # Map your fill variable
                 data=countries_data_filtered,
                 alpha=0.5,   # Adjust transparency
                 color='black') +  # Country borders
    geom_hline(aes(yintercept=lats), color='#e0218a', linetype=2, size=1) +  
    geom_point(aes(color='city'), 
               data=centroids,  
               size=5,  
               show_legend=True,  
               tooltips=layer_tooltips().title("@city"))  
)

ggsave(plot, filename='cities_map.html', path='/files/ds105a-2024-w06-summative-deyavuz/figures', w=8, h=5, unit='in', dpi=300)

In [None]:
with open('/files/ds105a-2024-w06-summative-deyavuz/figures/cities_map.html', 'w') as f:
    f.write(plot.to_html())  

HTML(filename='/files/ds105a-2024-w06-summative-deyavuz/figures/cities_map.html') 

[Interactive Map](https://github.com/lse-ds105/ds105a-2024-w06-summative-deyavuz/blob/main/figures/cities_map.html)

In [10]:
data = pd.read_csv('/files/ds105a-2024-w06-summative-deyavuz/data/mean_raininess_per_city.csv')

centroids = geocode_cities(data["city"]).get_centroids()

p = ggplot() + ggsize(800, 500)

lats = [0 * y for y in range(4)]

plot = (
    p + 
    geom_livemap(zoom=2) +  
    geom_hline(aes(yintercept=lats), color='#e0218a', linetype=2, size=1) +  
    geom_point(aes(color='city'), 
               data=centroids,  
               size=5,  
               show_legend=True,  
               tooltips=layer_tooltips().title("@city"))  
)

plot