---
title: "Digital Transformation in Organisation - KIPA Project"
authors:
- 
date: Fall-2023

abstract: " "

format: 
  html:  
    code-fold: true
    standalone: true
    embed-resources: true
    number-sections: true # numbering the header
    toc: true
    toc-depth: 4
---


**Introduction to the notebook:**

In this notebook, the weather data including weather temperature and wind speed is extracted from free weather API from <https://open-meteo.com/>
Here is a definition for each parameter:


|**Parameter**             |**Definition** [^1]                                                               |**Format**         |
|--------------------------|----------------------------------------------------------------------------------|-------------------|
|`latitude` & `longitude`  |Geographical WGS84 coordinate of the location                                     |Floating point     |
|`start_date` & `end_date` |The time interval to get weather data. A day must be specified as an ISO8601 date.|String (yyyy-mm-dd)|
|`hourly`                  |A list of weather variables which should be returned.                             |String array       |
|`timezone`                |If timezone is set, all timestamps are returned as local-time and data is returned starting at 00:00 local-time.|String             |

**temperature_unit** is in *celsius* and **windspeed_unit** is *kmh*. 

**temperature_2m:** Air temperature at 2 meters above ground.

**windspeed_10m & windspeed_100m** Wind speed at 10 or 100 meters above ground. Wind speed on 10 meters is the standard level.


[^1]:Above data is extracted from [API Documentation](https://open-meteo.com/en/docs/historical-weather-api#latitude=51.1657&longitude=10.4515&start_date=2020-01-01&end_date=2023-09-12&hourly=temperature_2m,windspeed_10m,windspeed_100m,winddirection_10m,winddirection_100m&daily=&timezone=Europe%2FBerlin).

In [1]:
import requests
import json
import pandas as pd
from datetime import datetime

In [2]:
def weekly_data(df):
    '''
    We define this function to reduce the size of dataframe by converting the activity per defined time intervals into activity per hour.
    '''
    df = df.groupby(pd.Grouper(key='time',freq='W')).mean().reset_index()
    df['time'] = df['time'].dt.to_period('W').dt.start_time
    return df

In [3]:
base_URL = "https://archive-api.open-meteo.com/v1/archive"

In [4]:
params = {
    'start_date': '2020-09-01',
    'end_date': '2023-09-05',
    'hourly': {'temperature_2m', 'windspeed_10m'},
    'timezone': 'Europe/Berlin',
}

In [5]:
df_20212223 = {}
coordinates_20212223 = [{'latitude':53.5507 , 'longitude':9.993},
                        {'latitude':53.5067 , 'longitude':9.9871},
                        {'latitude':52.5952 , 'longitude':8.8349},
                        {'latitude':53.517 , 'longitude':10.2488},
                        {'latitude':53.3264 , 'longitude':9.8681},
                        {'latitude':53.5934 , 'longitude':9.4763},
                        {'latitude':53.8236 , 'longitude':9.2854},
                        {'latitude':53.8333 , 'longitude':9.1333},
                        {'latitude':53.8689 , 'longitude':10.6873},
                        {'latitude':54.1624 , 'longitude':10.4233}]

for i, coordinates in enumerate(coordinates_20212223):
    params.update(coordinates)
    response = requests.get(base_URL, params=params)
    if response.status_code == 200:
        data = response.json()
        df_name = f'df_{i}'
        df_20212223[df_name] = pd.DataFrame(data['hourly'])
        df_20212223[df_name]['time'] = pd.to_datetime(df_20212223[df_name]['time'], format='%Y-%m-%dT%H:%M')
    else:
        print("Error: Status code {}".format(response.status_code))
    for key in coordinates:
        del params[key]


In [6]:
combined_20212223= pd.concat(df_20212223)
average_20212223 = combined_20212223.groupby('time').mean().reset_index()
average_20212223.head()

Unnamed: 0,time,temperature_2m,windspeed_10m
0,2020-09-01 00:00:00,13.85,8.81
1,2020-09-01 01:00:00,13.29,8.02
2,2020-09-01 02:00:00,12.76,7.33
3,2020-09-01 03:00:00,12.49,7.33
4,2020-09-01 04:00:00,12.08,7.48


In [7]:
df_2425 = {}
coordinates_2425 = [{'latitude':54.3213 , 'longitude':10.1349},
                        {'latitude':54.4 , 'longitude':10.1333},
                        {'latitude':54.4 , 'longitude':9.9833},
                        {'latitude':54.4685 , 'longitude':9.8382},
                        {'latitude':54.0748 , 'longitude':9.9819},
                        {'latitude':53.7903 , 'longitude':10.0054},
                        {'latitude':53.8329 , 'longitude':9.9581},
                        {'latitude':53.9183 , 'longitude':9.8842},
                        {'latitude':54.1674 , 'longitude':9.8544},
                        {'latitude':54.0889 , 'longitude':9.6536},
                        {'latitude':54.0167 , 'longitude':10.0333},
                        {'latitude':54.0399 , 'longitude':10.215},
                        {'latitude':54.3066 , 'longitude':9.6631},
                        {'latitude':54.3184 , 'longitude':9.673},
                        {'latitude':54.7843 , 'longitude':9.4396},
                        {'latitude':53.6877 , 'longitude':9.6639},
                        {'latitude':53.7 , 'longitude':9.7167},
                        {'latitude':53.6667 , 'longitude':9.6667},
                        {'latitude':53.7079 , 'longitude':9.681},
                        {'latitude':53.7 , 'longitude':9.65},
                        {'latitude':53.6833 , 'longitude':9.6167},
                        {'latitude':54.35 , 'longitude':8.7667},
                        {'latitude':54.4858 , 'longitude':9.0524},
                        {'latitude':54.3971 , 'longitude':9.1865}]

for i, coordinates in enumerate(coordinates_2425):
    params.update(coordinates)
    response = requests.get(base_URL, params=params)
    if response.status_code == 200:
        data = response.json()
        df_name = f'df_{i}'
        df_2425[df_name] = pd.DataFrame(data['hourly'])
        df_2425[df_name]['time'] = pd.to_datetime(df_2425[df_name]['time'], format='%Y-%m-%dT%H:%M')
    else:
        print("Error: Status code {}".format(response.status_code))
    for key in coordinates:
        del params[key]


In [8]:
combined_2425= pd.concat(df_2425)
average_2425 = combined_2425.groupby('time').mean().reset_index()
average_2425.head()

Unnamed: 0,time,temperature_2m,windspeed_10m
0,2020-09-01 00:00:00,13.979167,9.545833
1,2020-09-01 01:00:00,13.4125,8.904167
2,2020-09-01 02:00:00,12.841667,8.2375
3,2020-09-01 03:00:00,12.458333,8.216667
4,2020-09-01 04:00:00,12.020833,7.629167


In [14]:
df_262728 = {}
coordinates_262728 = [{'latitude':53.1833 , 'longitude':8},
                        {'latitude':53.53 , 'longitude':8.1125},
                        {'latitude':53.4692 , 'longitude':7.4823},
                        {'latitude':53.2167 , 'longitude':7.8},
                        {'latitude':53.2316 , 'longitude':7.461},
                        {'latitude':53.3114 , 'longitude':7.423},
                        {'latitude':53.1333 , 'longitude':7.6167},
                        {'latitude':53.3 , 'longitude':7.6},
                        {'latitude':53.2667 , 'longitude':7.3833},
                        {'latitude':53.2375 , 'longitude':8.4566},
                        {'latitude':53.566 , 'longitude':9.6111},
                        {'latitude':53.5998 , 'longitude':9.5522},
                        {'latitude':53.0758 , 'longitude':8.8072}]

for i, coordinates in enumerate(coordinates_262728):
    params.update(coordinates)
    response = requests.get(base_URL, params=params)
    if response.status_code == 200:
        data = response.json()
        df_name = f'df_{i}'
        df_262728[df_name] = pd.DataFrame(data['hourly'])
        df_262728[df_name]['time'] = pd.to_datetime(df_262728[df_name]['time'], format='%Y-%m-%dT%H:%M')
    else:
        print("Error: Status code {}".format(response.status_code))
    for key in coordinates:
        del params[key]

In [15]:
combined_262728= pd.concat(df_262728)
average_262728 = combined_262728.groupby('time').mean().reset_index()
average_262728.head()

Unnamed: 0,time,temperature_2m,windspeed_10m
0,2020-09-01 00:00:00,12.353846,6.115385
1,2020-09-01 01:00:00,11.753846,5.546154
2,2020-09-01 02:00:00,11.269231,4.776923
3,2020-09-01 03:00:00,10.953846,5.346154
4,2020-09-01 04:00:00,10.761538,5.930769


In [20]:
w_average_20212223 = weekly_data(average_20212223)
file_path = 'w_average_20212223.csv'
w_average_20212223.to_csv(file_path,index=False)

In [21]:
w_average_2425 =weekly_data(average_2425)
file_path = 'w_average_2425.csv'
w_average_2425.to_csv(file_path,index=False)

In [22]:
w_average_262728 = weekly_data(average_262728)
file_path = 'w_average_262728.csv'
w_average_262728.to_csv(file_path,index=False)