# Timetable

The primary objective of this script is to gather precise arrival timings for both buses and trams at every stop located in Warsaw. Executing the script on a particular day will provide the timetable for that specific day. For instance, in this case, the data was collected on the 16th of January 2023, and thus the obtained timetable is valid only for this day.

Importing the needed libraries

In [3]:
import urllib.request, json
import pandas as pd
import requests

Data regarding all the stops in Warsaw is gathered through the API access point (https://api.um.warszawa.pl/). The gathered data is then stored in the df_loc DataFrame. The essential information extracted from the API includes:
1. Stop number ('zespol')
2. Stop label ('slupek')
3. Longitude ('dlug_geo')
4. Latitude ('szer_geo')

In [4]:
df_loc = []
params = {
        "id": "ab75c33d-3a26-4342-b36a-6e5fef0a3ac3",
        "apikey": "0cecd684-f566-4629-9dc2-c4dd5f49748f"
    }
response = requests.get("https://api.um.warszawa.pl/api/action/dbstore_get/", params=params)
data = response.json()

for x in range(len(data['result'])):
    df_append = pd.DataFrame(data['result'][x]['values'])
    df_append = df_append.set_index('key')['value'].rename_axis(None)
    df_append = pd.DataFrame(df_append).T
    if x == 0:
        df_loc = df_append
    else:
        df_loc = pd.concat([df_loc, df_append])
df_loc = df_loc.reset_index()

At this stage, we collect accurate arrival time data. We send a request to the API for each stop number, stop label, and line number. Using this information, we generate timetable DataFrame, which includes:
1. Fleet number ('brygada')
2. Time ('czas')
3. Stop name ('stop_name')
4. Stop number ('stop_id')
5. Stop label ('bus_stop_nr')
6. Line number ('line')
7. Longitude ('Lon')
8. Latitude ('Lat')

Finally, we save this DataFrame as a 'timetable.csv' file, which is available in the repository. This file will be required for the next phase of the project.

In [7]:
timetable = []
stops = list(df_loc['zespol'].unique())

for x in stops:
    bus_stop_nrs = list(df_loc[df_loc['zespol'] == str(x)]['slupek'].unique())
    for y in bus_stop_nrs:
        with urllib.request.urlopen("https://api.um.warszawa.pl/api/action/dbtimetable_get/?id=88cd555f-6f31-43ca-9de4-66c479ad5942&busstopId={0}&busstopNr={1}&apikey=0cecd684-f566-4629-9dc2-c4dd5f49748f".format(str(x), str(y))) as url:
            data = json.loads(url.read().decode())['result']
        lines = []
        for i in data:
            lines.append(i['values'][0]['value'])
        for z in lines:
            params = {"id": "e923fa0e-d96c-43f9-ae6e-60518c9f3238",
                      "busstopId": str(x),
                      "busstopNr": str(y),
                      "line": str(z),
                      "apikey": "0cecd684-f566-4629-9dc2-c4dd5f49748f"}
            with requests.get("https://api.um.warszawa.pl/api/action/dbtimetable_get/", params=params) as response:
                res = response.json()
            df = []
            for j in list(res['result']):
                df_append = pd.DataFrame(j['values'])
                df_append = df_append.set_index('key')['value'].rename_axis(None)
                df_append = pd.DataFrame(df_append).T
                if j == list(res['result'])[0]:
                    df = df_append
                else:
                    df = pd.concat([df, df_append])
            if type(df) == list:
                break
            df = df.reset_index()[['brygada', 'czas']]
            df['stop_name'] = df_loc[(df_loc['slupek'] == str(y)) & (df_loc['zespol'] == str(x))].iloc[0]['nazwa_zespolu']
            df['stop_id'] = x
            df['bus_stop_nr'] = y
            df['line'] = z
            df['Lat'] = df_loc[(df_loc['slupek'] == str(y)) & (df_loc['zespol'] == str(x))].iloc[0]['szer_geo']
            df['Lon'] = df_loc[(df_loc['slupek'] == str(y)) & (df_loc['zespol'] == str(x))].iloc[0]['dlug_geo']
            if type(timetable) == list:
                timetable = df
            else:
                timetable = pd.concat([timetable, df])

timetable.to_csv('timetable.csv')