# Junakokeiluja

Eli minkälaista dataa löytyy sivuston ```rata.digitraffic.fi``` avoimista rajapinnoista ([Sivuston oma info](https://www.digitraffic.fi/rautatieliikenne)).

In [1]:
import sqlite3
from pathlib import Path
import json

import requests
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

In [2]:
def print_dict(d):
    for key, value in d.items():
        if isinstance(value, (str, bool, int, float)):
            print(f"{key}: {value} (type {type(value)})")
        else:
            print(f"{key}: {len(value)} values (type {type(value)})")

In [3]:
# asemat (ja ehkä rataosuudet?) voi tallentaa, jos haluaa 
db_path = "data/db.db"

def save_df_to_db(df, table_name):
    with sqlite3.connect("db_path") as conn:
        try:
            df.to_sql(name=table_name, con=conn, if_exists="fail", index=False)
            conn.commit()
        except ValueError:
            print(f"Table {table_name} already exists")

## /trains

In [4]:
url_start = "https://rata.digitraffic.fi/api/v1/trains/"

date = "2023-03-14"
train_num = 1

url = f"{url_start}{date}/{train_num}"

In [5]:
req = requests.get(url)
req.status_code

200

In [6]:
data = req.json()[0]

In [7]:
len(data)
type(data)

dict

In [8]:
print_dict(data)

trainNumber: 1 (type <class 'int'>)
departureDate: 2023-03-14 (type <class 'str'>)
operatorUICCode: 10 (type <class 'int'>)
operatorShortCode: vr (type <class 'str'>)
trainType: IC (type <class 'str'>)
trainCategory: Long-distance (type <class 'str'>)
commuterLineID:  (type <class 'str'>)
runningCurrently: False (type <class 'bool'>)
cancelled: False (type <class 'bool'>)
version: 285114991349 (type <class 'int'>)
timetableType: REGULAR (type <class 'str'>)
timetableAcceptanceDate: 2022-11-03T06:29:25.000Z (type <class 'str'>)
timeTableRows: 134 values (type <class 'list'>)


In [9]:
timetable = data["timeTableRows"]

In [10]:
timetable[0]

{'stationShortCode': 'HKI',
 'stationUICCode': 1,
 'countryCode': 'FI',
 'type': 'DEPARTURE',
 'trainStopping': True,
 'commercialStop': True,
 'commercialTrack': '6',
 'cancelled': False,
 'scheduledTime': '2023-03-14T04:57:00.000Z',
 'actualTime': '2023-03-14T04:57:49.000Z',
 'differenceInMinutes': 1,
 'causes': [],
 'trainNumber': 1,
 'trainReady': {'source': 'KUPLA',
  'accepted': True,
  'timestamp': '2023-03-14T04:46:40.000Z'}}

### /trains GTFS-muodossa

protobuf (protocol buffer)?????

### Vanhat junat zip-pakettina

In [11]:
zip_url = "https://rata.digitraffic.fi/api/v1/trains/dumps/list.html"

### Junan versiohistoria

historia tallessa 14 päivää

In [12]:
url_start = "https://rata.digitraffic.fi/api/v1/trains/history/"

date = "2023-04-30"


url = f"{url_start}{date}/{train_num}"

In [13]:
req = requests.get(url)
req.status_code

200

In [14]:
# req.json()

## /live-trains

In [15]:
url_start = "https://rata.digitraffic.fi/api/v1/live-trains/"

station_code = "JY"

url = f"{url_start}station/{station_code}"

In [16]:
req = requests.get(url)
req.status_code

200

In [17]:
len(req.json())

18

In [18]:
# req.json()[0]

In [19]:
print_dict(req.json()[0])

trainNumber: 429 (type <class 'int'>)
departureDate: 2023-05-15 (type <class 'str'>)
operatorUICCode: 10 (type <class 'int'>)
operatorShortCode: vr (type <class 'str'>)
trainType: HDM (type <class 'str'>)
trainCategory: Long-distance (type <class 'str'>)
commuterLineID:  (type <class 'str'>)
runningCurrently: False (type <class 'bool'>)
cancelled: False (type <class 'bool'>)
version: 285579092246 (type <class 'int'>)
timetableType: REGULAR (type <class 'str'>)
timetableAcceptanceDate: 2023-02-16T07:07:26.000Z (type <class 'str'>)
timeTableRows: 30 values (type <class 'list'>)


### reittiperusteinen live-trains

In [20]:
url_start = "https://rata.digitraffic.fi/api/v1/live-trains/"

depart_station = "HKI"
arrive_station = "TPE"

url = f"{url_start}station/{depart_station}/{arrive_station}"

In [21]:
date = "2023-03-14"
url = f"{url}?departure_date={date}"

In [22]:
req = requests.get(url)
req.status_code

200

In [23]:
len(req.json())

41

In [24]:
print_dict(req.json()[0])

trainNumber: 35 (type <class 'int'>)
departureDate: 2023-03-14 (type <class 'str'>)
operatorUICCode: 10 (type <class 'int'>)
operatorShortCode: vr (type <class 'str'>)
trainType: S (type <class 'str'>)
trainCategory: Long-distance (type <class 'str'>)
commuterLineID:  (type <class 'str'>)
runningCurrently: False (type <class 'bool'>)
cancelled: False (type <class 'bool'>)
version: 285114942962 (type <class 'int'>)
timetableType: REGULAR (type <class 'str'>)
timetableAcceptanceDate: 2022-11-03T06:29:25.000Z (type <class 'str'>)
timeTableRows: 202 values (type <class 'list'>)


## /train-locations

In [25]:
url_start = "https://rata.digitraffic.fi/api/v1/train-locations/"

url = f"{url_start}latest/"

In [26]:
req = requests.get(url)
req.status_code

200

In [27]:
len(req.json())

123

### yhden junan sijainti

In [28]:
url_start = "https://rata.digitraffic.fi/api/v1/train-locations/"

date = "2023-03-14"
train_num = 1

url = f"{url_start}{date}/{train_num}"

In [29]:
req = requests.get(url)
req.status_code

200

In [30]:
len(req.json())

2472

### Vanhat sijainnit zip-pakettina

In [31]:
url = "https://rata.digitraffic.fi/api/v1/train-locations/dumps/list.html"

## /train-tracking

In [32]:
url_start = "https://rata.digitraffic.fi/api/v1/train-tracking/"

In [33]:
date = "2023-03-14"
train_num = 1

url = f"{url_start}{date}/{train_num}"

In [34]:
# req = requests.get(url_start)
req = requests.get(url)
req.status_code

200

In [35]:
len(req.json())

1086

In [36]:
req.json()[0]

{'id': 2377182201,
 'version': 285114889714,
 'trainNumber': '1',
 'departureDate': '2023-03-14',
 'timestamp': '2023-03-14T09:33:33.000Z',
 'trackSection': 'B',
 'station': 'NTH',
 'nextStation': 'SUL',
 'previousStation': 'HSL',
 'type': 'RELEASE'}

In [37]:
req.json()[-1]

{'id': 2377060494,
 'version': 285112971514,
 'trainNumber': '1',
 'departureDate': '2023-03-14',
 'timestamp': '2023-03-14T04:52:00.000Z',
 'trackSection': '006a',
 'station': 'HKI',
 'nextStation': 'PSL',
 'previousStation': 'START',
 'type': 'OCCUPY'}

### aseman seuranta

In [38]:
url_start = "https://rata.digitraffic.fi/api/v1/train-tracking/"

station_code = "JY"
track_section = ""
date = "2023-03-14"

url = f"{url_start}station/{station_code}/{date}"

if track_section:
    url = f"{url_start}station/{station_code}/{date}/{track_section}"

In [39]:
req = requests.get(url)
req.status_code

200

In [40]:
len(req.json())

3044

## /compositions

In [41]:
url_start = "https://rata.digitraffic.fi/api/v1/compositions/"

date = "2023-03-14"
train_num = 1

url = f"{url_start}{date}/{train_num}"

In [42]:
req = requests.get(url)
req.status_code

200

In [43]:
len(req.json())

8

In [44]:
junakokoonpano = req.json()

In [45]:
print_dict(junakokoonpano)

trainNumber: 1 (type <class 'int'>)
departureDate: 2023-03-14 (type <class 'str'>)
operatorUICCode: 10 (type <class 'int'>)
operatorShortCode: vr (type <class 'str'>)
trainCategory: Long-distance (type <class 'str'>)
trainType: IC (type <class 'str'>)
version: 285114936734 (type <class 'int'>)
journeySections: 1 values (type <class 'list'>)


In [46]:
print_dict(junakokoonpano["journeySections"][0])

beginTimeTableRow: 5 values (type <class 'dict'>)
endTimeTableRow: 5 values (type <class 'dict'>)
locomotives: 1 values (type <class 'list'>)
wagons: 5 values (type <class 'list'>)
totalLength: 152 (type <class 'int'>)
maximumSpeed: 200 (type <class 'int'>)
attapId: 293065915 (type <class 'int'>)
saapAttapId: 293072665 (type <class 'int'>)


### Vanhat kokoonpanot zip-pakettina

In [47]:
url = "https://rata.digitraffic.fi/api/v1/compositions/dumps/list.html"

## /routesets

ehkei ole kiinnostavaa nyt...

## Asemat

In [48]:
url = "https://rata.digitraffic.fi/api/v1/metadata/stations"

In [49]:
req = requests.get(url)
req.status_code

200

In [50]:
stations = req.json()
len(stations)

554

In [51]:
def find_station(station_name):
    return list(filter(lambda d: station_name in d["stationName"], stations))

In [52]:
stations[0]

{'passengerTraffic': False,
 'type': 'STATION',
 'stationName': 'Ahonpää',
 'stationShortCode': 'AHO',
 'stationUICCode': 1343,
 'countryCode': 'FI',
 'longitude': 25.006783,
 'latitude': 64.537118}

In [53]:
station_name = "Jyväskylä"

In [54]:
find_station(station_name)

[{'passengerTraffic': True,
  'type': 'STATION',
  'stationName': 'Jyväskylä',
  'stationShortCode': 'JY',
  'stationUICCode': 240,
  'countryCode': 'FI',
  'longitude': 25.754984,
  'latitude': 62.241455}]

In [55]:
station_dict = list(filter(lambda d: station_name in d["stationName"], stations))[0]

In [56]:
station_dict

{'passengerTraffic': True,
 'type': 'STATION',
 'stationName': 'Jyväskylä',
 'stationShortCode': 'JY',
 'stationUICCode': 240,
 'countryCode': 'FI',
 'longitude': 25.754984,
 'latitude': 62.241455}

In [57]:
find_station("Tampere")

[{'passengerTraffic': True,
  'type': 'STATION',
  'stationName': 'Tampere asema',
  'stationShortCode': 'TPE',
  'stationUICCode': 160,
  'countryCode': 'FI',
  'longitude': 23.773454,
  'latitude': 61.498236},
 {'passengerTraffic': False,
  'type': 'STATION',
  'stationName': 'Tampere Järvensivu',
  'stationShortCode': 'JVS',
  'stationUICCode': 1272,
  'countryCode': 'FI',
  'longitude': 23.785189,
  'latitude': 61.491461},
 {'passengerTraffic': False,
  'type': 'STATION',
  'stationName': 'Tampere tavara',
  'stationShortCode': 'TPET',
  'stationUICCode': 1273,
  'countryCode': 'FI',
  'longitude': 23.763974,
  'latitude': 61.469888},
 {'passengerTraffic': False,
  'type': 'STATION',
  'stationName': 'Tampere Viinikka',
  'stationShortCode': 'VKA',
  'stationUICCode': 1274,
  'countryCode': 'FI',
  'longitude': 23.773542,
  'latitude': 61.480331}]

In [58]:
find_station("Oulu")

[{'passengerTraffic': True,
  'type': 'STATION',
  'stationName': 'Oulu asema',
  'stationShortCode': 'OL',
  'stationUICCode': 370,
  'countryCode': 'FI',
  'longitude': 25.486121,
  'latitude': 65.012409},
 {'passengerTraffic': False,
  'type': 'STATION',
  'stationName': 'Oulu Nokela',
  'stationShortCode': 'NOK',
  'stationUICCode': 1195,
  'countryCode': 'FI',
  'longitude': 25.475726,
  'latitude': 64.989283},
 {'passengerTraffic': False,
  'type': 'STATION',
  'stationName': 'Oulu Oritkari',
  'stationShortCode': 'ORI',
  'stationUICCode': 1196,
  'countryCode': 'FI',
  'longitude': 25.452448,
  'latitude': 64.98925},
 {'passengerTraffic': False,
  'type': 'STATION',
  'stationName': 'Oulu tavara',
  'stationShortCode': 'OLT',
  'stationUICCode': 1197,
  'countryCode': 'FI',
  'longitude': 25.470291,
  'latitude': 65.000913},
 {'passengerTraffic': False,
  'type': 'STATION',
  'stationName': 'Oulu Tuira',
  'stationShortCode': 'TUA',
  'stationUICCode': 339,
  'countryCode': 'FI

In [59]:
find_station("Kemi")

[{'passengerTraffic': True,
  'type': 'STATION',
  'stationName': 'Kemi asema',
  'stationShortCode': 'KEM',
  'stationUICCode': 347,
  'countryCode': 'FI',
  'longitude': 24.574339,
  'latitude': 65.736749},
 {'passengerTraffic': False,
  'type': 'STATION',
  'stationName': 'Kemi Lautiosaari',
  'stationShortCode': 'LI',
  'stationUICCode': 829,
  'countryCode': 'FI',
  'longitude': 24.565183,
  'latitude': 65.779047},
 {'passengerTraffic': False,
  'type': 'STATION',
  'stationName': 'Kemi Sahansaari',
  'stationShortCode': 'SHS',
  'stationUICCode': 1363,
  'countryCode': 'FI',
  'longitude': 24.550275,
  'latitude': 65.757307},
 {'passengerTraffic': True,
  'type': 'STATION',
  'stationName': 'Kemijärvi',
  'stationShortCode': 'KJÄ',
  'stationUICCode': 367,
  'countryCode': 'FI',
  'longitude': 27.403715,
  'latitude': 66.724273}]

In [60]:
station_df = pd.DataFrame(stations)
station_df.head()

Unnamed: 0,passengerTraffic,type,stationName,stationShortCode,stationUICCode,countryCode,longitude,latitude
0,False,STATION,Ahonpää,AHO,1343,FI,25.006783,64.537118
1,False,STATION,Ahvenus,AHV,1000,FI,22.498185,61.291923
2,True,STOPPING_POINT,Ainola,AIN,628,FI,25.101494,60.456863
3,False,STATION,Airaksela,ARL,869,FI,27.4295,62.724396
4,False,STATION,Aittaluoto,ATL,676,FI,21.84537,61.476933


In [61]:
# onko tarvetta tallentaa asemat?
# save_df_to_db(station_df, "juna-asemat")

## Raideosuudet

In [62]:
url = "https://rata.digitraffic.fi/api/v1/metadata/track-sections"

In [63]:
req = requests.get(url)
req.status_code

200

In [64]:
raideosuudet = req.json()

In [65]:
len(raideosuudet)

13138

In [66]:
raideosuudet[0]

{'id': 323476,
 'station': 'TPE',
 'trackSectionCode': 'TPE_230',
 'ranges': [{'id': 317350,
   'startLocation': {'track': '003', 'kilometres': 185, 'metres': 175},
   'endLocation': {'track': '003', 'kilometres': 185, 'metres': 695}}]}

In [67]:
raideosuudet[-1]

{'id': 348907,
 'station': 'KOK',
 'trackSectionCode': 'V567CD',
 'ranges': [{'id': 343454,
   'startLocation': {'track': '008', 'kilometres': 551, 'metres': 117},
   'endLocation': {'track': '008', 'kilometres': 551, 'metres': 117}}]}

In [68]:
useita_raiteita = [raide for raide in raideosuudet if len(raide["ranges"]) > 1]
len(useita_raiteita)

586

In [69]:
len([raide for raide in raideosuudet if len(raide["ranges"]) > 2])

174

In [70]:
len([raide for raide in raideosuudet if len(raide["ranges"]) > 3])

72

In [71]:
len([raide for raide in raideosuudet if len(raide["ranges"]) > 5])

72

In [72]:
[raide for raide in raideosuudet if len(raide["ranges"]) > 5][0]

{'id': 324871,
 'station': 'SK',
 'trackSectionCode': 'SK_853',
 'ranges': [{'id': 328654,
   'startLocation': {'track': '066', 'kilometres': 417, 'metres': 716},
   'endLocation': {'track': '441', 'kilometres': 418, 'metres': 379}},
  {'id': 328655,
   'startLocation': {'track': '003', 'kilometres': 346, 'metres': 800},
   'endLocation': {'track': '441', 'kilometres': 418, 'metres': 379}},
  {'id': 328651,
   'startLocation': {'track': '003', 'kilometres': 346, 'metres': 800},
   'endLocation': {'track': '008', 'kilometres': 418, 'metres': 379}},
  {'id': 328653,
   'startLocation': {'track': '066', 'kilometres': 417, 'metres': 716},
   'endLocation': {'track': '431', 'kilometres': 418, 'metres': 379}},
  {'id': 328652,
   'startLocation': {'track': '066', 'kilometres': 417, 'metres': 716},
   'endLocation': {'track': '008', 'kilometres': 418, 'metres': 379}},
  {'id': 328650,
   'startLocation': {'track': '003', 'kilometres': 346, 'metres': 800},
   'endLocation': {'track': '431', 'k

In [73]:
useita_raiteita[0]

{'id': 323538,
 'station': 'VAR',
 'trackSectionCode': 'VAR_V135',
 'ranges': [{'id': 317866,
   'startLocation': {'track': '024', 'kilometres': 424, 'metres': 296},
   'endLocation': {'track': '024', 'kilometres': 424, 'metres': 296}},
  {'id': 317867,
   'startLocation': {'track': 'VAR 034', 'kilometres': 424, 'metres': 390},
   'endLocation': {'track': 'VAR 034', 'kilometres': 424, 'metres': 390}}]}

In [74]:
# for track in useita_raiteita:
    # l = track["ranges"]
    # for item in track["ranges"]:
        

In [75]:
def get_track_row(main_id, station, sectioncode, range_d):
    # result = {key, value for key, value in d.items() if key != "ranges"}
    # inner_d = d["ranges"]
    result = {
        "id": main_id,
        "station": station,
        "trackSectionCode": sectioncode,
    }
    result["track_id"] = range_d["id"]
    result["start_track"] = range_d["startLocation"]["track"]
    result["start_location"] = range_d["startLocation"]["kilometres"] * 1000 + range_d["startLocation"]["metres"]
    result["end_track"] = range_d["endLocation"]["track"]
    result["end_location"] = range_d["endLocation"]["kilometres"] * 1000 + range_d["endLocation"]["metres"]
    return result

In [76]:
track_sections = []
for track in raideosuudet:
    for sub_track in track["ranges"]:
        track_sections.append(get_track_row(track["id"], track["station"], track["trackSectionCode"], sub_track))

In [77]:
track_df = pd.DataFrame(track_sections)
track_df.head()

Unnamed: 0,id,station,trackSectionCode,track_id,start_track,start_location,end_track,end_location
0,323476,TPE,TPE_230,317350,3,185175,3,185695
1,323477,TPE,TPE_280,317355,9,186200,9,186540
2,323478,TPE,TPE_452,317371,3,182455,3,183140
3,323479,TPE,TPE_896,317394,3,182935,3,183230
4,323480,TPE,TPE_897,317395,3,182995,3,183100


In [78]:
len(track_df)

14114

In [79]:
# onko tarvetta tallentaa raideosuudet?
# save_df_to_db(track_df, "raideosuudet")