# Analysis if travel times in europe

## Table of Contents
* [Introduction](#Introduction)
    * [Why is this topic important?](#Why-is-this-topic-important)
    * [Questions](#Questions)
* [Data gathering](#Data)
    * [Data sources](#Data-Sources)
    * [Car travel data](#Car-travel-data)
    * [Train travel data](#Train-travel-data)
        * [Duration estimation](#Train-Duration-Estimation)
    * [Air travel data](#Air-travel-data)
        * [Duration estimations](#Air-Duration-Estimation)
* [Analysis](#chapter3)
    * [Section 3.1](#section_3_1)
        * [Sub Section 3.1.1](#sub_section_3_1_1)
        * [Sub Section 3.1.2](#sub_section_3_1_2)
    * [Section 3.2](#section_3_2)
        * [Sub Section 3.2.1](#sub_section_3_2_1)
## Introduction
This notebook contains the analysis of travel data inside europe between cities with or that are close to major airports. 
### Why is this topic important
Analyzing travel times in Europe holds significance in optimizing the overall travel experience, ensuring efficiency, and promoting accessibility to diverse destinations. This analysis informs strategic infrastructure development, enabling authorities to address bottlenecks and prioritize key projects for economic growth. Efficient travel times contribute to sustainability, reducing fuel consumption and emissions, while also influencing urban planning for more livable cities. By streamlining transportation, Europe can foster economic development, attract investments, and enhance the interconnectedness of its regions. In essence, the examination of travel times is pivotal for creating a well-connected, environmentally conscious, and economically vibrant European transportation landscape.

This report targets:
- Urban planers, who can use this data to help them create well connected cities which increasses accessibility
- Enviromental advocates, who can use this data to promote more efficient travel options
- Travelers, who need to find the fastes way between two citties
### Questions
Based on why we think the topic is important, we have decided to anwser the following questions:
- How much faster is it to travel by plane than car/train
- Are there routes on which rail leads to shorter journey times than road and air travel? 
- On which routes is the travel duration above average?
- Which is the most well-connected city in Europe in terms of minimising travel times to other cities? 

## Data
Since we were unable to find datasets which would contain all needed data, we have decided to build our own. This notebook does not contain all of the coded needed to gather and process data as the process takes multiple hours and also requires 2 different API keys. The code that generates the data with additional istructions is freely available on [github](https://github.com/custibor99/DOPP_GROUP25/blob/main/docs/data_docs.md). Hovwever this notebook will contain data source descriptions and simplified overview of some processing methods.
### Data Sources
The folowing data sources have been used in the project:
- [List of largest european airports](https://airmundo.com/en/blog/airport-codes-european-airports/) according to AirMundo travel agency.
- [Pythons countryinfo package](https://github.com/porimol/countryinfo) for getting neighborings of european countries.
- [Chronotrains API](https://www.chronotrains.com/en) was used for obtaining train travel times between different cities
- [Google Maps Geocode API](https://developers.google.com/maps/documentation/geocoding/overview) were used for obtaining latitude and longitude coordinates of cities
- [Google maps distance matrix api](https://developers.google.com/maps/documentation/distance-matrix/overview) used for obtaining driving distances and durations between cities.
- Airlabs [Airports](https://airlabs.co/docs/airports) and [Schedules](https://airlabs.co/docs/schedules) API used for getting airport coordinates and snapshots of flight schedules.

### Car travel data
Car travel data has been obtained using googles [Google Maps Geocode API](https://developers.google.com/maps/documentation/geocoding/overview) and [Google maps distance matrix api](https://developers.google.com/maps/documentation/distance-matrix/overview). Below is a sample code whic display how the request can be made in python.
```python
import requests
city = "LJUBLJANA"
key = "KEY"
url = f"https://maps.googleapis.com/maps/api/geocode/json?address={city}&key={key}"
url_parsed = urllib.parse.quote(url, safe="://=?&")
response = requests.get(url_parsed)
```
The response is a json object containing data about city coordinates.
### Train travel data 
The train travel data was obtained using the [Chronotrains API](https://www.chronotrains.com/en). The website uses estimations from Deutsche Bahn. There are 2 awailable endpoints which can be used without a API key. Their usage is shown below.


In [8]:
import requests
import urllib.parse
url = f"https://www.chronotrains.com/api/search/Ljubljana"
url_parsed = urllib.parse.quote(url, safe="://")
response = requests.get(url_parsed)
response.json()[0]

{'name': 'Ljubljana',
 'aliases': [3196359],
 'i18nNames': {'cs': 'Lublaň',
  'da': 'Ljubljana',
  'de': 'Laibach',
  'en': 'Ljubljana',
  'es': 'Liubliana',
  'fr': 'Ljubljana',
  'it': 'Lubiana',
  'nb': 'Ljubljana',
  'nl': 'Ljubljana',
  'pl': 'Lublana',
  'ro': 'Ljubljana',
  'sv': 'Ljubljana'},
 'countryCode': 'SI'}

The aliases object represents the ID of the train station that can be used to find distances between strations.

In [13]:
url = f"https://www.chronotrains.com/api/trip/3196359/724443"
url_parsed = urllib.parse.quote(url, safe="://")
response = requests.get(url_parsed)
response.json()["suggestion"]

{'originId': 3196359,
 'destinationId': 724443,
 'duration': 681,
 'changes': 2,
 'estimatedPrice': None,
 'estimatedPriceCurrency': None,
 'journey': [3196359, 7900279, 5500008, 724443],
 'geom': 'gjaxGa`pwAkPi}C~fAdwAhvJwqJoq@g}C~{@p\\dqG_wRycC_wGxeCctFg|@syVpb@oyHtjE}lEzKe`Fw~Hg`J{{AadYguAyAkmD{|Nit@fwCsqClO{h@tdKebE`wDwrCskGipDboCor@crFgcJ_DejEghFvyBop_@{fHrXwM{yGobD}zCmdBgrK}sIkVs~FyjSrCyi_@ynD_|HnwD_zt@mvCm}@g_MjwDw}A}kIq|Ef~EihO}cGogChsF{tIjxBc}OwxAaqAscB{{Aet[??e`Dsts@}{DoxV||DcpYctFgkFwWccFqeG__IchHetEcv@qlLusG}qTgyF}Cs`Hr|Cq_I{uBleEyqN|tB{c[bfF}~Mn@ciEmgN_r[rbCol@oe@gfJp|BgqKawAgsNrgC_`Csq@upL{{GinWlU_fEw`EupEprCydJsYsiaAgqJ_{WkhPa|SgvHqkTodFyzY_`IdkB??s_Dkb[oaBnp@{nAcmJajFqtJm`GxPlTuwVo`EkeCitCokKtsCgkXa`EeeKa|B}x@x_AsoImlC}yh@}vE}mUquFapnAg_\\{f`Ae_DamCydKszAw{Qp{FdjAecOw|T{aIq~Xyzb@qz_@}uRokQcnPi~V~bI'}

#### Train Duration Estimation 
The provided estimations were mostly resonable when you were traveling inside a country and between neighboring countries, but if we wanted to recive an estimate for a long route that would travel between multiple countries the API did not produce any results. This is why we have decided to estimate the train travel durations. The idea of the algorithms is that we get the travel times between cities in the same country and neighboring countries and connect them in a graph. Afterwards we can use a shortest path algorithm to estimate the duration of any path as long as their exist a connection in the graph. The graph creation pseudo code is provided below
```
Create an empty graph G
Add all cities to G as nodes
For each city in a country:
		get the travel duration to all other cities in the country
        add the travel durations to the graph

		get the travel durations to all cities from neighboring countries
		add the travel durations to the graph
```

### Air travel data
### Air Duration Estimation