# Sail Traffic in Greece & Inefficiencies

Multiple datasources collected by the Greek administrative authorities have become [public online](https://data.gov.gr/). 

Some of these can provide useful insights that can **assist us with decision-making**.

This notebook attempts to detect **inefficiencies in ferries routes**.

Many places in Greece experience **overtourism**, at the same time that **other places are hard to reach**. 

By re-distributing our sail routs we can
1. Protect **overcrowded** places from the adverse effects of overtourism (e.g. pollution) 
2. Improve touristic **experience**
3. Distribute **income** earned by tourism in a wider area

In this specific notebook, we examine the **number of arrivals** of ships 
- in **Myrina, Limnos** 
- for **June - August of 2021**

and detect other ports with **more arrivals** but **fewer passengers AND fewer vehicles** per arrival.


Code is very easily **reproducible**, even by changing the variables under examination.

More specifially, it can be rerun with different:
- Port of Examination (another place of Greece)
- Dates of Examination (e.g. fetching data of 3 years and examining only winter months)

In [28]:
#Port under examination
port_under_examination = "MYR" #for this initial approach it is MYR for Myrina

# go to https://data.gov.gr/token/
# and apply for an api token
# it's instantly delivered
# make sure you check your spam mail
my_api_key = "1b90133be3f86d380bbdb9c7ca7d397ae3df48b7"

#Dates for which data that will be requested
start_date = "2021-06-01" # minimum start date --> 2017-01-03
end_date = "2021-08-31"

#These variables determine the months for which the investigation will take place
starting_month_under_examination = 6
ending_month_under_examination = 8

#for example, if you want to include observations from June to August, for years 2020 & 2021
#you will need:
#start_date = "2020-06-01" 
#end_date = "2020-08-30" 
#starting_month_under_examination = 6
#ending_month_under_examination = 8

#That's useful for the final map
points_color_on_map = "orange"

In [29]:
#Libraries
import requests
import pandas as pd
import numpy as np
import folium
from folium import Choropleth, Circle, Marker
from folium.plugins import HeatMap, MarkerCluster

In [30]:
#Functions that fetch public data

def fetch_sail_data(my_api_key,start_date,end_date):
    
    url = 'https://data.gov.gr/api/v1/query/sailing_traffic?date_from={}&date_to={}'.format(start_date, end_date)
    
    headers = {'Authorization':'Token {}'.format(my_api_key)}
    
    response = requests.get(url, headers=headers)
    
    df = pd.read_json(response.text, orient='records')
    
    return(df)

def fetch_coordinates():
    
    # A function that fetches coordinates for each port
    # from United Nations
    # note that they're not very accurate, but approximate
    
    coordinates_link = "https://unece.org/fileadmin/DAM/cefact/locode/gr.htm"
    coordinates_list = pd.read_html(coordinates_link)
    coordinates_df = coordinates_list[2][coordinates_list[2][1].str.startswith("GR")]
    a = coordinates_df[1].copy().map(lambda x: x.lstrip('GR ').rstrip(' ')).copy()
    coordinates_df = pd.DataFrame({"Code":a,"Coordinates":coordinates_df[9]})
    
    return coordinates_df

In [31]:
#Whole dataset
df = fetch_sail_data(my_api_key,start_date,end_date)

In [32]:
#Subset that includes some months only
summer_df = df[(df.date.dt.month >= starting_month_under_examination) & (df.date.dt.month <= ending_month_under_examination)]

In [33]:
#Average number of Passengers and Vehicles per arrival of Ship
mean_summer_traffic = (summer_df.groupby(["arrivalport","arrivalportname"])[["vehiclecount","passengercount"]].mean()).sort_values(by = "passengercount",ascending = False).reset_index()

#Absolute number of Arrivals
no_of_arrivals = summer_df.groupby(["arrivalport"]).count().rename(columns = {"date":"Number_of_Arrivals"})["Number_of_Arrivals"]

#Average and Absolute numbers for each port merged
mean_summer_traffic_full = pd.merge(mean_summer_traffic,no_of_arrivals, left_on = "arrivalport", right_index = True, how = "left")

#Numbers of port under examination
benchmark_data = mean_summer_traffic_full[mean_summer_traffic_full.arrivalport == port_under_examination].reset_index()

#Keep only ports of interest
less_efficient_routes = mean_summer_traffic_full[(mean_summer_traffic_full.Number_of_Arrivals >= benchmark_data.Number_of_Arrivals[0]) & (mean_summer_traffic_full.passengercount <= benchmark_data.passengercount[0] ) & (mean_summer_traffic_full.vehiclecount <= benchmark_data.vehiclecount[0]) 
]

In [34]:
#Bring coordinates for each port 
coordinates_df = fetch_coordinates()

#Add coordinates
less_efficient_routes = pd.merge(less_efficient_routes,coordinates_df, left_on = "arrivalport", right_on = "Code", how = "left")

#Some coordinates were missing from the United Nations dataset,
#and thus here were manually added

conditions = [
less_efficient_routes.arrivalport == "SKO",
less_efficient_routes.arrivalport == "RHO",
less_efficient_routes.arrivalport == "PTR",
less_efficient_routes.arrivalport == "ROU"
]

choices = ["3905N 02342E","3604N 02802E","3750N 02345E", "3522N 02396E"]

less_efficient_routes["Alt_Coord"] = np.select(conditions, choices, default = less_efficient_routes["Coordinates"])

less_efficient_routes["latitude"] = pd.to_numeric(less_efficient_routes["Alt_Coord"].str.slice(0,2) + "." + less_efficient_routes["Alt_Coord"].str.slice(2,4))
less_efficient_routes["longitude"] = pd.to_numeric(less_efficient_routes["Alt_Coord"].str.slice(7,9) + "." + less_efficient_routes["Alt_Coord"].str.slice(9,11))

less_efficient_routes["color"] = points_color_on_map

Number of ports with less traffic and still more arrivals than Myrina's port:

In [35]:
print(less_efficient_routes.shape[0] - 1)

34


Ports with less traffic and still more arrivals than Myrina's port:

In [36]:
less_efficient_routes[["arrivalportname","vehiclecount","passengercount","Number_of_Arrivals"]].rename({"arrivalportname":"Name_of_Port","vehiclecount":"Average_Vehicles_Transported", "passengercount":"Average_Passengers_Transported"}, axis = 1).assign(Period_under_Examination = start_date + " - " + end_date)#[["arrivalportname","Average_Vehicles_Transported","Average_Passengers_Transported"]]

Unnamed: 0,Name_of_Port,Average_Vehicles_Transported,Average_Passengers_Transported,Number_of_Arrivals,Period_under_Examination
0,Μύρινα,25.735192,79.226481,287,2021-06-01 - 2021-08-31
1,Σκιάθος,13.790361,75.662651,415,2021-06-01 - 2021-08-31
2,Θήρα,6.936747,75.576807,1328,2021-06-01 - 2021-08-31
3,Νάξος,11.270062,69.550697,2081,2021-06-01 - 2021-08-31
4,Μύκονος,11.539935,64.267165,2141,2021-06-01 - 2021-08-31
5,Πισαετός Ιθάκης,20.547677,58.151589,409,2021-06-01 - 2021-08-31
6,Λουτρό Χανίων,0.198091,55.389021,419,2021-06-01 - 2021-08-31
7,Σκόπελος,9.89,54.643333,300,2021-06-01 - 2021-08-31
8,Ύδρα,0.0,46.806452,682,2021-06-01 - 2021-08-31
9,Ύδρα,0.0,46.806452,682,2021-06-01 - 2021-08-31


Located at: 

In [None]:
#Create the basis of our map
ports_map = folium.Map(location = [37,24.5], 
                 tiles='cartodbpositron', 
                #tiles = "stamenterrain",
                 zoom_start=6.49999)

for idx, row in less_efficient_routes.iterrows():
    
    #Here we create a string containing html
    #It will be parsed from folium.Popup function to create a beautiful popup containing information about each port   
    popup_html = "Name: "+row["arrivalportname"]+"<br>"+"Number_of_Ship_Arrivals: "+str(round(row["Number_of_Arrivals"]))+"<br>"+"Average Passenger Count: "+str(round(row["passengercount"],2))+"<br>"+"Average Vehicle Count: "+str(round(row["vehiclecount"],2))#+"<br>"+"Rank of Passenger Count: " + str(round(row["passengercount_rank"]))+"<br>"+"Rank of Passenger Count: " + str(round(row["passengercount_rank"]))+"<br>"
                    
    Marker(radius = 70,location=[row['latitude'], row['longitude']],
           
          popup=folium.Popup(popup_html),
           
           icon=folium.Icon(color=row["color"], prefix="fa", icon = "anchor" )
          ).add_to(ports_map)

In [40]:
ports_map

In [38]:
name_of_export_file = "Ports_Map.html"
ports_map.save(name_of_export_file)

In [39]:
import os
print("Your file:", name_of_export_file , "is located at" ,os.getcwd())

Your file: Ports_Map.html is located at C:\Users\voulk\Downloads
