# Distance, driving distance and duration between two places
Various implementation in python for the Earth surface distnace, travelling distance on roads and duration of such a journey. 

We will get help from the public dataset containing locations of all the world capitals from Kaggle - https://www.kaggle.com/nikitagrec/world-capitals-gps (public).

In [2]:
import pandas as pd
from geopy import distance

In [3]:
# load the dataframe with capitals
df = pd.read_csv("concap.csv")

# rename so that the column names are shorter and comply with PEP-8
df.rename(columns={"CountryName": "Country", "CapitalName": "capital", "CapitalLatitude": "lat", "CapitalLongitude": "lon", "CountryCode": "code", "ContinentName": "continent"}, inplace=True)
df.head(3)

Unnamed: 0,Country,capital,lat,lon,code,continent
0,Somaliland,Hargeisa,9.55,44.05,,Africa
1,South Georgia and South Sandwich Islands,King Edward Point,-54.283333,-36.5,GS,Antarctica
2,French Southern and Antarctic Lands,Port-aux-Français,-49.35,70.216667,TF,Antarctica


Naming convetion of the variabled is described in PEP-8: https://www.python.org/dev/peps/pep-0008/#function-and-variable-names

There's discusion if it should be applied to the pandas columns as well, but I would suggest to do it - https://stackoverflow.com/questions/58584570/pep8-guidance-for-column-names-in-pandas-dataframe

In [47]:
# to start with let's filter only 2 capitals. Rome and Paris.
cities = df[df["capital"].isin(["Rome","Paris"])].reset_index()
cities

Unnamed: 0,index,Country,capital,lat,lon,code,continent
0,81,France,Paris,48.866667,2.333333,FR,Europe
1,110,Italy,Rome,41.9,12.483333,IT,Europe


## Calculating the distnace
The first obvious method is to use the shortest distnace on the surface of Earth. You can use various approximations:

* Great-circle distnace on the surface of sphere - https://en.wikipedia.org/wiki/Great-circle_distance
* Distances from geodesics since Earth is approximated as oblate ellipsoid https://en.wikipedia.org/wiki/Geodesics_on_an_ellipsoid
* Haversine formula - https://en.wikipedia.org/wiki/Haversine_formula, https://towardsdatascience.com/calculating-distance-between-two-geolocations-in-python-26ad3afe287b

You don't have to invent or even reproduce this math. The geopy.distance module already implemented all of these distnance calculation, it returns the values in kilometers (km), miles (mi), nautical miles (nm) or feet (ft).
* `distance((latitude_point_1, longitude_point_1), (lat_2, lon_2))` - using geodesic on `WGS-84` ellipsoid
* `geodesic((latitude_point_1, longitude_point_1), (lat_2, lon_2))`
* `great_circle((latitude_point_1, longitude_point_1), (lat_2, lon_2))`

More info about geopy.distance https://geopy.readthedocs.io/en/stable/#module-geopy.distance

In [48]:
d = distance.distance((cities.loc[0, "lat"], cities.loc[0, "lon"]), (cities.loc[1, "lat"], cities.loc[1, "lon"]))
d, d.km, d.miles

(Distance(1107.8818760940028), 1107.8818760940028, 688.4058822066647)

In [49]:
getattr(d, "km")

1107.8818760940028

In [50]:
results = []
for f in [distance.distance, distance.great_circle, distance.geodesic]:
    for mes in ["kilometers","km","miles","mi","nautical","nm","feet","ft"]:
        d = f((cities.loc[0, "lat"], cities.loc[0, "lon"]), (cities.loc[1, "lat"], cities.loc[1, "lon"]))
        results.append({"method": f.__name__, "measurement": mes, "value": getattr(d, mes)})

# show as dataframe
results_df = pd.DataFrame(results)
results_df.pivot_table(index="method", columns="measurement", values="value")

measurement,feet,ft,kilometers,km,mi,miles,nautical,nm
method,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
geodesic,3634783.0,3634783.0,1107.881876,1107.881876,688.405882,688.405882,598.208356,598.208356
great_circle,3630457.0,3630457.0,1106.563205,1106.563205,687.586498,687.586498,597.496331,597.496331


`distance.distance` nativelly calls `distance.geodesic` that's why these two calues collapse into one row in the results. 

In [51]:
# the distnace for various ellipsiods
for ellipsoid in distance.ELLIPSOIDS:
    for mes in ["kilometers","km","miles","mi","nautical","nm","feet","ft"]:
        d = distance.geodesic((cities.loc[0, "lat"], cities.loc[0, "lon"]), (cities.loc[1, "lat"], cities.loc[1, "lon"]), ellipsoid=ellipsoid)
        results.append({"method": f"geodesic: {ellipsoid}", "measurement": mes, "value": getattr(d, mes)})

# show as dataframe
results_df = pd.DataFrame(results)
results_df.pivot_table(index="method", columns="measurement", values="value")

measurement,feet,ft,kilometers,km,mi,miles,nautical,nm
method,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
geodesic,3634783.0,3634783.0,1107.881876,1107.881876,688.405882,688.405882,598.208356,598.208356
geodesic: Airy (1830),3634455.0,3634455.0,1107.781964,1107.781964,688.3438,688.3438,598.154408,598.154408
geodesic: Clarke (1880),3634851.0,3634851.0,1107.902624,1107.902624,688.418774,688.418774,598.219559,598.219559
geodesic: GRS-67,3634796.0,3634796.0,1107.885873,1107.885873,688.408366,688.408366,598.210515,598.210515
geodesic: GRS-80,3634783.0,3634783.0,1107.881876,1107.881876,688.405882,688.405882,598.208356,598.208356
geodesic: Intl 1924,3634927.0,3634927.0,1107.925804,1107.925804,688.433178,688.433178,598.232075,598.232075
geodesic: WGS-84,3634783.0,3634783.0,1107.881876,1107.881876,688.405882,688.405882,598.208356,598.208356
great_circle,3630457.0,3630457.0,1106.563205,1106.563205,687.586498,687.586498,597.496331,597.496331


# Driving distance
The cities can be quite close on the surface, though natural obstacles like sea or mountain can cause that the driving distance is much longer. 

In [54]:
cities = df[df["capital"].isin(["Helsinki","Stockholm"])].reset_index()
d = distance.distance((cities.loc[0, "lat"], cities.loc[0, "lon"]), (cities.loc[1, "lat"], cities.loc[1, "lon"]))
d.km

397.7633096859937

Even though the distance between Helsinky, the capita of Finland and Stockholm in Sweden less than 400km, if you decide to drive it's more than 1750km and 20 hours. Even if you take ferries you will drive almost 500km. Paris is located only 1107km from Rome, but roads connecting these cities have at least 1420km. 


That's why for many application you want to know the real travel distnace, which no mathematical function can return. You need to call some map service API - e.g. google routes or osrm route service (http://project-osrm.org/docs/v5.5.1/api/#route-service) 