# Visualising Geographic Data

Latitude and longitude points on two-dimensional maps describe points on a sphere, to plot the values on a two-dimensional plane, we need to project these points on a map: convert the latitude and longitude into Cartesian coordinates.

From http://dataquest.io:" When projecting down to the two-dimensional plane, some properties are distorted. Each map projection makes trade-offs in what properties to preserve and you can read about the different trade-offs here. We'll use the Mercator projection, because it is commonly used by popular mapping software."

In [None]:
# Warnings can be ignored here...
import warnings
warnings.filterwarnings('ignore')

The datasets come from the openflights website: https://openflights.org/data.html (`.dat` files)

Overview of most important columns in the following files:
* airlines.csv - data on each airline.

    * country - where the airline is headquartered.
    * active - if the airline is still active.
* airports.csv - data on each airport.

    * name - name of the airport.
    * city - city the airport is located.
    * country - country the airport is located.
    * code - unique airport code.
    * latitude - latitude value.
    * longitude - longitude value.
* routes.csv - data on each flight route.

    * airline - airline for the route.
    * source - starting city for the route.
    * dest - destination city for the route.


In [None]:
import pandas as pd
airlines = pd.read_csv('data/airlines.csv')
airports = pd.read_csv('data/airports.csv')
routes = pd.read_csv('data/routes.csv')
print(airlines.head(3))
print(airports.head(3))
print(routes.head(3))

## Basemap 

Basemap is an extension to Matplotlib that makes it easier to work with geographic data.

Installing basemap (with Anaconda), last version to overcome the deprecated warning about the `ishold()` and `hold` methods: 

`conda install -c conda-forge basemap`

To create a new instance of the basemap class, we call the basemap constructor assigning the following parameters:

* projection: the map projection.
* llcrnrlat: **lat**itude of **l**ower **l**eft hand **c**o**rn**e**r** of the desired map domain
* urcrnrlat: **lat**itude of **u**ppe**r** right hand **c**o**rn**e**r** of the desired map domain
* llcrnrlon: **lon**gitude of **l**ower **l**eft hand **c**o**rn**e**r** of the desired map domain
* urcrnrlon: **lon**gitude of **u**ppe**r** right hand **c**o**rn**e**r** of the desired map domain

As an example, let's visualise the locations of airports from our data set:

In [None]:
from mpl_toolkits.basemap import Basemap

In [None]:
import matplotlib.pyplot as plt

from mpl_toolkits.basemap import Basemap

m = Basemap(projection='merc', llcrnrlat=-80, urcrnrlat=80, llcrnrlon=-180, urcrnrlon=180)
longitudes = airports['longitude'].tolist()
latitudes = airports['latitude'].tolist()
x, y = m(longitudes, latitudes)
m.drawcoastlines()
m.scatter(x,y,s=1)
plt.show()

The file, geo_routes.csv, contains the latitude and longitude values corresponding to the source and destination airports for each route. From http://dataquest.io: "To plot great circles, we need the source longitude, source latitude, destination longitude, and the destination latitude for each route. While the routes dataframe contains the source and destination airports for each route, the latitude and longitude values for each airport are in a separate dataframe (airports)."

In [None]:

geo_routes = pd.read_csv('data/geo_routes.csv')
print(geo_routes.info())
print(geo_routes.head(5))


`basemap.drawgreatcircle()` method isn't able to create great circles properly when they go outside of the map boundaries.

Let us visualise the flights from the Konrad-Adenauer airport in Bonn ('CGN'):

In [None]:
def create_great_circles(df):
    m = Basemap(projection='merc', llcrnrlat=-80, urcrnrlat=80, llcrnrlon=-180, urcrnrlon=180)
    m.drawcoastlines()
    for index, row in df.iterrows():
        start_lat = row['start_lat']
        end_lat = row['end_lat']
        end_lon = row['end_lon']
        start_lon = row['start_lon']
        if end_lat<180 and start_lat<180 and (end_lon-start_lon)<180:
            m.drawgreatcircle(start_lon,start_lat,end_lon, end_lat)
            
dfw = geo_routes[geo_routes['source']=='CGN'] 
#print(geo_routes['source'].value_counts())
fig, ax = plt.subplots(figsize=(15,20))

create_great_circles(dfw)

plt.show()

## Useful Plotting tools links

* Creating 3D plots using Plotly: https://plot.ly/python/3d-scatter-plots/
* Creating interactive visualizations using bokeh: http://bokeh.pydata.org/en/latest/
* Creating interactive map visualizations using folium: https://folium.readthedocs.io/en/latest/
The art and science of data visualization:
* Visual Display of Quantitative Information
* Visual Explanations: Images and Quantities, Evidence and Narrative