## Intro to Geographic Data

From scientific fields like meteorology and climatology, through to the software on our smartphones like Google Maps and Facebook check-ins, geographic data is always present in our everyday lives. Raw geographic data like latitudes and longitudes are difficult to understand using the data charts and plots we've discussed so far. To explore this kind of data, you'll need to learn how to visualize the data on maps.

__Goal:__ We'll explore the fundamentals of geographic coordinate systems and how to work with the basemap library to plot geographic data points on maps.

__Data:__ We'll be working with flight data from the <a href='http://openflights.org/data.html'>openflights website</a>.

Here's a breakdown of the files we'll be working with and the most pertinent columns from each dataset:
- `airlines.csv` - data on each airline.
    - `country` - where the airline is headquartered.
    - `active` - if the airline is still active.
- `airports.csv` - data on each airport.
    - `name` - name of the airport.
    - `city` - city the airport is located.
    - `country` - country the airport is located.
    - `code` - unique airport code.
    - `latitude` - latitude value.
    - `longitude` - longitude value.
- `routes.csv` - data on each flight route.
    - `airline` - airline for the route.
    - `source` - starting city for the route.
    - `dest` - destination city for the route.

In [1]:
# Imports
import pandas as pd

In [2]:
airlines = pd.read_csv('data/airlines.csv')
airports = pd.read_csv('data/airports.csv')
routes = pd.read_csv('data/routes.csv')

In [4]:
# Display the 1st row of Airlines DF
print(airlines.iloc[0])
# Display the 1st row of Airports DF
print(airports.iloc[0])
# Display the 1st row of Routes DF
print(routes.iloc[0])

id                       1
name        Private flight
alias                   \N
iata                     -
icao                   NaN
callsign               NaN
country                NaN
active                   Y
Name: 0, dtype: object
id                              1
name                       Goroka
city                       Goroka
country          Papua New Guinea
code                          GKA
icao                         AYGA
latitude                 -6.08169
longitude                 145.392
altitude                     5282
offset                         10
dst                             U
timezone     Pacific/Port_Moresby
Name: 0, dtype: object
airline         2B
airline_id     410
source         AER
source_id     2965
dest           KZN
dest_id       2990
codeshare      NaN
stops            0
equipment      CR2
Name: 0, dtype: object


- The best way to link the data from the three datasets is by using geographical maps or coordinate systems.
- The latitude and longitude values are floats.

### Geographic Coordinate Systems

A geographic coordinate system allows us to locate any point on Earth using latitude and longitude coordinates.
<img src='_images/latitude_longitude.png' />

- We want to visualize latitude and longitude points on two-dimensional maps. Two-dimensional maps are faster to render, easier to view on a computer and distribute, and are more familiar to the experience of popular mapping software like Google Maps.
- Latitude and longitude values describe points on a sphere, which is three-dimensional. To plot the values on a two-dimensional plane, we need to convert the coordinates to the Cartesian coordinate system using a __map projection__.
    - A map projection transforms points on a sphere to a two-dimensional plane. When projecting down to the two-dimensional plane, some properties are distorted.

### Installing Basemap

Basemap is an extension to Matplotlib that makes it easier to work with geographic data. Basemap makes it easy to convert from the spherical coordinate system (latitudes & longitudes) to the Mercator projection.

In [5]:
import matplotlib.pyplot as plt
from mpl_toolkits.basemap import Basemap

### Workflow With Basemap

The general workflow will look like the following when working with two-dimensional maps:
- Create a new basemap instance with the specific map projection we want to use and how much of the map we want included.
- Convert spherical coordinates to Cartesian coordinates using the basemap instance.
- Use the matplotlib and basemap methods to customize the map.
- Display the map.

Let's focus on the first step and create a new basemap instance. To create a new instance of the basemap class, we call the basemap constructor and pass in values for the required parameters:
- `projection`: the map projection.
- `llcrnrlat`: latitude of lower left hand corner of the desired map domain
- `urcrnrlat`: latitude of upper right hand corner of the desired map domain
- `llcrnrlon`: longitude of lower left hand corner of the desired map domain
- `urcrnrlon`: longitude of upper right hand corner of the desired map domain

In [6]:
m = Basemap(projection='merc', llcrnrlat=-80, urcrnrlat=80, llcrnrlon=-180, urcrnrlon=180)

### Converting From Spherical to Cartesian Coordinates

- We can pass in a list of latitude and longitude values into the basemap instance and it will return back converted lists of longitude and latitude values using the projection
- The constructor only accepts list values, so we'll need to use `Series.tolist()` to convert the `longitude` and `latitude` columns from the `airports` dataframe to lists.

Then, we pass them to the basemap instance with the longitude values first then the latitude values:
```
x, y = m(longitudes, latitudes)
```
The basemap object will return 2 list objects, which we assign to `x` and `y`

In [7]:
x, y = m(airports['longitude'].tolist(),airports['latitude'].tolist())

### Generating A Scatter Plot

A scatter plot is the simplest way to plot points on a map, where each point is represented as an (x, y) coordinate pair. To create a scatter plot from a list of x and y coordinates, we use the basemap.scatter() method.
```
m.scatter(x,y)
```

In [None]:
m.scatter(x, y, s=1)
plt.show()