# Lesson-Visualizing Geographic Data
- In this lesson, we'll explore the fundamentals of geographic coordinate systems and how to work with the basemap library to plot geographic data points on maps. We'll be working with flight data from the (openflights website)[https://openflights.org/data.html].
- We will be working with following files. Below are the most pertinent columns from each dataset:

  -airlines.csv - data on each airline.

    - `country` - where the airline is headquartered.
    - `active` - if the airline is still active.
  
  - airports.csv - data on each airport.

    - `name` - name of the airport.
    - `city` - city the airport is located.
    - `country` - country the airport is located.
    - `code` - unique airport code.
    - `latitude` - latitude value.
    - `longitude` - longitude value.  
  
  - routes.csv - data on each flight route.

    - `airline` - airline for the route.
    - `source` - starting city for the route.
    - `dest` - destination city for the route. 

In [1]:
import pandas as pd
airlines=pd.read_csv('airlines.csv')
airports=pd.read_csv('airports.csv')
routes=pd.read_csv('routes.csv')

In [2]:
print(airlines.iloc[0])
print(airlines.info())


id                       1
name        Private flight
alias                   \N
iata                     -
icao                   NaN
callsign               NaN
country                NaN
active                   Y
Name: 0, dtype: object
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 6048 entries, 0 to 6047
Data columns (total 8 columns):
 #   Column    Non-Null Count  Dtype 
---  ------    --------------  ----- 
 0   id        6048 non-null   int64 
 1   name      6048 non-null   object
 2   alias     5614 non-null   object
 3   iata      1461 non-null   object
 4   icao      5961 non-null   object
 5   callsign  5305 non-null   object
 6   country   6033 non-null   object
 7   active    6048 non-null   object
dtypes: int64(1), object(7)
memory usage: 378.1+ KB
None


In [3]:
print(airports.iloc[0])
print(airports.info())


id                              1
name                       Goroka
city                       Goroka
country          Papua New Guinea
code                          GKA
icao                         AYGA
latitude                 -6.08169
longitude                 145.392
altitude                     5282
offset                         10
dst                             U
timezone     Pacific/Port_Moresby
Name: 0, dtype: object
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 8107 entries, 0 to 8106
Data columns (total 12 columns):
 #   Column     Non-Null Count  Dtype  
---  ------     --------------  -----  
 0   id         8107 non-null   int64  
 1   name       8107 non-null   object 
 2   city       8107 non-null   object 
 3   country    8107 non-null   object 
 4   code       5880 non-null   object 
 5   icao       8043 non-null   object 
 6   latitude   8107 non-null   float64
 7   longitude  8107 non-null   float64
 8   altitude   8107 non-null   int64  
 9   offset     8107 n

In [4]:
print(routes.iloc[0])
print(routes.info())


airline         2B
airline_id     410
source         AER
source_id     2965
dest           KZN
dest_id       2990
codeshare      NaN
stops            0
equipment      CR2
Name: 0, dtype: object
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 67663 entries, 0 to 67662
Data columns (total 9 columns):
 #   Column      Non-Null Count  Dtype 
---  ------      --------------  ----- 
 0   airline     67663 non-null  object
 1   airline_id  67663 non-null  object
 2   source      67663 non-null  object
 3   source_id   67663 non-null  object
 4   dest        67663 non-null  object
 5   dest_id     67663 non-null  object
 6   codeshare   14597 non-null  object
 7   stops       67663 non-null  int64 
 8   equipment   67645 non-null  object
dtypes: int64(1), object(8)
memory usage: 4.6+ MB
None


### Install Basemap from Conda
-The easiest way to install basemap is through Anaconda. If you're new to Anaconda, we recommend checking out the (installation)[https://matplotlib.org/basemap/] documentation :
`conda install basemap`
If the above code does not work for you, you can install Basemap through the Linux command line using the following code:
`onda install -c conda-forge basemap`
The Basemap library has some external dependencies that Anaconda handles the installation for. To test the installation, run the following import code:
`from mpl_toolkits.basemap import Basemap`

In [5]:
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
from mpl_toolkits.basemap import Basemap


KeyError: 'PROJ_LIB'

## Workflow with Basemap

In [None]:
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.basemap import Basemap
m=Basemap(llcrnrlat=-80,urcrnrlat=80,llcrnrlon=-180,urcrnrlon=180,projection='merc')


## Converting from Spherical to Cartesian Coordinates

In [None]:
m = Basemap(projection='merc', llcrnrlat=-80, urcrnrlat=80, llcrnrlon=-180, urcrnrlon=180)
longitudes=airports['longitude'].tolist()
latitudes=airports['latitude'].tolist()
x,y=m(longitudes,latitudes)

## Generating a Scatter Plot

In [None]:
m = Basemap(projection='merc', llcrnrlat=-80, urcrnrlat=80, llcrnrlon=-180, urcrnrlon=180)
x, y = m(longitudes, latitudes)
m.scatter(x,y,s=1)
plt.show()

## Customizing the Map using Basemap

In [None]:
m = Basemap(projection='merc', llcrnrlat=-80, urcrnrlat=80, llcrnrlon=-180, urcrnrlon=180)
longitudes = airports["longitude"].tolist()
latitudes = airports["latitude"].tolist()
x, y = m(longitudes, latitudes)
m.scatter(x, y, s=1)
m.drawcoastlines()

## Customizing the Map using Matplotlib

In [None]:
# Add code here, before creating the Basemap instance.
fig,x=plt.subplots(figsize=(15,20))
plt.title("Scaled Up Earth With Coastlines")
m = Basemap(projection='merc', llcrnrlat=-80, urcrnrlat=80, llcrnrlon=-180, urcrnrlon=180)
longitudes = airports["longitude"].tolist()
latitudes = airports["latitude"].tolist()
x, y = m(longitudes, latitudes)
m.scatter(x, y, s=1)
m.drawcoastlines()
plt.show()

## Introduction to Great Circles
We use the basemap.drawgreatcircle() method to display a great circle between 2 points. The basemap.drawgreatcircle() method requires four parameters in the following order:

lon1 - longitude of the starting point.
lat1 - latitude of the starting point.
lon2 - longitude of the ending point.
lat2 - latitude of the ending point.
```
m.drawgreatcircle(39.956589, 43.449928, 49.278728, 55.606186)
m.drawgreatcircle(48.006278, 46.283333, 49.278728, 55.606186)
m.drawgreatcircle(39.956589, 43.449928, 43.081889 , 44.225072)
```

In [None]:
geo_routes=pd.read_csv('geo_routes.csv')
geo_routes.info()
print(geo_routes[:5])

Unfortunately, basemap struggles to create great circles for routes that have an absolute difference of larger than 180 degrees for either the latitude or longitude values. This is because the basemap.drawgreatcircle() method isn't able to create great circles properly when they go outside of the map boundaries. This is mentioned briefly in the documentation for the method:
Note: Cannot handle situations in which the great circle intersects the edge of the map projection domain, and then re-enters the domain.


Write a function, named `create_great_circles()` that draws a great circle for each route that has an absolute difference in the latitude and longitude values less than 180. This function should:

- Accept a dataframe as the sole parameter
- iterate over the rows in the dataframe using `DataFrame.iterrows()`
- For each row:

- Draw a great circle using the four geographic coordinates only if:
  - The absolute difference between the latitude values (end_lat and start_lat) is less than 180.
  - If the absolute difference between the longitude values (end_lon and start_lon) is less than 180.
- Create a filtered dataframe containing just the routes that start at the DFW airport.
  - Select only the rows in geo_routes where the value for the source column equals "DFW".
  - Assign the resulting dataframe to dfw.  
- Pass dfw into create_great_circles() and display the plot using the pyplot.show() function.



In [None]:
fig, ax = plt.subplots(figsize=(15,20))
m = Basemap(projection='merc', llcrnrlat=-80, urcrnrlat=80, llcrnrlon=-180, urcrnrlon=180)
m.drawcoastlines()

#Start writing your solution below this line
def create_great_circles(df):
    for index, row in df.iterrows():
        end_lat,start_lat=row['end_lat'],row['start_lat']                   
        end_lon,start_lon=row['end_lon'],row['start_lon']
        if abs(end_lat-start_lat)<180:
            if abs(end_lon-start_lon)<180:
                m.drawgreatcircle(start_lon,start_lat,end_lon,end_lat)
dfw=geo_routes[geo_routes["source"]=='DFW']  
create_great_circles(dfw)
plt.show()                                                                                  
                                                                                      
                                         

In [None]:
what to do next:

Plotting tools:
Creating 3D plots using Plotly
Creating interactive visualizations using bokeh
Creating interactive map visualizations using folium

The art and science of data visualization:
Visual Display of Quantitative Information
Visual Explanations: Images and Quantities, Evidence and Narrative