# Geospatial Plotting

In the previous notebook, we looked a basic geospatial plotting.
Here we are using GeoPandas, however, there are numerous other ways to plot geospatial data.
In the R notebooks, we have seen the use of mapping services, such as Google Maps for rendering maps.
This functionality is available in Python as well.

In this notebook we are going to examine layering data on-top of maps for visual presentations.

## Load data from PostGIS

In [None]:
import matplotlib.pyplot as plt
import geopandas as gpd
import psycopg2

con = psycopg2.connect(database="dsa_ro", user="dsa_ro_user",password="readonly",host="dbase")

# Pulling area and region and subregion from the table as well
sql= "select name, lon, lat, area, region, subregion, pop2005, the_geom from geospatial.country_borders"

countries=gpd.GeoDataFrame.from_postgis(sql,con,geom_col='the_geom' )


## Map base

In [None]:
%matplotlib inline
countries.plot(color='white', figsize=(15,15))

## Overplotting 

Overplotting allows us to stack visual variables on top of the map.
In this first example, we overplot the center point, adjusting size based on millions of people in 2005.


In [None]:
%matplotlib inline
countries.plot(color='white', figsize=(15,15))

# Millions of People
size = countries['pop2005']/1000000

plt.scatter(x=countries['lon'], y=countries['lat'], c='r', s=size)

# Combining data sets

In this example, we can pull King county for Washington state into a GeoPanda dataframe.
Then, load a secondary data set with geospatial features and plot them as an additional layer (channel) of data.

In [None]:
# Second order
sql = "SELECT iso,name_1, name_2,the_geom "
sql+= " FROM geospatial.gadm_admin_borders "
sql+= " WHERE iso IN ('USA') and name_1 = 'Washington' and name_2 = 'King'"

fourth = gpd.GeoDataFrame.from_postgis(sql,con,geom_col='the_geom' )
fourth.plot(figsize=(15,15), color='white');

In [None]:
import pandas as pd
kc_house_data = pd.read_csv("/dsa/data/all_datasets/house_sales_in_king_county/kc_house_data.csv")
kc_house_data.describe()

In [None]:
fourth.plot(figsize=(15,15), color='white');
plt.scatter(x=kc_house_data['long'], y=kc_house_data['lat']
            ,  alpha=0.15, c='blue'
            , s=20
           )
# Zoom in a little
plt.xlim(-122.6,-121.6)
plt.show()

## Classification / Nominal Labeling

It is possible to use the Choropleth capability to show classifications.
However, when the number of labels exceeds the supported levels, you will have to revert to more programmatic techniques.

In the next few examples the region is used to color the countries.
First, look at he regiong and subregion fields, noticing that they are numeric.

In [None]:
countries.head()

The following visualization then maps the range of values into buckets using the Fisher-Jenks method.

In [None]:
import pysal as ps 
countries.plot(column='subregion',         # Which column has the measures
               scheme='fisher_jenks',   # How to partition the measure into color buckets
               k=6,                     # There are only 6 regions
               cmap='YlGnBu',           # From the Color Map Options
               figsize=(15,15))         # Figure size in inches

However, Geopandas has the ability to specify that a measure is to be treated and visualized as categorical using `categorical=True`.

In [None]:
countries.plot(column='subregion',         # Which column has the measures
               categorical=True,        # Categories
               cmap='YlGnBu',           # Colors
               figsize=(15,15))         # Figure size in inches

## <span style="background:yellow">YOUR TURN</span>

Experiment with a few variations on the paramters for: `cmap` and using **region** versus **subregion**.  
Leave your favorite in place for submission.

In [None]:
# Add your code below this comment
# ----------------------------------





# SAVE YOUR NOTEBOOK, then File > "Close and Halt"