## Assignment

### Background
We have now worked with Python with the `pandas` library to learn about `DataFrame`s and `geopandas` to learn about `GeoDataFrame`s by following some basic tutorials. This next assignment will use the toy `nybb` database that is popular in the open source software world to work with `geopandas` more exploratively. More specifcially, we will be reproducing some output that we previously did in the QGIS Tutorials.

Read this [tutorial](https://geopandas.org/getting_started/introduction.html)
Browse/read the GeoPandas [user guide](https://geopandas.org/docs/user_guide.html).

- Tutorial: https://geopandas.org/getting_started/introduction.html
- User Guide: https://geopandas.org/docs/user_guide.html

#### Geospatial libraries and notes on documentation 
Geospatial Library Reference:
- [Shapely docs](https://shapely.readthedocs.io/en/stable/manual.html)
- [Geopanda docs](http://geopandas.org/)
- [Pandas docs](https://pandas.pydata.org/pandas-docs/stable/user_guide/index.html#user-guide)

Docs can be good or bad. The docs for above range from "ok" to "good". It helps to have a working knowledge of `pandas` but the docs for `pandas` are quite good for learning and as a reference. `geopandas` actually uses the `shapely` model for geometries and while it needs to be installed, there is little direct interaction with this library on our part except for accessing the `shapely` geometries.

### Objective
The objective of this lab is to reproduce one of the QGIS Tutorials you did previously:
- [Performing spatial joins](http://www.qgistutorials.com/en/docs/3/performing_spatial_joins.html)


## Deliverables
An open Pull Request from a branch named `geopandas` to be merged with `master` containing the following files:
- `spatial_join.py`
- `spatial_join.png`

## Prep your codespace python environment
`geopandas` is not installed by default so we will use `pip` to install this and a few other libraries we need. 

This will execute the `pip` command in the shell and install the libraries we need. 

In [None]:
!pip install geopandas descartes mapclassify folium branca rtree pygeos

In [None]:
## Performing spatial joins
Review [Performing spatial joins](http://www.qgistutorials.com/en/docs/3/performing_spatial_joins.html) to see our objective. 

### Download and extract the data:

- [NY Boros](http://www.qgistutorials.com/downloads/nybb_19a.zip)
- [Pavement Ratings](http://www.qgistutorials.com/downloads/V_SSS_SEGMENTRATING_1.zip)

Meanwhile, we are going to record all of these python commands in a separate file named `spatial_join.py`. This will be a file that we can execute outside of the Jupyter notebook environment and will run from start to finish


First, download the data we will use. I have added some patterns to a `.gitignore` file so these will not be saved to the repo. We don't need to save large zip files to a git repo. 

In [None]:
from io import BytesIO
from urllib.request import urlopen
from zipfile import ZipFile
zipurl = 'http://www.qgistutorials.com/downloads/nybb_19a.zip'
with urlopen(zipurl) as zipresp:
    with ZipFile(BytesIO(zipresp.read())) as zfile:
        zfile.extractall('./data/nybb')
zipurl = 'http://www.qgistutorials.com/downloads/V_SSS_SEGMENTRATING_1.zip'
with urlopen(zipurl) as zipresp:
    with ZipFile(BytesIO(zipresp.read())) as zfile:
        zfile.extractall('./data/vss')

### Load the data in python
We are going to use `geopandas` and `descartes` in this lab, so import them:

In [None]:
import geopandas
import descartes


`geopandas` can read shapefiles. Using the path to the shapefile we downloaded for nybb:

In [None]:
nybb = geopandas.read_file('./data/nybb/nybb_19a/nybb.shp')

Next, do some basic exploration of the data and the python structures made from it:

In [None]:

print(nybb)
print(type(nybb))
nybb.head()


Note that the `.head()` method is acting _on_ the geopandas dataframe and will print the values of the first 5 lines. Next, plot it (the `.plot()` method is also acting on the geopandas dataframe and plots it using the `matplotlib` library:

In [None]:
nybb.plot()

Do the same for the street pavement rating:

In [None]:
vss = geopandas.read_file('./data/vss/dot_V_SSS_SEGMENTRATING_1_20190129.shp')

And explore:

In [None]:
print(vss)
type(vss)
vss.head()
vss.plot()

### Subset the Streets data
Geopandas gives us the ability to use array-indexing to subset the data. Let's construct a filter to get rid of the data where `RatingWord` is not `NR`:

View the data:

In [None]:
vss['RatingWord']

Construct a list of boolean values equal to the length of `vss` in which the value is `True` if `RatingWord` is not equal to `NR` and False otherwise:

In [None]:
vss['RatingWord'] != 'NR'

Now we can subset `vss` based on which values of ^ are `True`:

In [None]:
vss_sub = vss[vss['RatingWord'] != 'NR']

Now we have a new `GeoDataFrame` named `vss_sub` with fewer rows. Take a look at its shape and compare to `vss` to confirm:

In [None]:
vss_sub.shape
vss.shape

### Perform spatial join between boros and streets:
Geopandas has an `sjoin` [[doc](http://geopandas.org/reference/geopandas.sjoin.html)] operator to perform spatial joins:

In [None]:
nybb_with_vss = geopandas.sjoin(nybb, vss_sub)

Take a look:

In [None]:
type(nybb_with_vss)
nybb_with_vss.head()

### Summarize stats
We have successfully given the streets the names of the boros they reside in. Not let's summarize. This functionality is
inherited from the `pandas` library:

In [None]:
mean_rating_by_boro = nybb_with_vss.groupby(['BoroCode'])['Rating_B'].mean()

This creates a data frame containing the mean of `Rating_B` bu `BoroCode` across the `nybb_with_vss` dataframe.

In [None]:
type(mean_rating_by_boro)
mean_rating_by_boro.head()

### Join pavement summary stats to boros
The above is just a table. To give those attributes to the original boros data we need to do a table join, which in 
geopandas parlance is `.merge()`:

In [None]:
nybb_with_mean_ratings = nybb.merge(mean_rating_by_boro, on='BoroCode' )
nybb_with_mean_ratings.head()

Save your final python file as `spatial_join.py` in this repository. Additionally, take a screenshot showing the results
of the final step in the assignment and save it as `spatial_join.png`.