<a href="https://colab.research.google.com/github/IDCE-MSGIS/lab-4-zoning-maps-pandas-jstrzempko/blob/main/Jess_Strzempko_Zoning_maps_pandas_IDCE30274Lab4.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Jess Strzempko  
Created 11.06.20  
Python version 3.6.9  
IDCE 30274 Computer Programming for GIS  

## Lab 4 - Zoning Maps with Pandas

This script uses data on Planned Unit Developments and affordable housing from DC's Open Data portal to teach how to navigate geographic shapefiles in a Jupyter notebook. This notebook uploads data from a local computer, concatenates the datasets and conducts basic analysis. Crosswalks are used to give the data clearer identification. Preliminary maps are created for visualization purposes, but the full map created with shapefiles copied from this script can be viewed through the Carto link on the GitHub repo.

**Inputs:**  

* Packages installed  
  + pandas  
  + geopandas  
  + shapely.geometry (Point, Polygon, Multipolygon)  
  + shapely well known text (wkt)
  + rtree  
* Planned_Unit_Development__PUDs_.shp (PUDs in DC as a shapefile)
* Affordable_Housing.csv (affordable housing in DC as a csv file)
* zoning_crosswalk.csv (zoning crosswalk as a csv)

**Outputs:**  
* puds_info.shp (shapefile containing joined PUD & AH info)

In [None]:
# Set-up Code Block

# The pandas library is already installed in the Google colab environment
# But we will install along with other packages as a demonstration

# Add GIS packages to Colab environment
!pip install geopandas
!apt-get install -y libspatialindex-dev
!pip install rtree

# Import packages
# Use "import [package] as [name]" to specify an easier name to call the package for future reference
# Comment code provides information on purpose of each package
import pandas as pd  # provides interface for interacting with tabular data
import geopandas as gpd  # combines the capabilities of pandas and shapely for geospatial operations
from shapely.geometry import Point, Polygon, MultiPolygon  # for manipulating text data into geospatial shapes
from shapely import wkt  # stands for "well known text," allows for interchange across GIS programs
import rtree  # supports geospatial join

# Import drive into Colab environment
from google.colab import drive
# Connect Colab to Google Drive
drive.mount('/content/gdrive') 
# Set root path to folder where data was uploaded from OpenData DC
root_path = 'gdrive/My Drive/Colab_Notebooks/IDCE30274/Lab4'

In [None]:
# Analysis Code Block

# Read in PUDs file as a geodataframe and initialize coordinate reference system (CRS)
puds = gpd.read_file(root_path+'/inputs/Planned_Unit_Development__PUDs_.shp', crs = {'init' :'epsg:4326'})
# Read in two csv files as standard pandas DatFrames
aff = pd.read_csv(root_path+'/inputs/Affordable_Housing.csv')
crosswalk = pd.read_csv(root_path+'/inputs/zoning_crosswalk.csv')

# Commented-out lines of code provide visualization of data

# Running .sample() grabs rows from the first 3 index places in puds
# puds.sample(3)
# Summarize what is within the Affordable housing .csv with .info()
# aff.info()
# Same can be done for puds
# puds.info()

# Create a geometry column in the affordable housing dataframe
# by wrapping the longitude (‘X’) and latitude (‘Y’) in a Shapely POINT object
aff['geometry'] = aff.apply(lambda row: Point(row.X, row.Y), axis=1)
# Create a GeoDataFrame from the new aff data
aff = gpd.GeoDataFrame(aff, crs={'init' :'epsg:4326'})

# Now when we sample the first index, we can see the addition of the geometry column
# aff.sample(1) 

# Use geospatial join to identify which PUDs include affordable housing projects
# Merge the datasets based on their geographic intersection
puds_aff = gpd.sjoin(puds, aff, op='intersects', how='left')

# Check that the Merge was performed successfully
# puds_aff.info()

# Merge dataframe with zoning categories crosswalk
# This will categorize zoning exempted buildings as Commercial, Residential, or Other/Mixed Use.
puds_info = puds_aff.merge(crosswalk[['Zone_Cat']], how='left', left_on='PUD_ZONING', right_on=crosswalk['Zone'])

# Use print statements to show us the total number of PUDs
# and how many offer affordable housing
# print(f"Total count of PUDs: {puds_info.shape[0]}")
# print(f"Count PUDs offering Affordable Housing: {puds_info.loc[~puds_info.PROJECT_NAME.isna()].shape[0]}")

# Create a quick map of PUDs by Zoning Category
# puds_info.plot(column='Zone_Cat', legend=True, figsize=(16,8));

# Create a map of just the PUDs that provide Affordable Housing
# puds_info[puds_info['TOTAL_AFFORDABLE_UNITS']>0].plot(column='TOTAL_AFFORDABLE_UNITS', color='grey', figsize=(16,8));

# Export GeoDataFrame as a shapefile within Colab environment
puds_info.to_file('puds_info.shp')

# Copy shapefile components from Colab environment to Google Drive using bash script
# Check google drive to ensure that copy was successful
!cp puds_info.cpg 'gdrive/My Drive/Colab_Notebooks/IDCE30274/Lab4/outputs'
!cp puds_info.dbf 'gdrive/My Drive/Colab_Notebooks/IDCE30274/Lab4/outputs'
!cp puds_info.prj 'gdrive/My Drive/Colab_Notebooks/IDCE30274/Lab4/outputs'
!cp puds_info.shp 'gdrive/My Drive/Colab_Notebooks/IDCE30274/Lab4/outputs'
!cp puds_info.shx 'gdrive/My Drive/Colab_Notebooks/IDCE30274/Lab4/outputs'