<a href="https://colab.research.google.com/github/ReidelVichot/DSTEP23/blob/main/week_8/dstep23_geospatial_intro_part2.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

### **DSTEP23 // Introduction to Geospatial Data: Part 2**

*October 19, 2023*

This notebook will introduce tools for working with geospatial data in python using Zipcodes and MapPLUTO in New York City.

---

#### **Loading and visualizing geospatial data with python**

The core package for working with geospatial data in python is `geopandas`:

In [None]:
import numpy as np
import pandas as pd
import geopandas as gp
import matplotlib.pyplot as plt

In [None]:
# -- and now let's access the *shapes* of those zipcodes
zname = "/content/drive/Shareddrives/dstep23/data/geos/nyc/zipcode_shapes/ZIP_CODE_040114.shp"
zshps = gp.read_file(zname)

In [None]:
# -- note the coordinates
zshps.crs

In [None]:
# -- a simple plot
fig, ax = plt.subplots(figsize=(5, 5))
zshps.plot(ax=ax)
fig.show()

In [None]:
# -- highlight zipcodes in Queens
queens = zshps[zshps["COUNTY"] == "Queens"]
others = zshps[zshps["COUNTY"] != "Queens"]

fig, ax = plt.subplots(figsize=(5, 5))
queens.plot(facecolor="crimson", ax=ax)
others.plot(facecolor="steelblue", ax=ax)
fig.show()

In [None]:
# -- plot categorical variables
fig, ax = plt.subplots(figsize=(5, 5))
zshps.plot("COUNTY", legend=True, legend_kwds={"loc":"upper left"}, ax=ax)
fig.show()

In [None]:
# -- make a color-coded "map" of numerical variables
fig, ax = plt.subplots(figsize=(7, 5))
zshps.plot("POPULATION", cmap="viridis", legend=True, ax=ax)
fig.text(0.9, 0.5, "number of residents", rotation=-90, va="center")
fig.show()

In [None]:
# -- make a histogram of the population values
fig, ax = plt.subplots(figsize=(7, 3))
zshps.hist("POPULATION", bins=20, ax=ax)
ax.set_xlabel("number of persons")
ax.set_ylabel("number of zipcodes")
ax.set_title("")
fig.show()

#### **Accessing and working with values, attributes, and methods in GeoDataFrames**

The values in GeoDataFrames can be accessed and used identically to DataFrames,

In [None]:
# -- display the "POPULATION" column


In [None]:
# -- calculate summary statistics of the various numerical columns


Many of the attributes and methods of GeoDataFrames are the same as DataFrames,

In [None]:
# -- print the columns attribute


In [None]:
# -- access the 5th row


In [None]:
# -- print the unique values of categorical variables


but some are **unique** to GeoDataFrames,

In [None]:
# -- print the centroid attribute of the GeoDataFrame


In [None]:
# -- print the area attribute (but note this data set contains the same info in a column)


In [None]:
# -- display just the geometry of a single row
geo5 =

geo5

In [None]:
# -- for each zipcode, find the distance to the 5th zipcode using the distance method


Let's demonstrate this last one with a plot,

In [None]:
# -- add the distance to the 5th zipcode as a column in the GeoDataFrame
zshps["dist5"] =

In [None]:
# -- set the zipcode number
zcode5 =

# -- make a choropleth color-coded by that distance
fig, ax = plt.subplots(figsize=(9, 7))


# -- add an X where the 5th zipcode is
geo5x =
geo5y =
ax.scatter([geo5x], [geo5y], marker="*", s=200, color="r")

fig.show()

#### **Coordinate transforms and matching**

Let's load the MapPLUTO data for Manhattan,

In [None]:
# -- load MapPLUTO data
mname = "/content/drive/Shareddrives/dstep23/data/nycdcp/mappluto/mn/MapPLUTO_MN.shp"
mnpl = gp.read_file(mname)

In [None]:
# -- display the MapPLUTO data


In [None]:
# -- plot the shapes


Let's say we wanted to zoom in on Flatiron building which we know is at a latitude/longitude of $(40.740947^\circ, -73.989645^\circ)$

In [None]:
# -- convert to decimal degrees
mnpl_deg =

In [None]:
# -- plot the shapes in the EPSG:4326 coordinate system


In [None]:
# -- zoom in on the Flatiron building
cen = (-73.989645, 40.740947)
wid =
xlim =
ylim =



#### **Grouping and merging GeoDataFrames**

Let's try to visualize an answer to the following Urban Planning-based policy question:

***What is the average number of floors of buildings in a given NYC zipcode?***