# Query and Download Specified Shapefiles and Tables

In this exercise, you will learn how to query spatial data that is in the form of ESRI Shapefiles. In particular, we will work with Census boundary files in the TIGER (Topologically Integrated Geographic Encoding and Referencing) system.

We will be using the GeoPandas library to read and query these Shapefiles.

In [None]:
import geopandas as gpd
import pandas
pandas.set_option('display.max_rows', None)

## Query the state shapefile to determine the STATEFP value for Indiana

We have already provided the state boundaries shapefile along with this notebook. Next, we will load up this shapefile and query the **STATEFP** field for the state of Indiana

In [None]:
zipfile = "zip://./tl_2024_us_state.zip"
states = gpd.read_file(zipfile)

## Inspecting the data in the shapefile

Before running the query, let's first inspect the data available in the shapefile. Here we are inspecting the DataFrame that GeoPandas automatically creates when we open the shapefile using the _read_file_ method.

In [None]:
states

Were you able to find the row for Indiana and the value for STATEFP? Let's try and do the same with code now:

In [None]:
states[states.STUSPS =='IN']['STATEFP'].values[0]

Let's try to filter down to a few relevant columns rather than print the entire table. Does that make it easier to find the one you are interested in?

In [None]:
states[['STUSPS','NAME','STATEFP']]

## Exercise 1

Download the zip file for the County boundaries, upload it to Jupyter, and repeat the steps above to load the file into a GeoPandas DataFrame

In [None]:
zipfile = "zip:///home/jovyan/<county zip file name>"
counties = 

Next, can you write code to query the counties data frame to find the Hamilton county in Indiana? 

**Hint:** What if you filtered by the _STATEFP_ value for Indiana that you identified previously?

**Hint:** Try to only retrieve the columns you need (e.g. COUNTYFP, NAME)

Your code should look something like this: 

``
counties[<condition>][[<list of columns>]]
``

In [None]:
# Enter your code here

It is also possible to combine multiple conditions into a query on the dataframe, try typing in the following into the next cell:

``
counties.loc[(counties['STATEFP'] == '18') & (counties['NAME'] == 'Hamilton')]['COUNTYFP'].values[0]
``

In [None]:
# Enter your code here

## Exercise 2

Download the zip file for the Roads in Hamilton county. Next, upload it to Jupyter and read the file using GeoPandas.

**Hint:** Use the STATEFP and COUNTYFP codes to identify the right zip file to download. 

In [None]:
zipfile = "zip:///home/jovyan/tl_2024_18057_roads.zip"
roads = gpd.read_file(zipfile) 

Inspect the data frame, but make sure to restrict the number of rows displayed

**Hint:** reset display.max_rows to 50

In [None]:
# Enter your code here

This shapefile comprises roads of different types. Let's try to figure out what these types are. 

**Hint:** You can call the unique() function on a particular field from a dataframe to get an array of unique values of that field/column.

In [None]:
# Enter your code here

If you want to figure out what these various road types refer to, take a look at the Route Types code list here: https://www.census.gov/library/reference/code-lists.html

## Exercise 3

We will now plot the roads of Hamilton county on a map. However, we will only plot roads of a particular type (e.g. interstate).

In [None]:
import folium
m = folium.Map(location=[40, -85], zoom_start=8, tiles="CartoDB positron")
interstates = roads[roads.RTTYP == 'I']
for _, r in interstates.iterrows():
    # Without simplifying the representation of each borough,
    # the map might not be displayed
    sim_geo = gpd.GeoSeries(r["geometry"])
    geo_j = sim_geo.to_json()
    geo_j = folium.GeoJson(data=geo_j)
    folium.Popup(r["FULLNAME"]).add_to(geo_j)
    geo_j.add_to(m)
m