## Week 1 Class activities
This notebook is a starting point for the exercises and activities that we'll do in class.

Before you attempt any of these activities, make sure to watch the Week 1 video lectures.

### Using the `requests` library to query an API
Here's the code that we saw in the video lecture that queries BART for real-time arrivals.

In [29]:
import json
import pandas as pd
import requests

APIkey = 'MW9S-E7SL-26DU-VV8V'  # the key posted on BART's website
station = '12TH'
requestString = 'http://api.bart.gov/api/etd.aspx?cmd=etd&orig={}&json=y&key={}'.format(station, APIkey)
r = requests.get(requestString)
d = json.loads(r.text)
td = d['root']['station'][0]['etd']
print('Trains from {} to {}'.format(station, etd[0]['destination']))
df = pd.DataFrame(etd[0]['estimate'])
df

KeyError: 'etd'

In [11]:
etd = d['root']['station'][0]['etd']
print('Trains from {} to {}'.format(station, etd[0]['destination']))
df = pd.DataFrame(etd[0]['estimate'])
df

KeyError: 'etd'

In [5]:
print(d['root']['station'][0]['etd'])

KeyError: 'etd'

<div class="alert alert-block alert-info">
<strong>Exercise:</strong> Explore the different objects. What are <strong>r</strong>, <strong>d</strong>, and <strong>etd</strong>. What can you do with them?
</div>

Hint: Use `type()` to find out the type of an object (e.g. `type(r)`), and `?` to pull up the help screen (e.g. `r?`).

You can also tab autocomplete to discover an object's attributes and methods (e.g. `r.` and then `TAB`). 

In [16]:
# your code here
type(r)
r?

Now let's explore the other options and API commands that BART offers. 

The API documentation for the `etd` (real-time information) command is [here](https://api.bart.gov/docs/etd/etd.aspx). 

<div class="alert alert-block alert-info">
<strong>Exercise:</strong> Write a command to retrieve real-time departures for southbound trains at Civic Center station (code: CIVC). Hint: You'll need to add another <strong>&</strong> to <strong>requestString</strong>.
</div>

In [6]:
# your code here
APIkey = 'MW9S-E7SL-26DU-VV8V'  # the key posted on BART's website
station = 'CIVC'
dir= 's'
requestString = 'http://api.bart.gov/api/etd.aspx?cmd=etd&orig={}&json=y&key={}&dir={}'.format(station, APIkey, dir)
r = requests.get(requestString)
d = json.loads(r.text)
etd = d['root']['station'][0]['etd']
print('Trains from {} to {}'.format(station, etd[0]['destination']))
df = pd.DataFrame(etd[0]['estimate'])
df

KeyError: 'etd'

<div class="alert alert-block alert-info">
    <strong>Exercise:</strong> Use the <strong>elev</strong> command to obtain the elevator status at each station, and put it in a dataframe. Optional extension: pass the parameters as a dictionary to requests, as we saw in the video lecture.

See the API docs [here](https://api.bart.gov/docs/bsa/elev.aspx) for details of that command.

In [11]:
# your code here
requestString = 'http://api.bart.gov/api/etd.aspx?cmd=etd&orig=RICH&key={}&json=y'.format(APIkey)
r2 = requests.get(requestString)
d2 = json.loads(r.text)
type(r2)

requests.models.Response

### Accessing census data

Recall that we have seen two ways to access census data:
* The Census Bureau API
* The `cenpy` library

Let's try them both and map patterns of race for Los Angeles County. 

Here's the relevant code that we saw in the video lecture to get the 5-year ACS estimates for population (table `B00001_001E`).

In [18]:
import json
import requests
import pandas as pd

r = requests.get('https://api.census.gov/data/2015/acs/acs5?get=B00001_001E&for=county')

censusdata = r.json()
df = pd.DataFrame(censusdata[1:], columns=censusdata[0])
df.head()

Unnamed: 0,B00001_001E,state,county
0,592,48,75
1,7108,48,91
2,2401,48,225
3,5409,48,349
4,1502,48,415


<div class="alert alert-block alert-info">
<strong>Exercise:</strong> Retrieve population data at the census tract level for LA County, and put it in a pandas dataframe. (You can use the 5-year ACS if you like.)
</div>

Some examples are given [here](https://api.census.gov/data/2015/acs/acs5/examples.html). 

Note that you don't need the API key for a small number of queries, so you can delete `&key=YOUR_KEY_GOES_HERE` from the examples. 

The FIPS code for California is `06` and for Los Angeles County `037`.

In [42]:
# your code here

r = requests.get('https://api.census.gov/data/2015/acs/acs5?get=NAME,B00001_001E&for=county:037&in=state:06')
censusdata = r.json()
df = pd.DataFrame(censusdata[1:], columns=censusdata[0])
df.head()

Unnamed: 0,NAME,B00001_001E,state,county
0,"Childress County, Texas",592,48,75
1,"Comal County, Texas",7108,48,91
2,"Houston County, Texas",2401,48,225
3,"Navarro County, Texas",5409,48,349
4,"Scurry County, Texas",1502,48,415


In [1]:
# note the inplace keyword changes the dataframe in place, rather than returning a copy
df.rename(columns = {'B00001_001E':'population'}, inplace=True)
df

NameError: name 'df' is not defined

<div class="alert alert-block alert-info">
<strong>Exercise:</strong> Retrieve the census data for race/ethnicity for tracts in Los Angeles county, and put it in a pandas dataframe. 
</div>

Hints:
* The list of tables is [here](https://api.census.gov/data/2015/acs/acs5/variables.html).
* The data is crosstabulated by race and age and gender. If you just want race/ethnicity, then look at the `Estimate!!Total:` tables. For example, `B01001H_001E` gives the total number of non-Hispanic white people, without further disaggregating by gender and age. 
* Start with the simplest measure of race. For example, you could calculate the proportion of Black people or non-Hispanic white people in each census tract, by dividing the relevant table by the total population (which you already retrieved above).
* You can request multiple tables at once - just separate them with commas. For example, `get=NAME,B00001_001E,B01001H_001E`.


In [50]:
# your code here
#r = requests.get('https://api.census.gov/data/2015/acs/acs5?get=NAME,B01001H_001E,B00001_001E&for=county:037&in=state:06')
censusdata = r.json()
df = pd.DataFrame(censusdata[1:], columns=censusdata[0])
df.head()



Unnamed: 0,NAME,B01001H_001E,B00001_001E,state,county
0,"Los Angeles County, California",2703547,783732,6,37


<div class="alert alert-block alert-info">
    <strong>Exercise:</strong> Now do the same using <strong>cenpy</strong>.
</div>

Here's the relevant example from the lecture. Note if you want multiple variables, you can pass them as a list. For example: `variables=['B25035_001E','B01001H_001E']`.


In [1]:
import cenpy
from cenpy import products

# create a connection to the American Community Survey
acs = cenpy.products.ACS()
riverside = products.ACS(2017).from_county('Riverside, CA', level='tract',
                                        variables='B25035_001E')
riverside.head()

  in_crs_string = _prepare_from_proj_string(in_crs_string)
  in_crs_string = _prepare_from_proj_string(in_crs_string)
  return self._from_name(county, variables, level, "Counties", **kwargs)


Unnamed: 0,GEOID,geometry,B25035_001E,NAME,state,county,tract
0,6065041904,"POLYGON ((-13099280.410 4011347.460, -13099235...",1980.0,"Census Tract 419.04, Riverside County, California",6,65,41904
1,6065041806,"POLYGON ((-13090448.870 4008279.130, -13090446...",1995.0,"Census Tract 418.06, Riverside County, California",6,65,41806
2,6065040808,"POLYGON ((-13089707.820 4014799.500, -13089688...",1987.0,"Census Tract 408.08, Riverside County, California",6,65,40808
3,6065046601,"POLYGON ((-13089190.190 4017589.480, -13089189...",1995.0,"Census Tract 466.01, Riverside County, California",6,65,46601
4,6065040816,"POLYGON ((-13084376.060 4015407.040, -13084371...",1995.0,"Census Tract 408.16, Riverside County, California",6,65,40816


In [2]:
# your code here
acs = cenpy.products.ACS()
riverside = products.ACS(2017).from_county('Riverside, CA', level='tract',
                                        variables=['B25035_001E','B01001H_001E'])
riverside.head()

  in_crs_string = _prepare_from_proj_string(in_crs_string)
  in_crs_string = _prepare_from_proj_string(in_crs_string)
  return self._from_name(county, variables, level, "Counties", **kwargs)


Unnamed: 0,GEOID,geometry,B01001H_001E,B25035_001E,NAME,state,county,tract
0,6065041904,"POLYGON ((-13099280.410 4011347.460, -13099235...",2454.0,1980.0,"Census Tract 419.04, Riverside County, California",6,65,41904
1,6065041806,"POLYGON ((-13090448.870 4008279.130, -13090446...",2259.0,1995.0,"Census Tract 418.06, Riverside County, California",6,65,41806
2,6065040808,"POLYGON ((-13089707.820 4014799.500, -13089688...",2163.0,1987.0,"Census Tract 408.08, Riverside County, California",6,65,40808
3,6065046601,"POLYGON ((-13089190.190 4017589.480, -13089189...",1018.0,1995.0,"Census Tract 466.01, Riverside County, California",6,65,46601
4,6065040816,"POLYGON ((-13084376.060 4015407.040, -13084371...",1102.0,1995.0,"Census Tract 408.16, Riverside County, California",6,65,40816


<div class="alert alert-block alert-info">
<strong>Exercise:</strong> Can you write a function that retrieves population by race for all census tracts in a specified county? (Or a simplified measure of race, such as the proportion of Black people.) 
</div>

Hint: use the code you wrote, but replace the county FIPS code `037` with a variable. Your function can take a single argument, e.g. `countyFIPS`.

In [3]:
# your code here
acs = cenpy.products.ACS()
la = products.ACS(2017).from_county('Los Angeles, CA', level='tract',
                                        variables=['B25035_001E','B01001H_001E'])
la.head()

  result[:] = values
  in_crs_string = _prepare_from_proj_string(in_crs_string)
  result[:] = values
  in_crs_string = _prepare_from_proj_string(in_crs_string)
  return self._from_name(county, variables, level, "Counties", **kwargs)


Unnamed: 0,GEOID,geometry,B01001H_001E,B25035_001E,NAME,state,county,tract
0,6037670328,"POLYGON ((-13183928.750 3998590.110, -13183871...",3253.0,1965.0,"Census Tract 6703.28, Los Angeles County, Cali...",6,37,670328
1,6037990200,"POLYGON ((-13206496.550 4033174.070, -13206119...",0.0,,"Census Tract 9902, Los Angeles County, California",6,37,990200
2,6037670326,"POLYGON ((-13182706.680 4000184.830, -13182626...",1968.0,1965.0,"Census Tract 6703.26, Los Angeles County, Cali...",6,37,670326
3,6037980013,"POLYGON ((-13179812.820 4019143.240, -13179813...",0.0,,"Census Tract 9800.13, Los Angeles County, Cali...",6,37,980013
4,6037602302,"POLYGON ((-13177861.170 4015210.820, -13177860...",2121.0,1957.0,"Census Tract 6023.02, Los Angeles County, Cali...",6,37,602302


<div class="alert alert-block alert-info">
<strong>Exercise:</strong> Map your results!
</div>

In [None]:
# your code here

### Using Socrata

Here's the example that we saw in the lecture.

In [None]:
import geopandas as gpd
url = 'https://data.lacity.org/resource/mymu-zi3s.geojson'
gdf = gpd.read_file(url)
gdf.plot()

<div class="alert alert-block alert-info">
<strong>Exercise:</strong> Choose another dataset on Socrata, download it using the API, and map the results. 
</div>

The City of Los Angeles datasets are [here](https://data.lacity.org). Feel free to choose another city or county if you prefer.

Some possible datasets of planning-related interest:
* [DACA/DAPA workshops](https://data.lacity.org/Community-Economic-Development/Map2-DACA-DAPA-Workshops/icwt-9z3e) (seems a bit dated)
* [Solar PV permits](https://data.lacity.org/A-Prosperous-City/Solar-PV-Permits-in-LA/bdt7-w2xr)
* [Parks](https://data.lacity.org/Community-Economic-Development/Department-of-Recreation-and-Parks-Facility-and-Pa/ax8j-dhzm)

In [None]:
# your code here

<div class="alert alert-block alert-info">
<h3>What you should have learned</h3>
<ul>
  <li>Gain confidence in experimenting with code - exploring different objects, writing functions, and so on</li>
  <li>Learn how to read API documentation and adapt the examples to create your own queries.</li>
  <li>Gain confidence in mapping the results. We'll practice this much more throughout the quarter.</li>
</ul>
</div>