In [1]:
import pandas as pd

# 1. What fraction of Boulder's Indigenous residents are unhoused?
* To make this calculation, we use:
    * 2019 [Census Bureau estimates](https://www.census.gov/quickfacts/fact/table/bouldercountycolorado,bouldercitycolorado/RHI825219#RHI825219) for the City of Boulder's American Indian and Alaska Native (AIAN) population. Throughout this report, we interchange AIAN with Indigenous. 
    * 2019 and 2020 data from [Metro Denver Homeless Initiative's Point in Time Survey](https://www.mdhi.org/pit). Point in Time data that is avaliable online is at the granularity of Boulder county. We requested and were provided with survey data specifically for the [City of Boulder for 2019 and 2020](https://github.com/nlaberge/boulder-community-data/blob/main/data/boulder_city_pit.xlsx), for fair comparion with City of Boulder's Census Bureau estimates. Point in Time Survey is a known undercount, so our calculation is a lower bound on the proportion of unhoused Indigenous residents. 
* Please write to us with any questions about this data, or our analysis. 
       - Sam Becker (samhbecker@gmail.com)
       - Nick LaBerge (labergenick@gmail.com)

#### First, we distribute "missing" race responses from the PIT survey across races, in proportion to each group's demographic representation
* "missing" responses only occur in 2019

In [2]:
df_census = pd.read_excel('../data/boulder_city_census.xlsx').drop(columns = ['source','coverage'])
df_pit = pd.read_excel('../data/boulder_city_pit.xlsx').drop(columns = ['source','coverage'])

df_pit = df_pit.set_index('year')

for year in [2019, 2020]:
    pit_year = df_pit.loc[(df_pit.index == year)]
    num_missing = df_pit.loc[(df_pit.index == year) & (df_pit.race == 'Missing'), 'population'].iloc[0]
    print(num_missing, 'missing responses in the', year,'PIT survey.')

    #calculate the fraction of unhoused people that each group represents, excuding the people with "missing" race
    pit_year_pop_not_missing = pit_year.population.sum() - num_missing
    df_pit.loc[(df_pit.index == year),'frac'] = pit_year.population / pit_year_pop_not_missing

    #distribute the "missing" race people across races, in proportion to each group's demographic representation
    pit_year = df_pit.loc[(df_pit.index == year)]
    df_pit.loc[(df_pit.index == year),'population'] += pit_year.frac * num_missing

# remove the "missing" rows, now that we have distributed them across observed races
df_pit = df_pit[df_pit['race']!='Missing']

22 missing responses in the 2019 PIT survey.
0.0 missing responses in the 2020 PIT survey.


#### This gives us a lower bound on the number of Boulder's unhoused Indigenous residents for both years: 
* at least 33 in 2019
* at least 21 in 2020

In [3]:
df_pit[df_pit.race == 'AIAN']

Unnamed: 0_level_0,race,population,type,frac
year,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
2019,AIAN,33.64486,unhoused,0.074766
2020,AIAN,21.0,unhoused,0.059155


#### The Census Bureau estimates that there are 211 Indigenous residents in the City of Boulder overall:

In [4]:
df_census[df_census.race == 'AIAN']['population']

3    211.346
Name: population, dtype: float64

#### Now, we can estimate the fraction of Boulder's Indigenous residents who are unhoused:
* at least 15.9 percent in 2019
* at least 9.9 percent in 2020

In [5]:
# number of unhoused aian residents divided by total number of aian residents, for 2019 and 2020
frac_unhoused_aian = df_pit[df_pit.race == 'AIAN']['population'] / df_census[df_census.race == 'AIAN']['population'].iloc[0]
frac_unhoused_aian

year
2019    0.159193
2020    0.099363
Name: population, dtype: float64

# What fraction of Boulder's White residents are unhoused?
* We can use the same data and methodology to make this calculation as before

#### Lower bound on the number of Boulder's unhoused White residents for both years:
* at least 307 in 2019
* at least 251 in 2020

In [6]:
df_pit[df_pit.race == 'White']

Unnamed: 0_level_0,race,population,type,frac
year,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
2019,White,307.009346,unhoused,0.682243
2020,White,251.0,unhoused,0.707042


#### The Census Bureau estimates that there are 92,358 White residents in the City of Boulder overall:

In [7]:
df_census[df_census.race == 'White']['population']

0    92358.202
Name: population, dtype: float64

#### Now, we can estimate the fraction of Boulder's White residents who are unhoused:
* at least 0.33 percent in 2019
* at least 0.27 percent in 2020

In [8]:
# number of unhoused white residents divided by total number of white residents, for 2019 and 2020
frac_unhoused_white = df_pit[df_pit.race == 'White']['population'] / df_census[df_census.race == 'White']['population'].iloc[0]
frac_unhoused_white

year
2019    0.003324
2020    0.002718
Name: population, dtype: float64

## Using these estimates, we find that <font color='red'>Boulder's Indigenous residents are on average 42.2x more likely to be unhoused than Boulder's White residents</font>. 
* 47.9x more likely in 2019
* 36.6x more likely in 2020

In [9]:
frac_unhoused_aian / frac_unhoused_white

year
2019    47.890411
2020    36.561753
Name: population, dtype: float64

In [10]:
(frac_unhoused_aian / frac_unhoused_white).mean()

42.22608197347596

***
# 2. To what extent are unhoused Indigenous residents ticketed for non-violent offences, relative to housed Indigenous residents?
* To make this calculation, we use:
    * 2019 [Census Bureau estimates](https://www.census.gov/quickfacts/fact/table/bouldercountycolorado,bouldercitycolorado/RHI825219#RHI825219) for the City of Boulder's American Indian and Alaska Native (AIAN) population, as before
    * 2019 and 2020 data from [Metro Denver Homeless Initiative's Point in Time Survey](https://www.mdhi.org/pit), as before
    * A comprehensive [dataset](https://github.com/nlaberge/boulder-community-data/blob/main/data/becker_data_request_indigenous_citations_boulder.csv) of tickets written to Indigenous people for non-violent offences that occurred in the City of Boulder in 2019 and 2020, broken down by race and housing status. This data was requested from and provided by the City of Boulder. Per the City of Boulder Court Administrator, there is a high correlation between providing no address while being ticketed and being unhoused, so that is how we define whether someone is unhoused.

   

#### First, we clean up the citations data that was provided by the City of Boulder

In [11]:
df_citations = pd.read_csv('../data/becker_data_request_indigenous_citations_boulder.csv')

#rename race to match census...
df_citations['RACE'] = df_citations['RACE'].replace({'Indian':'AIAN'})

#add a year column, based on the VIOLATION_DATE
df_citations['VIOLATION_DATE'] = pd.to_datetime(df_citations['VIOLATION_DATE'])
df_citations['year'] = df_citations['VIOLATION_DATE'].apply(lambda x: x.year)

#look at first three rows of the cleaned data
df_citations.head(3)

Unnamed: 0,COURT_CASE,PARTY_ID,VIOLATION_DATE,HOUSED?,STATUTE_DESC,RACE,year
0,CR-2018-0000372-BI,232509,2018-01-13 18:20:00,YES,Careless Driving,AIAN,2018
1,CR-2018-0001449-TI,233281,2018-02-23 19:18:00,YES,Defective Headlight,AIAN,2018
2,CR-2018-0002518-TI,233976,2018-03-21 10:00:00,YES,Drove Unsafe (Defective) Vehicle,AIAN,2018


#### Now, we can calculate the number of citations made to housed and unhoused Indigenous residents
* In 2019:
    - 21 citations to unhoused Indigenous residents
    - 22 citations to housed Indigenous residents
* In 2020:
    - 17 citations to unhoused Indigenous residents
    - 6 citations to housed Indigenous residents

In [12]:
citations_by_year_and_status = df_citations.groupby(
    by=['year','HOUSED?']).count()['RACE'][[2019, 2020]]

citations_by_year_and_status

year  HOUSED?
2019  NO         21
      YES        22
2020  NO         17
      YES         6
Name: RACE, dtype: int64

#### Using the citation counts from above, we calculate the number of citations per person in each reference group (housed vs. unhoused).

In [13]:
citations_per_person = {2019:{'YES':0,'NO':0}, # to store results
                        2020:{'YES':0,'NO':0}}
for year in [2019,2020]:
    # get num unhoused from PiT data (known underestimate)
    num_unhoused = df_pit[df_pit.race == 'AIAN'].loc[year]['population']
    
    # get num housed by subtracting num_unhoused from census estimate, which
    # is intended to include all people, both housed and unhoused
    num_housed = float(df_census[df_census.race == 'AIAN']['population'] - num_unhoused)
    
    for housed_status in ['YES','NO']:
        if housed_status == 'YES': 
            num_people = num_housed
            print('Estimated number of housed AIAN residents in', year,':',num_people)
        else: 
            num_people = num_unhoused
            print('Estimated number of unhoused AIAN residents in', year,':',num_people)
        num_citations = citations_by_year_and_status[year][housed_status]
        num_citations_per_person = num_citations/num_people
        print('\t Number of citations:',num_citations)
        print('\t Citations per person:',num_citations_per_person,'\n')
        citations_per_person[year][housed_status] = num_citations_per_person

Estimated number of housed AIAN residents in 2019 : 177.7011401869159
	 Number of citations: 22
	 Citations per person: 0.12380336995507842 

Estimated number of unhoused AIAN residents in 2019 : 33.64485981308411
	 Number of citations: 21
	 Citations per person: 0.6241666666666666 

Estimated number of housed AIAN residents in 2020 : 190.346
	 Number of citations: 6
	 Citations per person: 0.03152154497599109 

Estimated number of unhoused AIAN residents in 2020 : 21.0
	 Number of citations: 17
	 Citations per person: 0.8095238095238095 



# Using these estimates, we find that unhoused Indigenous residents are cited up to 15.4x more per person than housed Indigenous residents, on average. 
- In 2019, unhoused Indigenous residents recieved up to 5.04 times more citations per capita than their housed neighbors
- In 2020, unhoused Indigenous residents recieved up to 25.68 times more citations per capita than their housed neighbors

\* we note that these estimates are likely upperbounds, as the PIT survey data likely undercount the number of unhoused AIAN people.

In [14]:
unhoused_over_housed_citations = [] #to store results

for year in [2019,2020]:
    citations_per_unhoused_person = citations_per_person[year]['NO']
    citations_per_housed_person = citations_per_person[year]['YES']
    
    print('In '+str(year)+', unhoused Indigenous residents recieved up to '+\
          str(round(citations_per_unhoused_person /citations_per_housed_person, 2))\
          +' times more citations per capita than their housed neighbors')
    
    unhoused_over_housed_citations += [citations_per_unhoused_person /citations_per_housed_person]


In 2019, unhoused Indigenous residents recieved up to 5.04 times more citations per capita than their housed neighbors
In 2020, unhoused Indigenous residents recieved up to 25.68 times more citations per capita than their housed neighbors


In [15]:
sum(unhoused_over_housed_citations)/2

15.361599958513708