In [1]:
import glob
import os
import re

import numpy as np
import pandas as pd

## Recent arrests by Continuum of Care

Because full counts of unhoused people only happen every other year in many CoCs, and the counts only happen once a year, I will use an average of two years of data to analyze arrests and populations of unhoused people.

In [2]:
recent_arrests = pd.read_csv('../04_outputs/a01_aggregated_arrests_by_demographic.csv',
                             keep_default_na=False)

In [3]:
recent_arrests.loc[recent_arrests['_census_race'] ==
                   'Hispanic or Latino']

Unnamed: 0,_city,_housing_status,_census_race,_census_ethnicity,arrests
94,Portland,housed,Hispanic or Latino,Hispanic or Latino (of any race),2669
101,Portland,no information,Hispanic or Latino,Hispanic or Latino (of any race),134
108,Portland,unhoused,Hispanic or Latino,Hispanic or Latino (of any race),1741
115,Portland,unknown,Hispanic or Latino,Hispanic or Latino (of any race),92


For the purpose of analyzing demographics per U.S. Census data, "Hispanic or Latino" is orthogonal to race. For every arrest that specified it, I assign the data to ethnicty and mark race as unknown.

In [4]:
recent_arrests.loc[recent_arrests['_census_race'] ==
                   'Hispanic or Latino', '_census_race'] = 'Other/Unknown'

In [5]:
recent_arrests.loc[recent_arrests['_census_race']
                   == 'Unknown', '_census_race'] = 'Other/Unknown'

In [6]:
recent_arrests['_census_race'].unique()

array(['American Indian and Alaska Native', 'Asian',
       'Black or African American',
       'Native Hawaiian and Other Pacific Islander', 'Other/Unknown',
       'White'], dtype=object)

In [7]:
recent_arrests['CoC Number'] = recent_arrests['_city'].replace({'Sacramento': 'CA-503',
                                                                'Los Angeles': 'CA-600',
                                                                'Oakland': 'CA-502',
                                                                'San Diego': 'CA-601',
                                                                'Portland': 'OR-501',
                                                                'Seattle': 'WA-500'
                                                                })

### By Continuum of Care alone

In [8]:
recent_arrests_by_coc = (
    recent_arrests.groupby(
        [
            'CoC Number',
        ],
        dropna=False,
    ).agg(arrests_coc_total=('arrests', sum))
    .reset_index()
)

### By CoC and housing status

In [9]:
recent_arrests_by_housing = (
    recent_arrests.groupby(
        [
            'CoC Number',
            '_housing_status'
        ],
        dropna=False,
    ).agg(arrests_by_housing=('arrests', sum))
    .reset_index()
)

### By housing status and demographic

#### Race

In [10]:
recent_arrests.columns

Index(['_city', '_housing_status', '_census_race', '_census_ethnicity',
       'arrests', 'CoC Number'],
      dtype='object')

In [11]:
recent_arrests_by_housing_and_race = (
    recent_arrests.groupby(
        [
            'CoC Number',
            '_housing_status',
            '_census_race'
        ],
        dropna=False,
    ).agg(arrests=('arrests', sum))
    .reset_index()
)

##### Issues with race and ethnicity

Both the US Census Bureau and HUD differentiate between race and ethnicity with respect to Latino people. However, most arrest data has one field for 'race' where they specify that an arresteee was "Hispanic" or "Latino."

Another complication is that HUD data does not provide a race x ethnicity demographic group, but rather separate aggregates for each. Because many "White" unhoused people may be Latino, I produce an additional "White alone" category duplicating data under "White" for comparison with census data, which provides a more conservative estimate of racial disparities in both homelessness and arrests.

**Example**

This is a toy example, but the proportions actually closely match those of 2019 San Diego data.

| Variable        | Count | `White` population | `White alone` population |
|:----------------|-------|--------------------|--------------------------|
| arrests         |    25 |                700 |                      450 |
| unhoused people |    16 |                700 |                      450 |



Using `White alone` as the estimate for rate comparisons provides a more conservative analysis of racial disparities in arrests.


| rate / 100 people       | `White` population | `White alone` population |
|:------------------------|--------------------|--------------------------|
| arrests                 |                  4 |                        6 |
| unhoused people         |                  2 |                        4 |



This is unnecessary because I already segmented out white and white alone.

In [12]:
recent_arrests_by_housing_and_race['_census_race'].unique()

array(['American Indian and Alaska Native', 'Asian',
       'Black or African American',
       'Native Hawaiian and Other Pacific Islander', 'Other/Unknown',
       'White'], dtype=object)

In [13]:
recent_arrests_by_housing_and_race[
    'PITC Demographic'
] = recent_arrests_by_housing_and_race['_census_race'].str.replace(' and ', ' or ')

In [14]:
recent_arrests_by_housing_and_race.rename(
    columns={'_census_race': 'ACS Demographic'}, inplace=True)

In [15]:
white_alone_recent_arrests_by_housing_and_race = (
    recent_arrests[
        (recent_arrests['_census_race'] == 'White')
        & (recent_arrests['_census_ethnicity'] == 'Not Hispanic or Latino')
    ]
    .groupby(
        ['CoC Number', '_housing_status', '_census_race'],
        dropna=False,
    )
    .agg(arrests=('arrests', sum))
    .reset_index()
)

In [16]:
white_alone_recent_arrests_by_housing_and_race['ACS Demographic'] = 'Non-Hispanic White'
white_alone_recent_arrests_by_housing_and_race['PITC Demographic'] = 'White'

In [17]:
white_alone_recent_arrests_by_housing_and_race

Unnamed: 0,CoC Number,_housing_status,_census_race,arrests,ACS Demographic,PITC Demographic
0,CA-502,housed,White,1918,Non-Hispanic White,White
1,CA-502,no information,White,41,Non-Hispanic White,White
2,CA-502,unhoused,White,149,Non-Hispanic White,White
3,CA-502,unknown,White,18,Non-Hispanic White,White
4,CA-503,housed,White,6355,Non-Hispanic White,White
5,CA-503,no information,White,26,Non-Hispanic White,White
6,CA-503,unhoused,White,6348,Non-Hispanic White,White
7,CA-503,unknown,White,16,Non-Hispanic White,White
8,CA-600,housed,White,22060,Non-Hispanic White,White
9,CA-600,no information,White,13,Non-Hispanic White,White


In [18]:
final_recent_arrests_by_housing_and_race = pd.concat(
    [white_alone_recent_arrests_by_housing_and_race, recent_arrests_by_housing_and_race])

In [19]:
final_recent_arrests_by_housing_and_race.drop(
    labels=['_census_race'], axis=1, inplace=True)

In [20]:
final_recent_arrests_by_housing_and_race.sort_values(
    by=['CoC Number', '_housing_status', 'PITC Demographic', 'ACS Demographic'],
    inplace=True,
    ignore_index=True,
)

In [21]:
final_recent_arrests_by_housing_and_race[final_recent_arrests_by_housing_and_race['PITC Demographic'] == 'White']

Unnamed: 0,CoC Number,_housing_status,arrests,ACS Demographic,PITC Demographic
5,CA-502,housed,1918,Non-Hispanic White,White
6,CA-502,housed,2192,White,White
12,CA-502,no information,41,Non-Hispanic White,White
13,CA-502,no information,46,White,White
19,CA-502,unhoused,149,Non-Hispanic White,White
20,CA-502,unhoused,187,White,White
26,CA-502,unknown,18,Non-Hispanic White,White
27,CA-502,unknown,23,White,White
33,CA-503,housed,6355,Non-Hispanic White,White
34,CA-503,housed,12399,White,White


#### Ethnicity

In [22]:
final_recent_arrests_by_housing_and_ethn = (
    recent_arrests.groupby(
        [
            'CoC Number',
            '_housing_status',
            '_census_ethnicity',
        ],
        dropna=False,
    ).agg(arrests=('arrests', sum))
    .reset_index()
)

In [23]:
final_recent_arrests_by_housing_and_ethn.rename(
    columns={'_census_ethnicity': 'ACS Demographic'}, inplace=True)

In [24]:
final_recent_arrests_by_housing_and_ethn[
    'PITC Demographic'
] = final_recent_arrests_by_housing_and_ethn['ACS Demographic'].replace(
    {
        'Not Hispanic or Latino': 'Non-Hispanic/Non-Latino',
        'Hispanic or Latino (of any race)': 'Hispanic/Latino',
    }
)

In [25]:
final_recent_arrests_by_housing_and_ethn

Unnamed: 0,CoC Number,_housing_status,ACS Demographic,arrests,PITC Demographic
0,CA-502,housed,Hispanic or Latino (of any race),5692,Hispanic/Latino
1,CA-502,housed,Not Hispanic or Latino,15365,Non-Hispanic/Non-Latino
2,CA-502,housed,Unknown,1335,Unknown
3,CA-502,no information,Hispanic or Latino (of any race),81,Hispanic/Latino
4,CA-502,no information,Not Hispanic or Latino,354,Non-Hispanic/Non-Latino
5,CA-502,no information,Unknown,23,Unknown
6,CA-502,unhoused,Hispanic or Latino (of any race),322,Hispanic/Latino
7,CA-502,unhoused,Not Hispanic or Latino,1175,Non-Hispanic/Non-Latino
8,CA-502,unhoused,Unknown,156,Unknown
9,CA-502,unknown,Hispanic or Latino (of any race),60,Hispanic/Latino


### By demographic alone

#### Race

In [26]:
recent_arrests

Unnamed: 0,_city,_housing_status,_census_race,_census_ethnicity,arrests,CoC Number
0,Los Angeles,housed,American Indian and Alaska Native,Not Hispanic or Latino,17,CA-600
1,Los Angeles,housed,Asian,Not Hispanic or Latino,254,CA-600
2,Los Angeles,housed,Black or African American,Hispanic or Latino (of any race),4,CA-600
3,Los Angeles,housed,Black or African American,Not Hispanic or Latino,51045,CA-600
4,Los Angeles,housed,Native Hawaiian and Other Pacific Islander,Not Hispanic or Latino,21,CA-600
...,...,...,...,...,...,...
170,San Diego,unknown,Black or African American,Not Hispanic or Latino,261,CA-601
171,San Diego,unknown,Native Hawaiian and Other Pacific Islander,Not Hispanic or Latino,4,CA-601
172,San Diego,unknown,Other/Unknown,Hispanic or Latino (of any race),163,CA-601
173,San Diego,unknown,Other/Unknown,Not Hispanic or Latino,8,CA-601


In [27]:
recent_arrests_by_race = (
    recent_arrests.groupby(['CoC Number', '_census_race'], dropna=False)
    .agg(arrests_demographic=('arrests', sum))
    .reset_index()
)

In [28]:
recent_arrests_by_race_and_ethnicity = (
    recent_arrests.groupby(
        ['CoC Number', '_census_race', '_census_ethnicity'], dropna=False)
    .agg(arrests_demographic=('arrests', sum))
    .reset_index()
)

In [29]:
white_alone_recent_arrests_by_race = recent_arrests_by_race_and_ethnicity[
    (recent_arrests_by_race_and_ethnicity['_census_race'] == 'White') & (
        recent_arrests_by_race_and_ethnicity['_census_ethnicity'] == 'Not Hispanic or Latino')
].copy()

In [30]:
white_alone_recent_arrests_by_race

Unnamed: 0,CoC Number,_census_race,_census_ethnicity,arrests_demographic
16,CA-502,White,Not Hispanic or Latino,2126
28,CA-503,White,Not Hispanic or Latino,12745
38,CA-600,White,Not Hispanic or Latino,35105
45,CA-601,White,Not Hispanic or Latino,31112
52,OR-501,White,Not Hispanic or Latino,43663


In [31]:
white_alone_recent_arrests_by_race['_census_race'] = 'Non-Hispanic White'

In [32]:
final_recent_arrests_by_race = pd.concat(
    [white_alone_recent_arrests_by_race, recent_arrests_by_race], ignore_index=True)

In [33]:
final_recent_arrests_by_race.rename(
    columns={'_census_race': 'ACS Demographic'}, inplace=True)

In [34]:
final_recent_arrests_by_race['ACS Demographic'].unique()

array(['Non-Hispanic White', 'American Indian and Alaska Native', 'Asian',
       'Black or African American',
       'Native Hawaiian and Other Pacific Islander', 'Other/Unknown',
       'White'], dtype=object)

In [35]:
acs_to_pitc = {
    'Non-Hispanic White': 'White',
    'American Indian and Alaska Native': 'American Indian or Alaska Native',
    'Native Hawaiian and Other Pacific Islander': 'Native Hawaiian or Other Pacific Islander',
}

In [36]:
final_recent_arrests_by_race['PITC Demographic'] = final_recent_arrests_by_race['ACS Demographic'].replace(
    acs_to_pitc)

In [37]:
final_recent_arrests_by_race.drop(
    labels=['_census_ethnicity'], axis=1, inplace=True)

In [38]:
final_recent_arrests_by_race

Unnamed: 0,CoC Number,ACS Demographic,arrests_demographic,PITC Demographic
0,CA-502,Non-Hispanic White,2126,White
1,CA-503,Non-Hispanic White,12745,White
2,CA-600,Non-Hispanic White,35105,White
3,CA-601,Non-Hispanic White,31112,White
4,OR-501,Non-Hispanic White,43663,White
5,CA-502,American Indian and Alaska Native,67,American Indian or Alaska Native
6,CA-502,Asian,1137,Asian
7,CA-502,Black or African American,14510,Black or African American
8,CA-502,Native Hawaiian and Other Pacific Islander,347,Native Hawaiian or Other Pacific Islander
9,CA-502,Other/Unknown,6272,Other/Unknown


#### Ethnicity

In [39]:
final_recent_arrests_by_ethn = (
    recent_arrests.groupby(['CoC Number', '_census_ethnicity'], dropna=False)
    .agg(arrests_demographic=('arrests', sum))
    .reset_index()
)

In [40]:
final_recent_arrests_by_ethn.rename(
    columns={'_census_ethnicity': 'ACS Demographic'}, inplace=True)

In [41]:
final_recent_arrests_by_housing_and_ethn[
    'PITC Demographic'
] = final_recent_arrests_by_housing_and_ethn['ACS Demographic'].replace(
    {
        'Not Hispanic or Latino': 'Non-Hispanic/Non-Latino',
        'Hispanic or Latino (of any race)': 'Hispanic/Latino',
    }
)

In [42]:
final_recent_arrests_by_ethn['PITC Demographic'] = final_recent_arrests_by_ethn['ACS Demographic'].replace(
    {
        'Not Hispanic or Latino': 'Non-Hispanic/Non-Latino',
        'Hispanic or Latino (of any race)': 'Hispanic/Latino',
    }
)

### Merging

#### Race and race by housing

In [43]:
final_recent_arrests_by_race

Unnamed: 0,CoC Number,ACS Demographic,arrests_demographic,PITC Demographic
0,CA-502,Non-Hispanic White,2126,White
1,CA-503,Non-Hispanic White,12745,White
2,CA-600,Non-Hispanic White,35105,White
3,CA-601,Non-Hispanic White,31112,White
4,OR-501,Non-Hispanic White,43663,White
5,CA-502,American Indian and Alaska Native,67,American Indian or Alaska Native
6,CA-502,Asian,1137,Asian
7,CA-502,Black or African American,14510,Black or African American
8,CA-502,Native Hawaiian and Other Pacific Islander,347,Native Hawaiian or Other Pacific Islander
9,CA-502,Other/Unknown,6272,Other/Unknown


In [44]:
race_df = pd.merge(final_recent_arrests_by_race, final_recent_arrests_by_housing_and_race,
                   on=['CoC Number', 'ACS Demographic', 'PITC Demographic'])

In [45]:
race_df

Unnamed: 0,CoC Number,ACS Demographic,arrests_demographic,PITC Demographic,_housing_status,arrests
0,CA-502,Non-Hispanic White,2126,White,housed,1918
1,CA-502,Non-Hispanic White,2126,White,no information,41
2,CA-502,Non-Hispanic White,2126,White,unhoused,149
3,CA-502,Non-Hispanic White,2126,White,unknown,18
4,CA-503,Non-Hispanic White,12745,White,housed,6355
...,...,...,...,...,...,...
120,OR-501,Other/Unknown,4931,Other/Unknown,unknown,97
121,OR-501,White,43663,White,housed,18871
122,OR-501,White,43663,White,no information,932
123,OR-501,White,43663,White,unhoused,23334


#### Race, race by housing, housing

In [46]:
race_housing_df = pd.merge(race_df, recent_arrests_by_housing, on=[
                           'CoC Number', '_housing_status'])

##### Race, etc. by CoC recent

In [47]:
race_housing_coc_df = pd.merge(race_housing_df, recent_arrests_by_coc)

#### Ethnicity, ethnicity by housing

In [48]:
ethn_df = pd.merge(final_recent_arrests_by_ethn, final_recent_arrests_by_housing_and_ethn,
                   on=['CoC Number', 'ACS Demographic', 'PITC Demographic'])

#### Ethnicity, ethnicity by housing, housing

In [49]:
ethn_housing_df = pd.merge(ethn_df, recent_arrests_by_housing, on=[
                           'CoC Number', '_housing_status'])

##### Ethnicity, etc. by CoC

In [50]:
ethn_housing_coc_df = pd.merge(ethn_housing_df, recent_arrests_by_coc)

In [51]:
ethn_housing_coc_df.head()

Unnamed: 0,CoC Number,ACS Demographic,arrests_demographic,PITC Demographic,_housing_status,arrests,arrests_by_housing,arrests_coc_total
0,CA-502,Hispanic or Latino (of any race),6155,Hispanic/Latino,housed,5692,22392,24781
1,CA-502,Not Hispanic or Latino,17092,Non-Hispanic/Non-Latino,housed,15365,22392,24781
2,CA-502,Unknown,1534,Unknown,housed,1335,22392,24781
3,CA-502,Hispanic or Latino (of any race),6155,Hispanic/Latino,no information,81,458,24781
4,CA-502,Not Hispanic or Latino,17092,Non-Hispanic/Non-Latino,no information,354,458,24781


#### Ethnicity and race

In [52]:
final_recent_arrest_df = pd.concat([race_housing_coc_df, ethn_housing_coc_df])

In [53]:
final_recent_arrest_df

Unnamed: 0,CoC Number,ACS Demographic,arrests_demographic,PITC Demographic,_housing_status,arrests,arrests_by_housing,arrests_coc_total
0,CA-502,Non-Hispanic White,2126,White,housed,1918,22392,24781
1,CA-502,American Indian and Alaska Native,67,American Indian or Alaska Native,housed,50,22392,24781
2,CA-502,Asian,1137,Asian,housed,1050,22392,24781
3,CA-502,Black or African American,14510,Black or African American,housed,12949,22392,24781
4,CA-502,Native Hawaiian and Other Pacific Islander,347,Native Hawaiian or Other Pacific Islander,housed,331,22392,24781
...,...,...,...,...,...,...,...,...
49,OR-501,Not Hispanic or Latino,62081,Non-Hispanic/Non-Latino,unhoused,31772,33599,67012
50,OR-501,Unknown,295,Unknown,unhoused,86,33599,67012
51,OR-501,Hispanic or Latino (of any race),4636,Hispanic/Latino,unknown,92,922,67012
52,OR-501,Not Hispanic or Latino,62081,Non-Hispanic/Non-Latino,unknown,825,922,67012


### Calculations

In [54]:
final_recent_arrest_df['housing_status_and_demographic_pct_of_all_arrests'] = (
    final_recent_arrest_df['arrests'] /
    final_recent_arrest_df['arrests_coc_total']
)

In [55]:
final_recent_arrest_df['demographic_pct_of_housing_status_arrests'] = (
    final_recent_arrest_df['arrests'] /
    final_recent_arrest_df['arrests_by_housing']
)

In [56]:
final_recent_arrest_df['demographic_pct_of_all_arrests'] = (
    final_recent_arrest_df['arrests_demographic'] /
    final_recent_arrest_df['arrests_coc_total']
)

In [57]:
final_recent_arrest_df['housing_status_pct_of_all_arrests'] = (
    final_recent_arrest_df['arrests_by_housing'] /
    final_recent_arrest_df['arrests_coc_total']
)

In [58]:
final_recent_arrest_df['housing_status_pct_of_demographic_arrests'] = (
    final_recent_arrest_df['arrests'] /
    final_recent_arrest_df['arrests_demographic']
)

In [59]:
final_recent_arrest_df[['ACS Demographic',
                        'PITC Demographic']].drop_duplicates()

Unnamed: 0,ACS Demographic,PITC Demographic
0,Non-Hispanic White,White
1,American Indian and Alaska Native,American Indian or Alaska Native
2,Asian,Asian
3,Black or African American,Black or African American
4,Native Hawaiian and Other Pacific Islander,Native Hawaiian or Other Pacific Islander
5,Other/Unknown,Other/Unknown
6,White,White
0,Hispanic or Latino (of any race),Hispanic/Latino
1,Not Hispanic or Latino,Non-Hispanic/Non-Latino
2,Unknown,Unknown


Arrest data averaged over 2017 and 2019, because:

- CoCs performed counts of unsheltered people (limited to odd years)
- These are the years for which I have Portland data (limited to 2017-part of 2021)
- CoCs recorded race and ethnicity (limited to 2015 on)

## Point in time count and ACS data

In [60]:
pitc_acs = pd.read_csv(
    '../04_outputs/c02_acs_pitc.csv',
    dtype={'county_fips': str, 'ACS Year': str, 'PITC Year': str},
)

In [61]:
pitc_acs['ACS Year'].unique()

array(['2019'], dtype=object)

In [62]:
pitc_acs['variable']

0      ethnicity
1      ethnicity
2      ethnicity
3      ethnicity
4      ethnicity
         ...    
139     one race
140     one race
141     one race
142     one race
143     one race
Name: variable, Length: 144, dtype: object

In [63]:
pitc_acs_avg = (
    pitc_acs[
        (pitc_acs['PITC Year'].isin(['2017', '2019']))
        & (pitc_acs['CoC Number'] != 'WA-500')
    ]
    .groupby(
        ['CoC Number', 'CoC Name', 'PITC Demographic', 'ACS Demographic', ], dropna=False
    )[
        [
            'ACS Percent',
            'demographic pct of unsheltered',
            'demographic pct of all homeless',
        ]
    ]
    .mean()
    .reset_index()
)

### Merge arrest and pitc/coc census data

In [64]:
arrest_and_pitc_df = pd.merge(final_recent_arrest_df, pitc_acs_avg)

In [65]:
arrest_and_pitc_df.rename(
    columns={'ACS Percent': 'ACS Percent, CoC (one race)'}, inplace=True)

## City ACS data

In [66]:
city_acs_df = pd.read_csv(
    '../04_outputs/c01_USCB-ACS5Y-DP05-City.csv',
    dtype={'ACS Year': str},
)

In [67]:
city_acs_df[['variable', 'ACS Demographic']].drop_duplicates()

Unnamed: 0,variable,ACS Demographic
0,ethnicity,Hispanic or Latino (of any race)
1,ethnicity,Not Hispanic or Latino
2,race alone or in combination,American Indian and Alaska Native
3,race alone or in combination,Asian
4,race alone or in combination,Black or African American
5,race alone or in combination,Native Hawaiian and Other Pacific Islander
6,race alone or in combination,Some other race
7,race alone or in combination,White
8,race alone or in combination,Non-Hispanic White
9,one race,American Indian and Alaska Native


In [68]:
arrest_and_pitc_df.rename(
    columns={'ACS Percent': 'ACS Percent, CoC (one race)'}, inplace=True)

In [69]:
city_acs_df.rename(columns={
    'geography': 'city',
    'ACS Percent': 'ACS Percent, city'}, inplace=True)

In [70]:
city_acs_df['city'].unique()

array(['Los Angeles', 'Oakland', 'Portland', 'Sacramento', 'San Diego',
       'Seattle'], dtype=object)

## Annual arrests by city

In [71]:
coc_to_city_dict = {'CA-600': 'Los Angeles',
                    'CA-502': 'Oakland',
                    'OR-501': 'Portland',
                    'CA-503': 'Sacramento',
                    'CA-601': 'San Diego',
                    'WA-500': 'Seattle'}

In [72]:
# coc_to_geo_dict = {v: k for k, v in geo_to_coc_dict.items()}

## Merging city census data with arrest data

In [73]:
arrest_and_pitc_df.head()

Unnamed: 0,CoC Number,ACS Demographic,arrests_demographic,PITC Demographic,_housing_status,arrests,arrests_by_housing,arrests_coc_total,housing_status_and_demographic_pct_of_all_arrests,demographic_pct_of_housing_status_arrests,demographic_pct_of_all_arrests,housing_status_pct_of_all_arrests,housing_status_pct_of_demographic_arrests,CoC Name,"ACS Percent, CoC (one race)",demographic pct of unsheltered,demographic pct of all homeless
0,CA-502,Non-Hispanic White,2126,White,housed,1918,22392,24781,0.077398,0.085656,0.085792,0.903595,0.902164,"Oakland, Berkeley/Alameda County CoC",0.314,0.327639,0.308711
1,CA-502,Non-Hispanic White,2126,White,no information,41,458,24781,0.001654,0.08952,0.085792,0.018482,0.019285,"Oakland, Berkeley/Alameda County CoC",0.314,0.327639,0.308711
2,CA-502,Non-Hispanic White,2126,White,unhoused,149,1653,24781,0.006013,0.090139,0.085792,0.066704,0.070085,"Oakland, Berkeley/Alameda County CoC",0.314,0.327639,0.308711
3,CA-502,Non-Hispanic White,2126,White,unknown,18,278,24781,0.000726,0.064748,0.085792,0.011218,0.008467,"Oakland, Berkeley/Alameda County CoC",0.314,0.327639,0.308711
4,CA-502,American Indian and Alaska Native,67,American Indian or Alaska Native,housed,50,22392,24781,0.002018,0.002233,0.002704,0.903595,0.746269,"Oakland, Berkeley/Alameda County CoC",0.007,0.036514,0.033684


In [74]:
city_acs_df.columns.intersection(arrest_and_pitc_df.columns)

Index(['ACS Demographic', 'CoC Number'], dtype='object')

In [75]:
final_df = pd.merge(
    arrest_and_pitc_df,
    city_acs_df[city_acs_df['variable'].isin(
        ['ethnicity', 'race alone or in combination'])],
    on=['CoC Number', 'ACS Demographic'],
    suffixes=(' CoC', ' city'),
)

In [76]:
final_df.loc[
    (final_df['city'] == 'Oakland')
    & (final_df['ACS Demographic'] == 'Black or African American'),
    'demographic pct of unsheltered, city',
] = np.nan

In [77]:
final_df.loc[
    (final_df['city'] == 'Oakland')
    & (final_df['ACS Demographic'] == 'Black or African American'),
    'demographic pct of all homeless, city',
] = 0.69

In [78]:
final_df.loc[
    (final_df['city'] == 'Oakland')
    & (final_df['ACS Demographic'] == 'American Indian and Alaska Native'),
    'demographic pct of all homeless, city',
] = 0.03

In [79]:
final_df.loc[
    (final_df['city'] == 'Oakland')
    & (final_df['ACS Demographic'] == 'Hispanic or Latino (of any race)'),
    'demographic pct of all homeless, city',
] = 0.13

[2019](https://www.documentcloud.org/documents/22029257-2019-los-angeles-continuum-of-care-homeless-count-methodology-report#document/p83/a2110003): 48.2%

[2017](https://www.documentcloud.org/documents/22029256-2017-los-angeles-continuum-of-care-homeless-count-methodology-report#document/p75/a2110001): 56.9%

In [80]:
(.482+.569)/2

0.5255

In [81]:
final_df.loc[
    (final_df['city'] == 'Los Angeles')
    & (final_df['ACS Demographic'] == 'Black or African American'),
    'demographic pct of all homeless, city',
] = 0.526

In [82]:
(.022+.031)/2

0.0265

In [83]:
final_df.loc[
    (final_df['city'] == 'Los Angeles')
    & (final_df['ACS Demographic'] == 'American Indian and Alaska Native'),
    'demographic pct of all homeless, city',
] = 0.027

In [84]:
(0.348+.307)/2

0.3275

In [85]:
final_df.loc[
    (final_df['city'] == 'Los Angeles')
    & (final_df['ACS Demographic'] == 'Hispanic or Latino (of any race)'),
    'demographic pct of all homeless, city',
] = 0.328

<a id="fc"></a>
Fact-checking note: The percent of each city's Black population is always higher than the Black population of the county to which it belongs.

In [86]:
final_df[final_df['ACS Demographic'] == 'Black or African American'][
    [
        'ACS Demographic',
        'CoC Name',
        'ACS Percent, CoC (one race)',
        'ACS Percent, city',
        'demographic pct of all homeless',
        'demographic pct of all homeless, city',
    ]
].applymap(
    lambda x: round(x * 100) if isinstance(x, (float, np.floating)) else x,
    na_action='ignore',
).drop_duplicates()

Unnamed: 0,ACS Demographic,CoC Name,"ACS Percent, CoC (one race)","ACS Percent, city",demographic pct of all homeless,"demographic pct of all homeless, city"
12,Black or African American,"Oakland, Berkeley/Alameda County CoC",11,27,48,69.0
33,Black or African American,Sacramento City & County CoC,10,16,32,
53,Black or African American,Los Angeles City & County CoC,7,10,47,53.0
70,Black or African American,San Diego City and County CoC,5,8,23,
94,Black or African American,"Portland, Gresham/Multnomah County CoC",6,8,14,


In [87]:
final_df.to_csv('../04_outputs/a03_arrests_and_demographics.csv', index=False)