# Census tract to Chicago community area aggregation by Hispanic or Latino origin by race

<b>Methdology: </b>Aggregated tract-level data on race and ethnicity to the community area-level with CMAP's crosswalks which are assigned by the share of a census block's population that resides in a given community area. For the eight split tracts, I distribute the total population based on the geographic distribution of the block-level population, then calculated an estimated race/ethnicity total for the split portions by applying the same race/ethnicity breakdown of the census tract.

In [1]:
# configs
import pandas as pd

Downloaded 2023 5-year ACS data for B03002, Hispanic or Latino Origin by Race, by census tract for Cook County

In [73]:
# load census data
acs = pd.read_csv('../census-data/ACSDT5Y2023.B03002_2025-07-10T125648/ACSDT5Y2023.B03002-Data.csv', skiprows=[1],
                  header=0,
                  usecols=['GEO_ID',
                           'B03002_001E', # total
                           'B03002_012E', # hispanic (any race)
                           'B03002_003E', # white alone, not hispanic
                           'B03002_004E', # black alone, not hispanic
                           'B03002_006E']) # asian alone, not hispanic

In [74]:
acs.head(2)

Unnamed: 0,GEO_ID,B03002_001E,B03002_003E,B03002_004E,B03002_006E,B03002_012E
0,1400000US17031010100,3726,1297,1376,137,809
1,1400000US17031010201,7588,1406,2301,376,2622


In [75]:
# rename cols
acs.columns = ['geoid', 'total', 'white_nonhispanic', 'black_nonhispanic', 'asian_nonhispanic', 'hispanic']

In [76]:
# clean id to match crosswalk tract geoid
acs['geoid'] = acs['geoid'].astype(str)
acs['geoid_clean'] = acs['geoid'].str.replace('1400000US', '')
acs['geoid_clean'] = acs['geoid_clean'].astype(int)

In [77]:
# add pcts 
acs['pct_hispanic'] = acs['hispanic']/acs['total']
acs['pct_white_nonhispanic'] = acs['white_nonhispanic']/acs['total']
acs['pct_black_nonhispanic'] = acs['black_nonhispanic']/acs['total']
acs['pct_asian_nonhispanic'] = acs['asian_nonhispanic']/acs['total']

In [78]:
# inspect
acs[acs['geoid_clean'] == 17031843900]

Unnamed: 0,geoid,total,white_nonhispanic,black_nonhispanic,asian_nonhispanic,hispanic,geoid_clean,pct_hispanic,pct_white_nonhispanic,pct_black_nonhispanic,pct_asian_nonhispanic
1326,1400000US17031843900,3878,428,3281,29,116,17031843900,0.029912,0.110366,0.846055,0.007478


In [79]:
# load cmap crosswalk
crosswalk = pd.read_csv('../census-data/Crosswalk_TR_to_CCA_2020 (1).csv')
crosswalk.head(2)

Unnamed: 0,TRACT,GEOID,CCA,TR_POP_RAT,TR_HH_RAT,TR_HU_RAT
0,17031010100,1,Rogers Park,1.0,1.0,1.0
1,17031010201,1,Rogers Park,1.0,1.0,1.0


The CMAP crosswalks use ratios based on population, household, and housing unit data at the block level to assign tracts to community areas. Some partially covered tracts. They show up twice in the crosswalk. There are 8 split tracts.

In [80]:
crosswalk[crosswalk.duplicated('TRACT', keep=False)]

Unnamed: 0,TRACT,GEOID,CCA,TR_POP_RAT,TR_HH_RAT,TR_HU_RAT
504,17031520500,52,East Side,0.994118,0.997356,0.997459
505,17031520500,55,Hegewisch,0.005882,0.002644,0.002541
506,17031520600,52,East Side,0.980632,0.99262,0.992883
507,17031520600,55,Hegewisch,0.019368,0.00738,0.007117
694,17031831000,22,Logan Square,0.57387,0.526779,0.530892
695,17031831000,24,West Town,0.42613,0.473221,0.469108
719,17031834300,44,Chatham,0.005449,0.007218,0.006678
720,17031834300,45,Avalon Park,0.994551,0.992782,0.993322
763,17031840000,34,Armour Square,0.014995,0.013501,0.012605
764,17031840000,60,Bridgeport,0.985005,0.986499,0.987395


To assign population totals by commmunity area, I multiply the tract population ratio <b>TR_POP_RAT</b> by the total population estimate to determine the total population share for the portion of the tract that is in a given community area. <br>

Then, I assume the distribution of race/ethnicity is uniform geographically accross the tract and apply the racial percentages to all estiamted population portions in split tracts.

In [81]:
# left merge crosswalk with census data so duplicate keys have same census info
merged = pd.merge(crosswalk, acs, left_on='TRACT', right_on='geoid_clean', how='left', indicator=True)

In [82]:
# check
merged[merged['TRACT'] == 17031843900]

Unnamed: 0,TRACT,GEOID,CCA,TR_POP_RAT,TR_HH_RAT,TR_HU_RAT,geoid,total,white_nonhispanic,black_nonhispanic,asian_nonhispanic,hispanic,geoid_clean,pct_hispanic,pct_white_nonhispanic,pct_black_nonhispanic,pct_asian_nonhispanic,_merge
799,17031843900,42,Woodlawn,0.385979,0.312902,0.314274,1400000US17031843900,3878.0,428.0,3281.0,29.0,116.0,17031840000.0,0.029912,0.110366,0.846055,0.007478,both
800,17031843900,43,South Shore,0.614021,0.687098,0.685726,1400000US17031843900,3878.0,428.0,3281.0,29.0,116.0,17031840000.0,0.029912,0.110366,0.846055,0.007478,both


In [83]:
merged['_merge'].value_counts()

_merge
both          805
left_only       2
right_only      0
Name: count, dtype: int64

In [54]:
# note two o'hare tracts are not in the ACS data
merged[merged['_merge'] == 'left_only']

Unnamed: 0,TRACT,GEOID,CCA,TR_POP_RAT,TR_HH_RAT,TR_HU_RAT,geoid,total,hispanic,white_nonhispanic,black_nonhispanic,asian_nonhispanic,geoid_clean,pct_hispanic,pct_white_nonhispanic,pct_black_nonhispanic,pct_asian_nonhispanic,_merge
805,17043840000,76,O'Hare,0.0,0.0,0.0,,,,,,,,,,,,left_only
806,17043840801,76,O'Hare,0.0,0.0,0.0,,,,,,,,,,,,left_only


In [84]:
# multiply by population ratio
merged['est_total'] = merged['total'] * merged['TR_POP_RAT']

In [85]:
# multiply est total by racial breakdowns
merged['est_hispanic'] = merged['pct_hispanic'] * merged['est_total']
merged['est_white_nonhispanic'] = merged['pct_white_nonhispanic'] * merged['est_total']
merged['est_black_nonhispanic'] = merged['pct_black_nonhispanic'] * merged['est_total']
merged['est_asian_nonhispanic'] = merged['pct_asian_nonhispanic'] * merged['est_total']

In [86]:
# check
merged[merged['TRACT'] == 17031843900]

Unnamed: 0,TRACT,GEOID,CCA,TR_POP_RAT,TR_HH_RAT,TR_HU_RAT,geoid,total,white_nonhispanic,black_nonhispanic,...,pct_hispanic,pct_white_nonhispanic,pct_black_nonhispanic,pct_asian_nonhispanic,_merge,est_total,est_hispanic,est_white_nonhispanic,est_black_nonhispanic,est_asian_nonhispanic
799,17031843900,42,Woodlawn,0.385979,0.312902,0.314274,1400000US17031843900,3878.0,428.0,3281.0,...,0.029912,0.110366,0.846055,0.007478,both,1496.827605,44.773595,165.199127,1266.397981,11.193399
800,17031843900,43,South Shore,0.614021,0.687098,0.685726,1400000US17031843900,3878.0,428.0,3281.0,...,0.029912,0.110366,0.846055,0.007478,both,2381.172395,71.226405,262.800873,2014.602019,17.806601


Next I aggregate by community area by taking the sum of the estimated racial breakdown and estimated total population.

In [87]:
agg = merged.groupby('CCA')[['est_total',
                             'est_hispanic',
                             'est_white_nonhispanic',
                             'est_black_nonhispanic',
                             'est_asian_nonhispanic']].sum().reset_index()

agg

Unnamed: 0,CCA,est_total,est_hispanic,est_white_nonhispanic,est_black_nonhispanic,est_asian_nonhispanic
0,Albany Park,46620.000000,20723.000000,15530.000000,1761.000000,7014.000000
1,Archer Heights,14021.000000,11414.000000,1706.000000,152.000000,670.000000
2,Armour Square,14246.256914,725.252916,2219.600467,1732.314895,9063.503832
3,Ashburn,42079.000000,19507.000000,3577.000000,18114.000000,291.000000
4,Auburn Gresham,45049.000000,1785.000000,374.000000,41894.000000,323.000000
...,...,...,...,...,...,...
72,West Lawn,32649.000000,28344.000000,3157.000000,639.000000,450.000000
73,West Pullman,24470.000000,1850.000000,296.000000,21774.000000,18.000000
74,West Ridge,78227.000000,17106.000000,30298.000000,9155.000000,17031.000000
75,West Town,86427.297759,17108.669199,54611.385492,4598.863274,5297.516521


In [109]:
# add pcts
agg['pct_hispanic'] = agg['est_hispanic']/agg['est_total']
agg['pct_white_nonhispanic'] = agg['est_white_nonhispanic']/agg['est_total']
agg['pct_black_nonhispanic'] = agg['est_black_nonhispanic']/agg['est_total']
agg['pct_asian_nonhispanic'] = agg['est_asian_nonhispanic']/agg['est_total']
agg['pct_minority'] = (agg['est_total'] - agg['est_white_nonhispanic'])/agg['est_total']

In [110]:
agg

Unnamed: 0,CCA,est_total,est_hispanic,est_white_nonhispanic,est_black_nonhispanic,est_asian_nonhispanic,pct_hispanic,pct_white_nonhispanic,pct_black_nonhispanic,pct_asian_nonhispanic,pct_minority
0,Albany Park,46620.000000,20723.000000,15530.000000,1761.000000,7014.000000,0.444509,0.333119,0.037773,0.150450,0.666881
1,Archer Heights,14021.000000,11414.000000,1706.000000,152.000000,670.000000,0.814065,0.121675,0.010841,0.047785,0.878325
2,Armour Square,14246.256914,725.252916,2219.600467,1732.314895,9063.503832,0.050908,0.155802,0.121598,0.636202,0.844198
3,Ashburn,42079.000000,19507.000000,3577.000000,18114.000000,291.000000,0.463580,0.085007,0.430476,0.006916,0.914993
4,Auburn Gresham,45049.000000,1785.000000,374.000000,41894.000000,323.000000,0.039624,0.008302,0.929965,0.007170,0.991698
...,...,...,...,...,...,...,...,...,...,...,...
72,West Lawn,32649.000000,28344.000000,3157.000000,639.000000,450.000000,0.868143,0.096695,0.019572,0.013783,0.903305
73,West Pullman,24470.000000,1850.000000,296.000000,21774.000000,18.000000,0.075603,0.012096,0.889824,0.000736,0.987904
74,West Ridge,78227.000000,17106.000000,30298.000000,9155.000000,17031.000000,0.218671,0.387309,0.117031,0.217713,0.612691
75,West Town,86427.297759,17108.669199,54611.385492,4598.863274,5297.516521,0.197954,0.631877,0.053211,0.061294,0.368123


Check against 2022 5-year ACS CMAP values by community area

In [95]:
# load cmap
cmap = pd.read_csv('../census-data/Community_Data_Snapshots_2024.csv', header=0,
                  usecols=['GEOID',
                          'GEOG',
                          'TOT_POP',
                          'WHITE',
                          'HISP',
                          'BLACK',
                          'ASIAN',
                          'MEDINC'])
cmap.head()

Unnamed: 0,GEOID,GEOG,TOT_POP,WHITE,HISP,BLACK,ASIAN,MEDINC
0,1,Rogers Park,55711.0,25004.0,10836.0,13510.0,2822.0,57590.882862
1,2,West Ridge,79265.0,31506.0,17540.0,9093.0,16577.0,68091.372913
2,77,Edgewater,56099.0,29925.0,8695.0,7079.0,7394.0,67795.667447
3,3,Uptown,57464.0,30392.0,8084.0,11160.0,5507.0,66870.910173
4,4,Lincoln Square,42271.0,25830.0,8326.0,1538.0,4360.0,90568.923328


In [96]:
# make percents
cmap['pct_white'] = cmap['WHITE']/cmap['TOT_POP']
cmap['pct_hisp'] = cmap['HISP']/cmap['TOT_POP']
cmap['pct_black'] = cmap['BLACK']/cmap['TOT_POP']
cmap['pct_asian'] = cmap['ASIAN']/cmap['TOT_POP']

Check totals

In [97]:
# merge together
check = pd.merge(agg, cmap, left_on='CCA', right_on='GEOG', indicator=True)
check['_merge'].value_counts()

_merge
both          77
left_only      0
right_only     0
Name: count, dtype: int64

In [106]:
# create delta cols
check['hisp_delta'] = check['pct_hispanic'] - check['pct_hisp']
check['black_delta'] = check['pct_black_nonhispanic'] - check['pct_black']
check['white_delta'] = check['pct_white_nonhispanic'] - check['pct_white']
check['asian_delta'] = check['pct_asian_nonhispanic'] - check['pct_asian']

None of the race/ethnicity percentage breakdowns are more than 5% different from CMAP's community area aggregation versus mine.

In [102]:
check.sort_values('hisp_delta', ascending=False).head(5)

Unnamed: 0,CCA,est_total,est_hispanic,est_white_nonhispanic,est_black_nonhispanic,est_asian_nonhispanic,pct_hispanic,pct_white_nonhispanic,pct_black_nonhispanic,pct_asian_nonhispanic,...,ASIAN,MEDINC,pct_white,pct_hisp,pct_black,pct_asian,_merge,hisp_delta,black_delta,white_delta
30,Hegewisch,9051.924436,4826.277633,2934.099744,1175.082353,79.270588,0.533177,0.324141,0.129816,0.008757,...,60.879137,58647.459697,0.361961,0.489656,0.135975,0.006692,both,0.043521,-0.006159,-0.03782
16,Clearing,24924.0,16059.0,8065.0,356.0,125.0,0.644319,0.323584,0.014283,0.005015,...,169.0,73778.040142,0.361493,0.601828,0.018643,0.006834,both,0.042491,-0.004359,-0.037909
61,South Chicago,29381.0,6789.0,1339.0,20588.0,71.0,0.231068,0.045574,0.700725,0.002417,...,157.0,43935.950413,0.033515,0.189444,0.75019,0.005179,both,0.041624,-0.049465,0.012059
62,South Deering,14210.0,4983.0,672.0,8426.0,15.0,0.350669,0.047291,0.592963,0.001056,...,1.0,34814.241486,0.045275,0.312787,0.640426,6.6e-05,both,0.037881,-0.047463,0.002015
15,Chicago Lawn,53460.0,33000.0,1129.0,17835.0,290.0,0.617284,0.021119,0.333614,0.005425,...,253.0,43293.492696,0.02192,0.582209,0.368691,0.004822,both,0.035075,-0.035077,-0.000801


In [103]:
check.sort_values('black_delta', ascending=False).head(5)

Unnamed: 0,CCA,est_total,est_hispanic,est_white_nonhispanic,est_black_nonhispanic,est_asian_nonhispanic,pct_hispanic,pct_white_nonhispanic,pct_black_nonhispanic,pct_asian_nonhispanic,...,ASIAN,MEDINC,pct_white,pct_hisp,pct_black,pct_asian,_merge,hisp_delta,black_delta,white_delta
52,North Park,18742.0,3300.0,7828.0,958.0,5994.0,0.176075,0.417672,0.051115,0.319816,...,6097.0,74648.760331,0.445046,0.176461,0.027683,0.314311,both,-0.000386,0.023432,-0.027374
46,Near North Side,104712.0,7831.0,69339.0,9069.0,14097.0,0.074786,0.662188,0.086609,0.134626,...,14952.0,121561.582445,0.678237,0.069985,0.064708,0.147735,both,0.004801,0.021901,-0.016049
26,Gage Park,35691.0,32366.0,768.0,2116.0,284.0,0.906839,0.021518,0.059287,0.007957,...,224.0,50111.867704,0.028631,0.923221,0.039583,0.006439,both,-0.016381,0.019704,-0.007113
76,Woodlawn,24185.827605,652.773595,2470.199127,19120.397981,803.193399,0.02699,0.102134,0.790562,0.033209,...,859.0,29969.405594,0.098471,0.03084,0.772344,0.035994,both,-0.00385,0.018218,0.003664
12,Burnside,2148.0,111.0,0.0,1972.0,0.0,0.051676,0.0,0.918063,0.0,...,0.0,46710.526316,0.017809,0.048531,0.900267,0.0,both,0.003145,0.017796,-0.017809


In [104]:
check.sort_values('white_delta', ascending=False).head(5)

Unnamed: 0,CCA,est_total,est_hispanic,est_white_nonhispanic,est_black_nonhispanic,est_asian_nonhispanic,pct_hispanic,pct_white_nonhispanic,pct_black_nonhispanic,pct_asian_nonhispanic,...,ASIAN,MEDINC,pct_white,pct_hisp,pct_black,pct_asian,_merge,hisp_delta,black_delta,white_delta
28,Grand Boulevard,26345.0,1200.0,1847.0,21622.0,117.0,0.045549,0.070108,0.820725,0.004441,...,191.0,43047.619048,0.052231,0.032604,0.874139,0.007698,both,0.012946,-0.053414,0.017877
56,Portage Park,61793.0,25560.0,29898.0,850.0,3340.0,0.413639,0.483841,0.013756,0.054051,...,3697.0,85473.40596,0.466603,0.428667,0.017177,0.057574,both,-0.015028,-0.003422,0.017238
55,Oakland,6946.0,366.0,410.0,5853.0,34.0,0.052692,0.059027,0.842643,0.004895,...,67.0,27693.370166,0.041995,0.037695,0.885051,0.009603,both,0.014997,-0.042408,0.017032
61,South Chicago,29381.0,6789.0,1339.0,20588.0,71.0,0.231068,0.045574,0.700725,0.002417,...,157.0,43935.950413,0.033515,0.189444,0.75019,0.005179,both,0.041624,-0.049465,0.012059
60,Roseland,36700.0,880.0,943.0,33872.0,73.0,0.023978,0.025695,0.922943,0.001989,...,75.0,49808.153477,0.014039,0.018612,0.943765,0.001994,both,0.005366,-0.020822,0.011656


In [107]:
check.sort_values('asian_delta', ascending=False).head(5)

Unnamed: 0,CCA,est_total,est_hispanic,est_white_nonhispanic,est_black_nonhispanic,est_asian_nonhispanic,pct_hispanic,pct_white_nonhispanic,pct_black_nonhispanic,pct_asian_nonhispanic,...,MEDINC,pct_white,pct_hisp,pct_black,pct_asian,_merge,hisp_delta,black_delta,white_delta,asian_delta
47,Near South Side,29174.0,2116.0,14654.0,6160.0,4981.0,0.07253,0.502297,0.211147,0.170734,...,124558.359621,0.506876,0.053161,0.238021,0.151829,both,0.019369,-0.026874,-0.004579,0.018905
65,The Loop,42181.0,5093.0,21467.0,3940.0,9967.0,0.120742,0.508926,0.093407,0.236291,...,120174.618023,0.538576,0.115212,0.08104,0.221689,both,0.00553,0.012367,-0.02965,0.014602
57,Pullman,6741.0,521.0,900.0,4966.0,88.0,0.077288,0.133511,0.736686,0.013054,...,54755.244755,0.123687,0.081534,0.767357,0.000729,both,-0.004246,-0.030671,0.009824,0.012325
11,Brighton Park,42062.0,33094.0,2953.0,915.0,4827.0,0.786791,0.070206,0.021754,0.114759,...,52144.588045,0.075255,0.799872,0.014393,0.103591,both,-0.013081,0.007361,-0.005049,0.011168
33,Hyde Park,29591.0,2365.0,11889.0,8292.0,4581.0,0.079923,0.401778,0.28022,0.154811,...,61004.273504,0.449034,0.070638,0.26283,0.143916,both,0.009285,0.01739,-0.047257,0.010895


Also, check that the total population is roughly in the same ballpark as what the census reports the total population for B03000 for Chicago is.

In [108]:
# should equal close to 2,707,648 
# https://data.census.gov/table/ACSDT5Y2023.B03002?q=B03002:+Hispanic+or+Latino+Origin+by+Race&g=160XX00US1714000
agg['est_total'].sum()

np.float64(2707267.1336591267)

In [111]:
# export to csv
agg.to_csv('../processed/chicago_race_agg_community.csv',index=False)