### Census Geographies
The follow are the base geographical units from census. Each table has a unique ID for joining to the crosswalks

|Table Name|Unique ID (length)| Description| Change Date|
| :--- | :--- | :--- | ---: |
|CSA|FIPS (3)|Combined Statistical Area|2/1/2021|
|CBSA|FIPS (5)|Core-Based Statistical Area (Metro/Micro-politan Area)|2/1/2021|
|METDIV|FIPS (5)|Metropolitan Division|2/1/2021|
|COUNTY|GEOID (5)|Counties|1/6/2023|
|COUSUB|GEOID (10)|County Subdivision|2/1/2021|
|TRACT|GEOID (11)|Tracts| 1/6/2023|
|BLOCKGROUP|GEOID (12)|Block Group| 1/6/2023|
|BLOCK|GEOID (15)|Blocks| 1/6/2023|

**Non-Hierarchical Geographies**

|Table Name|Unique ID (length)| Description|Change Date|
| :--- | :--- | :--- | ---: |
|CD|GEOID (4)|118th Congressional Districts| 1/10/2023|
|PLACE|GEOID (7)| All Places| 2/1/2021|

### Census of Local Governments

The Census of Governments has 38,767 local governments listed as of 2021. The list was subdivided in to three parts to isolate local governments that fall under the umbrella of NLC and match them to the Census geographies.

    1. COUNTIES - includes two city-counties and 3 NLC members that are officially counties
    2. MUNICIPALITIES - includes incorporated places
    3. TOWNSHIPS - includes county-subdivision geographic units from New England, Pennsylvania and New Jersey
    
These tables are combined to create the following table which can be merged to the crosswalks and the base geographies.

|Table Name|Unique ID (length)| Description|Change Date|
| :--- | :--- | :--- | ---: |
|CTV|GEOID (5/7/10)|All Cities, Towns and Villages| 2021|

Total: 22655 Cities Towns and Villages; 17,054 within Metro or Micro Area;


### Crosswalks

The following crosswalk tables can be used to join the primary CTV table to related geographies. Joining must be done using the  GEOID column with the corresponding table and primary key

|Table Name|Join On| Join Table|
| :--- | :--- | :--- |
|CTV_x_COUNTY|GEOID_ctv|CTV|
||GEOID_county|COUNTY|
||GEOID_csa|CSA|
||GEOID_cbsa|CBSA|
||GEOID_metdiv|METDIV|
|CTV_x_BLOCK|GEOID_ctv|CTV|
||GEOID_county|COUNTY|
||GEOID_cousub|COUSUB|
||GEOID_place|PLACE|
||GEOID_tract|TRACT|
||GEOID_blockgroup|BLOCKGROUP|
||GEOID_block|BLOCK|
|CTV_x_CD|GEOID_ctv|CTV|
||GEOID_cd|CD|

In [1]:
#Import Packages
import pandas as pd
import os
from tqdm import tqdm

# County to CSA
County, Metropolitan Divisions, Core-baed statistical areas, Combined Statistical Areas, County to CSA Crosswalk and County Subdivisions

In [2]:
STATE = pd.read_csv('Tables/states.csv',dtype={'FIPS':'str'})

In [3]:
CSA = pd.read_csv('Tables/csa.csv',
                  usecols=['CSAFP','NAME','ALAND','AWATER','INTPTLAT','INTPTLON'],
                  dtype={'CSAFP':'str'})
CSA.columns = ['FIPS','NAME','ALAND','AWATER','LAT','LON']

In [4]:
CBSA = pd.read_csv('Tables/cbsa.csv',
                   usecols=['CBSAFP','NAME','LSAD','ALAND','AWATER','INTPTLAT','INTPTLON'],
                   dtype={'CBSAFP':'str'})
CBSA.columns = ['FIPS','NAME','TYPE','ALAND','AWATER','LAT','LON']
CBSA['TYPE'] = CBSA['TYPE'].map({'M1':'Metro Area','M2':'Micro Area'})

In [5]:
METDIV = pd.read_csv('Tables/metdiv.csv',
                   usecols=['METDIVFP','NAME','ALAND','AWATER','INTPTLAT','INTPTLON'],
                   dtype={'METDIVFP':'str'})
METDIV.columns = ['FIPS','NAME','ALAND','AWATER','LAT','LON']

In [6]:
COUNTY = pd.read_csv('Tables/counties.csv',
                     usecols=['STATEFP','COUNTYFP','GEOID','NAME','LSAD','CSAFP',
                              'CBSAFP','METDIVFP','ALAND','AWATER','INTPTLAT','INTPTLON'],
                     dtype={'STATEFP':'str','COUNTYFP':'str','GEOID':'str','CSAFP':'str','CBSAFP':'str','METDIVFP':'str'})
COUNTY.columns = ['FIPS_state','FIPS','GEOID','NAME','TYPE','FIPS_csa','FIPS_cbsa','FIPS_metdiv',
                  'ALAND','AWATER','LAT','LON']
Central_Outlying = pd.read_csv('Tables/central_outlying.csv', dtype={'FIPS_state':'str','FIPS':'str'})
COUNTY = COUNTY.merge(Central_Outlying, how='left',on=['FIPS_state','FIPS'])
COUNTY = COUNTY.iloc[:,[2,5,6,7,3,4,12,8,9,10,11]]
CountyCrossWalk = COUNTY.iloc[:,:4]
COUNTY = COUNTY.drop(COUNTY.columns[[1,2,3]], axis = 1)
COUNTY['TYPE'] = COUNTY['TYPE'].map({0:"0",
                                     3: "City and Borough",
                                     4: "Borough",
                                     5: "Census Area",
                                     6: "County",
                                     7: "District",
                                     10: "Island",
                                     12: "Municipality",
                                     13: "Municipio",
                                     15: "Parish",
                                     25: "City"
                                    }
                                   )

In [7]:
COUSUB = pd.read_csv('Tables/cousub.csv',
                    usecols=['GEOID','NAME','LSAD','ALAND','AWATER','INTPTLAT','INTPTLON'],
                    dtype={'GEOID':'str'})
COUSUB.columns = ['GEOID','NAME','TYPE','ALAND','AWATER','LAT','LON']
COUSUB['TYPE'] = COUSUB['TYPE'].map({0:"0",
                                     20:"Barrio",
                                     21:"Borough",
                                     22:"CCD",
                                     23:"Census Subarea",
                                     24:"Census Subdistrict",
                                     25:"City",
                                     26:"County",
                                     27:"District",
                                     28:"District",
                                     29:"Precinct",
                                     30:"Precinct",
                                     31:"Gore",
                                     32:"Grant",
                                     36:"Location",
                                     37:"Municipality",
                                     39:"Plantation",
                                     41:"Barrio-Pueblo",
                                     42:"Purchase",
                                     43:"Town",
                                     44:"Township",
                                     45:"Township",
                                     46:"UT",
                                     47:"Village",
                                     49:"Charter Township",
                                     86:"Reservation"
                                    }
                                   )

# Block to Tract
Block, Blockgroup, Tract and Block Crosswalk

In [8]:
TRACT = pd.read_csv('Tables/tract.csv',
                    usecols=['GEOID','NAMELSAD','ALAND','AWATER','INTPTLAT','INTPTLON'],
                    dtype={'GEOID':'str'})
TRACT.columns = ['GEOID','NAME','ALAND','AWATER','LAT','LON']

In [9]:
BLOCKGROUP = pd.read_csv('Tables/blockgroup.csv',
                         usecols=['GEOID','NAMELSAD','ALAND','AWATER','INTPTLAT','INTPTLON'],
                         dtype={'GEOID':'str'})
BLOCKGROUP.columns = ['GEOID','NAME','ALAND','AWATER','LAT','LON']

In [10]:
blockfiles = os.listdir(os.getcwd()+'\\Tables\\blocks')
BLOCK = []

for i in tqdm(blockfiles, total = len(blockfiles)):
    block = pd.read_csv('Tables/blocks/'+i,
                        usecols=['GEOID20','NAME20','UR20','ALAND20','AWATER20',
                                 'INTPTLAT20','INTPTLON20','HOUSING20','POP20'],
                        dtype={'GEOID20':'str'})
    block.columns = ['GEOID','NAME','URBAN_RURAL','ALAND','AWATER','LAT','LON','HOUSEHOLDS','POPULATION']
    block['URBAN_RURAL'] = block['URBAN_RURAL'].map({'U':'Urban','R':'Rural'})
    BLOCK.append(block)

BLOCK = pd.concat(BLOCK)

100%|██████████████████████████████████████████████████████████████████████████████████████| 56/56 [00:09<00:00,  5.84it/s]


In [11]:
BlockCrossWalk = pd.read_csv('Tables/blocksCrosswalk.csv',
                             usecols=['FULLCODE','STATE','COUNTY','TRACT','BLOCK','PLACE','COUSUB'],
                             dtype={'FULLCODE':'str','STATE':'str','COUNTY':'str','TRACT':'str',
                                    'BLOCK':'str','PLACE':'str','COUSUB':'str'})
BlockCrossWalk['GEOID'] = BlockCrossWalk.FULLCODE.str.zfill(15)
BlockCrossWalk['GEOID_county'] = BlockCrossWalk['GEOID'].str[:5]
BlockCrossWalk['GEOID_tract'] = BlockCrossWalk['GEOID'].str[:11]
BlockCrossWalk['GEOID_blockgroup'] = BlockCrossWalk['GEOID'].str[:12]
BlockCrossWalk['GEOID_cousub'] = BlockCrossWalk['GEOID_county'] + BlockCrossWalk['COUSUB']
BlockCrossWalk['GEOID_place'] = BlockCrossWalk['STATE'] + BlockCrossWalk['PLACE']
BlockCrossWalk = BlockCrossWalk.iloc[:,7:]

# Non-hierarchical Geographies
Congressional Districts, Places and Census of Governments

In [12]:
CD = pd.read_csv('Tables/CD.csv',
                usecols=['STATEFP20','GEOID20','CD118FP','NAMELSAD20','ALAND20','AWATER20','INTPTLAT20','INTPTLON20'],
                dtype={'STATEFP20':'str','GEOID20':'str','CD118FP':'str'})
CD.columns = ['FIPS_state','GEOID','FIPS','NAME','ALAND','AWATER','LAT','LON']
CD = CD[CD.FIPS != 'ZZ']
CD = CD.merge(STATE.add_suffix('_state'), how = 'left',on = 'FIPS_state')
CD['NAME'] = CD['CODE_state'] + '-' + CD['FIPS']
CD = CD.iloc[:,[1,3,4,5,6,7]]

In [13]:
PLACE = pd.read_csv('Tables/place.csv',
                    usecols=['GEOID','NAME','LSAD','ALAND','AWATER','INTPTLAT','INTPTLON'],
                    dtype={'GEOID':'str'})
PLACE.columns = ['GEOID','NAME','TYPE','ALAND','AWATER','LAT','LON']
PLACE['TYPE'] = PLACE['TYPE'].map({"00":"00",
                                   "21":"Borough",
                                   "25":"City",
                                   "35":"Metro Township",
                                   "37":"Municipality",
                                   "43":"Town",
                                   "47":"Village",
                                   "53":"City and Borough",
                                   "55":"Comunidad",
                                   "57":"CDP",
                                   "62":"Zona Urbana",
                                   "CG":"Consolidated Government",
                                   "CN":"Corporation",
                                   "MG":"Metropolitan Government",
                                   "UC":"Urban County",
                                   "UG":"Unified Government"
                                  }
                                 )

In [14]:
#census of Governments
COG = pd.read_csv('Tables/CensusofGovernments.csv',
                 usecols=['CENSUS_ID_PID6','UNIT_NAME','UNIT_TYPE','POPULATION','POPULATION_YEAR','FIPS_STATE','FIPS_COUNTY','FIPS_PLACE','COUNTY_AREA_NAME','IS_ACTIVE'],
                 dtype={'CENSUS_ID_PID6':'str','FIPS_STATE':'str','FIPS_COUNTY':'str','FIPS_PLACE':'str','POPULATION':'float'},
                 thousands=',')

COUNTIES = COG[COG.CENSUS_ID_PID6.isin(['164742','183701','170369','136002','176669'])]
COUNTIES['GEOID'] = COUNTIES['FIPS_STATE']+COUNTIES['FIPS_COUNTY']
COUNTIES = COUNTIES.merge(COUNTY, how = 'left',on = 'GEOID')
COUNTIES = COUNTIES.iloc[:,[10,5,6,1,2,12,3,4,14,15,16,17]]
COUNTIES['UNIT_TYPE'] = 'COUNTY'

TOWNSHIPS = COG[(COG.UNIT_TYPE == '3 - TOWNSHIP') & (COG.IS_ACTIVE == 'Y') & ~COG.FIPS_STATE.isin(['17','18','20','26','27','29','31','36','38','39','46','55'])]
TOWNSHIPS['GEOID'] = TOWNSHIPS['FIPS_STATE']+TOWNSHIPS['FIPS_COUNTY']+TOWNSHIPS['FIPS_PLACE']
TOWNSHIPS = TOWNSHIPS.merge(COUSUB, how = 'left',on = 'GEOID')
TOWNSHIPS = TOWNSHIPS.iloc[:,[10,5,6,1,2,12,3,4,13,14,15,16]]
TOWNSHIPS['UNIT_TYPE'] = 'COUSUB'

MUNICIPALITIES = COG[(COG.UNIT_TYPE == '2 - MUNICIPAL') & (COG.IS_ACTIVE == 'Y')]
MUNICIPALITIES['GEOID'] = MUNICIPALITIES['FIPS_STATE']+MUNICIPALITIES['FIPS_PLACE']
MUNICIPALITIES = MUNICIPALITIES.merge(PLACE, how = 'inner', on='GEOID')
MUNICIPALITIES = MUNICIPALITIES.iloc[:,[10,5,6,1,2,12,3,4,13,14,15,16]]
MUNICIPALITIES['UNIT_TYPE'] = 'PLACE'

CTV = pd.concat([COUNTIES, TOWNSHIPS, MUNICIPALITIES])
CTV.columns = ['GEOID','FIPS_state','FIPS_county','FULL_NAME','GEO_UNIT','TYPE','POPULATION','POP_YEAR','ALAND','AWATER','LAT','LON']
CTV = CTV.merge(STATE.add_suffix('_state'),how='left',on='FIPS_state')
CTV = CTV.iloc[:,[0,1,2,3,4,5,12,6,7,8,9,10,11]]

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  COUNTIES['GEOID'] = COUNTIES['FIPS_STATE']+COUNTIES['FIPS_COUNTY']
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  TOWNSHIPS['GEOID'] = TOWNSHIPS['FIPS_STATE']+TOWNSHIPS['FIPS_COUNTY']+TOWNSHIPS['FIPS_PLACE']
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  MUNICIPALITIES['GEOID'] = MUNICIPALITIES['FI

# Crosswalks

In [15]:
#Cog to CD
county2cd = pd.read_csv('Tables/county2cd.csv',
                        usecols=['GEOID_CD118_20','GEOID_COUNTY_20','AREALAND_PART','AREAWATER_PART'],
                        dtype={'GEOID_CD118_20':'str','GEOID_COUNTY_20':'str'})
county2cd['GEOID_COUNTY_20'] = county2cd.GEOID_COUNTY_20.str.zfill(5)
county2cd.rename(columns = {'GEOID_CD118_20':'GEOID_cd','GEOID_COUNTY_20':'GEOID_ctv','AREALAND_PART':'ALAND_part','AREAWATER_PART':'AWATER_part'}, inplace = True)

cousub2cd = pd.read_csv('Tables/cousub2cd.csv',
                        usecols=['GEOID_CD118_20','GEOID_COUSUB_20','AREALAND_PART','AREAWATER_PART'],
                        dtype={'GEOID_CD118_20':'str','GEOID_COUSUB_20':'str'})
cousub2cd['GEOID_COUSUB_20'] = cousub2cd.GEOID_COUSUB_20.str.zfill(10)
cousub2cd.rename(columns = {'GEOID_CD118_20':'GEOID_cd','GEOID_COUSUB_20':'GEOID_ctv','AREALAND_PART':'ALAND_part','AREAWATER_PART':'AWATER_part'}, inplace = True)

place2cd = pd.read_csv('Tables/place2cd.csv',
                       usecols=['GEOID_CD118_20','GEOID_PLACE_20','AREALAND_PART','AREAWATER_PART'],
                       dtype={'GEOID_CD118_20':'str','GEOID_PLACE_20':'str'})
place2cd.rename(columns = {'GEOID_CD118_20':'GEOID_cd','GEOID_PLACE_20':'GEOID_ctv','AREALAND_PART':'ALAND_part','AREAWATER_PART':'AWATER_part'}, inplace = True)

county2cd = county2cd.merge(COUNTIES, how = 'inner',left_on='GEOID_ctv', right_on = 'GEOID').iloc[:,:4]
cousub2cd = cousub2cd.merge(TOWNSHIPS, how = 'inner',left_on='GEOID_ctv', right_on = 'GEOID').iloc[:,:4]
place2cd = place2cd.merge(MUNICIPALITIES, how = 'inner',left_on='GEOID_ctv', right_on = 'GEOID').iloc[:,:4]

CTV_x_CD = pd.concat([county2cd,cousub2cd,place2cd])
CTV_x_CD = CTV_x_CD.iloc[:,[1,0,2,3]]

In [16]:
#Merge with block crosswalk
county_x_block = BlockCrossWalk.merge(COUNTIES, left_on = ['GEOID_county'], right_on = ['GEOID'], how = 'inner', suffixes = ('_block','_ctv')).iloc[:,:8]
township_x_block = BlockCrossWalk.merge(TOWNSHIPS, left_on = ['GEOID_cousub'], right_on = ['GEOID'], how = 'inner', suffixes = ('_block','_ctv')).iloc[:,:8]
place_x_block = BlockCrossWalk.merge(MUNICIPALITIES, left_on = ['GEOID_place'], right_on = ['GEOID'], how = 'inner', suffixes = ('_block','_ctv')).iloc[:,:8]

CTV_x_BLOCK = pd.concat([county_x_block,township_x_block,place_x_block])
CTV_x_BLOCK = CTV_x_BLOCK.iloc[:,[7,1,4,5,2,3,0]]

In [17]:
#Merge with County Crosswalk
CTV_x_COUNTY = CTV_x_BLOCK.iloc[:,:2].drop_duplicates(ignore_index=True).merge(CountyCrossWalk, left_on=['GEOID_county'],right_on=['GEOID'])
CTV_x_COUNTY = CTV_x_COUNTY.iloc[:,[0,1,3,4,5]]

In [18]:
#Table Export to CSV
STATE.to_csv('Output/STATE.csv', index = False)
CSA.to_csv('Output/CSA.csv', index = False)
CBSA.to_csv('Output/CBSA.csv', index = False)
METDIV.to_csv('Output/METDIV.csv', index = False)
COUNTY.to_csv('Output/COUNTY.csv', index = False)
COUSUB.to_csv('Output/COUSUB.csv', index = False)
TRACT.to_csv('Output/TRACT.csv', index = False)
BLOCKGROUP.to_csv('Output/BLOCKGROUP.csv', index = False)
BLOCK.to_csv('Output/BLOCK.csv', index = False)
CD.to_csv('Output/CD.csv', index = False)
PLACE.to_csv('Output/PLACE.csv', index = False)

CTV.to_csv('Output/CTV.csv', index = False)
CTV_x_CD.to_csv('Output/CTV_x_CD.csv', index = False)
CTV_x_BLOCK.to_csv('Output/CTV_x_BLOCK.csv', index = False)
CTV_x_COUNTY.to_csv('Output/CTV_x_COUNTY.csv', index = False)

# Next Steps

* Compare to Netforum and Upload new attributes
* Custom Shapefile

* Stack mysidewalk data by geographies and merge on GEOIDs
* Identify data gaps by geography
* Weighting of geographies and crosswalks based on size and population
    - Primary county vs multiple counties and weighting (by population and area)
    - Filtering out 0 pop blocks (weighitng by population and area)
* Network diagram with principal cities/ central/outlying counties and distance calculations
* CTV to PUMAS
* CTV to Zip codes

In [19]:
#Principal_Cities = pd.read_csv('Tables/principal_cities.csv',usecols=['FIPS_cbsa','NAME_city','FIPS_state','FIPS_place'], dtype={'FIPS_cbsa':'str','FIPS_state':'str','FIPS_place':'str'})