# US Census Bureau Region and Division Codes and State FIPS Codes

**[Work in progress]**

This notebook creates .csv files with US regions, divisions, and assigns state FIPS codes for ingestion into the Knowledge Graph.

![](../../docs/USRegionsDivisions.png)

Data source: [2017 Census Bureau Region and Division Codes and State FIPS Codes](https://www.census.gov/geographies/reference-files/2017/demo/popest/2017-fips.html)

Author: Peter Rose (pwrose@ucsd.edu)

In [1]:
import os
from pathlib import Path
import pandas as pd

In [2]:
pd.options.display.max_rows = None  # display all rows
pd.options.display.max_columns = None  # display all columsns

In [3]:
NEO4J_HOME = Path(os.getenv('NEO4J_HOME'))
print(NEO4J_HOME)

/Users/peter/Library/Application Support/Neo4j Desktop/Application/neo4jDatabases/database-4af96121-2328-4e2f-ba60-6d8b728a26d5/installation-4.0.3


### Create List of US Divisions

In [4]:
census_url = 'https://www2.census.gov/programs-surveys/popest/geographies/2017/state-geocodes-v2017.xlsx'

In [5]:
df = pd.read_excel(census_url, dtype='str', skiprows=5)

In [6]:
df.head()

Unnamed: 0,Region,Division,State (FIPS),Name
0,1,0,0,Northeast Region
1,1,1,0,New England Division
2,1,1,9,Connecticut
3,1,1,23,Maine
4,1,1,25,Massachusetts


In [7]:
df.rename(columns={'State (FIPS)': 'fips'}, inplace=True)
df.rename(columns={'Name': 'name'}, inplace=True)

### Example

In [8]:
df.query("name == 'Alabama'")

Unnamed: 0,Region,Division,fips,name
39,3,6,1,Alabama


### Create list of US Regions

In [9]:
regions = df.query("Division == '0'").copy()
regions.rename(columns={'Region': 'id'}, inplace=True)
regions['id'] = 'US.' + regions['id']
regions['parentId'] = 'US'
regions = regions[['id', 'name', 'parentId']]

In [10]:
regions.head()

Unnamed: 0,id,name,parentId
0,US.1,Northeast Region,US
12,US.2,Midwest Region,US
27,US.3,South Region,US
48,US.4,West Region,US


In [11]:
regions.to_csv(NEO4J_HOME / "import/00i-USCensus2017Region.csv", index=False)

### Create list of US Divisions

In [12]:
divisions = df.query("Division != '0'").query("fips == '00'").copy()
divisions.rename(columns={'Division': 'id'}, inplace=True)
divisions['parentId'] = 'US.' + divisions['Region']
divisions['id'] = 'US.' + divisions['Region']  + '.' + divisions['id']
divisions = divisions[['id', 'name', 'parentId']]

In [13]:
divisions.head(20)

Unnamed: 0,id,name,parentId
1,US.1.1,New England Division,US.1
8,US.1.2,Middle Atlantic Division,US.1
13,US.2.3,East North Central Division,US.2
19,US.2.4,West North Central Division,US.2
28,US.3.5,South Atlantic Division,US.3
38,US.3.6,East South Central Division,US.3
43,US.3.7,West South Central Division,US.3
49,US.4.8,Mountain Division,US.4
58,US.4.9,Pacific Division,US.4


In [14]:
divisions.to_csv(NEO4J_HOME / "import/00i-USCensus2017Division.csv", index=False)

### Create list of US State FIPS codes

In [15]:
states = df.query("Division != '0'").query("fips != '00'").copy()
states['parentId'] = 'US.' + states['Region']  + '.' + states['Division']
states = states[['name', 'fips', 'parentId']]

#### Rename District of Columbia to match GenNames.org (Washington, D.C.)

In [16]:
states['name'] = states['name'].str.replace('District of Columbia', 'Washington, D.C.')

In [17]:
states.head(100)

Unnamed: 0,name,fips,parentId
2,Connecticut,9,US.1.1
3,Maine,23,US.1.1
4,Massachusetts,25,US.1.1
5,New Hampshire,33,US.1.1
6,Rhode Island,44,US.1.1
7,Vermont,50,US.1.1
9,New Jersey,34,US.1.2
10,New York,36,US.1.2
11,Pennsylvania,42,US.1.2
14,Illinois,17,US.2.3


In [18]:
states.to_csv(NEO4J_HOME / "import/00i-USCensus2017State.csv", index=False)