# Municipalities

Exploratory data analysis of the raw 2020 TIGER/Line Shapefiles for U.S. census places and county subdivisions.

### Summary

To identify municipalities, it is necessary to reference both U.S. census places and county subdivisions:

**Places.** The raw dataset of U.S. places for 2020 is split into 56 files, each representing a "concentration of population [... that] may or may not have legally prescribed limits, powers, or functions. This concentration of population must have a name, be locally recognized, and not be part of any other place" ([Reference](https://www2.census.gov/geo/pdfs/reference/GARM/Ch9GARM.pdf)). There are 32,188 place(s) total spanning the 50 U.S. states, the District of Columbia, American Samoa, Guam, Northern Mariana Islands, Puerto Rico, and the U.S. Virgin Islands. In addition to the geometry column, relevant columns include the geography name (`NAME`) and computed name (`NAMELSAD`), which references the political subdivision (i.e., town, village, etc.). The dataset has a coordinate reference system (CRS) of EPSG:4269, which is standard for federal agencies.

**County Subdivisions.** The U.S. Census Bureau defines county subdivisions as "minor civil divisions (MCDs) or census county divisions (CCDs). A State has either MCDs or their statistical quivalents, or CCDs; it cannot contain both. [...] In the State of Alaska, whih has no counties and no MCDs, the Census Bureau and State officials have established census subareas (CSAs) as the statistical equivalents of MCDs" ([Reference](https://www2.census.gov/geo/pdfs/reference/GARM/Ch8GARM.pdf)). This raw dataset of U.S. county subdivisions is also split into 56 files for the same geographic extent and contains 36,639 records and a CRS of EPSG:4269. Relevant columns include the geometry, name (`NAME`), and computed name (`NAMELSAD`).

### Exploration

Examine census places.

In [12]:
import geopandas as gpd
import glob

In [25]:
fpaths = glob.glob("../data/raw/census/places/tl_2020_39_place.zip")
fpaths.sort()
print(len(fpaths))

1


In [26]:
gdf = gpd.read_file(fpaths[0])
gdf.info()

<class 'geopandas.geodataframe.GeoDataFrame'>
RangeIndex: 1265 entries, 0 to 1264
Data columns (total 17 columns):
 #   Column    Non-Null Count  Dtype   
---  ------    --------------  -----   
 0   STATEFP   1265 non-null   object  
 1   PLACEFP   1265 non-null   object  
 2   PLACENS   1265 non-null   object  
 3   GEOID     1265 non-null   object  
 4   NAME      1265 non-null   object  
 5   NAMELSAD  1265 non-null   object  
 6   LSAD      1265 non-null   object  
 7   CLASSFP   1265 non-null   object  
 8   PCICBSA   1265 non-null   object  
 9   PCINECTA  1265 non-null   object  
 10  MTFCC     1265 non-null   object  
 11  FUNCSTAT  1265 non-null   object  
 12  ALAND     1265 non-null   int64   
 13  AWATER    1265 non-null   int64   
 14  INTPTLAT  1265 non-null   object  
 15  INTPTLON  1265 non-null   object  
 16  geometry  1265 non-null   geometry
dtypes: geometry(1), int64(2), object(14)
memory usage: 168.1+ KB


In [15]:
gdf.head(2)

Unnamed: 0,STATEFP,PLACEFP,PLACENS,GEOID,NAME,NAMELSAD,LSAD,CLASSFP,PCICBSA,PCINECTA,MTFCC,FUNCSTAT,ALAND,AWATER,INTPTLAT,INTPTLON,geometry
0,39,58912,2399590,3958912,Osgood,Osgood village,47,C1,N,N,G4110,A,902595,0,40.339539,-84.4960179,"POLYGON ((-84.50548 40.33951, -84.50548 40.339..."
1,39,83972,2400154,3983972,Weston,Weston village,47,C1,N,N,G4110,A,2951747,6786,41.3459807,-83.7946092,"POLYGON ((-83.80560 41.35394, -83.80521 41.353..."


In [6]:
gdf.crs

<Geographic 2D CRS: EPSG:4269>
Name: NAD83
Axis Info [ellipsoidal]:
- Lat[north]: Geodetic latitude (degree)
- Lon[east]: Geodetic longitude (degree)
Area of Use:
- name: North America - onshore and offshore: Canada - Alberta; British Columbia; Manitoba; New Brunswick; Newfoundland and Labrador; Northwest Territories; Nova Scotia; Nunavut; Ontario; Prince Edward Island; Quebec; Saskatchewan; Yukon. Puerto Rico. United States (USA) - Alabama; Alaska; Arizona; Arkansas; California; Colorado; Connecticut; Delaware; Florida; Georgia; Hawaii; Idaho; Illinois; Indiana; Iowa; Kansas; Kentucky; Louisiana; Maine; Maryland; Massachusetts; Michigan; Minnesota; Mississippi; Missouri; Montana; Nebraska; Nevada; New Hampshire; New Jersey; New Mexico; New York; North Carolina; North Dakota; Ohio; Oklahoma; Oregon; Pennsylvania; Rhode Island; South Carolina; South Dakota; Tennessee; Texas; Utah; Vermont; Virginia; Washington; West Virginia; Wisconsin; Wyoming. US Virgin Islands. British Virgin Islands

In [7]:
num_places = 0
class_name = []
crs_name = []

for fpath in fpaths:
    gdf = gpd.read_file(fpath)
    num_places += len(gdf)
    class_name.extend(gdf["CLASSFP"].unique().tolist())
    crs_name.append(gdf.crs.name)

In [8]:
print(f"There are {num_places:,} place(s) spanning the 50 U.S. states, "
      f"the District of Columbia, and the U.S. territories.")

There are 32,188 place(s) spanning the 50 U.S. states, the District of Columbia, and the U.S. territories.


In [27]:
for p in gdf["NAMELSAD"].sort_values().tolist():
    print(p)

Aberdeen village
Ada village
Adamsville village
Addyston village
Adelphi village
Adena village
Ai CDP
Akron city
Albany village
Alexandria village
Alger village
Alliance city
Alvordton CDP
Amanda village
Amberley village
Amelia CDP
Amesville village
Amherst city
Amsterdam village
Andersonville CDP
Andover village
Anna village
Ansonia village
Antioch village
Antwerp village
Apple Creek village
Apple Valley CDP
Aquilla village
Arcadia village
Arcanum village
Archbold village
Arlington Heights village
Arlington village
Ashland city
Ashley village
Ashtabula city
Ashville village
Athalia village
Athens city
Attica village
Atwater CDP
Aurora city
Austinburg CDP
Austintown CDP
Avon Lake city
Avon city
Bailey Lakes village
Bainbridge CDP
Bainbridge village
Bairdstown village
Ballville CDP
Baltic village
Baltimore village
Bannock CDP
Barberton city
Barnesville village
Barnhill village
Bascom CDP
Bass Lake CDP
Batavia village
Batesville village
Bay View village
Bay Village city
Beach City villag

In [8]:
set(sorted(class_name))
# Exclude C9, M2, 

{'C1', 'C2', 'C5', 'C6', 'C7', 'C8', 'C9', 'M2', 'U1', 'U2'}

In [9]:
set(sorted(crs_name))

{'NAD83'}

Examine county subdivisions.

In [33]:
fpaths = glob.glob("../data/raw/census/county_subdivisions/tl_2020_20_cousub.zip")
fpaths.sort()
print(len(fpaths))

1


In [34]:
gdf = gpd.read_file(fpaths[0])
gdf.info()

<class 'geopandas.geodataframe.GeoDataFrame'>
RangeIndex: 1531 entries, 0 to 1530
Data columns (total 19 columns):
 #   Column    Non-Null Count  Dtype   
---  ------    --------------  -----   
 0   STATEFP   1531 non-null   object  
 1   COUNTYFP  1531 non-null   object  
 2   COUSUBFP  1531 non-null   object  
 3   COUSUBNS  1531 non-null   object  
 4   GEOID     1531 non-null   object  
 5   NAME      1531 non-null   object  
 6   NAMELSAD  1531 non-null   object  
 7   LSAD      1531 non-null   object  
 8   CLASSFP   1531 non-null   object  
 9   MTFCC     1531 non-null   object  
 10  CNECTAFP  0 non-null      float64 
 11  NECTAFP   0 non-null      float64 
 12  NCTADVFP  0 non-null      float64 
 13  FUNCSTAT  1531 non-null   object  
 14  ALAND     1531 non-null   int64   
 15  AWATER    1531 non-null   int64   
 16  INTPTLAT  1531 non-null   object  
 17  INTPTLON  1531 non-null   object  
 18  geometry  1531 non-null   geometry
dtypes: float64(3), geometry(1), int64(2), ob

In [38]:
gdf.query("GEOID == '2015171242'")

Unnamed: 0,STATEFP,COUNTYFP,COUSUBFP,COUSUBNS,GEOID,NAME,NAMELSAD,LSAD,CLASSFP,MTFCC,CNECTAFP,NECTAFP,NCTADVFP,FUNCSTAT,ALAND,AWATER,INTPTLAT,INTPTLON,geometry
909,20,151,71242,470371,2015171242,10,Township 10,45,T1,G4040,,,,A,187584433,1333,37.5123571,-98.8940644,"POLYGON ((-99.01404 37.55687, -99.00651 37.556..."


In [30]:
gdf.query("NAMELSAD == 'Adams township'")

Unnamed: 0,STATEFP,COUNTYFP,COUSUBFP,COUSUBNS,GEOID,NAME,NAMELSAD,LSAD,CLASSFP,MTFCC,CNECTAFP,NECTAFP,NCTADVFP,FUNCSTAT,ALAND,AWATER,INTPTLAT,INTPTLON,geometry
273,39,147,324,1086940,3914700324,Adams,Adams township,44,T1,G4040,,,,A,93026010,466243,41.2078427,-83.0099899,"POLYGON ((-83.07439 41.18911, -83.07428 41.189..."
279,39,111,296,1086646,3911100296,Adams,Adams township,44,T1,G4040,,,,A,58194840,14626,39.7766286,-80.9938494,"POLYGON ((-81.05586 39.78148, -81.05584 39.781..."
485,39,21,212,1085838,3902100212,Adams,Adams township,44,T1,G4040,,,,A,80420471,0,40.2336238,-83.9599249,"POLYGON ((-84.02132 40.20175, -84.02110 40.203..."
613,39,59,282,1086177,3905900282,Adams,Adams township,44,T1,G4040,,,,A,65338471,0,40.0407069,-81.6754846,"POLYGON ((-81.72419 40.00652, -81.72411 40.008..."
813,39,167,338,1087124,3916700338,Adams,Adams township,44,T1,G4040,,,,A,80095875,1637609,39.5444874,-81.5273754,"POLYGON ((-81.58474 39.57361, -81.58470 39.574..."
835,39,27,226,1085876,3902700226,Adams,Adams township,44,T1,G4040,,,,A,55579196,607316,39.4464589,-83.924718,"POLYGON ((-83.98866 39.44266, -83.98865 39.442..."
891,39,31,240,1085909,3903100240,Adams,Adams township,44,T1,G4040,,,,A,66407404,0,40.3328567,-81.6658887,"POLYGON ((-81.71515 40.30438, -81.71508 40.310..."
928,39,39,268,1086029,3903900268,Adams,Adams township,44,T1,G4040,,,,A,92178444,0,41.3840276,-84.2847172,"POLYGON ((-84.34169 41.35593, -84.34168 41.362..."
935,39,119,310,1086713,3911900310,Adams,Adams township,44,T1,G4040,,,,A,64937108,731157,40.1250297,-81.8587827,"POLYGON ((-81.90849 40.08681, -81.90820 40.092..."
1183,39,37,254,1086009,3903700254,Adams,Adams township,44,T1,G4040,,,,A,96223877,830136,40.1303431,-84.4890801,"POLYGON ((-84.56120 40.14196, -84.56122 40.146..."


In [22]:
gdf.head(2)

Unnamed: 0,STATEFP,COUNTYFP,COUSUBFP,COUSUBNS,GEOID,NAME,NAMELSAD,LSAD,CLASSFP,MTFCC,CNECTAFP,NECTAFP,NCTADVFP,FUNCSTAT,ALAND,AWATER,INTPTLAT,INTPTLON,geometry
0,39,115,48930,1086691,3911548930,Meigsville,Meigsville township,44,T1,G4040,,,,A,80606792,612759,39.6373963,-81.7573546,"POLYGON ((-81.82489 39.62507, -81.82457 39.630..."
1,39,115,52122,1086692,3911552122,Morgan,Morgan township,44,T1,G4040,,,,A,30935491,1071578,39.6607406,-81.8405166,"POLYGON ((-81.88911 39.68041, -81.88901 39.680..."


In [13]:
gdf.crs

<Geographic 2D CRS: EPSG:4269>
Name: NAD83
Axis Info [ellipsoidal]:
- Lat[north]: Geodetic latitude (degree)
- Lon[east]: Geodetic longitude (degree)
Area of Use:
- name: North America - onshore and offshore: Canada - Alberta; British Columbia; Manitoba; New Brunswick; Newfoundland and Labrador; Northwest Territories; Nova Scotia; Nunavut; Ontario; Prince Edward Island; Quebec; Saskatchewan; Yukon. Puerto Rico. United States (USA) - Alabama; Alaska; Arizona; Arkansas; California; Colorado; Connecticut; Delaware; Florida; Georgia; Hawaii; Idaho; Illinois; Indiana; Iowa; Kansas; Kentucky; Louisiana; Maine; Maryland; Massachusetts; Michigan; Minnesota; Mississippi; Missouri; Montana; Nebraska; Nevada; New Hampshire; New Jersey; New Mexico; New York; North Carolina; North Dakota; Ohio; Oklahoma; Oregon; Pennsylvania; Rhode Island; South Carolina; South Dakota; Tennessee; Texas; Utah; Vermont; Virginia; Washington; West Virginia; Wisconsin; Wyoming. US Virgin Islands. British Virgin Islands

In [14]:
num_divisions = 0
class_name = []
crs_name = []

for fpath in fpaths:
    gdf = gpd.read_file(fpath)
    num_divisions += len(gdf)
    class_name.extend(gdf["CLASSFP"].unique().tolist())
    crs_name.append(gdf.crs.name)

In [15]:
print(f"There are {num_divisions:,} county subdivision(s) spanning the 50 U.S. "
      f"states, the District of Columbia, and the U.S. territories.")

There are 36,639 county subdivision(s) spanning the 50 U.S. states, the District of Columbia, and the U.S. territories.


In [24]:
for p in gdf["NAMELSAD"].sort_values().tolist():
    print(p)

Adams township
Adams township
Adams township
Adams township
Adams township
Adams township
Adams township
Adams township
Adams township
Adams township
Addison township
Aid township
Akron city
Alexander township
Allen township
Allen township
Allen township
Allen township
Alliance city
Amanda township
Amanda township
Amanda township
Amberley village
Amboy township
American township
Ames township
Amherst city
Amherst township
Anderson township
Andover township
Antrim township
Archer township
Arlington Heights village
Arlington village
Ashland city
Ashley village
Ashtabula township
Athens township
Athens township
Atwater township
Auburn township
Auburn township
Auburn township
Auglaize township
Auglaize township
Augusta township
Aurelius township
Aurora city
Austinburg township
Austintown township
Avon Lake city
Avon city
Bainbridge township
Ballville township
Barberton city
Barlow township
Bartlow township
Batavia township
Bath township
Bath township
Bath township
Baughman township
Bay Vil

In [16]:
set(sorted(class_name))
# Exlude T9, Z1 (inactive), z9

{'C2', 'C5', 'T1', 'T5', 'T9', 'Z1', 'Z2', 'Z3', 'Z5', 'Z7', 'Z9'}

In [17]:
set(sorted(crs_name))

{'NAD83'}