Skip to content

Working with Geographic Data

Manu Murugesan edited this page Mar 13, 2026 · 2 revisions

Working with Geographic Data

medicaid-utils includes geographic classification features for assigning rural/urban status and primary care service areas to beneficiaries based on their ZIP code.

Rural-Urban Classification

The person summary preprocessing automatically assigns rural/urban codes when preprocess=True:

from medicaid_utils.preprocessing import max_ps

ps = max_ps.MAXPS(year=2012, state="WY", data_root="/data/cms")
# df now has ruca_code, rucc_code, and rural/urban flags

RUCA Codes (Rural-Urban Commuting Area)

Based on the USDA RUCA 3.1 classification:

  • Codes 1–3: Urban
  • Codes 4–10: Rural

RUCC Codes (Rural-Urban Continuum)

Based on the USDA Rural-Urban Continuum Codes:

  • Codes 1–7: Metropolitan/Urban
  • Codes 8–9: Rural

Primary Care Service Areas (PCSA)

PCSA codes are assigned based on ZIP code crosswalks from the Dartmouth Atlas.

Geographic Crosswalk Module

The other_datasets.zip module provides utilities for building geographic crosswalks:

from medicaid_utils.other_datasets import zip as zip_utils

# Generate a ZIP-to-PCSA-to-RUCA crosswalk
zip_utils.generate_zip_pcsa_ruca_crosswalk()

Data Sources

The geographic data bundled with medicaid-utils was compiled from:

Dataset Source
RUCA 3.1 codes USDA Economic Research Service
RUCC codes USDA Economic Research Service
PCSA crosswalk Dartmouth Atlas
ZIP-ZCTA crosswalk UDS Mapper
ZIP centroids NBER

Clone this wiki locally