Skip to content

Latest commit

 

History

History
210 lines (175 loc) · 6.52 KB

README.md

File metadata and controls

210 lines (175 loc) · 6.52 KB

python-censusbatchgeocoder

A simple Python wrapper for U.S. Census Geocoding Services API batch service.

Build Status PyPI version Coverage Status

Installation

$ pip install censusbatchgeocoder

Basic usage

Importing the library

>>> import censusbatchgeocoder

According to the official Census documentation, the input is expected to contain the following fields:

  • id: Your unique identifier for the record
  • address: Structure number and street name (required)
  • city: City name (required)
  • state: State (optional)
  • zipcode: ZIP Code (optional)

You can geocode a comma-delimited file from the filesystem. Results are returned as a list of dictionaries.

An example could look like this:

id,address,city,state,zipcode
1,1600 Pennsylvania Ave NW,Washington,DC,20006
2,202 W. 1st Street,Los Angeles,CA,90012

Which is then passed in like this:

>>> results = censusbatchgeocoder.geocode("./my_file.csv")

The results are returned with the following columns from the Census

  • id: The unique id provided with the record.
  • returned_address: The address that was submitted to the geocoder.
  • geocoded_address: The address of the match returned by the geocoder.
  • is_match: Whether or not the geocoder found a match.
  • is_exact: The precision of the match.
  • coordinates: The longitude and latitude of the match.
  • tiger_line: The Census TIGER line of the match.
  • side: The side of the Census TIGER line of the match.
  • state_fips: The FIPS state code identifying the state of the match.
  • county_fips: The FIPS county code identifying the county of the match.
  • tract: The Census tract of the match.
>>> print results
[{'geocoded_address': '202 W. 1st Street, Los Angeles, CA, 90012',
  'block': '1034',
  'coordinates': '-118.24456,34.053005',
  'county_fips': '037',
  'returned_address': '202 W 1ST ST, LOS ANGELES, CA, 90012',
  'id': '2',
  'is_exact': 'Exact',
  'is_match': 'Match',
  'side': 'L',
  'state_fips': '06',
  'tiger_line': '141618115',
  'tract': '207400'},
 {'geocoded_address': '1600 Pennsylvania Ave NW, Washington, DC, 20006',
  'block': '1031',
  'coordinates': '-77.03535,38.898754',
  'county_fips': '001',
  'returned_address': '1600 PENNSYLVANIA AVE NW, WASHINGTON, DC, 20502',
  'id': '1',
  'is_exact': 'Non_Exact',
  'is_match': 'Match',
  'side': 'L',
  'state_fips': '11',
  'tiger_line': '76225813',
  'tract': '006202'}]

Any extra metadata fields included in the file are still present in the returned data.

So the my_metadata column here...

id,address,city,state,zipcode,my_metadata
1,1600 Pennsylvania Ave NW,Washington,DC,20006,foo
2,202 W. 1st Street,Los Angeles,CA,90012,bar

.. is still there after you geocode.

>>> censusbatchgeocoder.geocode("./my_file.csv")
[{'address': '1600 Pennsylvania Ave NW',
  'block': '1031',
  'city': 'Washington',
  'coordinates': '-77.03535,38.898754',
  'county_fips': '001',
  'geocoded_address': '1600 Pennsylvania Ave NW, Washington, DC, 20006',
  'id': '1',
  'is_exact': 'Non_Exact',
  'is_match': 'Match',
  'my_metadata': 'foo',
  'returned_address': '1600 PENNSYLVANIA AVE NW, WASHINGTON, DC, 20502',
  'side': 'L',
  'state': 'DC',
  'state_fips': '11',
  'tiger_line': '76225813',
  'tract': '006202',
  'zipcode': '20006'},
 {'address': '202 W. 1st Street',
  'block': '1034',
  'city': 'Los Angeles',
  'coordinates': '-118.24456,34.053005',
  'county_fips': '037',
  'geocoded_address': '202 W. 1st Street, Los Angeles, CA, 90012',
  'id': '2',
  'is_exact': 'Exact',
  'is_match': 'Match',
  'my_metadata': 'bar',
  'returned_address': '202 W 1ST ST, LOS ANGELES, CA, 90012',
  'side': 'L',
  'state': 'CA',
  'state_fips': '06',
  'tiger_line': '141618115',
  'tract': '207400',
  'zipcode': '90012'}]

Custom column names

If you column headers do not exactly match those expected by the geocoder you can override themself.

So a file like this:

foo,bar,baz,bada,boom
1,521 SWARTHMORE AVENUE,PACIFIC PALISADES,CA,90272-4350
2,2015 W TEMPLE STREET,LOS ANGELES,CA,90026-4913

Can be mapped like this:

>>> censusbatchgeocoder.geocode(
    self.weird_path,
    id="foo",
    address="bar",
    city="baz",
    state="bada",
    zipcode="boom"
)

Optional columns

The state and ZIP Code columns are optional. If your data doesn't have them, pass None as keyword arguments.

>>> censusbatchgeocoder.geocode("./my_file.csv", state=None, zipcode=None)

File objects

You can also geocode an in-memory file object of data in CSV format.

>>> my_data = """id,address,city,state,zipcode
1,1600 Pennsylvania Ave NW,Washington,DC,20006
2,202 W. 1st Street,Los Angeles,CA,90012"""
>>> censusbatchgeocoder.geocode(io.StringIO(my_data))
[{'geocoded_address': '202 W. 1st Street, Los Angeles, CA, 90012',
  'block': '1034',
  'coordinates': '-118.24456,34.053005',
  'county_fips': '037',
  'returned_address': '202 W 1ST ST, LOS ANGELES, CA, 90012',
  'id': '2',
  'is_exact': 'Exact',
  'is_match': 'Match',
  'side': 'L',
  'state_fips': '06',
  'tiger_line': '141618115',
  'tract': '207400'},
 {'geocoded_address': '1600 Pennsylvania Ave NW, Washington, DC, 20006',
  'block': '1031',
  'coordinates': '-77.03535,38.898754',
  'county_fips': '001',
  'returned_address': '1600 PENNSYLVANIA AVE NW, WASHINGTON, DC, 20502',
  'id': '1',
  'is_exact': 'Non_Exact',
  'is_match': 'Match',
  'side': 'L',
  'state_fips': '11',
  'tiger_line': '76225813',
  'tract': '006202'}]