The goal of CensusPy is to expose the vast amount of data the government collects on US citizens to the broader programming community. Written as a wrapper around existing census APIs, CensusPy 1.1.0 currently supports:
But, the end goal will be to support all databases provided by the Census Bureau.
CensusPy is supported on PyPi, so installation is as simple as:
pip install censuspy
CensusPy only supports Python >= 3.0
The Business Dynamics Statistics (BDS) includes measures of establishment openings and closings, firm startups, job creation and destruction by firm size, age, and industrial sector, and several other statistics on business dynamics. The BDS is made up of only one sub-dataset.
Initialize the BDS object using your API key & geographic level of query:
from censuspy import bds state = bds.bds(api_key=[YOUR_API_KEY_HERE], geo='state')
Pull total employment numbers for Massachusetts (FIPS code 25) in 2014:
ma_emp = state.get(metric='emp', code=25, time=2014) print(ma_emp)
metric
(required)- specify metric to pull
- full BDS variables list
code
(conditionally required)- specify state or metro FIPS code
- only required if geographic level != us
- FIPS state codes
time
(required)- specify time period
- acceptable values include 1976 - 2014
- might not return results for every year if no data for specific geo
sic1
(optional)- specify industry sector
- default = 0 (all included)
- options listed on BDS website
fage4
(optional)- specify firm age
- default = 'm' (all included)
- options listed on BDS website
fsize
(optional)- specify firm size
- default = 'm' (all included)
- options listed on BDS website
ifsize
(optional)- specify initial firm size
- default = 'm' (all included)
- options listed on BDS website
- General information about the BDS database
- BDS API call examples and supported geographies
- List of available BDS metrics/variables
- FIPS State Codes
The Annual Survey of Entrepreneurs (ASE) supplements the 5-year Survey of Business Owners (SBO) program and provides more timely updates on the status, nature, and scope of women-, minority-, and veteran-owned businesses for 2014. The ASE has three sub-datasets:
- Company Summary (CSA)
- Characteristics of Businesses (CSCB)
- Characteristics of Business Owners (CSCBO)
Initialize the ASE object using your API key & geographic level of query, then specify the dataset that you want to access. In this example we will work with the Company Summary (CSA) dataset:
from censuspy import ase state = ase.csa(api_key=[YOUR_API_KEY_HERE], geo='state')
Pull total employment numbers for Massachusetts (FIPS code 25) in 2014:
ma_emp = state.get(metric='emp', code=25) print(ma_emp)
Provides data for employer businesses by sector, gender, ethnicity, race, veteran status, years in business, receipts size of firm, and employment size of firm for the U.S., states, and the fifty most populous metropolitan statistical areas (MSAs).
metric
(required)- specify metric to pull
- full CSA variables list
code
(conditionally required)- specify state or metro FIPS code
- only required if geographic level != us
- FIPS state codes
empszfi
(optional)- employment size of firms
- options for CSA empszfi input
rcpszfi
(optional)- sales, receipts, and revenue size of firms
- options for CSA rcpszfi input
sex
(optional)- gender, ethnicity, race, and veteran status
- options for CSA sex input
vet_group
(optional)- veteran group
- options for CSA vet_group input
naics2012
(optional)- 2012 NAICS code
- options for CSA naics2012 input
yibszfi
(optional)- years in business
- options for CSA yibszfi input
eth_group
(optional)- gender, ethnicity, race, and veteran status
- options for CSA eth_group input
race_group
(optional)- race code
- options for CSA race_group input
- General information about the ASE database
- CSA API call examples and supported geographies
- List of available CSA metrics/variables
- FIPS State Codes
Provides data for employer firms by sector, gender, ethnicity, race, veteran status, and years in business for the U.S., states, and fifty most populous MSAs, including detailed business characteristics.
metric
(required)- specify metric to pull
- full CSCB variables list
code
(conditionally required)- specify state or metro FIPS code
- only required if geographic level != us
- FIPS state codes
acqbuscap
(optional)- amount of capital used to start or acquire the business
- options for CSCB acqbuscap input
asecb
(optional)- gender, race, ethnicity, and veteran status code
- options for CSCB asecb input
avoidfinan
(optional)- reasons for avoiding additional financing
- options for CSCB avoidfinan input
benefits
(optional)- employee benefits paid totally or partly by the business
- options for CSCB benefits input
busact
(optional)- business activity characteristics
- options for CSCB busact input
busaspir
(optional)- owner's business aspirations
- options for CSCB busaspir input
busoutus
(optional)- operations outside of the US
- options for CSCB busoutus input
ceaseops
(optional)- whether business is currently operating or if not, reason for ceasing operations
- options for CSCB ceaseops input
cust
(optional)- customers accounting for 10% or more of total sales of goods/services
- options for CSCB cust input
custlocpct
(optional)- geographic location of business customers/clients
- options for CSCB custlocpct input
famown
(optional)- family owned business codes
- options for CSCB famown input
fundsrc
(optional)- funding sources and total amount of funding
- options for CSCB fundsrc input
innovimp
(optional)- business product/process innovations/improvements in the past three years
- options for CSCB innovimp input
intelctprop
(optional)- owned intellectual property
- options for CSCB intelctprop input
lang
(optional)- languages used to conduct transactions with customers
- options for CSCB lang input
naics2012
(optional)- 2012 NAICS codes
- options for CSCB naics2012 input
negprofit
(optional)- negative impacts on business profitability
- options for CSCB negprofit input
newfundrel
(optional)- new funding relationships
- options for CSCB newfundrel input
opfran
(optional)- year business was established
- options for CSCB opfran input
outsrcus
(optional)- business functions or services outsourced to a location outside the US
- options for CSCB outsrcus input
ownrnum
(optional)- number of owners in the business code
- options for CSCB ownrnum input
pecommrc
(optional)- e-commerce sales as a % of total sales
- options for CSCB pecommrc input
pexport
(optional)- exports sales as a % of total sales
- options for CSCB pexport input
profit
(optional)- profitability of the business
- options for CSCB profit input
rdpuramt
(optional)- amount used to purchase R&D activities
- options for CSCB rdpuramt input
rdtotalcst
(optional)- total cost of R&D activities
- options for CSCB rdtotalcst input
rdworkers
(optional)- workers that did the R&D activities
- options for CSCB rdworkers input
spouses
(optional)- spouses jointly owned and operated business codes
- options for CSCB spouses input
strtsrce
(optional)- sources of capital used to start or acquire the business
- options for CSCB strtsrce input
website
(optional)- business website codes
- options for CSCB website input
workers
(optional)- types of workers used codes
- options for CSCB workers input
yibszfi
(optional)- years in business
- options for CSCB yibszfi input
yrestbus
(optional)- year business was originally established
- options for CSCB yrestbus input
- General information about the ASE database
- CSCB API call examples and supported geographies
- List of available CSCB metrics/variables
- FIPS State Codes
Provides data for owners of respondent employer firms by sector, gender, ethnicity, race, veteran status, and years in business for the U.S., states, and top fifty most populous MSAs, including detailed owner characteristics.
metric
(required)- specify metric to pull
- only option for CSBO is
ownpdemp
and variations on it - full CSCBO variables list
code
(conditionally required)- specify state or metro FIPS code
- only required if geographic level != us
- FIPS state codes
acqbus
(optional)- how owner initially acquired business
- options for CSCBO acqbus input
asecbo
(optional)- gender, ethnicity, race, and veteran status code
- options for CSCBO asecbo input
educ
(optional)- highest level of education before establishing business
- options for CSCBO educ input
hrswrkd
(optional)- average hours spent per week managing or working in business
- options for CSCBO hrswrkd input
naics2012
(optional)- 2012 naics codes
- options for CSCBO naics2012 input
ownrage
(optional)- owner's age
- options for CSCBO ownrage input
pfnct
(optional)- primary functions in the business
- options for CSCBO pfnct input
priorbus
(optional)- whether they owned another business prior to establishing current business
- options for CSCBO priorbus input
prminc
(optional)- primary source of personal income
- options for CSCBO prminc input
usborncit
(optional)- whether they are a US born citizen
- options for CSCBO usborncit input
yracqbus
(optional)- year when business was acquired
- options for CSCBO yracqbus input
- General information about the ASE database
- CSCBO API call examples and supported geographies
- List of available CSCBO metrics/variables
- FIPS State Codes
The Census Bureau's Census surnames contains rank and frequency data on surnames reported 100 or more times in the decennial census, along with Hispanic origin and race category percentages. The latter are suppressed where necessary for confidentiality. The data focus on summarized aggregates of counts and characteristics associated with surnames, and the data do not in any way identify any specific individuals.
Initialize the DCSF object using your API key & time parameter (2010 or 2000):
from censuspy import dcsf us2010 = dcsf.dcsf(api_key=[YOUR_API_KEY_HERE], time=2010)
Pull ranking and count of reported occurences for "Smith" as a surname:
us2010_smith = us2010.get(metric='count', name="Smith") # the wrapper will return a dictionary with three keys: metric, rank, and name # metric will be whatever is passed in the metric parameter (count in this ex.) print(us2010_smith['rank']) # will yield the rank of Smith print(us2010_smith['metric']) # will yield the count
metric
(required)- specify metric to pull
- full DCSF variables list
time
(required)- specify time period
- options include 2010 or 2000
name
(conditionally required)- specify the surname you'd like search for
- will return "N/A" if surname is not available
rank
(conditionally required)- specify a surname rank to search on
- will return "N/A" if rank is not available
- Either
name
orrank
need to be specified otherwise the wrapper will raise a ValueError for missing parameters
- General information about the DCSF database
- DCSF API call examples and supported geographies
- List of available DCSF metrics/variables
Broadly speaking, my goal is to cover all the business-focused datasets before moving to the purely demographic data. The main motivation behind that is personal, since I'm deriving personal value from developing this wrapper. That being said -- if there is significant interest in exposing a specific dataset, then I'm more than happy to entertain that as well. Please feel free to send any requests to dnrkaseff360@gmail.com.
Roadmap:
- Annual Survey of Entrepreneurs (March 2018) [DONE]
- Decennial Census Surname Files (March 2018) [DONE]
- County Business Patterns and Nonemployer Statistics (April 2018)
- Economic Census (May 2018)
- Economic Indicators (June 2018)
- 0.0.1: initial beta release
- 0.0.2: hot fix to allow imports of specific database wrappers instead of having to import the entire package
- 1.0.0: go live! added support for ASE and implemented minor code changes to make calls more efficient from a resource perspective
- 1.1.0 added support for DCSF
MIT License
Copyright (c) 2018 DnrkasEFF
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.