# Example on using the package 'community_ry2403'

### Backgroud
A researcher needs to look up some community level data from the the US Census Bureau. However, he found that some of data are reported as ZIP Code level and others are ZIP Code Tabulation Areas (ZCTA) level. In order to ensure the consistency of analysis units, he needs to do some conversion or say aggregation. Then he found a package...

### prepare library
https://test.pypi.org/project/community-ry2403/

In [62]:
!pip install -i https://test.pypi.org/pypi/ --extra-index-url https://pypi.org/simple community-ry2403

Looking in indexes: https://test.pypi.org/pypi/, https://pypi.org/simple


# 1: Search Variables with fuzzy research    

**Specifically, he's interested in some data related to household by marry status. But he doesn't know the exact variable code in the Census Survey. So he will use the class `variables` and the function `find_variable()`**

In [26]:
from community_ry2403.community_ry2403 import variables
f = variables()   #Initialize the class variables
f.find_variable(keyword='total households by marry')

Unnamed: 0,variable,label
0,DP02_0001E,HOUSEHOLDS BY TYPE!!Total households
1,DP02_0001PE,HOUSEHOLDS BY TYPE!!Total households
2,DP02_0002E,HOUSEHOLDS BY TYPE!!Total households!!Married-...
3,DP02_0002PE,HOUSEHOLDS BY TYPE!!Total households!!Married-...
4,DP02_0003E,HOUSEHOLDS BY TYPE!!Total households!!Married-...
...,...,...
123,DP03_0072PE,INCOME AND BENEFITS (IN 2019 INFLATION-ADJUSTE...
124,DP03_0073E,INCOME AND BENEFITS (IN 2019 INFLATION-ADJUSTE...
125,DP03_0073PE,INCOME AND BENEFITS (IN 2019 INFLATION-ADJUSTE...
126,DP03_0074E,INCOME AND BENEFITS (IN 2019 INFLATION-ADJUSTE...


**The results show some revelant variables and lable names**

# 2: Get a crosswalk of ZIP-ZCTA-State
**Now he has some ZIP level data, he's wondering which state and ZCTA area they belong to. He needs to use the function `get_code()`**

In [65]:
from community_ry2403.community_ry2403 import get_code
get_code(area_code=['10025','10036'],level='zIP')

Unnamed: 0,ZIP,STATE,ZCTA,STATE_CODE
3213,10025,NY,10025,36
3224,10036,NY,10036,36


# 3: Query data from ZIP Codes Business Patterns (ZCBP) and American Community Survey (ACS)
**After checking the related variables above, he decided to look up varibales 'DP02_0001E','DP02_0002E','DP02_0003PE' from ACS, and some varibales from ZCBP. He will use the function `census_data()` and `business_data()`**

In [68]:
from community_ry2403.community_ry2403 import census_data
from community_ry2403.community_ry2403 import business_data

In [67]:
v=['DP02_0001E','DP02_0002E','DP02_0003PE']
census_data(year=2019,variable=v,area_code=['10025','10036'])

Unnamed: 0,DP02_0001E,DP02_0002E,DP02_0003PE,ZCTA
1,17260,3111,4.2,10025
2,41355,14506,12.8,10036


In [66]:
v = ['EMP','EMP_N','ESTAB','PAYANN','PAYANN_N','PAYQTR1']
business_data(year=2019,variable=v,area_code=['10025','10036'],industry=72)

Unnamed: 0,EMP,EMP_N,ESTAB,PAYANN,PAYANN_N,PAYQTR1,NAICS2017,ZIP
1,0,0,255,0,0,0,72,10025
2,0,0,577,0,0,0,72,10036


# 4: Convert units, query data with customized report level

**Considering other data of his analysis are all ZIP Code level, next step, he needs to change the report level to ZIP Code. He will use the class `search` and the function `census()` and `business()` inside**

In [69]:
from community_ry2403.community_ry2403 import search
api = search()   ##Initialize the class search

In [73]:
help(search.business)

Help on function business in module community_ry2403.community_ry2403:

business(self, area_code, geography, year, variable, industry)
    A function to get community business data with customized report level (ZIP / ZCTA) from Zip Code Businiss Pattern(ZCBP) and Community Business Pattern(CBP) API.
    If report level is ZIP, keep the original value.
    If report level is ZCTA, for percent data, use the mean estimate within the correspondent ZIP areas, 
    and for absolute data, use the sum estimate within correspondent ZIP areas.
    
    Parameters
    ----------
    area_code: a list of strings
        ZIP code or ZCTA code
    geography: str
        data reported level: ZIP or ZCTA; case-insensitive.
    year: int
        data reported year
    variable: a list of strings 
        variable codes of data
    industry: int
        NAICS 2017 Code identifying which industry of business data you wish to get
    
    Returns
    ----------
    A pandas dataframe containing the report

In [74]:
help(search.census)

Help on function census in module community_ry2403.community_ry2403:

census(self, area_code, geography, year, variable)
    A function to get community census data with customized report level (ZIP / ZCTA) from American Census Survey(ACS) API.
    If report level is ZCTA, keep the original value.
    If report level is ZIP, for percent data, use the original value within the correspondent ZCTA area, 
    and for absolute data, use the mean estimate within correspondent ZCTA area.
    
    Parameters
    ----------
    area_code: a list of strings
    ZIP code or ZCTA code
    geography: str
    data reported level: ZIP or ZCTA; case-insensitive.
    year: int
    data reported year
    variable: a list of strings 
    variable codes of data
    
    Returns
    ----------
    A pandas dataframe containing the report level, the area code and values you looked up
    
    
    Examples
    ----------
    >>> area_code = ['10025','10036']
    >>> geography = 'Zip'
    >>> year = 2019
   

In [75]:
v=['DP02_0001E','DP02_0002E','DP02_0003PE']
api.census(area_code=['10025','10036'],geography='zip',year=2019, variable=v)

Unnamed: 0,ZIP,DP02_0001E_ZIP,DP02_0002E_ZIP,DP02_0003PE_ZIP
0,10025,17260.0,3111.0,4.2
1,10036,13785.0,4835.333333,12.8


In [77]:
v = ['EMP','EMP_N','ESTAB','PAYANN','PAYANN_N','PAYQTR1']
api.business(area_code=['10025','10036'],geography='zip',year=2019, variable=v,industry=72)

Unnamed: 0,EMP,EMP_N,ESTAB,PAYANN,PAYANN_N,PAYQTR1,NAICS2017,ZIP
1,0,0,255,0,0,0,72,10025
2,0,0,577,0,0,0,72,10036
