## `acs_sdoh`

In [1]:
from CIFTools import acs_sdoh

Using `acs_sdoh`, you can not just download data for Cancer_InFocus but also for your own data project.   
Let's first take a look at how to download acs_sdoh data.   
`acs_sdoh` takes following arguments:   
* year: int
* state_fips : str, int or list of str (2 digit state fips code)
* query_level : str (possible values: 'state','county', 'tract','county subregion', 'block', 'zip'
* key : census key

You can sign up for the census key at : https://api.census.gov/data/key_signup.html

In this tutorial, we will use the sample key that may already be expired by the time you read the tutorial

In [2]:
key = # provide the census api user key

In [3]:
sdoh = acs_sdoh(2019, '21', 'county', key = key)

The function that you may call to scrap data for Cancer_InFocus is `cancer_infocus_download`.   
This function does not require any arguement

In [4]:
data_dictionary = sdoh.cancer_infocus_download()

{'year': 2019, 'state_fips': '21', 'query_level': 'county', 'acs_group': ['B27001', 'C27007'], 'acs_type': ''}
{'year': 2019, 'state_fips': '21', 'query_level': 'county', 'acs_group': 'B25002', 'acs_type': ''}
{'year': 2019, 'state_fips': '21', 'query_level': 'county', 'acs_group': 'B17026', 'acs_type': ''}
{'year': 2019, 'state_fips': '21', 'query_level': 'county', 'acs_group': 'B08141', 'acs_type': ''}
{'year': 2019, 'state_fips': '21', 'query_level': 'county', 'acs_group': 'B23025', 'acs_type': ''}
{'year': 2019, 'state_fips': '21', 'query_level': 'county', 'acs_group': 'B19083', 'acs_type': ''}
{'year': 2019, 'state_fips': '21', 'query_level': 'county', 'acs_group': 'B25070', 'acs_type': ''}
{'year': 2019, 'state_fips': '21', 'query_level': 'county', 'acs_group': 'B25034', 'acs_type': ''}
{'year': 2019, 'state_fips': '21', 'query_level': 'county', 'acs_group': 'B19058', 'acs_type': ''}
{'year': 2019, 'state_fips': '21', 'query_level': 'county', 'acs_group': 'B15003', 'acs_type': ''

`cancer_infocus_download()` returns a dictionary object with pandas dataframe as values as well as the corresponding dataset names as keys. 

In [5]:
data_dictionary.keys()

dict_keys(['insurance', 'vacancy', 'poverty', 'transportation', 'employment', 'gini_index', 'rent_to_income', 'houses_before_1960', 'public_assistance', 'education', 'income', 'demographic_age', 'demographic_race'])

In [7]:
data_dictionary['transportation'].head()

Unnamed: 0,FIPS,County,State,no_vehicle,two_or_more_vehicle,three_or_more_vehicle
0,21079,Garrard County,Kentucky,0.026159,0.867958,0.472388
1,21037,Campbell County,Kentucky,0.026872,0.779089,0.334509
2,21063,Elliott County,Kentucky,0.009025,0.852587,0.324308
3,21123,Larue County,Kentucky,0.000801,0.856342,0.463805
4,21167,Mercer County,Kentucky,0.01897,0.824078,0.399004


### Query other dataset

To query other dataset, you may want to use `add_custom_table` function.   
The function requires the following three arguments:
* group_id: acs group id (e.g. B01001)
    - you can provide multiple group ids in a list (e.g. \["B01001", "C27007"\])
    - However, all the acs groups must be in the same acs type
* acs_type:
    - '' : acs5
    - 'profile' : acs5/profile
    - 'subject' : acs5/subject
    - for more information, please visit: https://api.census.gov/data.html
* name: it will be a dictionary key for the dataset in the data_dictionary

Using the decorator of the `add_custom_table` you define how the dataframe to be organized since it is downloaded from the census.   
In the following example, the function does not change any from the raw data.

In [9]:
sdoh.clean_functions()

In [10]:
@sdoh.add_custom_table(["C27007", "B27001"], '', 'sample')
def download_custom_data(df):
    return df

{'year': 2019, 'state_fips': '21', 'query_level': 'county', 'acs_group': ['C27007', 'B27001'], 'acs_type': ''}


In [11]:
data_dictionary = sdoh.download_all()

In [13]:
data_dictionary['sample'].head()

Unnamed: 0,FIPS,County,State,B27001_001E,B27001_002E,B27001_003E,B27001_004E,B27001_005E,B27001_006E,B27001_007E,...,C27007_012E,C27007_013E,C27007_014E,C27007_015E,C27007_016E,C27007_017E,C27007_018E,C27007_019E,C27007_020E,C27007_021E
0,21079,Garrard County,Kentucky,17342,8483,675,675,0,1490,1469,...,8859,1915,1053,862,5273,1474,3799,1671,231,1440
1,21037,Campbell County,Kentucky,91367,44451,3352,3274,78,7322,7219,...,46916,10593,2835,7758,28694,4183,24511,7629,1152,6477
2,21063,Elliott County,Kentucky,5971,2759,220,220,0,545,538,...,3212,694,436,258,1807,695,1112,711,140,571
3,21123,Larue County,Kentucky,13912,6809,457,457,0,1159,1129,...,7103,1719,931,788,4107,1494,2613,1277,218,1059
4,21167,Mercer County,Kentucky,21412,10535,814,790,24,1812,1743,...,10877,2319,880,1439,6371,1541,4830,2187,275,1912


### ACSConfig

When exploring an acs group, you may use `ACSConfig`.
`ACSConfig` requires following arguements:
* year : str or int
* state_fips : a list of state fips or a single state fips as str
* query_level: str
* acs_group  : str
* acs_type   : str (optional)

In [24]:
from CIF_Config import ACSConfig
from pprint import pprint

In [19]:
cfg = ACSConfig(2020, 21, 'tract', 'B15001', acs_type = '')

In [20]:
cfg

ACSConfig(year=2020, state_fips=21, query_level='tract', acs_group='B15001', acs_type='')

ACSConfig can provide both variables within the group and their labels.   
In addition, it also provides a table explaining details of each variable.

In [26]:
print(cfg.variables)

['B15001_001E', 'B15001_002E', 'B15001_003E', 'B15001_004E', 'B15001_005E', 'B15001_006E', 'B15001_007E', 'B15001_008E', 'B15001_009E', 'B15001_010E', 'B15001_011E', 'B15001_012E', 'B15001_013E', 'B15001_014E', 'B15001_015E', 'B15001_016E', 'B15001_017E', 'B15001_018E', 'B15001_019E', 'B15001_020E', 'B15001_021E', 'B15001_022E', 'B15001_023E', 'B15001_024E', 'B15001_025E', 'B15001_026E', 'B15001_027E', 'B15001_028E', 'B15001_029E', 'B15001_030E', 'B15001_031E', 'B15001_032E', 'B15001_033E', 'B15001_034E', 'B15001_035E', 'B15001_036E', 'B15001_037E', 'B15001_038E', 'B15001_039E', 'B15001_040E', 'B15001_041E', 'B15001_042E', 'B15001_043E', 'B15001_044E', 'B15001_045E', 'B15001_046E', 'B15001_047E', 'B15001_048E', 'B15001_049E', 'B15001_050E', 'B15001_051E', 'B15001_052E', 'B15001_053E', 'B15001_054E', 'B15001_055E', 'B15001_056E', 'B15001_057E', 'B15001_058E', 'B15001_059E', 'B15001_060E', 'B15001_061E', 'B15001_062E', 'B15001_063E', 'B15001_064E', 'B15001_065E', 'B15001_066E', 'B15001_0

In [27]:
print(cfg.labels)

['Total', 'Male', 'Male - 18 to 24 years', '18 to 24 years - Less than 9th grade', '18 to 24 years - 9th to 12th grade, no diploma', '18 to 24 years - High school graduate (includes equivalency)', '18 to 24 years - Some college, no degree', "18 to 24 years - Associate's degree", "18 to 24 years - Bachelor's degree", '18 to 24 years - Graduate or professional degree', 'Male - 25 to 34 years', '25 to 34 years - Less than 9th grade', '25 to 34 years - 9th to 12th grade, no diploma', '25 to 34 years - High school graduate (includes equivalency)', '25 to 34 years - Some college, no degree', "25 to 34 years - Associate's degree", "25 to 34 years - Bachelor's degree", '25 to 34 years - Graduate or professional degree', 'Male - 35 to 44 years', '35 to 44 years - Less than 9th grade', '35 to 44 years - 9th to 12th grade, no diploma', '35 to 44 years - High school graduate (includes equivalency)', '35 to 44 years - Some college, no degree', "35 to 44 years - Associate's degree", "35 to 44 years 

In [29]:
cfg.var_desc

Unnamed: 0,name,label,concept
9131,B15001_014E,Estimate!!Total:!!Male:!!25 to 34 years:!!High...,SEX BY AGE BY EDUCATIONAL ATTAINMENT FOR THE P...
9133,B15001_015E,Estimate!!Total:!!Male:!!25 to 34 years:!!Some...,SEX BY AGE BY EDUCATIONAL ATTAINMENT FOR THE P...
9136,B15001_016E,Estimate!!Total:!!Male:!!25 to 34 years:!!Asso...,SEX BY AGE BY EDUCATIONAL ATTAINMENT FOR THE P...
9138,B15001_017E,Estimate!!Total:!!Male:!!25 to 34 years:!!Bach...,SEX BY AGE BY EDUCATIONAL ATTAINMENT FOR THE P...
9141,B15001_010E,Estimate!!Total:!!Male:!!18 to 24 years:!!Grad...,SEX BY AGE BY EDUCATIONAL ATTAINMENT FOR THE P...
...,...,...,...
18017,B15001_043E,Estimate!!Total:!!Female:,SEX BY AGE BY EDUCATIONAL ATTAINMENT FOR THE P...
18024,B15001_044E,Estimate!!Total:!!Female:!!18 to 24 years:,SEX BY AGE BY EDUCATIONAL ATTAINMENT FOR THE P...
18030,B15001_045E,Estimate!!Total:!!Female:!!18 to 24 years:!!Le...,SEX BY AGE BY EDUCATIONAL ATTAINMENT FOR THE P...
18046,B15001_040E,Estimate!!Total:!!Male:!!65 years and over:!!A...,SEX BY AGE BY EDUCATIONAL ATTAINMENT FOR THE P...


## Downloading facility data

To download facility data, you simply need to use `gen_facility_data` function.   
The fuction requires only one arguement:
* location : str or List\[str\] (abbreviation(s) of state name(s))

In [1]:
from CIFTools import gen_facility_data

In [None]:
facility_data = gen_facility_data(['KY','WV'])

In [None]:
import pandas as pd
all_facility = pd.concat(facility_data.values(), axis = 0).reset_index(drop = True)

In [None]:
all_facility.head()