In [1]:
!pip -q install -r ./config_files/requirements.txt

In [2]:
import pandas as pd

In [3]:
pd.options.mode.chained_assignment = None

# Forecast Automation Tool Documentation

**Authors: CRA, EWA**

Before proceeding, please connect to the SANDAG network. Additionally, it is highly recommended to configure your jupyter environment to include nbextensions, specifically the "Collapsible Headings" extension in order to better view the threshold dictionaries.

Additional meaning of the dataset's features can be found here: https://github.com/SANDAG/ABM/wiki/input-files#input-files-data-dictionary

## Part 1
The purpose of Part 1 is to concatenate all the ABM data stored as csv's in SANDAG's T-Drive and aggregate them at the MGRA, CPA, Jurisdiction, and Region levels. If you need to download data based on a specific individual datasource-id **OR** download data from two specific datasource-id's to compare, please run `initiate_window()` in a code cell below.

In [6]:
%run ./scripts/Part_1.ipynb

In [7]:
#initiate_window()

## Part 2
Part 2 performs a series of checks that identify anomalies in the ABM data. The Part 2 checks will flag anomalies and output them in the directory: `part_2_outputs`.

In [162]:
%run ./scripts/Part_2.ipynb

#### Inputs

The following cell should be filled out according to the desired checks.

- download_ds_data(first_ID, second_ID=None, folder='./outputs/'): Downloads necessary datafiles as dataframes to run the following checks on.
    - Inputs: first_ID, second_ID (optional), folder (***Make sure this is the directory path that contains Part 1 generated files***)
    - Outputs: dataframes that can be accessed: 
        - If second_ID is not provided: mgra_first, cpa_first, jur_first, reg_first
        - If second_ID is provided: mgra_first, cpa_first, jur_first, reg_first, mgra_second, cpa_second, jur_second, reg_second, mgra_both, mgra_diff

In [163]:
first_ID = 'DS41' # this is the first datasource ID
second_ID = None # this will only be used if running comparison functions
file_path = './outputs/' # this folder should contain all the files created in part 1

In [164]:
download_ds_data(first_ID, second_ID=second_ID, folder=file_path)

You have all the files you need to run the non-comparison functions


#### Input and Consistency Checks

These functions will compare the csv files to an ideal output and flag any anomalies that may occur.

- **check_cols(dataframe)**: checks that the necessary columns exist in the imported dataset and returns which columns are missing, or True if all columns exist.
    - Inputs: any geography level dataframe
    - Outputs: string stating check outcome


- **compare_totals(mgra_dataframe, jur_dataframe, reg_dataframe)**: checks that the totals of each column for each geography level (MGRA, jurisdiction, and region) match. Returns dictionary where keys are the non-MGRA geography levels and the value is a string describing how many columns match with the MGRA geography level.
    - Inputs: MGRA-level dataframe, jurisdiction-level dataframe, region-level dataframe
    - Outputs: dictionary containing strings stating the check outcome for each geography level


- **database_comparison(dataframe, first_ID)**: compares total population values (sum of database values for gender, ethnicity, age, jobs, and housing) from SQL data to population values in the CSV values. Outputs a dictionary with a dataframe for each category.
    - Inputs: MGRA-level dataframe, datasource-ID
    - Outputs: dictionary with comparison differences for each category (Gender, Ethnicity, Age, Jobs, Housing)
    
- **additional_comparisons(dataframe)**: identifies mismatches between school and hotel values. Outputs a dictionary with a dataframe for each category.
    - Inputs: MGRA-level dataframe, datasource-ID
    - Outputs: dictionary with comparison differences for each category (Hotel, School)
   
- **check_vacancy_rate(mgra_dataframe)**: checks and flags any rows with a vacancy rate of at least 4 percent.
    - Inputs: MGRA-level dataframe
    - Outputs: dataframe with new `Flag` column which holds True values for rows with a vacancy rate of 4 percent or higher.
    

In [165]:
check_cols(mgra_first)

'All desired columns exist.'

In [166]:
compare_totals(mgra_first, jur_first, reg_first)

{'jurisdiction': '129 columns did not match out of 139 columns.',
 'region': 'all columns matched.'}

In [167]:
mismatches = database_comparison(mgra_first, first_ID)

Output generated successfully.


In [168]:
# mismatches['Gender']

In [None]:
# mismatches['Ethnicity']

In [None]:
# mismatches['Age']

In [170]:
# mismatches['Jobs']

In [169]:
# mismatches['Housing']

In [171]:
additional_mismatches = additional_comparisons(mgra_first)

In [177]:
# additional_mismatches['Hotel']

In [179]:
# additional_mismatches['School']

In [160]:
vacancy_df = check_vacancy_rate(mgra_first)

In [161]:
vacancy_df

Unnamed: 0_level_0,Unnamed: 1_level_0,taz,hs,hs_Single_Family,hs_Multiple_Family,hs_Mobile_Homes,Household Population (hh),hh_Single_Family,hh_Multiple_Family,hh_Mobile_Homes,gq_civ,...,Asian,Black,Hispanic,Other,Pacific Islander,Two or More,White,Female,Male,Flag
mgra,year,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1
1,2016,3331,19,19,0,0,18,18,0,0,0,...,1.0,0.0,1.0,1.0,0.0,2.0,36.0,23.0,18.0,True
1,2018,3331,19,19,0,0,18,18,0,0,0,...,1.0,0.0,11.0,1.0,0.0,2.0,35.0,26.0,24.0,True
1,2020,3331,19,19,0,0,18,18,0,0,0,...,1.0,0.0,4.0,1.0,0.0,6.0,31.0,23.0,20.0,True
1,2023,3331,20,20,0,0,18,18,0,0,0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,False
1,2025,3331,20,20,0,0,18,18,0,0,0,...,1.0,0.0,4.0,1.0,0.0,2.0,34.0,24.0,18.0,True
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
23002,2032,1254,120,20,100,0,98,17,81,0,0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,False
23002,2035,1254,120,20,100,0,98,17,81,0,0,...,44.0,3.0,52.0,1.0,1.0,3.0,133.0,125.0,114.0,True
23002,2040,1254,120,20,100,0,102,18,84,0,0,...,69.0,2.0,60.0,0.0,1.0,14.0,109.0,136.0,120.0,True
23002,2045,1254,120,20,100,0,109,20,89,0,0,...,77.0,5.0,69.0,1.0,0.0,9.0,102.0,134.0,129.0,True


#### Threshold Analysis Checks

These functions will calculate differences across datasource, years, or proportions, and identify any anomalies using specified thresholds. Thresholds should be specified using dictionaries containing both the value threshold and the percentage threshold. The functions use **OR** logic, so specifying multiple column thresholds will flag any rows that meet any of the specified threholds.

- **yearly_diff_threshold(dataframe, threshold dictionary)**: given a dictionary with columns and thresholds, flags any differences between years that meet the threshold level. 
    - Inputs: Any geography level dataframe, thresholds dictionary with both actual and percentage thresholds (*The input dictionary should have 0 values for columns without a specified threshold.*)
    - Outputs: dataframe with differences by year with a `Flag` column that indicates whether the specified threshold(s) were met.


- **ds_diff_threshold(mgra_diff, mgra_second, threshold dictionary)**: given a dictionary with columns and thresholds, flags any differences between datasource_id's that meet the threshold level.
    - Inputs: datasource-ID difference MGRA-level dataframe, second datasource-ID MGRA-level dataframe, thresholds dictionary with both actual and percentage thresholds (*The input dictionary should have 0 values for columns without a specified threshold.*)
    - Outputs: dataframe with differences by datasource-ID with a `Flag` column that indicates whether the specified threshold(s) were met.
    

- **shares(dataframe, threshold dictionary)**: given a dictionary with columns and thresholds, flags proportions of yearly percent change within designated columns that meet the threshold level.
    - Inputs: Any geography level dataframe, thresholds dictionary (*The input dictionary should have 0 values for columns without a specified threshold.*)
    - Outputs: dataframe with yearly percent change with a `Flag` column that indicates whether the specified threshold(s) were met.
    - Example: For an income category (\\$15,000 to \\$29,999), the value = (difference in that income category from 2016 to 2018) / (Sum of all the income categories)



##### Yearly Difference Thresholds

In [None]:
year_thresholds = {
    'taz':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'hs':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'hs_Single_Family':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'hs_Multiple_Family':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'hs_Mobile_Homes':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'Household Population (hh)':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'hh_Single_Family':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'hh_Multiple_Family':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'hh_Mobile_Homes':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'gq_civ':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'Group Quarters - Military (gq_mil)':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'Less than $15,000':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    '$15,000 to $29,999':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    '$30,000 to $44,999':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    '$45,000 to $59,999':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    '$60,000 to $74,999':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    '$75,000 to $99,999':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    '$100,000 to $124,999':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    '$125,000 to $149,999':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    '$150,000 to $199,999':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    '$200,000 or more':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'hhs':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'pop':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'hhp':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'emp_Agricultural_and_Extractive':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'emp_const_non_bldg_prod':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'emp_const_non_bldg_Office':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'emp_utilities_prod':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'emp_utilities_Office':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'emp_const_bldg_prod':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'emp_const_bldg_Office':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'emp_Manufacturing_prod':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'emp_Manufacturing_Office':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'emp_whsle_whs':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'emp_trans':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'emp_retail':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'emp_prof_bus_svcs':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'emp_prof_bus_svcs_bldg_maint':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'emp_pvt_ed_k12':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'emp_pvt_ed_post_k12_Other_Residential':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'emp_health':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'emp_personal_svcs_Office':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'emp_amusement':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'emp_hotel':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'emp_restaurant_bar':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'emp_personal_svcs_retail':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'emp_religious':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'emp_pvt_hh':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'emp_state_local_Government_ent':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'emp_fed_non_Military':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'emp_fed_Military':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'emp_state_local_Government_blue':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'emp_state_local_Government_white':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'emp_public_ed':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'emp_own_occ_dwell_mgmt':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'emp_fed_Government_accts':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'emp_st_lcl_Government_accts':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'emp_cap_accts':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'emp_total':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'enrollgradekto8':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'enrollgrade9to12':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'collegeenroll':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'othercollegeenroll':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'adultschenrl':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'ech_dist':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'hch_dist':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'pseudomsa':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'parkarea':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'hstallsoth':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'hstallssam':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'hparkcost':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'numfreehrs':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'dstallsoth':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'dstallssam':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'dparkcost':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'mstallsoth':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'mstallssam':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'mparkcost':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'totint':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'duden':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'empden':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'popden':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'retempden':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'totintbin':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'empdenbin':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'dudenbin':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'zip09':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'parkactive':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'openspaceparkpreserve':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'beachactive':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'budgetroom':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'economyroom':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'luxuryroom':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'midpriceroom':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'upscaleroom':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'hotelroomtotal':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'luz_id':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'truckregiontype':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'district27':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'milestocoast':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'acres':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'effective_acres':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'land_acres':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'units':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'vacancy':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'unoccupiable':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'vacancy_rate':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'elem_population':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'high_population':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    '10 to 14':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    '15 to 17':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    '18 and 19':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    '20 to 24':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    '25 to 29':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    '30 to 34':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    '35 to 39':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    '40 to 44':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    '45 to 49':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    '5 to 9':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    '50 to 54':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    '55 to 59':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    '60 and 61':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    '62 to 64':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    '65 to 69':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    '70 to 74':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    '75 to 79':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    '80 to 84':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    '85 and Older':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'Under 5':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'American Indian':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'Asian':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'Black':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'Hispanic':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'Other':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'Pacific Islander':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'Two or More':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'White':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'Female':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'Male':
        {'value_threshold': 0,
        'percentage_threshold': 0},
}

In [None]:
year_thresh_df = yearly_diff_threshold(mgra_first, year_thresholds)

In [None]:
year_thresh_df[year_thresh_df['Flag']]

##### DS Difference Thresholds (requires two DS_ID's)

In [None]:
ds_thresholds = {
    'taz':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'hs':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'hs_Single_Family':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'hs_Multiple_Family':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'hs_Mobile_Homes':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'Household Population (hh)':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'hh_Single_Family':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'hh_Multiple_Family':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'hh_Mobile_Homes':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'gq_civ':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'Group Quarters - Military (gq_mil)':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'Less than $15,000':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    '$15,000 to $29,999':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    '$30,000 to $44,999':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    '$45,000 to $59,999':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    '$60,000 to $74,999':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    '$75,000 to $99,999':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    '$100,000 to $124,999':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    '$125,000 to $149,999':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    '$150,000 to $199,999':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    '$200,000 or more':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'hhs':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'pop':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'hhp':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'emp_Agricultural_and_Extractive':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'emp_const_non_bldg_prod':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'emp_const_non_bldg_Office':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'emp_utilities_prod':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'emp_utilities_Office':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'emp_const_bldg_prod':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'emp_const_bldg_Office':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'emp_Manufacturing_prod':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'emp_Manufacturing_Office':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'emp_whsle_whs':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'emp_trans':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'emp_retail':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'emp_prof_bus_svcs':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'emp_prof_bus_svcs_bldg_maint':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'emp_pvt_ed_k12':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'emp_pvt_ed_post_k12_Other_Residential':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'emp_health':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'emp_personal_svcs_Office':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'emp_amusement':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'emp_hotel':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'emp_restaurant_bar':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'emp_personal_svcs_retail':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'emp_religious':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'emp_pvt_hh':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'emp_state_local_Government_ent':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'emp_fed_non_Military':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'emp_fed_Military':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'emp_state_local_Government_blue':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'emp_state_local_Government_white':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'emp_public_ed':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'emp_own_occ_dwell_mgmt':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'emp_fed_Government_accts':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'emp_st_lcl_Government_accts':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'emp_cap_accts':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'emp_total':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'enrollgradekto8':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'enrollgrade9to12':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'collegeenroll':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'othercollegeenroll':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'adultschenrl':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'ech_dist':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'hch_dist':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'pseudomsa':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'parkarea':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'hstallsoth':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'hstallssam':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'hparkcost':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'numfreehrs':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'dstallsoth':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'dstallssam':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'dparkcost':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'mstallsoth':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'mstallssam':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'mparkcost':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'totint':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'duden':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'empden':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'popden':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'retempden':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'totintbin':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'empdenbin':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'dudenbin':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'zip09':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'parkactive':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'openspaceparkpreserve':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'beachactive':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'budgetroom':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'economyroom':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'luxuryroom':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'midpriceroom':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'upscaleroom':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'hotelroomtotal':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'luz_id':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'truckregiontype':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'district27':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'milestocoast':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'acres':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'effective_acres':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'land_acres':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'units':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'vacancy':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'unoccupiable':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'vacancy_rate':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'elem_population':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'high_population':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    '10 to 14':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    '15 to 17':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    '18 and 19':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    '20 to 24':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    '25 to 29':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    '30 to 34':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    '35 to 39':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    '40 to 44':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    '45 to 49':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    '5 to 9':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    '50 to 54':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    '55 to 59':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    '60 and 61':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    '62 to 64':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    '65 to 69':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    '70 to 74':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    '75 to 79':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    '80 to 84':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    '85 and Older':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'Under 5':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'American Indian':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'Asian':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'Black':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'Hispanic':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'Other':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'Pacific Islander':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'Two or More':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'White':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'Female':
        {'value_threshold': 0,
        'percentage_threshold': 0},
    'Male':
        {'value_threshold': 0,
        'percentage_threshold': 0},
}

In [None]:
if second_ID:
    ds_diff_example = ds_diff_threshold(mgra_diff, mgra_second, ds_thresholds)
    ds_diff_example

##### Share Thresholds

###### Employment

In [None]:
employment_thresholds = {'emp_Agricultural_and_Extractive': 0,
 'emp_const_non_bldg_prod': 0,
 'emp_const_non_bldg_Office': 0,
 'emp_utilities_prod': 0,
 'emp_utilities_Office': 0,
 'emp_const_bldg_prod': 0,
 'emp_const_bldg_Office': 0,
 'emp_Manufacturing_prod': 0,
 'emp_Manufacturing_Office': 0,
 'emp_whsle_whs': 0,
 'emp_trans': 0,
 'emp_retail': 0,
 'emp_prof_bus_svcs': 0,
 'emp_prof_bus_svcs_bldg_maint': 0,
 'emp_pvt_ed_k12': 0,
 'emp_pvt_ed_post_k12_Other_Residential': 0,
 'emp_health': 0,
 'emp_personal_svcs_Office': 0,
 'emp_amusement': 0,
 'emp_hotel': 0,
 'emp_restaurant_bar': 0,
 'emp_personal_svcs_retail': 0,
 'emp_religious': 0,
 'emp_pvt_hh': 0,
 'emp_state_local_Government_ent': 0,
 'emp_fed_non_Military': 0,
 'emp_fed_Military': 0,
 'emp_state_local_Government_blue': 0,
 'emp_state_local_Government_white': 0,
 'emp_public_ed': 0,
 'emp_own_occ_dwell_mgmt': 0,
 'emp_fed_Government_accts': 0,
 'emp_st_lcl_Government_accts': 0,
 'emp_cap_accts': 0,
 'emp_total': 0}

In [None]:
employment_shares = shares(mgra_first, threshold_dict=employment_thresholds)

In [None]:
employment_shares

###### Income

In [None]:
income_thresholds = {'Less than $15,000': 0,
 '$15,000 to $29,999': 0,
 '$30,000 to $44,999': 0,
 '$45,000 to $59,999': 0,
 '$60,000 to $74,999': 0,
 '$75,000 to $99,999': 0,
 '$100,000 to $124,999': 0,
 '$125,000 to $149,999': 0,
 '$150,000 to $199,999': 0,
 '$200,000 or more': 0}

In [None]:
income_shares = shares(mgra_first, threshold_dict=income_thresholds)

In [None]:
income_shares

###### Ethnicities

In [None]:
ethnicity_thresholds = {'Hispanic': 0,
 'White': 0,
 'Black': 0,
 'American Indian': 0,
 'Asian': 0,
 'Pacific Islander': 0,
 'Other': 0,
 'Two or More': 0}

In [None]:
ethnicity_shares = shares(mgra_first, threshold_dict=ethnicity_thresholds)

In [None]:
ethnicity_shares

###### Custom

In [None]:
# custom_thresholds = {'column 1': 0,
#  'column 2': 0,
#  'column 3': 0}

In [None]:
# custom_shares = shares(mgra_first, threshold_dict=custom_thresholds)

In [None]:
# custom_thresholds