# MD Demographics

In [1]:
# Import relevant libraries
import os, sys
import pandas as pd, geopandas as gpd

In [2]:
# Set Default Parameters
path_in = r"E:\DHD_Solar\Campaigns\2023\MD\Q1\GIS\01_Working_Data\Census"
fileIn_1 = "X01_Population.csv"
fileIn_2 = "X17_Poverty.csv"
json_in  = "Tracts.geojson"

path_out = fr"E:\DHD_Solar\Campaigns\2023\MD\Q1\GIS\01_Working_Data\Census\output"
json_out = "demographics.geojson"

# Makes sure the folder structure for the output exists and creates it if it does not.
os.makedirs(path_out, exist_ok=True)

## Processing

The processing for this notebook is primarily to join demographic data to the corresponding census tracts for MD. Idealy, this process will also apply to other states without any major recoding.

The processing will proceed as follows (Italic entries are conducted within ArcGIS prior to entering into script):
1. *Filter fields view for the following fields*
    - X01_SEX_AND_AGE
        - B01001e1
    - X17_POVERTY
        - C17002e2
        - C17002e3
        - C17002e4
        - C17002e5
        - C17002e6
        - C17002e7 \
        Note that sum of the fields in this table is the total population 1.99 times the poverty level and under.
2. *Create "Tract" field as "Double" type in both tables and census tract shapefile table*
3. *Calculate "Tract" field as "!GEOID![7:]" for tables and just as "!GEOID!" for the shapefile*
4. *Export tables to new csv's in 01_Working_Data\Census*
5. *Export census tracts to GeoJSON*
6. Load data into Demographics_Processing notebook
7. Combine data based on Tract field into master GeoDataFrame
8. Export new GeoJSON with compiled demographic data by census tract.

### Load Data

In [3]:
pop_df = pd.read_csv(f'{path_in}\{fileIn_1}')
pov_df = pd.read_csv(f'{path_in}\{fileIn_2}')
tracts = gpd.read_file(f'{path_in}\{json_in}')

In [4]:
tracts.drop(columns=['INTPTLAT', 'INTPTLON', 'ALAND', 'AWATER', 'FID'], inplace=True)

In [5]:
# Create temporary dataframe for caluclations
calc_df = pd.DataFrame(tracts['Tract'])

In [7]:
pops = []
povs = []

# Collect values from dataframes
for i in calc_df['Tract']:
    pops.append(pop_df[pop_df['Tract'] == i]['B01001e1'].values[0])
    povs.append(pov_df[pov_df['Tract'] == i][['C17002e2',
                                             'C17002e3',
                                             'C17002e4',
                                             'C17002e5',
                                             'C17002e6',
                                             'C17002e7']].values.tolist())
    
# Sum our poverty numbers
sums = []
for i in povs:
    for n in i:
        #print(n)
        f = [int(item) for item in n]
        sums.append(sum(f))

In [9]:
# Add lists to calc_df
calc_df['Pop'] = pops
calc_df['Poverty'] = sums

In [11]:
calc_df['perc_Poverty'] = calc_df['Poverty']/calc_df['Pop']

In [14]:
tracts['perc_Poverty'] = calc_df['perc_Poverty']

In [19]:
tracts.to_file(f'{path_out}\{json_out}', driver="GeoJSON")