<a href="https://colab.research.google.com/github/ertomz/h4bl-superfund-website/blob/main/H4BL_Demographic_Analysis.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
import pandas as pd
import numpy as np

Let's import some data:

In [None]:
#census data
#here's the metadata:
#https://www2.census.gov/programs-surveys/popest/technical-documentation/file-layouts/2010-2019/cc-est2019-alldata.pdf

In [None]:
census = pd.read_csv('census2019countyCLEANED.csv', encoding = 'ISO-8859-1')
census.head(300)

Unnamed: 0,YEAR,AGEGRP,STATE,COUNTY,STNAME,CTYNAME,TOT_POP,BAC_MALE,BAC_FEMALE
0,12,0,1,1,Alabama,Autauga County,55869,5514,6270
1,12,0,1,3,Alabama,Baldwin County,223234,10346,11008
2,12,0,1,5,Alabama,Barbour County,24686,6432,5685
3,12,0,1,7,Alabama,Bibb County,22394,3010,1897
4,12,0,1,9,Alabama,Blount County,57826,658,618
...,...,...,...,...,...,...,...,...,...
295,12,0,8,101,Colorado,Pueblo County,168424,3126,2575
296,12,0,8,103,Colorado,Rio Blanco County,6324,79,69
297,12,0,8,105,Colorado,Rio Grande County,11267,117,88
298,12,0,8,107,Colorado,Routt County,25638,222,157


Since the Superfund site data doesn't have 'County', etc. after the county names, we have to do a bit of data engineering here. 

In [None]:
census['CTYNAME']=census['CTYNAME'].str.replace('County', '')
census['CTYNAME']=census['CTYNAME'].str.replace('Parish', '')
census['CTYNAME']=census['CTYNAME'].str.replace('Census Area', '')
census['CTYNAME']=census['CTYNAME'].str.replace('Burough', '')
census['CTYNAME']=census['CTYNAME'].str.replace('Municipality', '')
census['CTYNAME']=census['CTYNAME'].str.replace('City and Burough', '')
census['CTYNAME']=census['CTYNAME'].str.replace('city', '')

#census['CTYNAME']=census['CTYNAME'].str.split(' ',expand=True)[0:-1].str[:-1]

census['CTYNAME']=census['CTYNAME'].str.strip(' ')
census

Unnamed: 0,YEAR,AGEGRP,STATE,COUNTY,STNAME,CTYNAME,TOT_POP,BAC_MALE,BAC_FEMALE,CountyState
0,12,0,1,1,Alabama,Autauga,55869,5514,6270,"Autauga, Alabama"
1,12,0,1,3,Alabama,Baldwin,223234,10346,11008,"Baldwin, Alabama"
2,12,0,1,5,Alabama,Barbour,24686,6432,5685,"Barbour, Alabama"
3,12,0,1,7,Alabama,Bibb,22394,3010,1897,"Bibb, Alabama"
4,12,0,1,9,Alabama,Blount,57826,658,618,"Blount, Alabama"
...,...,...,...,...,...,...,...,...,...,...
3137,12,0,56,37,Wyoming,Sweetwater,42343,481,389,"Sweetwater, Wyoming"
3138,12,0,56,39,Wyoming,Teton,23464,147,110,"Teton, Wyoming"
3139,12,0,56,41,Wyoming,Uinta,20226,122,111,"Uinta, Wyoming"
3140,12,0,56,43,Wyoming,Washakie,7805,55,38,"Washakie, Wyoming"


Ah, much better :)

Now, we make a 'County, State' column so that we'll be able to match superfund sites to their correct census data (some states have counties with the same name!)

In [None]:
census['CountyState']= census['CTYNAME'].str.cat(census['STNAME'], sep =", ") 
census

Unnamed: 0,YEAR,AGEGRP,STATE,COUNTY,STNAME,CTYNAME,TOT_POP,BAC_MALE,BAC_FEMALE,CountyState
0,12,0,1,1,Alabama,Autauga,55869,5514,6270,"Autauga, Alabama"
1,12,0,1,3,Alabama,Baldwin,223234,10346,11008,"Baldwin, Alabama"
2,12,0,1,5,Alabama,Barbour,24686,6432,5685,"Barbour, Alabama"
3,12,0,1,7,Alabama,Bibb,22394,3010,1897,"Bibb, Alabama"
4,12,0,1,9,Alabama,Blount,57826,658,618,"Blount, Alabama"
...,...,...,...,...,...,...,...,...,...,...
3137,12,0,56,37,Wyoming,Sweetwater,42343,481,389,"Sweetwater, Wyoming"
3138,12,0,56,39,Wyoming,Teton,23464,147,110,"Teton, Wyoming"
3139,12,0,56,41,Wyoming,Uinta,20226,122,111,"Uinta, Wyoming"
3140,12,0,56,43,Wyoming,Washakie,7805,55,38,"Washakie, Wyoming"


###Alright, now moving on to the superfund data:

In [None]:
#superfund data
superfunds = pd.read_csv("superfunds.csv")
superfunds

Unnamed: 0,Site Name,Site Score,Site EPA ID,SEMS ID,Region ID,State,City,County,Status,Latitude,Longitude,Proposed Date,Listing Date,Construction Completion Date,Construction Completion Number,NOID Date,Deletion Date,Site Listing Narrative,Site Progress Profile,Proposed FR Notice,Listing FR Notice,NOID FR Notice,Deletion FR Notice,Restoration FR Notice Jumper Page,Site has had a Partial Deletion,CreationDate,Creator,EditDate,Editor,ObjectId2,x,y
0,Army Creek Landfill,69.92,DED980494496,300086,3,Delaware,New Castle County,New Castle,NPL Site,39.653061,-75.608331,12/30/1982,09/08/1983,04/29/1994,236,,,"<a href=""https://semspub.epa.gov/src/document/...","<a href=""https://cumulis.epa.gov/supercpad/cur...","<a href=""https://semspub.epa.gov/src/document/...","<a href=""https://semspub.epa.gov/src/document/...",,,,No,,,,,13,-8.416681e+06,4.815653e+06
1,Delaware Sand & Gravel Landfill,46.60,DED000605972,300034,3,Delaware,New Castle County,New Castle,NPL Site,39.651389,-75.602781,12/30/1982,09/08/1983,08/12/1997,445,,,"<a href=""https://semspub.epa.gov/src/document/...","<a href=""https://cumulis.epa.gov/supercpad/cur...","<a href=""https://semspub.epa.gov/src/document/...","<a href=""https://semspub.epa.gov/src/document/...",,,,No,,,,,14,-8.416063e+06,4.815412e+06
2,Delaware City PVC Plant,30.55,DED980551667,300091,3,Delaware,Delaware City,New Castle,NPL Site,39.586111,-75.649439,12/30/1982,09/08/1983,09/26/2001,787,,,"<a href=""https://semspub.epa.gov/src/document/...","<a href=""https://cumulis.epa.gov/supercpad/cur...","<a href=""https://semspub.epa.gov/src/document/...","<a href=""https://semspub.epa.gov/src/document/...",,,,No,,,,,15,-8.421257e+06,4.805978e+06
3,"Harvey & Knott Drum, Inc.",30.77,DED980713093,300123,3,Delaware,Kirkwood,New Castle,NPL Site,39.573331,-75.770839,12/30/1982,09/08/1983,06/22/1994,238,,,"<a href=""https://semspub.epa.gov/src/document/...","<a href=""https://cumulis.epa.gov/supercpad/cur...","<a href=""https://semspub.epa.gov/src/document/...","<a href=""https://semspub.epa.gov/src/document/...",,,,No,,,,,16,-8.434771e+06,4.804132e+06
4,New Castle Steel,30.40,DED980705255,300106,3,Delaware,New Castle County,New Castle,Deleted NPL Site,39.657781,-75.577769,12/30/1982,09/08/1983,08/17/1988,30,09/22/1988,03/17/1989,"<a href=""https://semspub.epa.gov/src/document/...","<a href=""https://cumulis.epa.gov/supercpad/cur...","<a href=""https://semspub.epa.gov/src/document/...","<a href=""https://semspub.epa.gov/src/document/...","<a href=""https://semspub.epa.gov/src/document/...","<a href=""https://semspub.epa.gov/src/document/...",,No,,,,,17,-8.413279e+06,4.816336e+06
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1705,Barber Orchard,70.71,NCSFN0406989,406989,4,North Carolina,Waynesville,Haywood,NPL Site,35.445833,-83.063889,01/11/2001,09/13/2001,09/29/2011,1116,,,"<a href=""https://semspub.epa.gov/src/document/...","<a href=""https://cumulis.epa.gov/supercpad/cur...","<a href=""http://www.gpo.gov/fdsys/pkg/FR-2001-...","<a href=""http://www.gpo.gov/fdsys/pkg/FR-2001-...",,,,No,,,,,173,-9.246630e+06,4.224634e+06
1706,Cooper Drum Company,50.00,CAD055753370,903253,9,California,South Gate,Los Angeles,NPL Site,33.946972,-118.179694,01/11/2001,06/14/2001,,0,,,"<a href=""https://semspub.epa.gov/src/document/...","<a href=""https://cumulis.epa.gov/supercpad/cur...","<a href=""http://www.gpo.gov/fdsys/pkg/FR-2001-...","<a href=""http://www.gpo.gov/fdsys/pkg/FR-2001-...",,,,No,,,,,354,-1.315570e+07,4.021684e+06
1707,Quanta Resources,50.00,NJD000606442,200034,2,New Jersey,Edgewater,Bergen,NPL Site,40.804306,-73.989167,01/11/2001,09/05/2002,,0,,,"<a href=""https://semspub.epa.gov/src/document/...","<a href=""https://cumulis.epa.gov/supercpad/cur...","<a href=""http://www.gpo.gov/fdsys/pkg/FR-2001-...","<a href=""http://www.gpo.gov/fdsys/pkg/FR-2002-...",,,,No,,,,,1371,-8.236436e+06,4.983520e+06
1708,Griggs & Walnut Ground Water Plume,50.00,NM0002271286,605116,6,New Mexico,Las Cruces,Dona Ana,NPL Site,32.315556,-106.760000,01/11/2001,06/14/2001,07/20/2012,1124,,,"<a href=""https://semspub.epa.gov/src/document/...","<a href=""https://cumulis.epa.gov/supercpad/cur...","<a href=""http://www.gpo.gov/fdsys/pkg/FR-2001-...","<a href=""http://www.gpo.gov/fdsys/pkg/FR-2001-...",,,,No,,,,,1405,-1.188447e+07,3.804804e+06


### Let's explore the makeup of these Superfund sites.
Which are active? Completed? TBD?

In [None]:
active = superfunds[superfunds["Status"] == "NPL Site"].shape
proposed = superfunds[superfunds["Status"] == "Deleted NPL Site"].shape
deleted = superfunds[superfunds["Status"] == "Proposed NPL Site"].shape

(active, proposed, deleted)

((1259, 32), (404, 32), (47, 32))

There are 1259 active sites, 404 deleted sites, and 47 proposed sites at the moment. 

Let's make another one of those 'County, State' columns for our upcoming join:

In [None]:
superfunds['CountyState']= superfunds['County'].str.cat(superfunds['State'], sep =", ") 
superfunds

Unnamed: 0,Site Name,Site Score,Site EPA ID,SEMS ID,Region ID,State,City,County,Status,Latitude,Longitude,Proposed Date,Listing Date,Construction Completion Date,Construction Completion Number,NOID Date,Deletion Date,Site Listing Narrative,Site Progress Profile,Proposed FR Notice,Listing FR Notice,NOID FR Notice,Deletion FR Notice,Restoration FR Notice Jumper Page,Site has had a Partial Deletion,CreationDate,Creator,EditDate,Editor,ObjectId2,x,y,CountyState
0,Army Creek Landfill,69.92,DED980494496,300086,3,Delaware,New Castle County,New Castle,NPL Site,39.653061,-75.608331,12/30/1982,09/08/1983,04/29/1994,236,,,"<a href=""https://semspub.epa.gov/src/document/...","<a href=""https://cumulis.epa.gov/supercpad/cur...","<a href=""https://semspub.epa.gov/src/document/...","<a href=""https://semspub.epa.gov/src/document/...",,,,No,,,,,13,-8.416681e+06,4.815653e+06,"New Castle, Delaware"
1,Delaware Sand & Gravel Landfill,46.60,DED000605972,300034,3,Delaware,New Castle County,New Castle,NPL Site,39.651389,-75.602781,12/30/1982,09/08/1983,08/12/1997,445,,,"<a href=""https://semspub.epa.gov/src/document/...","<a href=""https://cumulis.epa.gov/supercpad/cur...","<a href=""https://semspub.epa.gov/src/document/...","<a href=""https://semspub.epa.gov/src/document/...",,,,No,,,,,14,-8.416063e+06,4.815412e+06,"New Castle, Delaware"
2,Delaware City PVC Plant,30.55,DED980551667,300091,3,Delaware,Delaware City,New Castle,NPL Site,39.586111,-75.649439,12/30/1982,09/08/1983,09/26/2001,787,,,"<a href=""https://semspub.epa.gov/src/document/...","<a href=""https://cumulis.epa.gov/supercpad/cur...","<a href=""https://semspub.epa.gov/src/document/...","<a href=""https://semspub.epa.gov/src/document/...",,,,No,,,,,15,-8.421257e+06,4.805978e+06,"New Castle, Delaware"
3,"Harvey & Knott Drum, Inc.",30.77,DED980713093,300123,3,Delaware,Kirkwood,New Castle,NPL Site,39.573331,-75.770839,12/30/1982,09/08/1983,06/22/1994,238,,,"<a href=""https://semspub.epa.gov/src/document/...","<a href=""https://cumulis.epa.gov/supercpad/cur...","<a href=""https://semspub.epa.gov/src/document/...","<a href=""https://semspub.epa.gov/src/document/...",,,,No,,,,,16,-8.434771e+06,4.804132e+06,"New Castle, Delaware"
4,New Castle Steel,30.40,DED980705255,300106,3,Delaware,New Castle County,New Castle,Deleted NPL Site,39.657781,-75.577769,12/30/1982,09/08/1983,08/17/1988,30,09/22/1988,03/17/1989,"<a href=""https://semspub.epa.gov/src/document/...","<a href=""https://cumulis.epa.gov/supercpad/cur...","<a href=""https://semspub.epa.gov/src/document/...","<a href=""https://semspub.epa.gov/src/document/...","<a href=""https://semspub.epa.gov/src/document/...","<a href=""https://semspub.epa.gov/src/document/...",,No,,,,,17,-8.413279e+06,4.816336e+06,"New Castle, Delaware"
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1705,Barber Orchard,70.71,NCSFN0406989,406989,4,North Carolina,Waynesville,Haywood,NPL Site,35.445833,-83.063889,01/11/2001,09/13/2001,09/29/2011,1116,,,"<a href=""https://semspub.epa.gov/src/document/...","<a href=""https://cumulis.epa.gov/supercpad/cur...","<a href=""http://www.gpo.gov/fdsys/pkg/FR-2001-...","<a href=""http://www.gpo.gov/fdsys/pkg/FR-2001-...",,,,No,,,,,173,-9.246630e+06,4.224634e+06,"Haywood, North Carolina"
1706,Cooper Drum Company,50.00,CAD055753370,903253,9,California,South Gate,Los Angeles,NPL Site,33.946972,-118.179694,01/11/2001,06/14/2001,,0,,,"<a href=""https://semspub.epa.gov/src/document/...","<a href=""https://cumulis.epa.gov/supercpad/cur...","<a href=""http://www.gpo.gov/fdsys/pkg/FR-2001-...","<a href=""http://www.gpo.gov/fdsys/pkg/FR-2001-...",,,,No,,,,,354,-1.315570e+07,4.021684e+06,"Los Angeles, California"
1707,Quanta Resources,50.00,NJD000606442,200034,2,New Jersey,Edgewater,Bergen,NPL Site,40.804306,-73.989167,01/11/2001,09/05/2002,,0,,,"<a href=""https://semspub.epa.gov/src/document/...","<a href=""https://cumulis.epa.gov/supercpad/cur...","<a href=""http://www.gpo.gov/fdsys/pkg/FR-2001-...","<a href=""http://www.gpo.gov/fdsys/pkg/FR-2002-...",,,,No,,,,,1371,-8.236436e+06,4.983520e+06,"Bergen, New Jersey"
1708,Griggs & Walnut Ground Water Plume,50.00,NM0002271286,605116,6,New Mexico,Las Cruces,Dona Ana,NPL Site,32.315556,-106.760000,01/11/2001,06/14/2001,07/20/2012,1124,,,"<a href=""https://semspub.epa.gov/src/document/...","<a href=""https://cumulis.epa.gov/supercpad/cur...","<a href=""http://www.gpo.gov/fdsys/pkg/FR-2001-...","<a href=""http://www.gpo.gov/fdsys/pkg/FR-2001-...",,,,No,,,,,1405,-1.188447e+07,3.804804e+06,"Dona Ana, New Mexico"


And now, to make this into a pandas Series, indexing it by the 'County, State' column we so nicely prepared up above.

In [None]:
superfunds_series = superfunds.set_index('CountyState').squeeze()
superfunds_series 

Unnamed: 0_level_0,Site Name,Site Score,Site EPA ID,SEMS ID,Region ID,State,City,County,Status,Latitude,Longitude,Proposed Date,Listing Date,Construction Completion Date,Construction Completion Number,NOID Date,Deletion Date,Site Listing Narrative,Site Progress Profile,Proposed FR Notice,Listing FR Notice,NOID FR Notice,Deletion FR Notice,Restoration FR Notice Jumper Page,Site has had a Partial Deletion,CreationDate,Creator,EditDate,Editor,ObjectId2,x,y
CountyState,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1,Unnamed: 25_level_1,Unnamed: 26_level_1,Unnamed: 27_level_1,Unnamed: 28_level_1,Unnamed: 29_level_1,Unnamed: 30_level_1,Unnamed: 31_level_1,Unnamed: 32_level_1
"New Castle, Delaware",Army Creek Landfill,69.92,DED980494496,300086,3,Delaware,New Castle County,New Castle,NPL Site,39.653061,-75.608331,12/30/1982,09/08/1983,04/29/1994,236,,,"<a href=""https://semspub.epa.gov/src/document/...","<a href=""https://cumulis.epa.gov/supercpad/cur...","<a href=""https://semspub.epa.gov/src/document/...","<a href=""https://semspub.epa.gov/src/document/...",,,,No,,,,,13,-8.416681e+06,4.815653e+06
"New Castle, Delaware",Delaware Sand & Gravel Landfill,46.60,DED000605972,300034,3,Delaware,New Castle County,New Castle,NPL Site,39.651389,-75.602781,12/30/1982,09/08/1983,08/12/1997,445,,,"<a href=""https://semspub.epa.gov/src/document/...","<a href=""https://cumulis.epa.gov/supercpad/cur...","<a href=""https://semspub.epa.gov/src/document/...","<a href=""https://semspub.epa.gov/src/document/...",,,,No,,,,,14,-8.416063e+06,4.815412e+06
"New Castle, Delaware",Delaware City PVC Plant,30.55,DED980551667,300091,3,Delaware,Delaware City,New Castle,NPL Site,39.586111,-75.649439,12/30/1982,09/08/1983,09/26/2001,787,,,"<a href=""https://semspub.epa.gov/src/document/...","<a href=""https://cumulis.epa.gov/supercpad/cur...","<a href=""https://semspub.epa.gov/src/document/...","<a href=""https://semspub.epa.gov/src/document/...",,,,No,,,,,15,-8.421257e+06,4.805978e+06
"New Castle, Delaware","Harvey & Knott Drum, Inc.",30.77,DED980713093,300123,3,Delaware,Kirkwood,New Castle,NPL Site,39.573331,-75.770839,12/30/1982,09/08/1983,06/22/1994,238,,,"<a href=""https://semspub.epa.gov/src/document/...","<a href=""https://cumulis.epa.gov/supercpad/cur...","<a href=""https://semspub.epa.gov/src/document/...","<a href=""https://semspub.epa.gov/src/document/...",,,,No,,,,,16,-8.434771e+06,4.804132e+06
"New Castle, Delaware",New Castle Steel,30.40,DED980705255,300106,3,Delaware,New Castle County,New Castle,Deleted NPL Site,39.657781,-75.577769,12/30/1982,09/08/1983,08/17/1988,30,09/22/1988,03/17/1989,"<a href=""https://semspub.epa.gov/src/document/...","<a href=""https://cumulis.epa.gov/supercpad/cur...","<a href=""https://semspub.epa.gov/src/document/...","<a href=""https://semspub.epa.gov/src/document/...","<a href=""https://semspub.epa.gov/src/document/...","<a href=""https://semspub.epa.gov/src/document/...",,No,,,,,17,-8.413279e+06,4.816336e+06
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
"Haywood, North Carolina",Barber Orchard,70.71,NCSFN0406989,406989,4,North Carolina,Waynesville,Haywood,NPL Site,35.445833,-83.063889,01/11/2001,09/13/2001,09/29/2011,1116,,,"<a href=""https://semspub.epa.gov/src/document/...","<a href=""https://cumulis.epa.gov/supercpad/cur...","<a href=""http://www.gpo.gov/fdsys/pkg/FR-2001-...","<a href=""http://www.gpo.gov/fdsys/pkg/FR-2001-...",,,,No,,,,,173,-9.246630e+06,4.224634e+06
"Los Angeles, California",Cooper Drum Company,50.00,CAD055753370,903253,9,California,South Gate,Los Angeles,NPL Site,33.946972,-118.179694,01/11/2001,06/14/2001,,0,,,"<a href=""https://semspub.epa.gov/src/document/...","<a href=""https://cumulis.epa.gov/supercpad/cur...","<a href=""http://www.gpo.gov/fdsys/pkg/FR-2001-...","<a href=""http://www.gpo.gov/fdsys/pkg/FR-2001-...",,,,No,,,,,354,-1.315570e+07,4.021684e+06
"Bergen, New Jersey",Quanta Resources,50.00,NJD000606442,200034,2,New Jersey,Edgewater,Bergen,NPL Site,40.804306,-73.989167,01/11/2001,09/05/2002,,0,,,"<a href=""https://semspub.epa.gov/src/document/...","<a href=""https://cumulis.epa.gov/supercpad/cur...","<a href=""http://www.gpo.gov/fdsys/pkg/FR-2001-...","<a href=""http://www.gpo.gov/fdsys/pkg/FR-2002-...",,,,No,,,,,1371,-8.236436e+06,4.983520e+06
"Dona Ana, New Mexico",Griggs & Walnut Ground Water Plume,50.00,NM0002271286,605116,6,New Mexico,Las Cruces,Dona Ana,NPL Site,32.315556,-106.760000,01/11/2001,06/14/2001,07/20/2012,1124,,,"<a href=""https://semspub.epa.gov/src/document/...","<a href=""https://cumulis.epa.gov/supercpad/cur...","<a href=""http://www.gpo.gov/fdsys/pkg/FR-2001-...","<a href=""http://www.gpo.gov/fdsys/pkg/FR-2001-...",,,,No,,,,,1405,-1.188447e+07,3.804804e+06


### Let's make a column for the % Black Population for each county by adding the population of black & black-mixed males (`BAC_MALE`) and females (`BAC_FEMALE`)and then dividing them by the county's total population.

In [None]:
census.head(20)

Unnamed: 0,YEAR,AGEGRP,STATE,COUNTY,STNAME,CTYNAME,TOT_POP,BAC_MALE,BAC_FEMALE,CountyState
0,12,0,1,1,Alabama,Autauga,55869,5514,6270,"Autauga, Alabama"
1,12,0,1,3,Alabama,Baldwin,223234,10346,11008,"Baldwin, Alabama"
2,12,0,1,5,Alabama,Barbour,24686,6432,5685,"Barbour, Alabama"
3,12,0,1,7,Alabama,Bibb,22394,3010,1897,"Bibb, Alabama"
4,12,0,1,9,Alabama,Blount,57826,658,618,"Blount, Alabama"
5,12,0,1,11,Alabama,Bullock,10101,3753,3418,"Bullock, Alabama"
6,12,0,1,13,Alabama,Butler,19448,3941,4924,"Butler, Alabama"
7,12,0,1,15,Alabama,Calhoun,113605,11846,13637,"Calhoun, Alabama"
8,12,0,1,17,Alabama,Chambers,33254,6351,7276,"Chambers, Alabama"
9,12,0,1,19,Alabama,Cherokee,26196,642,604,"Cherokee, Alabama"


###Joining the Superfund and County Data

In [None]:
census["PCNT_BLACK"] = (census["BAC_MALE"] + census["BAC_FEMALE"]) / census["TOT_POP"]

census_series = census.set_index('CountyState').squeeze()

superfund_and_census = census_series.join(superfunds_series)

#remove any counties that aren't in the superfund dataset
#superfund_and_census = superfund_and_census[superfund_and_census['Site Score'].notna()]
superfund_and_census

Unnamed: 0_level_0,YEAR,AGEGRP,STATE,COUNTY,STNAME,CTYNAME,TOT_POP,BAC_MALE,BAC_FEMALE,PCNT_BLACK,Site Name,Site Score,Site EPA ID,SEMS ID,Region ID,State,City,County,Status,Latitude,Longitude,Proposed Date,Listing Date,Construction Completion Date,Construction Completion Number,NOID Date,Deletion Date,Site Listing Narrative,Site Progress Profile,Proposed FR Notice,Listing FR Notice,NOID FR Notice,Deletion FR Notice,Restoration FR Notice Jumper Page,Site has had a Partial Deletion,CreationDate,Creator,EditDate,Editor,ObjectId2,x,y
CountyState,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1,Unnamed: 25_level_1,Unnamed: 26_level_1,Unnamed: 27_level_1,Unnamed: 28_level_1,Unnamed: 29_level_1,Unnamed: 30_level_1,Unnamed: 31_level_1,Unnamed: 32_level_1,Unnamed: 33_level_1,Unnamed: 34_level_1,Unnamed: 35_level_1,Unnamed: 36_level_1,Unnamed: 37_level_1,Unnamed: 38_level_1,Unnamed: 39_level_1,Unnamed: 40_level_1,Unnamed: 41_level_1,Unnamed: 42_level_1
"Abbeville, South Carolina",12,0,45,1,South Carolina,Abbeville,24527,3304,3710,0.285971,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
"Acadia, Louisiana",12,0,22,1,Louisiana,Acadia,62045,5791,6108,0.191780,EVR-Wood Treating/Evangeline Refining Company,48.20,LAN000605517,605517.0,6.0,Louisiana,Jennings,Acadia,NPL Site,30.248056,-92.6175,03/15/2012,09/18/2012,,0.0,,,"<a href=""https://semspub.epa.gov/src/document/...","<a href=""https://cumulis.epa.gov/supercpad/cur...","<a href=""http://www.gpo.gov/fdsys/pkg/FR-2012-...","<a href=""http://www.gpo.gov/fdsys/pkg/FR-2012-...",,,,No,,,,,553.0,-1.031013e+07,3.535475e+06
"Accomack, Virginia",12,0,51,1,Virginia,Accomack,32316,4630,5002,0.298057,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
"Ada, Idaho",12,0,16,1,Idaho,Ada,481587,5778,4613,0.021577,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
"Adair, Iowa",12,0,19,1,Iowa,Adair,7152,45,37,0.011465,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
"Yuma, Arizona",12,0,4,27,Arizona,Yuma,213787,4315,2726,0.032935,Yuma Marine Corps Air Station,32.24,AZ0971590062,900885.0,9.0,Arizona,Yuma,Yuma,NPL Site,32.654581,-114.5888,06/24/1988,02/21/1990,09/20/2000,720.0,,,"<a href=""https://semspub.epa.gov/src/document/...","<a href=""https://cumulis.epa.gov/supercpad/cur...","<a href=""https://semspub.epa.gov/src/document/...","<a href=""https://semspub.epa.gov/src/document/...",,,,No,,,,,851.0,-1.275597e+07,3.849545e+06
"Yuma, Colorado",12,0,8,125,Colorado,Yuma,10019,63,42,0.010480,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
"Zapata, Texas",12,0,48,505,Texas,Zapata,14179,61,42,0.007264,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
"Zavala, Texas",12,0,48,507,Texas,Zavala,11840,107,79,0.015709,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,


Let's check that we're only using 'all age groups' and data from the most recent census work, July 2019. Then, let's focus in on only the columns that will be helpful for us. AND, we'll create a 'Percent Black' column that tells us the Black proportion of the county's population.

In [None]:
census = census[census['AGEGRP'] == 0]
census = census[census['YEAR'] == 12]

black_census = census.loc[:, ["STATE", "CountyState", "STNAME", "CTYNAME", "TOT_POP", "BAC_MALE", "BAC_FEMALE"]]
black_census["PCNT_BLACK"] = (black_census["BAC_MALE"] + black_census["BAC_FEMALE"]) / black_census["TOT_POP"]
black_census

Unnamed: 0,STATE,CountyState,STNAME,CTYNAME,TOT_POP,BAC_MALE,BAC_FEMALE,PCNT_BLACK
0,1,"Autauga, Alabama",Alabama,Autauga,55869,5514,6270,0.210922
1,1,"Baldwin, Alabama",Alabama,Baldwin,223234,10346,11008,0.095657
2,1,"Barbour, Alabama",Alabama,Barbour,24686,6432,5685,0.490845
3,1,"Bibb, Alabama",Alabama,Bibb,22394,3010,1897,0.219121
4,1,"Blount, Alabama",Alabama,Blount,57826,658,618,0.022066
...,...,...,...,...,...,...,...,...
3137,56,"Sweetwater, Wyoming",Wyoming,Sweetwater,42343,481,389,0.020546
3138,56,"Teton, Wyoming",Wyoming,Teton,23464,147,110,0.010953
3139,56,"Uinta, Wyoming",Wyoming,Uinta,20226,122,111,0.011520
3140,56,"Washakie, Wyoming",Wyoming,Washakie,7805,55,38,0.011915


In [None]:
superfund_and_census.to_csv("Census_County_and_Superfund_ALL.csv", encoding='utf-8', index=False)
superfund_and_census

Unnamed: 0_level_0,YEAR,AGEGRP,STATE,COUNTY,STNAME,CTYNAME,TOT_POP,BAC_MALE,BAC_FEMALE,PCNT_BLACK,Site Name,Site Score,Site EPA ID,SEMS ID,Region ID,State,City,County,Status,Latitude,Longitude,Proposed Date,Listing Date,Construction Completion Date,Construction Completion Number,NOID Date,Deletion Date,Site Listing Narrative,Site Progress Profile,Proposed FR Notice,Listing FR Notice,NOID FR Notice,Deletion FR Notice,Restoration FR Notice Jumper Page,Site has had a Partial Deletion,CreationDate,Creator,EditDate,Editor,ObjectId2,x,y
CountyState,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1,Unnamed: 25_level_1,Unnamed: 26_level_1,Unnamed: 27_level_1,Unnamed: 28_level_1,Unnamed: 29_level_1,Unnamed: 30_level_1,Unnamed: 31_level_1,Unnamed: 32_level_1,Unnamed: 33_level_1,Unnamed: 34_level_1,Unnamed: 35_level_1,Unnamed: 36_level_1,Unnamed: 37_level_1,Unnamed: 38_level_1,Unnamed: 39_level_1,Unnamed: 40_level_1,Unnamed: 41_level_1,Unnamed: 42_level_1
"Abbeville, South Carolina",12,0,45,1,South Carolina,Abbeville,24527,3304,3710,0.285971,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
"Acadia, Louisiana",12,0,22,1,Louisiana,Acadia,62045,5791,6108,0.191780,EVR-Wood Treating/Evangeline Refining Company,48.20,LAN000605517,605517.0,6.0,Louisiana,Jennings,Acadia,NPL Site,30.248056,-92.6175,03/15/2012,09/18/2012,,0.0,,,"<a href=""https://semspub.epa.gov/src/document/...","<a href=""https://cumulis.epa.gov/supercpad/cur...","<a href=""http://www.gpo.gov/fdsys/pkg/FR-2012-...","<a href=""http://www.gpo.gov/fdsys/pkg/FR-2012-...",,,,No,,,,,553.0,-1.031013e+07,3.535475e+06
"Accomack, Virginia",12,0,51,1,Virginia,Accomack,32316,4630,5002,0.298057,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
"Ada, Idaho",12,0,16,1,Idaho,Ada,481587,5778,4613,0.021577,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
"Adair, Iowa",12,0,19,1,Iowa,Adair,7152,45,37,0.011465,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
"Yuma, Arizona",12,0,4,27,Arizona,Yuma,213787,4315,2726,0.032935,Yuma Marine Corps Air Station,32.24,AZ0971590062,900885.0,9.0,Arizona,Yuma,Yuma,NPL Site,32.654581,-114.5888,06/24/1988,02/21/1990,09/20/2000,720.0,,,"<a href=""https://semspub.epa.gov/src/document/...","<a href=""https://cumulis.epa.gov/supercpad/cur...","<a href=""https://semspub.epa.gov/src/document/...","<a href=""https://semspub.epa.gov/src/document/...",,,,No,,,,,851.0,-1.275597e+07,3.849545e+06
"Yuma, Colorado",12,0,8,125,Colorado,Yuma,10019,63,42,0.010480,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
"Zapata, Texas",12,0,48,505,Texas,Zapata,14179,61,42,0.007264,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
"Zavala, Texas",12,0,48,507,Texas,Zavala,11840,107,79,0.015709,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,


Let's filter these so that the % Black is above the national average of 13%. We can go ahead and remove the population data too.

source: https://www.indexmundi.com/facts/united-states/quick-facts/all-states/black-population-percentage#map 



In [None]:
black_census = black_census[black_census["PCNT_BLACK"] > 0.13]
del black_census['BAC_MALE']
del black_census['BAC_FEMALE']
del black_census['TOT_POP']
del black_census['STATE']
black_census

Unnamed: 0,CountyState,STNAME,CTYNAME,PCNT_BLACK
0,"Autauga, Alabama",Alabama,Autauga,0.210922
2,"Barbour, Alabama",Alabama,Barbour,0.490845
3,"Bibb, Alabama",Alabama,Bibb,0.219121
5,"Bullock, Alabama",Alabama,Bullock,0.709930
6,"Butler, Alabama",Alabama,Butler,0.455831
...,...,...,...,...
2950,"Waynesboro, Virginia",Virginia,Waynesboro,0.160097
2951,"Williamsburg, Virginia",Virginia,Williamsburg,0.175003
2952,"Winchester, Virginia",Virginia,Winchester,0.138151
3087,"Milwaukee, Wisconsin",Wisconsin,Milwaukee,0.291590


In [None]:
black_census.to_csv("Black_Census_County.csv", encoding='utf-8', index=False)

747 of the 3142 counties in this census data have a Black population percentage that is higher than the national average (13%). That is a little under 24%. 

Let's make this into a pandas Series too, so the join will work.

In [None]:
black_census_series = black_census.set_index('CountyState').squeeze()
black_census_series

Unnamed: 0_level_0,STNAME,CTYNAME,PCNT_BLACK
CountyState,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
"Autauga, Alabama",Alabama,Autauga,0.210922
"Barbour, Alabama",Alabama,Barbour,0.490845
"Bibb, Alabama",Alabama,Bibb,0.219121
"Bullock, Alabama",Alabama,Bullock,0.709930
"Butler, Alabama",Alabama,Butler,0.455831
...,...,...,...
"Waynesboro, Virginia",Virginia,Waynesboro,0.160097
"Williamsburg, Virginia",Virginia,Williamsburg,0.175003
"Winchester, Virginia",Virginia,Winchester,0.138151
"Milwaukee, Wisconsin",Wisconsin,Milwaukee,0.291590


We now see that the counties with less than 13% Black & Black-mixed populations are gone. This step removed just shy of 1200 counties.

### Next, let's join the Superfund data and Black Census data by their County.

In [None]:
superfund_and_blackcensus = black_census_series.join(superfunds_series)

#remove any counties that aren't in the superfund dataset
superfund_and_blackcensus = superfund_and_blackcensus[superfund_and_blackcensus['Site Score'].notna()]
superfund_and_blackcensus

Unnamed: 0_level_0,STNAME,CTYNAME,PCNT_BLACK,Site Name,Site Score,Site EPA ID,SEMS ID,Region ID,State,City,County,Status,Latitude,Longitude,Proposed Date,Listing Date,Construction Completion Date,Construction Completion Number,NOID Date,Deletion Date,Site Listing Narrative,Site Progress Profile,Proposed FR Notice,Listing FR Notice,NOID FR Notice,Deletion FR Notice,Restoration FR Notice Jumper Page,Site has had a Partial Deletion,CreationDate,Creator,EditDate,Editor,ObjectId2,x,y
CountyState,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1,Unnamed: 25_level_1,Unnamed: 26_level_1,Unnamed: 27_level_1,Unnamed: 28_level_1,Unnamed: 29_level_1,Unnamed: 30_level_1,Unnamed: 31_level_1,Unnamed: 32_level_1,Unnamed: 33_level_1,Unnamed: 34_level_1,Unnamed: 35_level_1
"Acadia, Louisiana",Louisiana,Acadia,0.191780,EVR-Wood Treating/Evangeline Refining Company,48.20,LAN000605517,605517.0,6.0,Louisiana,Jennings,Acadia,NPL Site,30.248056,-92.617500,03/15/2012,09/18/2012,,0.0,,,"<a href=""https://semspub.epa.gov/src/document/...","<a href=""https://cumulis.epa.gov/supercpad/cur...","<a href=""http://www.gpo.gov/fdsys/pkg/FR-2012-...","<a href=""http://www.gpo.gov/fdsys/pkg/FR-2012-...",,,,No,,,,,553.0,-1.031013e+07,3.535475e+06
"Aiken, South Carolina",South Carolina,Aiken,0.265216,Clearwater Finishing,47.99,SCD003303120,403391.0,4.0,South Carolina,Clearwater,Aiken,NPL Site,33.500907,-81.892130,11/08/2019,09/03/2020,,0.0,,,"<a href=""https://semspub.epa.gov/src/document/...","<a href=""https://cumulis.epa.gov/supercpad/cur...","<a href=""http://www.gpo.gov/fdsys/pkg/FR-2019-...","<a href=""http://www.gpo.gov/fdsys/pkg/FR-2020-...",,,,No,,,,,1204.0,-9.116190e+06,3.961981e+06
"Alachua, Florida",Florida,Alachua,0.222459,Cabot/Koppers,36.69,FLD980709356,400903.0,4.0,Florida,Gainesville,Alachua,NPL Site,29.675000,-82.323061,09/08/1983,09/21/1984,,0.0,,,"<a href=""https://semspub.epa.gov/src/document/...","<a href=""https://cumulis.epa.gov/supercpad/cur...","<a href=""https://semspub.epa.gov/src/document/...","<a href=""https://semspub.epa.gov/src/document/...",,,,No,,,,,62.0,-9.164161e+06,3.461842e+06
"Albany, New York",New York,Albany,0.158632,"Mercury Refining, Inc.",44.58,NYD048148175,201552.0,2.0,New York,Colonie,Albany,NPL Site,42.689719,-73.804169,12/30/1982,09/08/1983,04/30/2015,1170.0,,,"<a href=""https://semspub.epa.gov/src/document/...","<a href=""https://cumulis.epa.gov/supercpad/cur...","<a href=""https://semspub.epa.gov/src/document/...","<a href=""https://semspub.epa.gov/src/document/...",,,,No,,,,,1535.0,-8.215843e+06,5.264863e+06
"Alexander, Illinois",Illinois,Alexander,0.332755,Ilada Energy Co.,34.21,ILD980996789,500942.0,5.0,Illinois,East Cape Girardeau,Alexander,Deleted NPL Site,37.258400,-89.463500,06/24/1988,10/04/1989,09/28/1999,654.0,11/09/2000,01/08/2001,"<a href=""https://semspub.epa.gov/src/document/...","<a href=""https://cumulis.epa.gov/supercpad/cur...","<a href=""https://semspub.epa.gov/src/document/...","<a href=""https://semspub.epa.gov/src/document/...","<a href=""http://www.gpo.gov/fdsys/pkg/FR-2000-...","<a href=""http://www.gpo.gov/fdsys/pkg/FR-2000-...",,No,,,,,412.0,-9.959031e+06,4.475186e+06
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
"York, South Carolina",South Carolina,York,0.206457,"Leonard Chemical Co., Inc.",47.10,SCD991279324,403481.0,4.0,South Carolina,Rock Hill,York,NPL Site,34.851669,-80.904169,09/08/1983,09/21/1984,,0.0,,,"<a href=""https://semspub.epa.gov/src/document/...","<a href=""https://cumulis.epa.gov/supercpad/cur...","<a href=""https://semspub.epa.gov/src/document/...","<a href=""https://semspub.epa.gov/src/document/...",,,,No,,,,,1176.0,-9.006211e+06,4.143742e+06
"York, South Carolina",South Carolina,York,0.206457,Rock Hill Chemical Co.,40.29,SCD980844005,403425.0,4.0,South Carolina,Rock Hill,York,NPL Site,34.966100,-80.998500,06/24/1988,02/21/1990,12/31/1996,419.0,,,"<a href=""https://semspub.epa.gov/src/document/...","<a href=""https://cumulis.epa.gov/supercpad/cur...","<a href=""https://semspub.epa.gov/src/document/...","<a href=""https://semspub.epa.gov/src/document/...",,,,No,,,,,1187.0,-9.016712e+06,4.159275e+06
"York, Virginia",Virginia,York,0.156781,Chisman Creek,47.19,VAD980712913,302756.0,3.0,Virginia,York County,York,NPL Site,37.177000,-76.463100,12/30/1982,09/08/1983,12/21/1990,50.0,,,"<a href=""https://semspub.epa.gov/src/document/...","<a href=""https://cumulis.epa.gov/supercpad/cur...","<a href=""https://semspub.epa.gov/src/document/...","<a href=""https://semspub.epa.gov/src/document/...",,,,No,,,,,1023.0,-8.511833e+06,4.463807e+06
"York, Virginia",Virginia,York,0.156781,Naval Weapons Station - Yorktown,50.00,VA8170024170,302869.0,3.0,Virginia,Yorktown,York,NPL Site,37.245833,-76.588889,02/07/1992,10/14/1992,,0.0,,,"<a href=""https://semspub.epa.gov/src/document/...","<a href=""https://cumulis.epa.gov/supercpad/cur...","<a href=""https://semspub.epa.gov/src/document/...","<a href=""https://semspub.epa.gov/src/document/...",,,,No,,,,,1042.0,-8.525836e+06,4.473428e+06


In [None]:
superfund_and_blackcensus.to_csv("SuperFund_and_BlackCensus.csv", encoding='utf-8', index=False)

Here we have all the Superfunds within counties that have greater than 13% black population, all **587** of them (that's ***34%*** of all Superfunds in the database). Let's just clean it up a smidge:

In [None]:
del superfund_and_blackcensus["STNAME"]
superfund_and_blackcensus

Unnamed: 0_level_0,CTYNAME,PCNT_BLACK,Site Name,Site Score,Site EPA ID,SEMS ID,Region ID,State,City,County,Status,Latitude,Longitude,Proposed Date,Listing Date,Construction Completion Date,Construction Completion Number,NOID Date,Deletion Date,Site Listing Narrative,Site Progress Profile,Proposed FR Notice,Listing FR Notice,NOID FR Notice,Deletion FR Notice,Restoration FR Notice Jumper Page,Site has had a Partial Deletion,CreationDate,Creator,EditDate,Editor,ObjectId2,x,y
CountyState,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1,Unnamed: 25_level_1,Unnamed: 26_level_1,Unnamed: 27_level_1,Unnamed: 28_level_1,Unnamed: 29_level_1,Unnamed: 30_level_1,Unnamed: 31_level_1,Unnamed: 32_level_1,Unnamed: 33_level_1,Unnamed: 34_level_1
"Acadia, Louisiana",Acadia,0.191780,EVR-Wood Treating/Evangeline Refining Company,48.20,LAN000605517,605517.0,6.0,Louisiana,Jennings,Acadia,NPL Site,30.248056,-92.617500,03/15/2012,09/18/2012,,0.0,,,"<a href=""https://semspub.epa.gov/src/document/...","<a href=""https://cumulis.epa.gov/supercpad/cur...","<a href=""http://www.gpo.gov/fdsys/pkg/FR-2012-...","<a href=""http://www.gpo.gov/fdsys/pkg/FR-2012-...",,,,No,,,,,553.0,-1.031013e+07,3.535475e+06
"Aiken, South Carolina",Aiken,0.265216,Clearwater Finishing,47.99,SCD003303120,403391.0,4.0,South Carolina,Clearwater,Aiken,NPL Site,33.500907,-81.892130,11/08/2019,09/03/2020,,0.0,,,"<a href=""https://semspub.epa.gov/src/document/...","<a href=""https://cumulis.epa.gov/supercpad/cur...","<a href=""http://www.gpo.gov/fdsys/pkg/FR-2019-...","<a href=""http://www.gpo.gov/fdsys/pkg/FR-2020-...",,,,No,,,,,1204.0,-9.116190e+06,3.961981e+06
"Alachua, Florida",Alachua,0.222459,Cabot/Koppers,36.69,FLD980709356,400903.0,4.0,Florida,Gainesville,Alachua,NPL Site,29.675000,-82.323061,09/08/1983,09/21/1984,,0.0,,,"<a href=""https://semspub.epa.gov/src/document/...","<a href=""https://cumulis.epa.gov/supercpad/cur...","<a href=""https://semspub.epa.gov/src/document/...","<a href=""https://semspub.epa.gov/src/document/...",,,,No,,,,,62.0,-9.164161e+06,3.461842e+06
"Albany, New York",Albany,0.158632,"Mercury Refining, Inc.",44.58,NYD048148175,201552.0,2.0,New York,Colonie,Albany,NPL Site,42.689719,-73.804169,12/30/1982,09/08/1983,04/30/2015,1170.0,,,"<a href=""https://semspub.epa.gov/src/document/...","<a href=""https://cumulis.epa.gov/supercpad/cur...","<a href=""https://semspub.epa.gov/src/document/...","<a href=""https://semspub.epa.gov/src/document/...",,,,No,,,,,1535.0,-8.215843e+06,5.264863e+06
"Alexander, Illinois",Alexander,0.332755,Ilada Energy Co.,34.21,ILD980996789,500942.0,5.0,Illinois,East Cape Girardeau,Alexander,Deleted NPL Site,37.258400,-89.463500,06/24/1988,10/04/1989,09/28/1999,654.0,11/09/2000,01/08/2001,"<a href=""https://semspub.epa.gov/src/document/...","<a href=""https://cumulis.epa.gov/supercpad/cur...","<a href=""https://semspub.epa.gov/src/document/...","<a href=""https://semspub.epa.gov/src/document/...","<a href=""http://www.gpo.gov/fdsys/pkg/FR-2000-...","<a href=""http://www.gpo.gov/fdsys/pkg/FR-2000-...",,No,,,,,412.0,-9.959031e+06,4.475186e+06
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
"York, South Carolina",York,0.206457,"Leonard Chemical Co., Inc.",47.10,SCD991279324,403481.0,4.0,South Carolina,Rock Hill,York,NPL Site,34.851669,-80.904169,09/08/1983,09/21/1984,,0.0,,,"<a href=""https://semspub.epa.gov/src/document/...","<a href=""https://cumulis.epa.gov/supercpad/cur...","<a href=""https://semspub.epa.gov/src/document/...","<a href=""https://semspub.epa.gov/src/document/...",,,,No,,,,,1176.0,-9.006211e+06,4.143742e+06
"York, South Carolina",York,0.206457,Rock Hill Chemical Co.,40.29,SCD980844005,403425.0,4.0,South Carolina,Rock Hill,York,NPL Site,34.966100,-80.998500,06/24/1988,02/21/1990,12/31/1996,419.0,,,"<a href=""https://semspub.epa.gov/src/document/...","<a href=""https://cumulis.epa.gov/supercpad/cur...","<a href=""https://semspub.epa.gov/src/document/...","<a href=""https://semspub.epa.gov/src/document/...",,,,No,,,,,1187.0,-9.016712e+06,4.159275e+06
"York, Virginia",York,0.156781,Chisman Creek,47.19,VAD980712913,302756.0,3.0,Virginia,York County,York,NPL Site,37.177000,-76.463100,12/30/1982,09/08/1983,12/21/1990,50.0,,,"<a href=""https://semspub.epa.gov/src/document/...","<a href=""https://cumulis.epa.gov/supercpad/cur...","<a href=""https://semspub.epa.gov/src/document/...","<a href=""https://semspub.epa.gov/src/document/...",,,,No,,,,,1023.0,-8.511833e+06,4.463807e+06
"York, Virginia",York,0.156781,Naval Weapons Station - Yorktown,50.00,VA8170024170,302869.0,3.0,Virginia,Yorktown,York,NPL Site,37.245833,-76.588889,02/07/1992,10/14/1992,,0.0,,,"<a href=""https://semspub.epa.gov/src/document/...","<a href=""https://cumulis.epa.gov/supercpad/cur...","<a href=""https://semspub.epa.gov/src/document/...","<a href=""https://semspub.epa.gov/src/document/...",,,,No,,,,,1042.0,-8.525836e+06,4.473428e+06


In [None]:
superfund_and_blackcensus.index.unique()
# there are 215 unique counties with both superfund sites and a black population >13%

Index(['Acadia, Louisiana', 'Aiken, South Carolina', 'Alachua, Florida',
       'Albany, New York', 'Alexander, Illinois', 'Allegheny, Pennsylvania',
       'Allen, Indiana', 'Allendale, South Carolina', 'Anne Arundel, Maryland',
       'Arapahoe, Colorado',
       ...
       'Wayne, Michigan', 'Webster, Louisiana', 'Westchester, New York',
       'Westmoreland, Virginia', 'Will, Illinois', 'Winn, Louisiana',
       'Winnebago, Illinois', 'Winston, Mississippi', 'York, South Carolina',
       'York, Virginia'],
      dtype='object', name='CountyState', length=215)

- Meanwhile, there are 3142 total counties in this census data. So, 215/3142 = about 7% of US Counties have both Superfund sites and an above average Black proportion of their population.

- If 215 counties have both these qualities, then that means 215/747 = 28.7% of all counties with Black Pop. % > 13% have a Superfund site(s).

- Therefore, while only 747/3142 = 23.7% counties have a Black population percentage higher than the national average, they contain 34.7% of all superfunds in the database.

### Pollution Score Statistics
Comparing Site Scores for counties with above-average black pop. percentage and below-average black pop. percentage.

In [None]:
superfund_and_blackcensus = superfund_and_blackcensus[superfund_and_blackcensus['Site Score'].notna()]
superfund_and_blackcensus[superfund_and_blackcensus['Site Score'] > 0]
superfund_and_blackcensus['Site Score'].mean()

41.476695059625214

The average site score for counties with a greater than average black pop. percentage is 41.48

Now we will look at the average site score in counties with below-average black pop. percentage.

In [None]:
low_black_census = census.loc[:, ["STATE", "CountyState", "STNAME", "CTYNAME", "TOT_POP", "BAC_MALE", "BAC_FEMALE"]]
low_black_census["PCNT_BLACK"] = (low_black_census["BAC_MALE"] + low_black_census["BAC_FEMALE"]) / low_black_census["TOT_POP"]
low_black_census

low_black_census = low_black_census[low_black_census["PCNT_BLACK"] <= 0.13]
del low_black_census['BAC_MALE']
del low_black_census['BAC_FEMALE']
del low_black_census['TOT_POP']
del low_black_census['STATE']
low_black_census

Unnamed: 0,CountyState,STNAME,CTYNAME,PCNT_BLACK
1,"Baldwin, Alabama",Alabama,Baldwin,0.095657
4,"Blount, Alabama",Alabama,Blount,0.022066
9,"Cherokee, Alabama",Alabama,Cherokee,0.047565
10,"Chilton, Alabama",Alabama,Chilton,0.111326
14,"Cleburne, Alabama",Alabama,Cleburne,0.035211
...,...,...,...,...
3137,"Sweetwater, Wyoming",Wyoming,Sweetwater,0.020546
3138,"Teton, Wyoming",Wyoming,Teton,0.010953
3139,"Uinta, Wyoming",Wyoming,Uinta,0.011520
3140,"Washakie, Wyoming",Wyoming,Washakie,0.011915


In [None]:
low_black_census_series = low_black_census.set_index('CountyState').squeeze()
low_black_census_series

Unnamed: 0_level_0,STNAME,CTYNAME,PCNT_BLACK
CountyState,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
"Baldwin, Alabama",Alabama,Baldwin,0.095657
"Blount, Alabama",Alabama,Blount,0.022066
"Cherokee, Alabama",Alabama,Cherokee,0.047565
"Chilton, Alabama",Alabama,Chilton,0.111326
"Cleburne, Alabama",Alabama,Cleburne,0.035211
...,...,...,...
"Sweetwater, Wyoming",Wyoming,Sweetwater,0.020546
"Teton, Wyoming",Wyoming,Teton,0.010953
"Uinta, Wyoming",Wyoming,Uinta,0.011520
"Washakie, Wyoming",Wyoming,Washakie,0.011915


In [None]:
superfund_and_lowblackcensus = low_black_census_series.join(superfunds_series)

#remove any counties that aren't in the superfund dataset
superfund_and_lowblackcensus = superfund_and_lowblackcensus[superfund_and_lowblackcensus['Site Score'].notna()]
superfund_and_lowblackcensus[superfund_and_gt50blackcensus['Site Score'] > 0]
superfund_and_lowblackcensus

NameError: ignored

Now that the df is set up, let's calculate the avg site score:

In [None]:
superfund_and_lowblackcensus['Site Score'].mean()

42.054500458295195

The average site score for counties with a less than average black pop. percentage is 42.05, a hair above that for counties with a higher than average black pop. percentage. 

One area of external potential research is to evalauate whether site score for predominantly white areas were exagerated as compared to scores for areas with higher black populations.

### Average Time on the NPL List
Exploring the average time that a site has been waiting on the NPL, based off the percentage of black pop. of that county.

In [None]:
today = pd.to_datetime("today")
today

Timestamp('2021-02-21 19:09:07.817563')

First, for counties with higher than average black population:

In [None]:
superfund_and_blackcensus['Construction Completion Date'] = superfund_and_blackcensus['Construction Completion Date'].fillna(today)

superfund_and_blackcensus['Proposed Date'] = pd.to_datetime(superfund_and_blackcensus['Proposed Date'])
superfund_and_blackcensus['Construction Completion Date'] = pd.to_datetime(superfund_and_blackcensus['Construction Completion Date'])


superfund_and_blackcensus['date_diff'] = superfund_and_blackcensus['Construction Completion Date'] - superfund_and_blackcensus['Proposed Date']

superfund_and_blackcensus['date_diff'].mean()

Timedelta('5815 days 06:14:49.531145920')

Now, for counties with lower than average black population percentage:

---



In [None]:

superfund_and_lowblackcensus['Construction Completion Date'] = superfund_and_lowblackcensus['Construction Completion Date'].fillna(today)

superfund_and_lowblackcensus['Proposed Date'] = pd.to_datetime(superfund_and_lowblackcensus['Proposed Date'])
superfund_and_lowblackcensus['Construction Completion Date'] = pd.to_datetime(superfund_and_lowblackcensus['Construction Completion Date'])


superfund_and_lowblackcensus['date_diff'] = superfund_and_lowblackcensus['Construction Completion Date'] - superfund_and_lowblackcensus['Proposed Date']

superfund_and_lowblackcensus['date_diff'].mean()

Timedelta('5901 days 00:15:12.658257152')

These counties have a slightly higher average resident time on the list.

What about for communities  >50% Black population?

In [None]:
gt50_black_census = census.loc[:, ["STATE", "CountyState", "STNAME", "CTYNAME", "TOT_POP", "BAC_MALE", "BAC_FEMALE"]]
gt50_black_census["PCNT_BLACK"] = (gt50_black_census["BAC_MALE"] + gt50_black_census["BAC_FEMALE"]) / gt50_black_census["TOT_POP"]
gt50_black_census

gt50_black_census = gt50_black_census[gt50_black_census["PCNT_BLACK"] >= 0.50]
del gt50_black_census['BAC_MALE']
del gt50_black_census['BAC_FEMALE']
del gt50_black_census['TOT_POP']
del gt50_black_census['STATE']

gt50_black_census_series = gt50_black_census.set_index('CountyState').squeeze()
gt50_black_census_series

Unnamed: 0_level_0,STNAME,CTYNAME,PCNT_BLACK
CountyState,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
"Bullock, Alabama",Alabama,Bullock,0.709930
"Dallas, Alabama",Alabama,Dallas,0.712765
"Greene, Alabama",Alabama,Greene,0.804833
"Hale, Alabama",Alabama,Hale,0.583578
"Lowndes, Alabama",Alabama,Lowndes,0.727740
...,...,...,...
"Emporia, Virginia",Virginia,Emporia,0.662364
"Franklin, Virginia",Virginia,Franklin,0.588804
"Hampton, Virginia",Virginia,Hampton,0.542592
"Petersburg, Virginia",Virginia,Petersburg,0.788873


In [None]:
superfund_and_gt50blackcensus = gt50_black_census_series.join(superfunds_series)

#remove any counties that aren't in the superfund dataset
superfund_and_gt50blackcensus = superfund_and_gt50blackcensus[superfund_and_gt50blackcensus['Site Score'].notna()]
superfund_and_gt50blackcensus[superfund_and_gt50blackcensus['Site Score'] > 0]

Unnamed: 0_level_0,STNAME,CTYNAME,PCNT_BLACK,Site Name,Site Score,Site EPA ID,SEMS ID,Region ID,State,City,County,Status,Latitude,Longitude,Proposed Date,Listing Date,Construction Completion Date,Construction Completion Number,NOID Date,Deletion Date,Site Listing Narrative,Site Progress Profile,Proposed FR Notice,Listing FR Notice,NOID FR Notice,Deletion FR Notice,Restoration FR Notice Jumper Page,Site has had a Partial Deletion,CreationDate,Creator,EditDate,Editor,ObjectId2,x,y
CountyState,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1,Unnamed: 25_level_1,Unnamed: 26_level_1,Unnamed: 27_level_1,Unnamed: 28_level_1,Unnamed: 29_level_1,Unnamed: 30_level_1,Unnamed: 31_level_1,Unnamed: 32_level_1,Unnamed: 33_level_1,Unnamed: 34_level_1,Unnamed: 35_level_1
"Allendale, South Carolina",South Carolina,Allendale,0.734116,Helena Chemical Co. Landfill,33.89,SCD058753971,403309.0,4.0,South Carolina,Fairfax,Allendale,NPL Site,32.9412,-81.239,06/24/1988,02/21/1990,09/13/1999,628.0,,,"<a href=""https://semspub.epa.gov/src/document/...","<a href=""https://cumulis.epa.gov/supercpad/cur...","<a href=""https://semspub.epa.gov/src/document/...","<a href=""https://semspub.epa.gov/src/document/...",,,,No,,,,,1188.0,-9043484.0,3887502.0
"Baltimore, Maryland",Maryland,Baltimore,0.642019,Kane & Lombard Street Drums,30.15,MDD980923783,300344.0,3.0,Maryland,Baltimore,Baltimore,NPL Site,39.2956,-76.5419,10/15/1984,06/10/1986,,0.0,,,"<a href=""https://semspub.epa.gov/src/document/...","<a href=""https://cumulis.epa.gov/supercpad/cur...","<a href=""https://semspub.epa.gov/src/document/...","<a href=""https://semspub.epa.gov/src/document/...",,,,No,,,,,203.0,-8520605.0,4764103.0
"Baltimore, Maryland",Maryland,Baltimore,0.642019,68th Street Dump,50.0,MDD980918387,300338.0,3.0,Maryland,Baltimore,Baltimore,Proposed NPL Site,39.307967,-76.517886,04/30/2003,,,0.0,,,"<a href=""https://semspub.epa.gov/src/document/...","<a href=""https://cumulis.epa.gov/supercpad/cur...","<a href=""http://www.gpo.gov/fdsys/pkg/FR-2003-...",,,,,No,,,,,219.0,-8517932.0,4765882.0
"Baltimore, Maryland",Maryland,Baltimore,0.642019,Sauer Dump,50.0,MDD981038334,300348.0,3.0,Maryland,Dundalk,Baltimore,NPL Site,39.270267,-76.45271,03/10/2011,03/15/2012,,0.0,,,"<a href=""https://semspub.epa.gov/src/document/...","<a href=""https://cumulis.epa.gov/supercpad/cur...","<a href=""http://www.gpo.gov/fdsys/pkg/FR-2011-...","<a href=""http://www.gpo.gov/fdsys/pkg/FR-2012-...",,,,No,,,,,222.0,-8510677.0,4760459.0
"Bibb, Georgia",Georgia,Bibb,0.568481,Armstrong World Industries,50.0,GAN000410033,410033.0,4.0,Georgia,Macon,Bibb,NPL Site,32.773497,-83.6516,10/21/2010,09/16/2011,,0.0,,,"<a href=""https://semspub.epa.gov/src/document/...","<a href=""https://cumulis.epa.gov/supercpad/cur...","<a href=""http://www.gpo.gov/fdsys/pkg/FR-2010-...","<a href=""http://www.gpo.gov/fdsys/pkg/FR-2011-...",,,,No,,,,,739.0,-9312054.0,3865278.0
"Bibb, Georgia",Georgia,Bibb,0.568481,Macon Naval Ordnance Plant,48.97,GAD003302676,405304.0,4.0,Georgia,Macon,Bibb,NPL Site,32.777658,-83.639675,03/15/2012,05/24/2013,,0.0,,,"<a href=""https://semspub.epa.gov/src/document/...","<a href=""https://cumulis.epa.gov/supercpad/cur...","<a href=""http://www.gpo.gov/fdsys/pkg/FR-2012-...","<a href=""http://www.gpo.gov/fdsys/pkg/FR-2013-...",,,,No,,,,,740.0,-9310726.0,3865829.0
"Charles, Maryland",Maryland,Charles,0.529049,Indian Head Naval Surface Warfare Center,50.0,MD7170024684,300430.0,3.0,Maryland,Indian Head,Charles,NPL Site,38.591389,-77.174306,02/13/1995,09/29/1995,,0.0,,,"<a href=""https://semspub.epa.gov/src/document/...","<a href=""https://cumulis.epa.gov/supercpad/cur...","<a href=""http://www.gpo.gov/fdsys/pkg/FR-1995-...","<a href=""http://www.gpo.gov/fdsys/pkg/FR-1995-...",,,,No,,,,,213.0,-8591004.0,4663309.0
"Coahoma, Mississippi",Mississippi,Coahoma,0.780058,Red Panther Chemical Company,39.43,MSD000272385,402231.0,4.0,Mississippi,Clarksdale,Coahoma,Deleted NPL Site,34.187408,-90.561625,03/10/2011,09/16/2011,06/09/2020,1216.0,07/15/2020,09/30/2020,"<a href=""https://semspub.epa.gov/src/document/...","<a href=""https://cumulis.epa.gov/supercpad/cur...","<a href=""http://www.gpo.gov/fdsys/pkg/FR-2011-...","<a href=""http://www.gpo.gov/fdsys/pkg/FR-2011-...","<a href=""http://www.gpo.gov/fdsys/pkg/FR-2020-...","<a href=""http://www.gpo.gov/fdsys/pkg/FR-2020-...",,No,,,,,125.0,-10081270.0,4053994.0
"Copiah, Mississippi",Mississippi,Copiah,0.525387,Potter Co.,50.0,MSD056029648,404404.0,4.0,Mississippi,Wesson,Copiah,Proposed NPL Site,31.710278,-90.393056,05/10/1993,,,0.0,,,"<a href=""https://semspub.epa.gov/src/document/...","<a href=""https://cumulis.epa.gov/supercpad/cur...","<a href=""https://semspub.epa.gov/src/document/...",,,,,No,,,,,118.0,-10062510.0,3725340.0
"Crittenden, Arkansas",Arkansas,Crittenden,0.553436,Gurley Pit,40.13,ARD035662469,600077.0,6.0,Arkansas,Edmondson,Crittenden,Deleted NPL Site,35.1209,-90.3116,12/30/1982,09/08/1983,09/13/1994,254.0,07/28/2003,11/06/2003,"<a href=""https://semspub.epa.gov/src/document/...","<a href=""https://cumulis.epa.gov/supercpad/cur...","<a href=""https://semspub.epa.gov/src/document/...","<a href=""https://semspub.epa.gov/src/document/...","<a href=""http://www.gpo.gov/fdsys/pkg/FR-2003-...","<a href=""http://www.gpo.gov/fdsys/pkg/FR-2003-...",,No,,,,,827.0,-10053440.0,4180323.0


In [None]:
superfund_and_gt50blackcensus['Site Score'].mean()

38.28581395348838

Interesting. Communities with greater than 50% Black population have lower scores for their Superfund sites. I would like to believe this is a good thing, but my understand of institutionalized racism leads me to I wonder if these scores are not entirely accurate. Perhaps, less-needy sites are given 