This repository contains the code used in the quantitative analysis of the evaluation of the Prospective Purchaser Agreeement (PPA) program. A short description of the analysis conducted with each file is provided below. The evaluation report is located here.
The analysis team used this code to conduct the following analysis and create the following graphs:
- Conduct exploratory data analysis of the the economic and PPA data to identify any trends or anomalies
- Determine the area of each Superfund site with economic data
- Created Figure 1 (Number of Unique Sites with a PPA by EPA Region since 1991)
- Created Figure 4 (Number and percentage of unique sites with a PPA(s), with and without economic data by EPA Region)
- Observe the distributions (via boxplots and histograms) and medians of 5 variable of interest for sites among 3 distinct groups. The 3 distinct groups include:
- Superfund sites with no PPA or enforcement instrument
- Superfund sites with at least one enforcement instrument (not including a PPA)
- Superfund sites with at least 1 PPA (among other enforcement instruments) The 5 variables of interest include:
- Number of businesses at the Superfund site location
- Number of employees employed at the businesses
- Annual sales of the businesses (gross sales)
- Annual income of the businesses (profit)
- Area (in acreage) of the Superfund site with economic data (both for PPA and non-PPA sites)
- Conduct a Shapiro-Wilk test to determine the normality of each distribution
- Conduct a Kruskal-Wallis statistical test to determine if the distribution of economic and Superfund size values for the three groups differed significantly from each other
- Conduct Mann-Whitney U statistical tests to compare pairs of the different distributions for each of the 5 variables
PPA Demographic Analysis (both the ppa_demographic_analysis.ipynb and PPA_demographic_analysis_5_21_25.R files)
The analysis team used this code for the demographic analysis, utilizing location data from PPA sites and demographic data from the U.S. Census' American Community Survey.
This repository includes some code initially created by entities contracted by the EPA, Censeo Consulting Group (Censeo) and Industrial Economics (IEc), in support of this evaluation. The EPA finished the data collection and analysis for this evaluation without contractor support. Credit to the initial creation of files created/coded are provided below.
EPA created/compiled files: PPA_Economic_Analysis.ipynb; ppa_demographic_analysis.ipynb; Superfund_areas.ipynb; README.md
Censeo/IEc initially created/compiled files, with significant EPA edits to meet the needs of the evaluation: PPA_demographic_analysis_5_21_25.R