# Exploring ECHO Data in Your Area
This workbook is a way to quickly view data from EPA's Enforcement and Compliance History Online portal that is relevant just to your area.
It is designed to work with the [ECHO Exporter](https://echo.epa.gov/tools/data-downloads#exporter) file (download as ZIP file and extract to CSV).

Input your zip code into the text box below, then click the "Run" button to see data for your region!

In [7]:
# Not currently using this section, it's a way to make this page into more of an app/less of a data science notebook

# # Widgets for defined interaction
# import ipywidgets as widgets
# from IPython.display import display
# w = widgets.Text("53703")
# display(w)

In [8]:
my_zip = "98296"
data_location = "data/ECHO_EXPORTER.csv" # Where the ECHO data is saved on your computer in relation to this file

### Below this point, everything is calculated automatically

You don't need to interact with it in order to get it to work, but if you want to dive deeper, you can use it to get started exploring!

In [3]:
# Import libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline

In [52]:
# Define columns of interest (see the echo_exporter_columns xlsx file that comes bundled with the csv download)
# This is not a comprehensive list of columns; more are available.
# This dictionary maps the column titles to their data types, to allow for faster import

# Note to self - right now mapping everything not explicitly a number as a string, might be an issue later
column_mapping = {
    "REGISTRY_ID": str,
    "FAC_NAME": str,
    "FAC_ZIP": str,
    "FAC_LAT": float,
    "FAC_LONG": float,
    "FAC_QTRS_WITH_NC": float,
    "CAA_PERMIT_TYPES": str,
    "CWA_PERMIT_TYPES": str,
    "RCRA_PERMIT_TYPES": str
}
# not currently using: "FAC_3YR_COMPLIANCE_HISTORY", "FAC_INSPECTION_COUNT", "FAC_INFORMAL_COUNT", "FAC_FORMAL_ACTION_COUNT",

In [53]:
# Get the data
echo_data = pd.read_csv(data_location, usecols = list(column_mapping.keys()), dtype=column_mapping)
echo_data.head()

Unnamed: 0,REGISTRY_ID,FAC_NAME,FAC_ZIP,FAC_LAT,FAC_LONG,FAC_QTRS_WITH_NC,CAA_PERMIT_TYPES,CWA_PERMIT_TYPES,RCRA_PERMIT_TYPES
0,110007920152,GREAT WESTERN CHEMICAL CO KENAI,99611,60.65448,-151.310147,0.0,,,VSQG
1,110064336240,LOST VALLEY LAKE RESORT,65066,38.47606,-91.387974,10.0,,Minor,
2,110010488105,CROW CANYON CLEANERS,94568,37.705084,-121.936713,0.0,,,Transporter
3,110002976067,APPLE CLNRS,80260,39.85628,-104.996657,0.0,,,Other
4,110013694582,NEWTOWN PUBLIC WORKS FACILITY,6470,41.3864,-73.27817,0.0,,Minor,


In [64]:
# Filter to just your zip code
my_echo = echo_data[echo_data["FAC_ZIP"] == my_zip]

# Get an idea of how much data we're working with (rows, cols)
print("There are %s facilities in %s tracked in the ECHO database." %(my_echo.shape[0], my_zip))

There are 59 facilities in 98296 tracked in the ECHO database.


In [75]:
# What permit types are active in this zip code?

permit_cols = ["CAA_PERMIT_TYPES","CWA_PERMIT_TYPES","RCRA_PERMIT_TYPES"]

# For each permit type
for permit_law in permit_cols:
    print(permit_law)
    # Find out what the unique values are
    permit_types = my_echo[permit_law].unique().tolist()
    print(permit_types)
    for permit_type in permit_types:
        # Count those unique values
        print(permit_law.replace("_PERMIT_TYPES",""))
        print(permit_type)
        print(my_echo[my_echo[permit_law] == permit_type].shape[0])
# Output a dataframe (permit type, count)

CAA_PERMIT_TYPES


AttributeError: 'list' object has no attribute 'dropna'

In [None]:
# How many facilities have been out of compliance in the last 12 quarters?

noncompliant = my_echo[my_echo["FAC_QTRS_WITH_NC"] > 0]
plt.pie([noncompliant.shape[0], my_echo.shape[0] - noncompliant.shape[0]], labels=["Noncompliant", "Compliant"], autopct='%1.1f%%', shadow=True)

plt.title("%s of %s Total Facilities Noncompliant in %s in the last 12 qtrs" %(noncompliant.shape[0], my_echo.shape[0], my_zip))

In [None]:
# Which facilities aren't in compliance?

print("Facilities in %s noncompliant in the last 12 quarters:" %my_zip)
print(noncompliant["FAC_NAME"])

# Next questions
* Which facilities aren't in compliance?
* Where are they (on a map)?
* What are the top 3 noncompliant facilities in the zip code and what are they violating?
* Which types of noncompliance are we experiencing here?
* What permits are issued in this region?
* Beyond "significant" – how much over their permits are they?
