### <p style="text-align: right;"> &#9989; Elizabeth Walter</p>

# Day 10 In-class Assignment
# Get the Lead Out: Understanding The Water Crisis in Flint, MI

<div align="center"><img src="http://media2.govtech.com/images/940*627/Rusty+Corroded+Pipes.jpg" width=400px></div>


## Learning Goals:

By the end of this assignment you should be able to:
* Use Pandas to filter data to select particular subsets of interest
* Articulate, based on your own perception, what you thinks makes a data visualization "good" versus "bad"
* Use data to support a claim or make an argument

Also, data analysis matters! Data analysis is something that can and should be used for (among other things): 

- [improving local government](http://www.codeforamerica.org)
- [improving the Federal government](https://www.whitehouse.gov/digital/united-states-digital-service)
- [serving humanity](http://www.datakind.org)

Data visualization is one of the most important parts of modeling - it gives us an intuitive understanding of the system we are interested in. How we represent data affects everything from to [understanding poverty in the developing world](http://www.ted.com/talks/hans_rosling_shows_the_best_stats_you_ve_ever_seen) to [grappling with the global spread of lethal diseases](http://www.ted.com/talks/hans_rosling_the_truth_about_hiv). We want you to be able to find things out about the models you create and use visual information to make convincing arguments.

## Assignment instructions

Work with your group to complete this assignment. Instructions for submitting this assignment are at the end of the notebook. The assignment is due at the end of class.

___
## Exploring data

Today we will embark on another **data science** project following up on our some of our earlier exploratory data analysis efforts. We will primarly use Pandas for this. 

The data set we will use is real; it comes from the Flint water crisis that first made the news in 2014, just down the road from where we are. In fact, it was only in April of 2018 that the Governor of Michigan stated that lead levels had returned to the federal safety limits. However, Flint still has many lead service lines that still need to be replaced.

We will analyze the data ourselves and come up with conclusions and policy decisions based on what we find. The data is real, so it will be imperfect. It is important to learn to handle real-world data. The process of doing real data science typically follows these steps:

1. **Inspecting**: you should *always* look at your data before you attempt to do anything. If you are using Pandas you can simply look at your dataframe using shift-enter in a cell with the name of the dataframe; Pandas will nicely format that to look like a spreadsheet.
2. **Cleaning**: you will probably find problems in your data set, such as missing values or values that don't make sense. You will need to remove those from the data set, and you can use Pandas operations for removing rows and/or columns. 
3. **Transforming**: it is likely that the data has come to you in some form that you can't immediately use and you will need to transform it in some way. The transformation is very problem dependent, but might include changing the units (e.g., multiplying a column by a conversation factor), creating a new quantity from others (e.g., adding two columns together to create a new column) or removing cases that aren't of interest (e.g., grouping by a specific feature). 
4. **Modeling**: finally you can get to the fun part - thinking about what your data is telling you! There are too many techniques for data modeling to list here, but you have already seen one example: looking for correlations in the Great Lake levels by plotting the depth of one lake versus another.

&#9989;&nbsp;  **Talk through each of these four steps with your group members, create a markdown cell below, and write some examples that you have seen so far in class or experiences you've had in other classes or outside of class** (e.g., using the Great Lakes levels data).


I have used raw data to create dynamic databases in excel; utilized  empirical data to do statistical/regression/empirical analysis.

---
### Visualizing Data
Before we get started with the data analysis, we want to cover the general concept of visualization. While you have found it easy to make really high quality plots with Python's matplotlib, a more important challenge is to be able to get information out of those plots and to tell a story to your audience. **Think about the issues in this list, discuss them in your group and find some examples from around the web to paste into this notebook - one or two examples for each bullet below. Try to find both good and bad examples**. The markdown cell below shows you one way that you can embed images in a notebook. 


Things to think about and look for:

* overuse of color, marker type, marker size, line type, line thickness: what do you feel is a good rule to use for use these attributes?
* adding information to your simple plots: how can you use marker size and/or color to add another dimension to your data presentation?
* color blindness: what types of color blindness are there and what are the best color palettes to use?
* Seaborn: today we will be using Seaborn; look at this [page](https://seaborn.pydata.org/tutorial/color_palettes.html) and discuss their thoughts on these issues.

Edit the mark down cell below to record your thoughts and embed some examples.

&#9989;&nbsp;  **Put some notes from your discussion here and include some example plots**

I don't know what color palettes are best to use, but I do know that most people who are co
* A plot with a scale that is not well suited for your data can make it harder to see different data points, changes in data, differences in plots of two different sets of data, etc and make it difficult to make meaningful interpretations of the visual.
You can embed an image from the web using this structure in a markdown cell:

`<img src="http://theurloftheimage.com" width=300px>`

___
## Continuing our data science efforts!

Now, let's turn to back to data science! 

Today we want you to think about what the data is telling you, but still use coding to help you; and, use some of the visualization ideas you explored above.

We'll be looking at the publicly released [Flint Water Quality dataset](http://flintwaterstudy.org/2015/12/complete-dataset-lead-results-in-tap-water-for-271-flint-samples/). This is a dataset of nearly 300 tests run by volunteers at Virginia Tech on water samples obtained from Flint residents. You can learn more about their efforts [here](http://flintwaterstudy.org/about-page/about-us/). The water testing method involves collecting three different bottles worth of water at timed intervals; our analysis will focus on just the first collection at each testing site. 

You'll be considering the following questions in the context of U.S. Environmental Protection Agency (EPA) guidelines about lead contaminants, which state:

> Lead and copper are regulated by a treatment technique that requires systems to control the corrosiveness of their water. If more than 10% of tap water samples exceed the action level, water systems must take additional steps. For copper, the action level is 1.3 mg/L, and for lead is 0.015 mg/L. 
>
> Source: (http://www.epa.gov/your-drinking-water/table-regulated-drinking-water-contaminants#seven). 

&#9989;&nbsp;  **Talk through this EPA guideline with your group members and make sure you understand it. You will be searching through the data to explore whether or not the EPA guideline has been met, or not.**

Now, load in the libraries you need and the data, just as you did in the previous in-class assignment.

In [1]:
import matplotlib.pyplot as plt
%matplotlib inline

import numpy as np
import pandas as pd

# Loading the data
flint_data = pd.read_json("""[{"SampleID":1,"Zip Code":48504,"Ward":6,"PbBottle1_ppb":0.344,"PbBottle2_ppb":0.226,"PbBottle3_ppb":0.145},{"SampleID":2,"Zip Code":48507,"Ward":9,"PbBottle1_ppb":8.133,"PbBottle2_ppb":10.77,"PbBottle3_ppb":2.761},{"SampleID":4,"Zip Code":48504,"Ward":1,"PbBottle1_ppb":1.111,"PbBottle2_ppb":0.11,"PbBottle3_ppb":0.123},{"SampleID":5,"Zip Code":48507,"Ward":8,"PbBottle1_ppb":8.007,"PbBottle2_ppb":7.446,"PbBottle3_ppb":3.384},{"SampleID":6,"Zip Code":48505,"Ward":3,"PbBottle1_ppb":1.951,"PbBottle2_ppb":0.048,"PbBottle3_ppb":0.035},{"SampleID":7,"Zip Code":48507,"Ward":9,"PbBottle1_ppb":7.2,"PbBottle2_ppb":1.4,"PbBottle3_ppb":0.2},{"SampleID":8,"Zip Code":48507,"Ward":9,"PbBottle1_ppb":40.63,"PbBottle2_ppb":9.726,"PbBottle3_ppb":6.132},{"SampleID":9,"Zip Code":48503,"Ward":5,"PbBottle1_ppb":1.1,"PbBottle2_ppb":2.5,"PbBottle3_ppb":0.1},{"SampleID":12,"Zip Code":48507,"Ward":9,"PbBottle1_ppb":10.6,"PbBottle2_ppb":1.038,"PbBottle3_ppb":1.294},{"SampleID":13,"Zip Code":48505,"Ward":3,"PbBottle1_ppb":6.2,"PbBottle2_ppb":4.2,"PbBottle3_ppb":2.3},{"SampleID":15,"Zip Code":48507,"Ward":9,"PbBottle1_ppb":4.358,"PbBottle2_ppb":0.822,"PbBottle3_ppb":0.147},{"SampleID":16,"Zip Code":48505,"Ward":5,"PbBottle1_ppb":24.37,"PbBottle2_ppb":8.796,"PbBottle3_ppb":4.347},{"SampleID":17,"Zip Code":48505,"Ward":2,"PbBottle1_ppb":6.609,"PbBottle2_ppb":5.752,"PbBottle3_ppb":1.433},{"SampleID":18,"Zip Code":48503,"Ward":7,"PbBottle1_ppb":4.062,"PbBottle2_ppb":1.099,"PbBottle3_ppb":1.085},{"SampleID":19,"Zip Code":48507,"Ward":9,"PbBottle1_ppb":2.484,"PbBottle2_ppb":0.72,"PbBottle3_ppb":0.565},{"SampleID":20,"Zip Code":48507,"Ward":9,"PbBottle1_ppb":0.438,"PbBottle2_ppb":1.046,"PbBottle3_ppb":0.511},{"SampleID":21,"Zip Code":48503,"Ward":5,"PbBottle1_ppb":1.29,"PbBottle2_ppb":0.243,"PbBottle3_ppb":0.225},{"SampleID":22,"Zip Code":48504,"Ward":6,"PbBottle1_ppb":0.548,"PbBottle2_ppb":0.622,"PbBottle3_ppb":0.361},{"SampleID":23,"Zip Code":48504,"Ward":2,"PbBottle1_ppb":3.131,"PbBottle2_ppb":0.674,"PbBottle3_ppb":0.683},{"SampleID":24,"Zip Code":48504,"Ward":6,"PbBottle1_ppb":120,"PbBottle2_ppb":239.7,"PbBottle3_ppb":29.71},{"SampleID":25,"Zip Code":48505,"Ward":1,"PbBottle1_ppb":2.911,"PbBottle2_ppb":0.406,"PbBottle3_ppb":0.237},{"SampleID":26,"Zip Code":48505,"Ward":5,"PbBottle1_ppb":16.52,"PbBottle2_ppb":10.26,"PbBottle3_ppb":2.762},{"SampleID":27,"Zip Code":48505,"Ward":1,"PbBottle1_ppb":1.984,"PbBottle2_ppb":1.13,"PbBottle3_ppb":0.712},{"SampleID":28,"Zip Code":48503,"Ward":7,"PbBottle1_ppb":5.367,"PbBottle2_ppb":2.474,"PbBottle3_ppb":1.616},{"SampleID":29,"Zip Code":48504,"Ward":2,"PbBottle1_ppb":5.5,"PbBottle2_ppb":8.4,"PbBottle3_ppb":2.4},{"SampleID":30,"Zip Code":48506,"Ward":4,"PbBottle1_ppb":0.639,"PbBottle2_ppb":0.223,"PbBottle3_ppb":0.194},{"SampleID":31,"Zip Code":48503,"Ward":7,"PbBottle1_ppb":6.087,"PbBottle2_ppb":28.87,"PbBottle3_ppb":2.13,"Notes":"*house sampled twice"},{"SampleID":31,"Zip Code":48503,"Ward":7,"PbBottle1_ppb":10.32,"PbBottle2_ppb":13.47,"PbBottle3_ppb":18.19,"Notes":"*house sampled twice"},{"SampleID":33,"Zip Code":48503,"Ward":6,"PbBottle1_ppb":66.88,"PbBottle2_ppb":2.662,"PbBottle3_ppb":2.082},{"SampleID":34,"Zip Code":48505,"Ward":1,"PbBottle1_ppb":20.41,"PbBottle2_ppb":3.543,"PbBottle3_ppb":2.344},{"SampleID":35,"Zip Code":48504,"Ward":6,"PbBottle1_ppb":109.6,"PbBottle2_ppb":80.47,"PbBottle3_ppb":94.52},{"SampleID":36,"Zip Code":48503,"Ward":8,"PbBottle1_ppb":5.06,"PbBottle2_ppb":3.406,"PbBottle3_ppb":4.088},{"SampleID":37,"Zip Code":48504,"Ward":2,"PbBottle1_ppb":2.774,"PbBottle2_ppb":0.21,"PbBottle3_ppb":0.264},{"SampleID":38,"Zip Code":48505,"Ward":3,"PbBottle1_ppb":4.453,"PbBottle2_ppb":3.679,"PbBottle3_ppb":3.523},{"SampleID":39,"Zip Code":48505,"Ward":1,"PbBottle1_ppb":0.4,"PbBottle2_ppb":0.3,"PbBottle3_ppb":0.7},{"SampleID":40,"Zip Code":48529,"Ward":9,"PbBottle1_ppb":0.974,"PbBottle2_ppb":0.142,"PbBottle3_ppb":0.118},{"SampleID":41,"Zip Code":48505,"Ward":5,"PbBottle1_ppb":3.228,"PbBottle2_ppb":2.534,"PbBottle3_ppb":2.222},{"SampleID":42,"Zip Code":48505,"Ward":2,"PbBottle1_ppb":12.55,"PbBottle2_ppb":4.132,"PbBottle3_ppb":0.12},{"SampleID":43,"Zip Code":48505,"Ward":3,"PbBottle1_ppb":0.501,"PbBottle2_ppb":0.156,"PbBottle3_ppb":15.14},{"SampleID":44,"Zip Code":48504,"Ward":2,"PbBottle1_ppb":2.448,"PbBottle2_ppb":0.373,"PbBottle3_ppb":0.288},{"SampleID":45,"Zip Code":48505,"Ward":3,"PbBottle1_ppb":5.508,"PbBottle2_ppb":5.157,"PbBottle3_ppb":2.621},{"SampleID":46,"Zip Code":48504,"Ward":6,"PbBottle1_ppb":1.293,"PbBottle2_ppb":0.441,"PbBottle3_ppb":0.281},{"SampleID":47,"Zip Code":48505,"Ward":1,"PbBottle1_ppb":4.699,"PbBottle2_ppb":1.395,"PbBottle3_ppb":0.329},{"SampleID":48,"Zip Code":48504,"Ward":6,"PbBottle1_ppb":6.093,"PbBottle2_ppb":2.682,"PbBottle3_ppb":1.458},{"SampleID":49,"Zip Code":48504,"Ward":2,"PbBottle1_ppb":0.8,"PbBottle2_ppb":0.8,"PbBottle3_ppb":0.5},{"SampleID":50,"Zip Code":48503,"Ward":5,"PbBottle1_ppb":1.626,"PbBottle2_ppb":1.332,"PbBottle3_ppb":0.327},{"SampleID":51,"Zip Code":48507,"Ward":8,"PbBottle1_ppb":2.576,"PbBottle2_ppb":2.852,"PbBottle3_ppb":1.48},{"SampleID":52,"Zip Code":48504,"Ward":1,"PbBottle1_ppb":2.362,"PbBottle2_ppb":0.467,"PbBottle3_ppb":0.339},{"SampleID":53,"Zip Code":48503,"Ward":5,"PbBottle1_ppb":1.585,"PbBottle2_ppb":0.494,"PbBottle3_ppb":1.232},{"SampleID":54,"Zip Code":48506,"Ward":4,"PbBottle1_ppb":3.058,"PbBottle2_ppb":1.808,"PbBottle3_ppb":1.169},{"SampleID":55,"Zip Code":48506,"Ward":4,"PbBottle1_ppb":2.423,"PbBottle2_ppb":0.393,"PbBottle3_ppb":0.373},{"SampleID":56,"Zip Code":48503,"Ward":8,"PbBottle1_ppb":30.91,"PbBottle2_ppb":42.58,"PbBottle3_ppb":44.6},{"SampleID":57,"Zip Code":48503,"Ward":8,"PbBottle1_ppb":4.47,"PbBottle2_ppb":3.649,"PbBottle3_ppb":1},{"SampleID":58,"Zip Code":48505,"Ward":1,"PbBottle1_ppb":2.172,"PbBottle2_ppb":1.76,"PbBottle3_ppb":1.44},{"SampleID":59,"Zip Code":48505,"Ward":3,"PbBottle1_ppb":1.8,"PbBottle2_ppb":0.5,"PbBottle3_ppb":0.2},{"SampleID":63,"Zip Code":48503,"Ward":5,"PbBottle1_ppb":0.965,"PbBottle2_ppb":0.166,"PbBottle3_ppb":0.319},{"SampleID":65,"Zip Code":48505,"Ward":1,"PbBottle1_ppb":7.636,"PbBottle2_ppb":5.206,"PbBottle3_ppb":9.239},{"SampleID":66,"Zip Code":48506,"Ward":3,"PbBottle1_ppb":3.158,"PbBottle2_ppb":1.948,"PbBottle3_ppb":2.802},{"SampleID":67,"Zip Code":48503,"Ward":7,"PbBottle1_ppb":105.3,"PbBottle2_ppb":12.84,"PbBottle3_ppb":4.534},{"SampleID":68,"Zip Code":48506,"Ward":3,"PbBottle1_ppb":4.476,"PbBottle2_ppb":0.355,"PbBottle3_ppb":0.334},{"SampleID":69,"Zip Code":48504,"Ward":1,"PbBottle1_ppb":2.828,"PbBottle2_ppb":6.694,"PbBottle3_ppb":20.99},{"SampleID":71,"Zip Code":48503,"Ward":5,"PbBottle1_ppb":2.481,"PbBottle2_ppb":3.86,"PbBottle3_ppb":24.64},{"SampleID":72,"Zip Code":48507,"Ward":5,"PbBottle1_ppb":11.52,"PbBottle2_ppb":0.288,"PbBottle3_ppb":0.215},{"SampleID":73,"Zip Code":48507,"Ward":8,"PbBottle1_ppb":3.784,"PbBottle2_ppb":0.292,"PbBottle3_ppb":0.258},{"SampleID":74,"Zip Code":48503,"Ward":5,"PbBottle1_ppb":1.344,"PbBottle2_ppb":0.729,"PbBottle3_ppb":1.226},{"SampleID":75,"Zip Code":48504,"Ward":6,"PbBottle1_ppb":11.93,"PbBottle2_ppb":9.645,"PbBottle3_ppb":3.514},{"SampleID":76,"Zip Code":48507,"Ward":9,"PbBottle1_ppb":10.96,"PbBottle2_ppb":7.744,"PbBottle3_ppb":4.16},{"SampleID":77,"Zip Code":48504,"Ward":6,"PbBottle1_ppb":3.341,"PbBottle2_ppb":0.555,"PbBottle3_ppb":0.917},{"SampleID":78,"Zip Code":48503,"Ward":5,"PbBottle1_ppb":1.229,"PbBottle2_ppb":1.192,"PbBottle3_ppb":0.218},{"SampleID":79,"Zip Code":48503,"Ward":6,"PbBottle1_ppb":6.3,"PbBottle2_ppb":1.1,"PbBottle3_ppb":0.3,"Notes":"*house sampled twice"},{"SampleID":79,"Zip Code":48503,"Ward":6,"PbBottle1_ppb":5.153,"PbBottle2_ppb":0.385,"PbBottle3_ppb":0.322,"Notes":"*house sampled twice"},{"SampleID":80,"Zip Code":48504,"Ward":6,"PbBottle1_ppb":6.054,"PbBottle2_ppb":0.927,"PbBottle3_ppb":0.676},{"SampleID":82,"Zip Code":48504,"Ward":2,"PbBottle1_ppb":31.14,"PbBottle2_ppb":4.73,"PbBottle3_ppb":3.188},{"SampleID":83,"Zip Code":48504,"Ward":6,"PbBottle1_ppb":102.7,"PbBottle2_ppb":9.894,"PbBottle3_ppb":3.133},{"SampleID":84,"Zip Code":48504,"Ward":2,"PbBottle1_ppb":1.38,"PbBottle2_ppb":3.734,"PbBottle3_ppb":0.524},{"SampleID":85,"Zip Code":48505,"Ward":3,"PbBottle1_ppb":1.132,"PbBottle2_ppb":2.17,"PbBottle3_ppb":0.465},{"SampleID":87,"Zip Code":48507,"Ward":9,"PbBottle1_ppb":3.232,"PbBottle2_ppb":2.989,"PbBottle3_ppb":1.927},{"SampleID":88,"Zip Code":48532,"Ward":8,"PbBottle1_ppb":0.507,"PbBottle2_ppb":2.315,"PbBottle3_ppb":0.231},{"SampleID":90,"Zip Code":48506,"Ward":4,"PbBottle1_ppb":8.561,"PbBottle2_ppb":5.141,"PbBottle3_ppb":4.724},{"SampleID":91,"Zip Code":48505,"Ward":3,"PbBottle1_ppb":9.997,"PbBottle2_ppb":0.983,"PbBottle3_ppb":0.611},{"SampleID":92,"Zip Code":48504,"Ward":1,"PbBottle1_ppb":4.152,"PbBottle2_ppb":0.758,"PbBottle3_ppb":0.433},{"SampleID":93,"Zip Code":48504,"Ward":2,"PbBottle1_ppb":75.82,"PbBottle2_ppb":11.65,"PbBottle3_ppb":3.942},{"SampleID":95,"Zip Code":48506,"Ward":4,"PbBottle1_ppb":138.8,"PbBottle2_ppb":2.745,"PbBottle3_ppb":0.797},{"SampleID":96,"Zip Code":48507,"Ward":9,"PbBottle1_ppb":0.8,"PbBottle2_ppb":0.2,"PbBottle3_ppb":0.2},{"SampleID":97,"Zip Code":48504,"Ward":2,"PbBottle1_ppb":7.244,"PbBottle2_ppb":1051,"PbBottle3_ppb":1.328},{"SampleID":98,"Zip Code":48506,"Ward":3,"PbBottle1_ppb":1.621,"PbBottle2_ppb":0.3,"PbBottle3_ppb":0.238},{"SampleID":99,"Zip Code":48505,"Ward":1,"PbBottle1_ppb":1.032,"PbBottle2_ppb":0.363,"PbBottle3_ppb":0.216},{"SampleID":100,"Zip Code":48504,"Ward":2,"PbBottle1_ppb":0.866,"PbBottle2_ppb":0.292,"PbBottle3_ppb":0.269},{"SampleID":101,"Zip Code":48504,"Ward":2,"PbBottle1_ppb":2.525,"PbBottle2_ppb":0.59,"PbBottle3_ppb":0.438},{"SampleID":102,"Zip Code":48505,"Ward":5,"PbBottle1_ppb":9.408,"PbBottle2_ppb":4.444,"PbBottle3_ppb":3.935},{"SampleID":103,"Zip Code":48505,"Ward":0,"PbBottle1_ppb":0.739,"PbBottle2_ppb":4.883,"PbBottle3_ppb":0.953},{"SampleID":104,"Zip Code":48505,"Ward":1,"PbBottle1_ppb":0.9,"PbBottle2_ppb":0.2,"PbBottle3_ppb":0.1},{"SampleID":105,"Zip Code":48504,"Ward":2,"PbBottle1_ppb":1.403,"PbBottle2_ppb":0.142,"PbBottle3_ppb":0.121},{"SampleID":106,"Zip Code":48503,"Ward":8,"PbBottle1_ppb":5.655,"PbBottle2_ppb":5.882,"PbBottle3_ppb":10.66},{"SampleID":107,"Zip Code":48505,"Ward":2,"PbBottle1_ppb":31.06,"PbBottle2_ppb":8.578,"PbBottle3_ppb":3.176},{"SampleID":108,"Zip Code":48506,"Ward":4,"PbBottle1_ppb":1.469,"PbBottle2_ppb":0.291,"PbBottle3_ppb":0.25},{"SampleID":109,"Zip Code":48505,"Ward":1,"PbBottle1_ppb":23.85,"PbBottle2_ppb":2.301,"PbBottle3_ppb":1.62},{"SampleID":110,"Zip Code":48505,"Ward":2,"PbBottle1_ppb":9.766,"PbBottle2_ppb":11.13,"PbBottle3_ppb":7.144},{"SampleID":111,"Zip Code":48505,"Ward":1,"PbBottle1_ppb":4.69,"PbBottle2_ppb":0.953,"PbBottle3_ppb":0.929},{"SampleID":112,"Zip Code":48505,"Ward":3,"PbBottle1_ppb":4.066,"PbBottle2_ppb":5.894,"PbBottle3_ppb":4.76},{"SampleID":113,"Zip Code":48505,"Ward":1,"PbBottle1_ppb":0.846,"PbBottle2_ppb":0.455,"PbBottle3_ppb":0.366},{"SampleID":114,"Zip Code":48505,"Ward":1,"PbBottle1_ppb":2.054,"PbBottle2_ppb":3.978,"PbBottle3_ppb":0.355},{"SampleID":115,"Zip Code":48506,"Ward":7,"PbBottle1_ppb":3.744,"PbBottle2_ppb":5.592,"PbBottle3_ppb":2.476},{"SampleID":116,"Zip Code":48507,"Ward":9,"PbBottle1_ppb":12.9,"PbBottle2_ppb":2.202,"PbBottle3_ppb":1.667},{"SampleID":117,"Zip Code":48505,"Ward":1,"PbBottle1_ppb":0.543,"PbBottle2_ppb":0.183,"PbBottle3_ppb":0.162},{"SampleID":118,"Zip Code":48503,"Ward":7,"PbBottle1_ppb":6.877,"PbBottle2_ppb":2.984,"PbBottle3_ppb":2.201},{"SampleID":119,"Zip Code":48505,"Ward":1,"PbBottle1_ppb":0.552,"PbBottle2_ppb":0.19,"PbBottle3_ppb":0.205},{"SampleID":121,"Zip Code":48506,"Ward":3,"PbBottle1_ppb":59,"PbBottle2_ppb":2.9,"PbBottle3_ppb":0.5},{"SampleID":122,"Zip Code":48506,"Ward":4,"PbBottle1_ppb":0.349,"PbBottle2_ppb":0.13,"PbBottle3_ppb":0.131},{"SampleID":123,"Zip Code":48506,"Ward":4,"PbBottle1_ppb":4.764,"PbBottle2_ppb":1.388,"PbBottle3_ppb":1.06},{"SampleID":124,"Zip Code":48506,"Ward":4,"PbBottle1_ppb":0.832,"PbBottle2_ppb":0.284,"PbBottle3_ppb":0.214},{"SampleID":125,"Zip Code":48506,"Ward":4,"PbBottle1_ppb":1.224,"PbBottle2_ppb":0.568,"PbBottle3_ppb":0.465},{"SampleID":126,"Zip Code":48507,"Ward":9,"PbBottle1_ppb":15.9,"PbBottle2_ppb":3.7,"PbBottle3_ppb":2.2},{"SampleID":127,"Zip Code":48506,"Ward":4,"PbBottle1_ppb":5.667,"PbBottle2_ppb":1.405,"PbBottle3_ppb":0.896},{"SampleID":128,"Zip Code":48506,"Ward":4,"PbBottle1_ppb":3.564,"PbBottle2_ppb":2.767,"PbBottle3_ppb":2.127},{"SampleID":129,"Zip Code":48506,"Ward":4,"PbBottle1_ppb":0.475,"PbBottle2_ppb":0.2,"PbBottle3_ppb":0.268},{"SampleID":130,"Zip Code":48507,"Ward":9,"PbBottle1_ppb":5.3,"PbBottle2_ppb":0.5,"PbBottle3_ppb":0.2},{"SampleID":131,"Zip Code":48506,"Ward":4,"PbBottle1_ppb":1.166,"PbBottle2_ppb":0.736,"PbBottle3_ppb":0.269},{"SampleID":132,"Zip Code":48506,"Ward":4,"PbBottle1_ppb":0.684,"PbBottle2_ppb":0.306,"PbBottle3_ppb":0.094},{"SampleID":133,"Zip Code":48503,"Ward":8,"PbBottle1_ppb":6.347,"PbBottle2_ppb":1.724,"PbBottle3_ppb":0.678},{"SampleID":134,"Zip Code":48507,"Ward":9,"PbBottle1_ppb":10.56,"PbBottle2_ppb":5.672,"PbBottle3_ppb":4.813},{"SampleID":135,"Zip Code":48502,"Ward":5,"PbBottle1_ppb":2.273,"PbBottle2_ppb":2.808,"PbBottle3_ppb":3.048},{"SampleID":136,"Zip Code":48505,"Ward":1,"PbBottle1_ppb":1.571,"PbBottle2_ppb":1.265,"PbBottle3_ppb":0.316},{"SampleID":137,"Zip Code":48503,"Ward":8,"PbBottle1_ppb":5.402,"PbBottle2_ppb":4.196,"PbBottle3_ppb":1.945},{"SampleID":138,"Zip Code":48504,"Ward":6,"PbBottle1_ppb":43.19,"PbBottle2_ppb":7.688,"PbBottle3_ppb":4.39},{"SampleID":139,"Zip Code":48503,"Ward":8,"PbBottle1_ppb":1.492,"PbBottle2_ppb":1.409,"PbBottle3_ppb":0.378},{"SampleID":140,"Zip Code":48503,"Ward":5,"PbBottle1_ppb":66.24,"PbBottle2_ppb":17.75,"PbBottle3_ppb":8.815},{"SampleID":141,"Zip Code":48505,"Ward":1,"PbBottle1_ppb":1.799,"PbBottle2_ppb":0.032,"PbBottle3_ppb":0.031},{"SampleID":142,"Zip Code":48503,"Ward":8,"PbBottle1_ppb":1.861,"PbBottle2_ppb":1.355,"PbBottle3_ppb":0.64},{"SampleID":143,"Zip Code":48503,"Ward":8,"PbBottle1_ppb":2.672,"PbBottle2_ppb":2.001,"PbBottle3_ppb":1.094},{"SampleID":144,"Zip Code":48505,"Ward":1,"PbBottle1_ppb":3.741,"PbBottle2_ppb":1.211,"PbBottle3_ppb":0.258},{"SampleID":145,"Zip Code":48506,"Ward":4,"PbBottle1_ppb":1.934,"PbBottle2_ppb":0.374,"PbBottle3_ppb":0.424},{"SampleID":146,"Zip Code":48506,"Ward":4,"PbBottle1_ppb":27.05,"PbBottle2_ppb":0.902,"PbBottle3_ppb":0.61},{"SampleID":147,"Zip Code":48506,"Ward":4,"PbBottle1_ppb":1.174,"PbBottle2_ppb":0.291,"PbBottle3_ppb":4.055},{"SampleID":148,"Zip Code":48507,"Ward":9,"PbBottle1_ppb":2.325,"PbBottle2_ppb":1.099,"PbBottle3_ppb":0.466},{"SampleID":149,"Zip Code":48506,"Ward":4,"PbBottle1_ppb":1.966,"PbBottle2_ppb":0.253,"PbBottle3_ppb":0.201},{"SampleID":150,"Zip Code":48506,"Ward":4,"PbBottle1_ppb":1.959,"PbBottle2_ppb":0.438,"PbBottle3_ppb":0.448},{"SampleID":151,"Zip Code":48506,"Ward":4,"PbBottle1_ppb":0.823,"PbBottle2_ppb":1.881,"PbBottle3_ppb":0.412},{"SampleID":152,"Zip Code":48506,"Ward":4,"PbBottle1_ppb":11.2,"PbBottle2_ppb":7.553,"PbBottle3_ppb":12.21},{"SampleID":153,"Zip Code":48504,"Ward":2,"PbBottle1_ppb":5.668,"PbBottle2_ppb":3.341,"PbBottle3_ppb":3.268},{"SampleID":154,"Zip Code":48506,"Ward":4,"PbBottle1_ppb":6.261,"PbBottle2_ppb":1.316,"PbBottle3_ppb":0.5},{"SampleID":155,"Zip Code":48506,"Ward":4,"PbBottle1_ppb":4.797,"PbBottle2_ppb":1.594,"PbBottle3_ppb":1.264},{"SampleID":156,"Zip Code":48506,"Ward":4,"PbBottle1_ppb":0.64,"PbBottle2_ppb":0.905,"PbBottle3_ppb":0.151},{"SampleID":158,"Zip Code":48507,"Ward":9,"PbBottle1_ppb":8.713,"PbBottle2_ppb":2.799,"PbBottle3_ppb":50.97},{"SampleID":159,"Zip Code":48507,"Ward":9,"PbBottle1_ppb":2.544,"PbBottle2_ppb":1.099,"PbBottle3_ppb":0.498},{"SampleID":161,"Zip Code":48507,"Ward":9,"PbBottle1_ppb":0.41,"PbBottle2_ppb":0.096,"PbBottle3_ppb":0.116},{"SampleID":162,"Zip Code":48503,"Ward":7,"PbBottle1_ppb":32.85,"PbBottle2_ppb":35.76,"PbBottle3_ppb":9.103},{"SampleID":163,"Zip Code":48503,"Ward":7,"PbBottle1_ppb":12.87,"PbBottle2_ppb":14.87,"PbBottle3_ppb":6.326},{"SampleID":164,"Zip Code":48503,"Ward":8,"PbBottle1_ppb":38.02,"PbBottle2_ppb":38.7,"PbBottle3_ppb":38.94},{"SampleID":165,"Zip Code":48507,"Ward":9,"PbBottle1_ppb":2.435,"PbBottle2_ppb":8.183,"PbBottle3_ppb":1.296},{"SampleID":166,"Zip Code":48507,"Ward":9,"PbBottle1_ppb":2.997,"PbBottle2_ppb":1.867,"PbBottle3_ppb":1.512},{"SampleID":167,"Zip Code":48507,"Ward":9,"PbBottle1_ppb":11,"PbBottle2_ppb":10.53,"PbBottle3_ppb":8.688},{"SampleID":168,"Zip Code":48507,"Ward":9,"PbBottle1_ppb":6.219,"PbBottle2_ppb":12.33,"PbBottle3_ppb":4.202},{"SampleID":169,"Zip Code":48503,"Ward":8,"PbBottle1_ppb":8.8,"PbBottle2_ppb":3.1,"PbBottle3_ppb":4.5},{"SampleID":170,"Zip Code":48504,"Ward":6,"PbBottle1_ppb":8.071,"PbBottle2_ppb":0.947,"PbBottle3_ppb":0.839},{"SampleID":171,"Zip Code":48507,"Ward":9,"PbBottle1_ppb":3.262,"PbBottle2_ppb":0.453,"PbBottle3_ppb":0.252},{"SampleID":172,"Zip Code":48503,"Ward":8,"PbBottle1_ppb":2.267,"PbBottle2_ppb":0.541,"PbBottle3_ppb":0.391},{"SampleID":173,"Zip Code":48505,"Ward":3,"PbBottle1_ppb":0.922,"PbBottle2_ppb":0.878,"PbBottle3_ppb":0.491},{"SampleID":174,"Zip Code":48504,"Ward":6,"PbBottle1_ppb":27.02,"PbBottle2_ppb":31.25,"PbBottle3_ppb":11.37},{"SampleID":176,"Zip Code":48503,"Ward":8,"PbBottle1_ppb":0.906,"PbBottle2_ppb":0.961,"PbBottle3_ppb":1.052},{"SampleID":177,"Zip Code":48507,"Ward":9,"PbBottle1_ppb":2.85,"PbBottle2_ppb":6.862,"PbBottle3_ppb":0.951},{"SampleID":178,"Zip Code":48504,"Ward":2,"PbBottle1_ppb":1.852,"PbBottle2_ppb":0.472,"PbBottle3_ppb":0.422},{"SampleID":179,"Zip Code":48506,"Ward":4,"PbBottle1_ppb":5.35,"PbBottle2_ppb":1.328,"PbBottle3_ppb":0.595},{"SampleID":180,"Zip Code":48503,"Ward":7,"PbBottle1_ppb":25.21,"PbBottle2_ppb":4.337,"PbBottle3_ppb":1.019},{"SampleID":182,"Zip Code":48504,"Ward":6,"PbBottle1_ppb":15.55,"PbBottle2_ppb":3.962,"PbBottle3_ppb":1.861},{"SampleID":183,"Zip Code":48506,"Ward":4,"PbBottle1_ppb":0.793,"PbBottle2_ppb":0.533,"PbBottle3_ppb":0.391},{"SampleID":184,"Zip Code":48506,"Ward":4,"PbBottle1_ppb":5.068,"PbBottle2_ppb":0.683,"PbBottle3_ppb":0.489},{"SampleID":185,"Zip Code":48507,"Ward":8,"PbBottle1_ppb":26.64,"PbBottle2_ppb":8.878,"PbBottle3_ppb":6.619},{"SampleID":186,"Zip Code":48506,"Ward":4,"PbBottle1_ppb":1.867,"PbBottle2_ppb":0.165,"PbBottle3_ppb":0.175},{"SampleID":189,"Zip Code":48506,"Ward":4,"PbBottle1_ppb":19.16,"PbBottle2_ppb":12.54,"PbBottle3_ppb":7.719},{"SampleID":191,"Zip Code":48503,"Ward":5,"PbBottle1_ppb":28.7,"PbBottle2_ppb":12.7,"PbBottle3_ppb":8.6},{"SampleID":192,"Zip Code":48507,"Ward":9,"PbBottle1_ppb":20.22,"PbBottle2_ppb":8.908,"PbBottle3_ppb":6.677},{"SampleID":193,"Zip Code":48507,"Ward":8,"PbBottle1_ppb":2.9,"PbBottle2_ppb":0.6,"PbBottle3_ppb":0.7},{"SampleID":194,"Zip Code":48503,"Ward":7,"PbBottle1_ppb":18.86,"PbBottle2_ppb":5.051,"PbBottle3_ppb":2.548},{"SampleID":195,"Zip Code":48507,"Ward":9,"PbBottle1_ppb":2.816,"PbBottle2_ppb":0.324,"PbBottle3_ppb":0.362},{"SampleID":196,"Zip Code":48506,"Ward":3,"PbBottle1_ppb":118.4,"PbBottle2_ppb":40.78,"PbBottle3_ppb":39.99},{"SampleID":197,"Zip Code":48506,"Ward":3,"PbBottle1_ppb":27.45,"PbBottle2_ppb":0.939,"PbBottle3_ppb":0.533},{"SampleID":198,"Zip Code":48503,"Ward":7,"PbBottle1_ppb":1.2,"PbBottle2_ppb":0.1,"PbBottle3_ppb":0.1},{"SampleID":200,"Zip Code":48506,"Ward":4,"PbBottle1_ppb":4.681,"PbBottle2_ppb":0.755,"PbBottle3_ppb":0.456},{"SampleID":201,"Zip Code":48506,"Ward":3,"PbBottle1_ppb":11.57,"PbBottle2_ppb":6.08,"PbBottle3_ppb":1.782},{"SampleID":202,"Zip Code":48532,"Ward":8,"PbBottle1_ppb":6.557,"PbBottle2_ppb":0.289,"PbBottle3_ppb":0.371},{"SampleID":203,"Zip Code":48505,"Ward":3,"PbBottle1_ppb":3.4,"PbBottle2_ppb":9.6,"PbBottle3_ppb":1.7},{"SampleID":204,"Zip Code":48504,"Ward":6,"PbBottle1_ppb":0.7,"PbBottle2_ppb":0.2,"PbBottle3_ppb":0.2},{"SampleID":205,"Zip Code":48507,"Ward":8,"PbBottle1_ppb":158,"PbBottle2_ppb":90.83,"PbBottle3_ppb":91.69},{"SampleID":206,"Zip Code":48506,"Ward":4,"PbBottle1_ppb":0.977,"PbBottle2_ppb":0.47,"PbBottle3_ppb":0.381},{"SampleID":207,"Zip Code":48503,"Ward":8,"PbBottle1_ppb":8.471,"PbBottle2_ppb":4.692,"PbBottle3_ppb":1.48},{"SampleID":208,"Zip Code":48507,"Ward":9,"PbBottle1_ppb":11.47,"PbBottle2_ppb":23.15,"PbBottle3_ppb":7.129},{"SampleID":209,"Zip Code":48503,"Ward":7,"PbBottle1_ppb":5.228,"PbBottle2_ppb":2.477,"PbBottle3_ppb":1.014},{"SampleID":210,"Zip Code":48507,"Ward":9,"PbBottle1_ppb":0.956,"PbBottle2_ppb":0.196,"PbBottle3_ppb":0.157},{"SampleID":211,"Zip Code":48504,"Ward":6,"PbBottle1_ppb":1.671,"PbBottle2_ppb":0.405,"PbBottle3_ppb":4.721},{"SampleID":212,"Zip Code":48507,"Ward":9,"PbBottle1_ppb":1.152,"PbBottle2_ppb":0.708,"PbBottle3_ppb":0.282},{"SampleID":213,"Zip Code":48507,"Ward":9,"PbBottle1_ppb":0.5,"PbBottle2_ppb":0.1,"PbBottle3_ppb":0.1},{"SampleID":214,"Zip Code":48503,"Ward":5,"PbBottle1_ppb":10.74,"PbBottle2_ppb":2.331,"PbBottle3_ppb":1.628},{"SampleID":215,"Zip Code":48507,"Ward":9,"PbBottle1_ppb":3.9,"PbBottle2_ppb":0.4,"PbBottle3_ppb":0.2},{"SampleID":216,"Zip Code":48507,"Ward":9,"PbBottle1_ppb":2.149,"PbBottle2_ppb":0.368,"PbBottle3_ppb":0.333},{"SampleID":217,"Zip Code":48506,"Ward":4,"PbBottle1_ppb":1.1,"PbBottle2_ppb":0.4,"PbBottle3_ppb":0.2},{"SampleID":218,"Zip Code":48507,"Ward":9,"PbBottle1_ppb":7.087,"PbBottle2_ppb":9.467,"PbBottle3_ppb":1.28},{"SampleID":219,"Zip Code":48505,"Ward":1,"PbBottle1_ppb":1.329,"PbBottle2_ppb":0.609,"PbBottle3_ppb":0.527},{"SampleID":220,"Zip Code":48507,"Ward":9,"PbBottle1_ppb":6.2,"PbBottle2_ppb":0.7,"PbBottle3_ppb":0.6},{"SampleID":221,"Zip Code":48505,"Ward":3,"PbBottle1_ppb":0.8,"PbBottle2_ppb":0.26,"PbBottle3_ppb":0.255},{"SampleID":222,"Zip Code":48503,"Ward":8,"PbBottle1_ppb":9.3,"PbBottle2_ppb":9.7,"PbBottle3_ppb":5},{"SampleID":223,"Zip Code":48504,"Ward":2,"PbBottle1_ppb":2.1,"PbBottle2_ppb":1.2,"PbBottle3_ppb":0.5},{"SampleID":224,"Zip Code":48505,"Ward":1,"PbBottle1_ppb":4.563,"PbBottle2_ppb":3.106,"PbBottle3_ppb":2.997},{"SampleID":225,"Zip Code":48504,"Ward":2,"PbBottle1_ppb":4.808,"PbBottle2_ppb":6.196,"PbBottle3_ppb":1.523},{"SampleID":226,"Zip Code":48504,"Ward":2,"PbBottle1_ppb":0.753,"PbBottle2_ppb":2.526,"PbBottle3_ppb":0.549},{"SampleID":227,"Zip Code":48505,"Ward":1,"PbBottle1_ppb":1.862,"PbBottle2_ppb":1.213,"PbBottle3_ppb":0.898},{"SampleID":228,"Zip Code":48503,"Ward":7,"PbBottle1_ppb":1.183,"PbBottle2_ppb":0.366,"PbBottle3_ppb":0.201},{"SampleID":229,"Zip Code":48503,"Ward":7,"PbBottle1_ppb":8.2,"PbBottle2_ppb":3.2,"PbBottle3_ppb":2.6},{"SampleID":230,"Zip Code":48504,"Ward":6,"PbBottle1_ppb":3.679,"PbBottle2_ppb":0.498,"PbBottle3_ppb":0.288},{"SampleID":231,"Zip Code":48505,"Ward":1,"PbBottle1_ppb":2.37,"PbBottle2_ppb":7.333,"PbBottle3_ppb":3.797},{"SampleID":234,"Zip Code":48504,"Ward":2,"PbBottle1_ppb":0.828,"PbBottle2_ppb":1.318,"PbBottle3_ppb":0.233},{"SampleID":235,"Zip Code":48503,"Ward":7,"PbBottle1_ppb":0.719,"PbBottle2_ppb":0.254,"PbBottle3_ppb":0.058},{"SampleID":236,"Zip Code":48504,"Ward":1,"PbBottle1_ppb":2.822,"PbBottle2_ppb":1.221,"PbBottle3_ppb":0.258},{"SampleID":237,"Zip Code":48504,"Ward":8,"PbBottle1_ppb":2.867,"PbBottle2_ppb":0.723,"PbBottle3_ppb":0.744},{"SampleID":238,"Zip Code":48503,"Ward":7,"PbBottle1_ppb":2.332,"PbBottle2_ppb":3.588,"PbBottle3_ppb":1.221},{"SampleID":240,"Zip Code":48503,"Ward":8,"PbBottle1_ppb":4.401,"PbBottle2_ppb":2.111,"PbBottle3_ppb":1.572},{"SampleID":241,"Zip Code":48504,"Ward":6,"PbBottle1_ppb":2.708,"PbBottle2_ppb":2.238,"PbBottle3_ppb":0.809},{"SampleID":242,"Zip Code":48504,"Ward":6,"PbBottle1_ppb":34.13,"PbBottle2_ppb":6.002,"PbBottle3_ppb":1.71},{"SampleID":243,"Zip Code":48503,"Ward":7,"PbBottle1_ppb":5.218,"PbBottle2_ppb":2.614,"PbBottle3_ppb":0.831},{"SampleID":244,"Zip Code":48503,"Ward":7,"PbBottle1_ppb":15.73,"PbBottle2_ppb":13.95,"PbBottle3_ppb":3.584},{"SampleID":245,"Zip Code":48504,"Ward":6,"PbBottle1_ppb":3.045,"PbBottle2_ppb":2.744,"PbBottle3_ppb":0.299},{"SampleID":246,"Zip Code":48503,"Ward":7,"PbBottle1_ppb":1.1,"PbBottle2_ppb":0.5,"PbBottle3_ppb":0.3},{"SampleID":247,"Zip Code":48503,"Ward":7,"PbBottle1_ppb":1.386,"PbBottle2_ppb":0.288,"PbBottle3_ppb":0.432},{"SampleID":248,"Zip Code":48504,"Ward":6,"PbBottle1_ppb":0.915,"PbBottle2_ppb":0.354,"PbBottle3_ppb":0.306},{"SampleID":249,"Zip Code":48504,"Ward":6,"PbBottle1_ppb":2.145,"PbBottle2_ppb":0.345,"PbBottle3_ppb":3.738},{"SampleID":250,"Zip Code":48507,"Ward":8,"PbBottle1_ppb":4.056,"PbBottle2_ppb":0.547,"PbBottle3_ppb":0.378},{"SampleID":251,"Zip Code":48503,"Ward":7,"PbBottle1_ppb":1.668,"PbBottle2_ppb":1.508,"PbBottle3_ppb":2.72},{"SampleID":252,"Zip Code":48503,"Ward":7,"PbBottle1_ppb":7.575,"PbBottle2_ppb":1.362,"PbBottle3_ppb":1.094},{"SampleID":253,"Zip Code":48507,"Ward":8,"PbBottle1_ppb":5.59,"PbBottle2_ppb":4.306,"PbBottle3_ppb":2.019},{"SampleID":254,"Zip Code":48503,"Ward":5,"PbBottle1_ppb":0.708,"PbBottle2_ppb":0.326,"PbBottle3_ppb":0.303},{"SampleID":255,"Zip Code":48504,"Ward":6,"PbBottle1_ppb":1.701,"PbBottle2_ppb":4.397,"PbBottle3_ppb":1.287},{"SampleID":256,"Zip Code":48504,"Ward":6,"PbBottle1_ppb":1.467,"PbBottle2_ppb":0.149,"PbBottle3_ppb":0.137},{"SampleID":258,"Zip Code":48504,"Ward":2,"PbBottle1_ppb":2.582,"PbBottle2_ppb":259.8,"PbBottle3_ppb":61.96},{"SampleID":259,"Zip Code":48505,"Ward":2,"PbBottle1_ppb":22.08,"PbBottle2_ppb":15.86,"PbBottle3_ppb":9.262},{"SampleID":260,"Zip Code":48507,"Ward":8,"PbBottle1_ppb":16.51,"PbBottle2_ppb":2.024,"PbBottle3_ppb":7.068},{"SampleID":262,"Zip Code":48507,"Ward":8,"PbBottle1_ppb":56.26,"PbBottle2_ppb":4.692,"PbBottle3_ppb":1.243},{"SampleID":263,"Zip Code":48504,"Ward":6,"PbBottle1_ppb":2.433,"PbBottle2_ppb":1.334,"PbBottle3_ppb":1.376},{"SampleID":264,"Zip Code":48503,"Ward":7,"PbBottle1_ppb":0.5,"PbBottle2_ppb":0.2,"PbBottle3_ppb":0.5},{"SampleID":265,"Zip Code":48504,"Ward":6,"PbBottle1_ppb":29.13,"PbBottle2_ppb":11.57,"PbBottle3_ppb":6.388},{"SampleID":266,"Zip Code":48505,"Ward":1,"PbBottle1_ppb":12.3,"PbBottle2_ppb":0.5,"PbBottle3_ppb":0.4},{"SampleID":267,"Zip Code":48504,"Ward":6,"PbBottle1_ppb":3.445,"PbBottle2_ppb":0.29,"PbBottle3_ppb":0.167},{"SampleID":268,"Zip Code":48507,"Ward":9,"PbBottle1_ppb":16.49,"PbBottle2_ppb":12.83,"PbBottle3_ppb":9.018},{"SampleID":269,"Zip Code":48503,"Ward":7,"PbBottle1_ppb":3.365,"PbBottle2_ppb":2.45,"PbBottle3_ppb":1.675},{"SampleID":270,"Zip Code":48503,"Ward":7,"PbBottle1_ppb":1.154,"PbBottle2_ppb":0.176,"PbBottle3_ppb":0.12},{"SampleID":271,"Zip Code":48507,"Ward":9,"PbBottle1_ppb":13.53,"PbBottle2_ppb":21.91,"PbBottle3_ppb":4.675},{"SampleID":272,"Zip Code":48504,"Ward":6,"PbBottle1_ppb":2.229,"PbBottle2_ppb":1.573,"PbBottle3_ppb":0.84},{"SampleID":273,"Zip Code":48503,"Ward":7,"PbBottle1_ppb":28.91,"PbBottle2_ppb":5.471,"PbBottle3_ppb":3.056},{"SampleID":274,"Zip Code":48503,"Ward":7,"PbBottle1_ppb":6.601,"PbBottle2_ppb":1.929,"PbBottle3_ppb":0.417},{"SampleID":275,"Zip Code":48507,"Ward":9,"PbBottle1_ppb":0.948,"PbBottle2_ppb":0.27,"PbBottle3_ppb":0.207},{"SampleID":276,"Zip Code":48505,"Ward":3,"PbBottle1_ppb":3.484,"PbBottle2_ppb":0.434,"PbBottle3_ppb":0.306},{"SampleID":278,"Zip Code":48503,"Ward":7,"PbBottle1_ppb":1.888,"PbBottle2_ppb":0.359,"PbBottle3_ppb":0.322},{"SampleID":279,"Zip Code":48503,"Ward":7,"PbBottle1_ppb":13.95,"PbBottle2_ppb":12.2,"PbBottle3_ppb":8.251},{"SampleID":280,"Zip Code":48504,"Ward":6,"PbBottle1_ppb":6.27,"PbBottle2_ppb":4.036,"PbBottle3_ppb":1.182},{"SampleID":281,"Zip Code":48506,"Ward":7,"PbBottle1_ppb":19.12,"PbBottle2_ppb":22.02,"PbBottle3_ppb":7.968},{"SampleID":282,"Zip Code":48507,"Ward":9,"PbBottle1_ppb":1.633,"PbBottle2_ppb":0.465,"PbBottle3_ppb":0.238},{"SampleID":283,"Zip Code":48503,"Ward":7,"PbBottle1_ppb":1.114,"PbBottle2_ppb":0.605,"PbBottle3_ppb":0.255},{"SampleID":284,"Zip Code":48503,"Ward":7,"PbBottle1_ppb":3.9,"PbBottle2_ppb":0.558,"PbBottle3_ppb":0.504},{"SampleID":285,"Zip Code":48504,"Ward":1,"PbBottle1_ppb":3.521,"PbBottle2_ppb":0.45,"PbBottle3_ppb":0.321},{"SampleID":286,"Zip Code":48507,"Ward":9,"PbBottle1_ppb":3.832,"PbBottle2_ppb":0.794,"PbBottle3_ppb":0.339},{"SampleID":287,"Zip Code":48505,"Ward":3,"PbBottle1_ppb":3.243,"PbBottle2_ppb":0.738,"PbBottle3_ppb":0.27},{"SampleID":289,"Zip Code":48503,"Ward":7,"PbBottle1_ppb":0.99,"PbBottle2_ppb":0.25,"PbBottle3_ppb":0.263},{"SampleID":290,"Zip Code":48507,"Ward":9,"PbBottle1_ppb":1.203,"PbBottle2_ppb":19.26,"PbBottle3_ppb":1.626},{"SampleID":291,"Zip Code":48506,"Ward":3,"PbBottle1_ppb":2.261,"PbBottle2_ppb":0.102,"PbBottle3_ppb":0.407},{"SampleID":292,"Zip Code":48503,"Ward":4,"PbBottle1_ppb":16.99,"PbBottle2_ppb":6.32,"PbBottle3_ppb":3.585},{"SampleID":293,"Zip Code":48503,"Ward":7,"PbBottle1_ppb":3.322,"PbBottle2_ppb":2.559,"PbBottle3_ppb":1.512},{"SampleID":294,"Zip Code":48506,"Ward":4,"PbBottle1_ppb":14.33,"PbBottle2_ppb":1.284,"PbBottle3_ppb":0.323},{"SampleID":295,"Zip Code":48503,"Ward":7,"PbBottle1_ppb":18.11,"PbBottle2_ppb":20.21,"PbBottle3_ppb":4.263},{"SampleID":296,"Zip Code":48503,"Ward":7,"PbBottle1_ppb":12.81,"PbBottle2_ppb":7.874,"PbBottle3_ppb":1.78},{"SampleID":298,"Zip Code":48503,"Ward":7,"PbBottle1_ppb":1.083,"PbBottle2_ppb":0.322,"PbBottle3_ppb":0.26},{"SampleID":299,"Zip Code":48503,"Ward":7,"PbBottle1_ppb":29.59,"PbBottle2_ppb":3.258,"PbBottle3_ppb":1.843},{"SampleID":300,"Zip Code":48505,"Ward":1,"PbBottle1_ppb":4.287,"PbBottle2_ppb":4.345,"PbBottle3_ppb":4.905}]""")



### Understanding the structure of the data

Remind yourself of some of the Pandas operations you can use to **inspect** the data.

&#9989;&nbsp;  How do you look at the first few rows of the dataframe? Do that here.

In [32]:
# We can use pandas to help us see what the data looks like

## If we have a dataframe (like `flint_data`, which we loaded in),
##   then we can use a built-in Pandas function to look at the first few rows

flint_data.head()

Unnamed: 0,SampleID,Zip Code,Ward,PbBottle1_ppb,PbBottle2_ppb,PbBottle3_ppb,Notes
0,1,48504,6,0.344,0.226,0.145,
1,2,48507,9,8.133,10.77,2.761,
2,4,48504,1,1.111,0.11,0.123,
3,5,48507,8,8.007,7.446,3.384,
4,6,48505,3,1.951,0.048,0.035,


&#9989;&nbsp;  How do you look at the end of the dataframe? What if you want more rows than it is showing you? **Show how to do that here.**

In [37]:
# Put your code here
flint_data.tail()

#want to see more rows you can include parameter n in .head() to display first n rows
  
# to find total # rows and columns dataframe contains: input flint_data
# then you know total # rows and toatl # columns and know n cannot exeed total # rows

flint_data.head(10) # say you want to see 10 rows



Unnamed: 0,SampleID,Zip Code,Ward,PbBottle1_ppb,PbBottle2_ppb,PbBottle3_ppb,Notes
0,1,48504,6,0.344,0.226,0.145,
1,2,48507,9,8.133,10.77,2.761,
2,4,48504,1,1.111,0.11,0.123,
3,5,48507,8,8.007,7.446,3.384,
4,6,48505,3,1.951,0.048,0.035,
5,7,48507,9,7.2,1.4,0.2,
6,8,48507,9,40.63,9.726,6.132,
7,9,48503,5,1.1,2.5,0.1,
8,12,48507,9,10.6,1.038,1.294,
9,13,48505,3,6.2,4.2,2.3,


### Questions you should ask yourself about the data you loaded in

^ Look at that data above. And ask yourself some questions. 

**These kinds of questions are in the toolbox of every data scientist.** 

- What's the structure of the data I loaded?
  - How many columns of data do I see?
  - How many did I *expect* to see? 
  - Is this what I thought the data might look like when I loaded it?
- Do I think I understand what *kind* of data is in each column? 
  - Do the ZIP codes look like what ZIP codes should look like?
  - Which columns seem to be integer numbers?
  - Which columns do not contain anything useful?
- What things do I *not understand*? 
  - Does anything look weird? If so, what?
  - Is there data that's not showing up?
  - Are there values in the data that don't make sense?
  - If some data seems weird, can I work around that in my analysis?

### Quick summary of the data: describe

&#9989;&nbsp;  Use the `describe` method to examine the properties of the entire dataframe for each of the columns of data.

- What kind of information is `describe()` giving me?
- Does it make sense for all the types of data in my dataset?
- Are there any parts of the data where `describe()` seems...less than helpful?

In [4]:
# Put your code for testing out the "describe" function here
flint_data.describe()

Unnamed: 0,SampleID,Zip Code,Ward,PbBottle1_ppb,PbBottle2_ppb,PbBottle3_ppb
count,271.0,271.0,271.0,271.0,271.0,271.0
mean,150.856089,48505.103321,5.313653,10.645993,10.301144,3.660705
std,86.30895,3.114546,2.668291,21.560778,67.531251,10.5385
min,1.0,48502.0,0.0,0.344,0.032,0.031
25%,77.5,48503.0,3.0,1.578,0.46,0.306
50%,149.0,48505.0,6.0,3.521,1.4,0.831
75%,224.5,48506.0,8.0,9.05,4.8065,2.7405
max,300.0,48532.0,9.0,158.0,1051.0,94.52


#### Key for data fields in this data set:

- *SampleID*: Unique study code for each sample
- *Zip Code*: location where samples were collected
- *Ward*: location where samples were collected
- *PbBottle1_ppb*: Concentration of lead in parts per billion in sample acquired at initial turn on of water
- *PbBottle2_ppb*: Concentration of lead in parts per billion in sample acquired after 45 seconds of flushing water
- *PbBottle3_ppb*: Concentration of lead in parts per billion in sample acquired after 120 seconds of flushing water



### Filtering data (using Boolean operations)

In the earlier in this course, we learned about using Boolean variables and `if` statements. Let's use those ideas to sort the data.

A different way of selecting information in a data frame is based on the data values.  You can create a "mask" of true and false values (which can be useful for NumPy arrays well!) by doing simple greater than/less than/equal to tests. For example, if you want to find out which rows of the column "PbBottle1_ppb" have values greater than 20, you'd say:

    flint_data['PbBottle1_ppb'] > 20

Which will return a list of true and/or false values.  You can store this in its own variable:

    mask = flint_data['PbBottle1_ppb'] > 20
    
and then you can use it to actually *get* a data frame containing all of the values that are greater than 20.  For example,

    flint_data[mask]
   
will return a data frame that contains only the rows where  "PbBottle1_ppb" is greater than 20.  Note that the line above is equivalent to saying

    flint_data[flint_data['PbBottle1_ppb'] > 20]

but this second way of saying it is a bit harder to read.  Note that you can also use "less than" (<) or "equal to" (==) to do this sort of test!

Because the returned quantity is a data frame, you can then treat it the same way you did the original frame.  So, you can also choose to get just one column of the masked data, say the "SampleID", by saying:

    flint_data[mask]['SampleID']
    
 Which is the same thing as saying:
 
     new_data_frame = flint_data[mask]
     new_data_frame['SampleID']

&#9989;&nbsp;  **Try all of these things below, and discuss the logic with your group members to make sure each person understands how this works.**

In [22]:
# splitting the cell after each step just to show what/how it changed step by step

flint_data['Zip Code'] 
# splice the dataframe to return column 'Zip Code' to visualize starting point 

0      48504
1      48507
2      48504
3      48507
4      48505
       ...  
266    48503
267    48503
268    48503
269    48503
270    48505
Name: Zip Code, Length: 271, dtype: int64

In [21]:
flint_data['Zip Code'] == 48504 
# creates a mask aka list of t/f values where 'true' when i in column 'zip code' is 48504

0       True
1      False
2       True
3      False
4      False
       ...  
266    False
267    False
268    False
269    False
270    False
Name: Zip Code, Length: 271, dtype: bool

In [23]:
zipcode_mask = flint_data['Zip Code'] == 48504 # create variable 'zipcode_mask' that stores the mask for ['zip code'] = 48504
flint_data[zipcode_mask] # use mask variable to get a data frame containing only the values that ='True' 
    # (rows in column 'Zip Code' that = 48504)

Unnamed: 0,SampleID,Zip Code,Ward,PbBottle1_ppb,PbBottle2_ppb,PbBottle3_ppb,Notes
0,1,48504,6,0.344,0.226,0.145,
2,4,48504,1,1.111,0.11,0.123,
17,22,48504,6,0.548,0.622,0.361,
18,23,48504,2,3.131,0.674,0.683,
19,24,48504,6,120.0,239.7,29.71,


### Creating new dataframes by filtering or sorting old ones

It's also important to note that since the returned quantity is a data frame, you can always do multiple masks of the sort above to make a more complicated query.  For example, if I want every row in the column where "PbBottle1_ppb" is greater than 20, and also where "PbBottle3_ppb" is also greater than 20, I could do:

    first_mask = flint_data['PbBottle1_ppb'] > 20
    first_masked_frame = flint_data[first_mask]
    second_mask = first_masked_frame['PbBottle3_ppb'] > 20
    second_masked_frame = first_masked_frame[second_mask]

Note that the code creates a data frame where the first sample had values greater than 20, and then it used *that* frame to create the mask (`second_mask`) that was used to create a new data frame where the third sample was also greater than 20.  You can then see how many samples still have high values after that by typing `second_masked_frame` to see the output, or just counting one of the columns by saying `second_masked_frame['SampleID'].count()`

&#9989;&nbsp;  **Try it below and then test out changing the threshold values to make sure that the results match your expectations.**

In [30]:
# IMPORTANT TO KNOW!!!!!!!

first_mask_zipcode = flint_data['Zip Code'] == 48504 # creates a variable that stores mask for ['Zip Code'] = 48504
first_masked_frame = flint_data[first_mask_zipcode] # get a data frame containing only the rows where "Zip Code" = 48504
second_mask_pbbottle1 = first_masked_frame['PbBottle1_ppb'] > 30 # create a mask for first_masked_frame / use data frame first_masked_frame to make a mask...
# where 'PbBottle1_ppb' is "True" when > 30
second_masked_frame = first_masked_frame[second_mask_pbbottle1] # create another data frame ('second_masked_frame') containing only the rows where 'PbBottle1_ppb' > 30
second_masked_frame

Unnamed: 0,SampleID,Zip Code,Ward,PbBottle1_ppb,PbBottle2_ppb,PbBottle3_ppb,Notes
0,1,48504,6,0.344,0.226,0.145,
2,4,48504,1,1.111,0.11,0.123,
17,22,48504,6,0.548,0.622,0.361,
18,23,48504,2,3.131,0.674,0.683,
19,24,48504,6,120.0,239.7,29.71,
24,29,48504,2,5.5,8.4,2.4,
30,35,48504,6,109.6,80.47,94.52,
32,37,48504,2,2.774,0.21,0.264,
39,44,48504,2,2.448,0.373,0.288,
41,46,48504,6,1.293,0.441,0.281,


### Filtering data (using `isin`)

Finally, you can select records in a data table based on a set of values in one of the columns using the `isin()` method.  (Note: read that as "**is in**".  So, you can create a mask for a single SampleID by giving it a list with a single number in it, like so:

    flint_data['SampleID'].isin([4])
    
or a list containing many numbers:

    flint_data['SampleID'].isin([2,4,6,8])
    
and then create a data frame with just those values:

    mask = flint_data['SampleID'].isin([2,4,6,8])
    some_samples = flint_data[mask]
    
Note that you can also create the list separately and then create the mask using that list.  For example:

    list_of_samples = [2,4,6,8]
    mask = flint_data['SampleID'].isin(list_of_samples)
    some_samples = flint_data[mask]
    
is equivalent to the block of code above it.  Give it a try - **make sure to type `some_samples` at the end of the cell to see what it outputs**.  Also, remember that `some_samples` is a data frame, and you can treat it the same way you treated any other data frame!

In [38]:
list_of_samples = [2,4,6,8]
mask = flint_data['SampleID'].isin(list_of_samples)
some_samples = flint_data[mask]
some_samples

Unnamed: 0,SampleID,Zip Code,Ward,PbBottle1_ppb,PbBottle2_ppb,PbBottle3_ppb,Notes
1,2,48507,9,8.133,10.77,2.761,
2,4,48504,1,1.111,0.11,0.123,
4,6,48505,3,1.951,0.048,0.035,
6,8,48507,9,40.63,9.726,6.132,


&#9989;&nbsp; **Question**: What if I want to see all of the rows from a particular zip code? How would I use `.isin()` to do this for zip code 48507? **Put your code below**.

In [40]:
# Put your code here
flint_data['Zip Code'].isin([48507])
mask3 = flint_data['Zip Code'].isin([48507])
zip_48507 = flint_data[mask3]
zip_48507

Unnamed: 0,SampleID,Zip Code,Ward,PbBottle1_ppb,PbBottle2_ppb,PbBottle3_ppb,Notes
1,2,48507,9,8.133,10.77,2.761,
3,5,48507,8,8.007,7.446,3.384,
5,7,48507,9,7.2,1.4,0.2,
6,8,48507,9,40.63,9.726,6.132,
8,12,48507,9,10.6,1.038,1.294,
10,15,48507,9,4.358,0.822,0.147,
14,19,48507,9,2.484,0.72,0.565,
15,20,48507,9,0.438,1.046,0.511,
46,51,48507,8,2.576,2.852,1.48,
62,72,48507,5,11.52,0.288,0.215,


____
### Outliers
____

Sometimes your data will contain values of the correct type (e.g., integer), but the value is very far from reasonable. Such values are called outliers. As a budding data scientist, you need to know how to find outliers and be able to assess whether they are real or not. For example, suppose you were given some data about speeds of cars driving on the freeway, and further suppose that one of the data entries was $174$ mph. Would you assume that is real, or the person who typed it in had a typo? What if the value was $548$ mph? If you know nothing else about the data, the responsibility falls on you to decide what to do with those outliers. 

&#9989;&nbsp; **How can you find the outliers? Are there any in this dataset? Maybe the people taking the lead measurements made mistakes from time to time? Think about how you can find potential outliers in this dataset, describe your methodology and suggest potential outliers.** 

<font size=+3>&#9998;</font> *Put your answer here*
* If you assume normal distribution then you know that outliers exist at -3 std and lower and +3 std and higher.
    * find mean of 'PbBottle1_ppb', 'PbBottle2_ppb', and 'PbBottle3_ppb'
    * find the standard deviation of 'PbBottle1_ppb', 'PbBottle2_ppb', and 'PbBottle3_ppb'
    * find mean of 'PbBottle1_ppb' - 3std of'PbBottle1_ppb' and mean 'PbBottle1_ppb'+ 3std'PbBottle1_ppb' 
         * Do same for 'PbBottle2_ppb', and 'PbBottle3_ppb'
    * 
* If NOT assuming normal dist., esp if skewed data, use quartiles, interquartile range- outliers exist at when < lower quartile (Q1) - 1.5* interquartile range (IQR, = Q3-Q1) and exist when > upper quartile (Q3) + 1.5 * IQR.


In [None]:
# Test out any code you think might be useful for finding outliers


### Data Transforming

As you might have already learned earlier in the course, data almost always comes to you in an imperfect form and you need to perform cleaning and transforming. 

The lead sample readings are in ppb (parts per billion). But, the U.S. Environmental Protection Agency's guidelines are expressed in mg/L, or milligrams per liter. **What will we need to do in order to compare the data collected to the EPA's guideline threshold?**

*Note*: 1 part-per-million, or 1ppm, equals 1 milligram per liter (1mg/L)

Review this unit conversion:

$$ x \text{ ppb} \times \frac{1 \text{ ppm}}{1000 \text{ ppb}} \times \frac{1 \text{ mg/L}}{1 \text{ ppm}} $$

Would this give you the units that you want?

In [None]:
# Write any python you need to here

# Create a new column and assign its value to be a draw column divided by 1000
# (You'll need to transfom the data for the other readings as well!)
flint_data['new_b1_mg_L'] = flint_data['PbBottle1_ppb']/1000

# Check to see if it worked
flint_data.head()

# Add code to convert the other readings


### Thinking more about outliers

One way to find an outlier is through visualization of the data. One way of doing this, is by using a boxplot. 

<div align="center"><img src="https://miro.medium.com/max/1280/1*2c21SkzJMf3frPXPAR_gZA.png" width="500px"></div>

Boxplots are a statistical graph which show the range of the data, outliers, median value, and the 25th and 75th percentile values. These are another great tool for understanding the statistics of your data. Try using matplotlib to plot a boxplot and identify some outliers in the cell below. 


In [None]:
#Put your code here

# diff btwn plotting w/matplotlib and seaborn is just plt.boxplot() and sns.boxplot 
# (aka for everything its just whether you put plt or sns before the .)

But, as you might have already figured this out when you were thinking about outliers earlier, using masks in Pandas also offers a way to find values that satisfy a specific condition.

&#9989;&nbsp; **Review the following code as a possible starting point to search for outliers if you didn't already come up with a method earlier. Modify and test the code as needed to locate other outliers in other columns. You might want to explore outliers in the new columns you just created as well.**

In [None]:
# find outlier data for bottle 2
# -- how do you decide what would be considered an outlier?
# -- What is the "right" threshold?
outliers = (flint_data['PbBottle2_ppb'] > 10)
flint_data[outliers]

# You should check for other outliers in the other sample as well.


// write any markdown here - explain what you did in the cells above!

### Getting a another sense of the Data

&#9989;&nbsp; What's the mean value of each column for the lead readings? Is there a pandas function you can use for this?

In [None]:
# Put your code here


<font size=+3>&#9998;</font> Write markdown text here to explain what you did in the cells above!

Now pause for a moment and think about the following question:

Is the mean value of the lead levels representative of how good (or bad) the overall lead levels in Flint water? If so, why; if not, why not? 

**Really take some time to think this one through.** 

&#9989;&nbsp; Try to justify your group's opinions by using plots, calculations, or anything else you feel appropriately supports your point. Examine the [Seaborn gallery](https://seaborn.pydata.org/examples/index.html) for some **new types of plots** that might be useful for achieving these goals.

In [None]:
# Add your code here.
# Use Seaborn for your plots.
# In order to make a seaborn style plot, you'll need to import seaborn as use the "set()" function
# let the instructor know if you run into issues with this.



<font size=+3>&#9998;</font> write any markdown notes here

How does the mean value for all the readings compare to the EPA's "action level"?  What about the range of values from the readings?  As a reminder, here's what the EPA guidelines say:

> Lead and copper are regulated by a treatment technique that requires systems to control the corrosiveness of their water. If more than 10% of tap water samples exceed the action level, water systems must take additional steps. For copper, the action level is 1.3 mg/L, and for lead is 0.015 mg/L. 
>
> Source: (http://www.epa.gov/your-drinking-water/table-regulated-drinking-water-contaminants#seven). 

&#9989;&nbsp;  **Write some code to try to determine whether or the measurements in this dataset would meet the EPA requirements**

In [None]:
# Write any python you need here


**What did you find?**

When the instructors analyzed the data, they found that mean levels are below the action limit, but their analysis clearly showed that 16% of the samples exceeded the EPA action limit. Put another way, the 90th percentile of the readings is more than 1.5 times the EPA legal limit. 

**How does this compare to your analysis?**

### Reflecting on your analysis

Is comparing the mean to the action level enough to tell us whether Flint had a definite problem with its drinking water? If so, why? If not, why not? 


If you were in a position to make a policy decision based on the data and the EPA guidelines, what recommendation would you make?

**Take some time to think this through. If you can, make a plot that you might show to a policy maker to make your point**

In [None]:
# Write any additional python you might need to here


<font size=+3>&#9998;</font> record your thoughts in this markdown cell

---

<img src="http://america.aljazeera.com/content/ajam/articles/2015/10/19/michigan-admits-mistakes-in-flint-water-testing/jcr:content/headlineImage.adapt.1460.high.Flint_Hed_20151019.1445295000666.jpg" width=400px>

---


## Congratulations, you're done!

Submit this assignment by uploading your notebook to the course Desire2Learn web page.  Go to the "In-Class Assignments" folder, find the appropriate submission link, and upload everything there. Make sure your name is on it!

&#169; Copyright 2018,  Michigan State University Board of Trustees