# C3M1 Lesson 3 Practice Lab: Air Quality - Looping Through Data

In this lab, you will continue analyzing the annual average fine particles (PM 2.5) for different areas in New York City. Now that you've learned more tools, let's see what you can do!

You will be working with a reduced version of the Air Quality dataset from the EPA that contains information on New York City air quality surveillance data. In particular, you will be working with the following 3 features:

- `geo_place_name`: name of the geographical region being measured.
- `data_value`: annual average measurement of fine particles, measured in mcg/m3.
- `year`: the year the measurements correspond to.

## General instructions
- **Replace any instances of `None` with your own code**. All `None`s must be replaced.
- **Compare your results with the expected output** shown below the code.
- **Check the solution** using the expandable cell to verify your answer.

Happy coding!

<div style="background-color: #FAD888; padding: 10px; border-radius: 3px; box-shadow: 0 2px 4px rgba(0, 0, 0, 0.1); width:95%
">
<strong>Important note</strong>: Code blocks with None will not run properly. If you run them before completing the exercise, you will likely get an error. 

</div>

## Table of Contents

- [Step 1: Load the data](#step-1)
- [Step 2: Years over 12mcg/m3](#step-2)
- [Step 3: Values per region](#step-3)


<a id="step-1"></a>

## Step 1: Load the data

<div style="background-color: #C6E2FF; padding: 10px; border-radius: 3px; box-shadow: 0 2px 4px rgba(0, 0, 0, 0.1); width:95%
">
    <strong>▶▶▶ Directions</strong> 
        <ol>
            <li>Run the cell below to load the data.</li>
        </ol>
</div>

In [None]:
from helper_functions import get_list

geo_place_name = get_list('geo_place_name')
data_value = get_list('data_value')
year = get_list('year')

num_observations = len(year)

<a id="step-2"></a>

## Step 2: Years over 12 mcg/m3

Fine particles above 12 mcg/m³ pose a moderate risk, potentially causing mild respiratory issues for people sensitive to air pollution. In this step you will identify all values exceeding this threshold.

#### Exercise 1: Find the Locations and Years With Critical Fine Particle Values

<div style="background-color: #C6E2FF; padding: 10px; border-radius: 3px; box-shadow: 0 2px 4px rgba(0, 0, 0, 0.1); width:95%
">
    <strong>▶▶▶ Directions</strong> 
        <ol>
            <li>Iterate over each observation in the dataset. You can do this with a <code>for</code> loop.</li>
            <ul>
                <li><strong>Hint:</strong> Use the <code>range()</code> function in the loop, to be able to access elelments from all three lists.</li>
            </ul>
            <li>For each iteration, check if the observation has the value of fine particles more than 12.</li> 
            <li>If the value of fine particles is higher than 12, print out the location and the year.</li>
            <ul>
                <li><strong>Hint:</strong> You can print out multiple values at once if you separate them with a comma within the <code>print()</code>.</li>
            </ul>
            <li>Print the results </li>
        </ol>
</div>


In [None]:
# Use range to iterate over the length of any of the three lists
for i in range(None):
    # Check if the fine particles value is larger than 12
    if data_value[i] > None:
        # Print out location and year
        print(None[i], None[i])

<details open>
<summary style="background-color: #c6e2ff6c; padding: 10px; border-radius: 3px; box-shadow: 0 2px 4px rgba(0, 0, 0, 0.01); width: 95%; text-align: left; cursor: pointer; font-weight: bold;">
Expected output:</summary> 

<small>

```mkdn
Chelsea - Clinton 2009
Chelsea - Clinton 2010
Chelsea - Clinton 2011
Chelsea - Clinton 2012
Chelsea - Clinton 2013
Chelsea - Clinton 2014
Upper West Side 2009
```

</small>
</details>


<details>
<summary style="background-color: #FDBFC7; padding: 10px; border-radius: 3px; box-shadow: 0 2px 4px rgba(0, 0, 0, 0.1); width: 95%; text-align: left; cursor: pointer; font-weight: bold;">
Click here to see the solution</summary> 

<ul style="background-color: #FFF8F8; padding: 10px; border-radius: 3px; margin-top: 5px; width: 95%; box-shadow: inset 0 2px 4px rgba(0, 0, 0, 0.1);">
   
Your solution should look something like this:

```python
# Use range to iterate over the length of any of the three lists
for i in range(len(data_value)):
    # Check if the fine particles value is larger than 12
    if data_value[i] > 12:
        # Print out location and year
        print(geo_place_name[i], year[i])
```
</details>

<a id="step-3"></a>

## Step 3: Values per region

In the previous lesson, you calculated the average fine particle value for all years. Now you will use for loops to split the data by regions to calculate it for each region separately.

#### Exercise 2: Split the Data by Regions

You will begin by splitting the <code>data_value</code> variable into three lists, one for each region, to perform further calculations.

<div style="background-color: #C6E2FF; padding: 10px; border-radius: 3px; box-shadow: 0 2px 4px rgba(0, 0, 0, 0.1); width:95%
">
    <strong>▶▶▶ Directions</strong> 
        <ol>
            <li>Create three empty lists, one for each region</li>
            <li>Iterate over each observation in the dataset. You can do this with a <code>for</code> loop</li>
            <li> For each iteration, check if the observation corresponds to "Chelsea - Clinton", "Long Island City - Astoria", or "Upper West Side"</li> 
            <li>How many code branchings did you need?</li>
            <li> Print the results </li>
        </ol>
</div>


In [None]:
# create an empty list for each region
CC_values = None # Chelsea - Clinton
LI_values = None # Long Island City - Astoria
UWS_values = None # Upper West Side

# iterate over the variable geo_place_name
for i in range(None):
    # check if the region is Chelsea - Clinton
    if geo_place_name[i] == "Chelsea - Clinton":
        # if True: append the data value to the list CC_values
        CC_values.append(data_value[i])    

    # if it is not  Chelsea - Clinton, 
    # check if the region is Long Island City - Astoria
    elif None == None:
        # if True: append the data value to the list LI_values
        LI_values.None(None)
        
    # if it is not  Chelsea - Clinton or Long Island City - Astoria
    else:
        # append the data value to the list UWS_values
        UWS_values.None(None)

# Print the results
print("There are ", len(CC_values), " observations for Chelsea - Clinton")
print("There are ", len(LI_values), " observations for Long Island City - Astoria")
print("There are ", len(UWS_values), " observations for Upper West Side")

<details open>
<summary style="background-color: #c6e2ff6c; padding: 10px; border-radius: 3px; box-shadow: 0 2px 4px rgba(0, 0, 0, 0.01); width: 95%; text-align: left; cursor: pointer; font-weight: bold;">
Expected output:</summary> 

<small>

```mkdn
There are  13  observations for Chelsea - Clinton
There are  13  observations for Long Island City - Astoria
There are  13  observations for Upper West Side
```

</small>
</details>


<details>
<summary style="background-color: #FDBFC7; padding: 10px; border-radius: 3px; box-shadow: 0 2px 4px rgba(0, 0, 0, 0.1); width: 95%; text-align: left; cursor: pointer; font-weight: bold;">
Click here to see the solution</summary> 

<ul style="background-color: #FFF8F8; padding: 10px; border-radius: 3px; margin-top: 5px; width: 95%; box-shadow: inset 0 2px 4px rgba(0, 0, 0, 0.1);">
   
Your solution should look something like this:

```python
# create an empty list for each region
CC_values = [] # Chelsea - Clinton
LI_values = [] # Long Island City - Astoria
UWS_values = [] # Upper West Side

# iterate over the variable geo_place_name
for i in range(num_observations):
    # check if the region is Chelsea - Clinton
    if geo_place_name[i] == "Chelsea - Clinton":
        # if True: append the data value to the list CC_values
        CC_values.append(data_value[i])    

    # if it is not  Chelsea - Clinton, 
    # check if the region is Long Island City - Astoria
    elif geo_place_name[i] == "Long Island City - Astoria":
        # if True: append the data value to the list LI_values
        LI_values.append(data_value[i])
        
    # if it is not  Chelsea - Clinton or Long Island City - Astoria
    else:
        # append the data value to the list UWS_values
        UWS_values.append(data_value[i])

# Print the results
print("There are ", len(CC_values), " observations for Chelsea - Clinton")
print("There are ", len(LI_values), " observations for Long Island City - Astoria")
print("There are ", len(UWS_values), " observations for Upper West Side")
```
</details>

#### Exercise 3: Calculate the Average for Each Region

Now that you have one list of values per region, you can get the average value of fine particles for each

<div style="background-color: #C6E2FF; padding: 10px; border-radius: 3px; box-shadow: 0 2px 4px rgba(0, 0, 0, 0.1); width:95%
">
    <strong>▶▶▶ Directions</strong> 
        <ol>
            <li>Use the lists you created in the previous cell to find the average for each region</li>
            <li>Print the results</li>
        </ol>
</div>

In [None]:
# Calculate the average value for each region
average_CC = sum(None) / len(None) # Chelsea - Clinton
average_LI = None / None # Long Island City - Astoria
average_UWS = None / None # Upper West Side

print("The Chelsea - Clinton are has had an average fine particles of ", average_CC)
print("The Long Island City - Astoria are has had an average fine particles of ", average_LI)
print("The Upper West Side are has had an average fine particles of ", average_UWS)

<details open>
<summary style="background-color: #c6e2ff6c; padding: 10px; border-radius: 3px; box-shadow: 0 2px 4px rgba(0, 0, 0, 0.01); width: 95%; text-align: left; cursor: pointer; font-weight: bold;">
Expected output:</summary> 

<small>

```mkdn
The Chelsea - Clinton are has had an average fine particles of  11.284615384615385
The Long Island City - Astoria are has had an average fine particles of  9.153846153846155
The Upper West Side are has had an average fine particles of  9.338461538461539
```

</small>
</details>


<details>
<summary style="background-color: #FDBFC7; padding: 10px; border-radius: 3px; box-shadow: 0 2px 4px rgba(0, 0, 0, 0.1); width: 95%; text-align: left; cursor: pointer; font-weight: bold;">
Click here to see the solution</summary> 

<ul style="background-color: #FFF8F8; padding: 10px; border-radius: 3px; margin-top: 5px; width: 95%; box-shadow: inset 0 2px 4px rgba(0, 0, 0, 0.1);">
   
Your solution should look something like this:

```python
# Calculate the average value for each region
average_CC = sum(CC_values) / len(CC_values) # Chelsea - Clinton
average_LI = sum(LI_values) / len(LI_values) # Long Island City - Astoria
average_UWS = sum(UWS_values) / len(UWS_values) # Upper West Side

print("The Chelsea - Clinton are has had an average fine particles of ", average_CC)
print("The Long Island City - Astoria are has had an average fine particles of ", average_LI)
print("The Upper West Side are has had an average fine particles of ", average_UWS)
```
</details>

Congratulations for making it until the end of this lab. Hope you enjoyed it! 