This morning we will practice **for loops**.  For loops are a fundamental skill all programmers need to master.  They allow us to move cycle through each item of an iterable.  

An iterable is:

> An object capable of returning its members one at a time. Examples of iterables include all sequence types (such as list, str, and tuple) and some non-sequence types like dict...

> Iterables can be used in a for loop and in many other places where a sequence is needed (zip(), map(), …).

[source](https://docs.python.org/3/glossary.html)

For this exercise, you will import a dataset about air quality in Brooklyn.  The data was gathered using the [Open Air Quality API](https://openaq.org/#/?_k=7pqqwf).  

The data is stored in two lists each filled with 10000 items. 

  - `measurements` holds daily measures of pm25 (particulate matter) readings in µg/m3.  
  - `dates` holds the corresponding timestamps of when a measurement was taken.
  

In [1]:
# Run this cell with no changes
import pickle
import numpy as np
import pandas as pd

with open('data/brooklyn.p','rb') as read_file:
    dates, measurements = pickle.load(read_file)

The indices align.  In other words, `dates[0]` holds the timestamp for `measurements[0]`.

In [2]:
len(dates) == len(measurements)

True

The short-term standard (24-hour or daily average) is 35 micrograms per cubic meter of air (µg/m3) 
[source](https://www.health.ny.gov/environmental/indoors/air/pmq_a.htm)

# Task 1

> Using a `for loop`, count how often Brooklyn air quality is above the safe level.


In [3]:
# your code here

In [4]:
#__SOLUTION__
# Using a for loop, count how often the Brooklyn Air Quality is above the safe level

count = 0
for measurement in measurements:
    if measurement >= 35:
        count += 1
        
count

12

# Task 2

> Using a for loop, find the maximum measurement of pm25 in the dataset.


In [5]:
# Your answer here
max_pm = None

In [6]:
#__SOLUTION__

max_pm25 = 0

for measurement in measurements:
    if measurement > max_pm25:
        max_pm25 = measurement



In [7]:
# Run this cell. 
assert(max_pm25==max(measurements))

# If you assigned the incorrect value to the max_pm variable, you will receive an error message
print('Correct')

Correct


# Task 3

> Using a for loop, find the minimum measurement of pm25


In [8]:
# your code here

min_pm25 = None

In [9]:
#__SOLUTION__
# Find the minimum reading of pm25
min_pm25 = max_pm25

for measurement in measurements:
    if measurement < min_pm25:
        min_pm25 = measurement



In [10]:
# Check here
assert(min_pm25 == min(measurements))
print('Correct')

Correct


# Task 4

Each date in the dates list is a Timestamp object.

In [11]:
type(dates[0])

pandas._libs.tslibs.timestamps.Timestamp

For each date, we can find the day of the week by calling the method day_name like so:

`date.day_name()`

With a for loop, create a list of the names of the days of the week for every entry.

In [12]:
# Your code here
day_of_week = None


In [13]:
#__SOLUTION__
day_of_week = []

for date in dates:
    day_of_week.append(date.day_name())


In [16]:
assert(day_of_week[0] == 'Thursday')
print('Correct')

Correct


# Task 5

The `zip` built-in function allows us to iterate through two lists at the same time. 

In [17]:
zip?

[0;31mInit signature:[0m [0mzip[0m[0;34m([0m[0mself[0m[0;34m,[0m [0;34m/[0m[0;34m,[0m [0;34m*[0m[0margs[0m[0;34m,[0m [0;34m**[0m[0mkwargs[0m[0;34m)[0m[0;34m[0m[0;34m[0m[0m
[0;31mDocstring:[0m     
zip(*iterables) --> A zip object yielding tuples until an input is exhausted.

   >>> list(zip('abcdefg', range(3), range(4)))
   [('a', 0, 0), ('b', 1, 1), ('c', 2, 2)]

The zip object yields n-length tuples, where n is the number of iterables
passed as positional arguments to zip().  The i-th element in every tuple
comes from the i-th iterable argument to zip().  This continues until the
shortest argument is exhausted.
[0;31mType:[0m           type
[0;31mSubclasses:[0m     


Using a for loop and zip, create a dictionary using a for loop whose key is the date and value is the pm25 measurement.


In [19]:
# Your code here
date_measurement = None



In [20]:
#__SOLUTION__

date_measurement = {}

for date, pm25 in zip(dates, measurements):
    date_measurement[date] = pm25

date_measurement

{Timestamp('2019-06-27 21:00:00+0000', tz='UTC'): 1.6,
 Timestamp('2019-06-27 22:00:00+0000', tz='UTC'): 0.7,
 Timestamp('2019-06-27 23:00:00+0000', tz='UTC'): 1.6,
 Timestamp('2019-06-28 00:00:00+0000', tz='UTC'): 2.4,
 Timestamp('2019-06-28 01:00:00+0000', tz='UTC'): 4.2,
 Timestamp('2019-06-28 02:00:00+0000', tz='UTC'): 4,
 Timestamp('2019-06-28 03:00:00+0000', tz='UTC'): 4,
 Timestamp('2019-06-28 04:00:00+0000', tz='UTC'): 4.7,
 Timestamp('2019-06-28 05:00:00+0000', tz='UTC'): 4.8,
 Timestamp('2019-06-28 06:00:00+0000', tz='UTC'): 5.2,
 Timestamp('2019-06-28 07:00:00+0000', tz='UTC'): 5.9,
 Timestamp('2019-06-28 08:00:00+0000', tz='UTC'): 4.8,
 Timestamp('2019-06-28 09:00:00+0000', tz='UTC'): 6.4,
 Timestamp('2019-06-28 10:00:00+0000', tz='UTC'): 5.1,
 Timestamp('2019-06-28 11:00:00+0000', tz='UTC'): 3.6,
 Timestamp('2019-06-28 12:00:00+0000', tz='UTC'): 4.8,
 Timestamp('2019-06-28 13:00:00+0000', tz='UTC'): 6,
 Timestamp('2019-06-28 14:00:00+0000', tz='UTC'): 6.1,
 Timestamp('2019

In [29]:
assert(date_measurement[list(date_measurement.keys())[0]] == 1.6)
print('Correct')

Correct


# Task 6
Find the average reading on each day of the week using whatever technique you care to choose.

In [None]:
# Your code here

In [117]:
#__SOLUTION__

for week_day in set(day_of_week):
    
    day_cumulative = 0
    count = 0
    
    for day, measurement in zip(day_of_week, measurements):
        if week_day == day:
            day_cumulative += measurement
            count += 1
    
    print(week_day, day_cumulative/count)

Sunday 6.031232876712325
Wednesday 5.821192528735631
Tuesday 5.342183098591558
Friday 5.607234617985123
Saturday 5.411336599020287
Monday 5.555672268907566
Thursday 6.100790229885051


# Bonus
Unlike lists and tuples, dictionaries are not ordered.  However, there are ways to sort dictionaries. Google how to sort a dictionary, and find the dates of the top 5 worst air-quality readings in Brooklyn.

In [38]:
# Your code here
airquality_sorted = None

In [39]:
#__SOLUTION__
# Choose whichever method you would like to find the dates of the top 5 worst air-quality readings in Brooklyn
# as given in the list

airquality_sorted = dict(sorted(date_measurement.items(), key=lambda x: x[1], reverse=True)[:5])

In [45]:
assert(list(airquality_sorted.values())) == [55.4, 54.7, 53.3, 47.5, 44]
print('Correct')

Correct
