# Introduction to Data Science for Public Policy
## Class 4: Loops
## Thomas Monk

**Recap**. Let's start with Question 1 in the Problem Set - this will give us some practice with dictionaries. Download the problem set and the assignment whilst we're there. Then we'll move on to loops.

**Plan for today**.

After the loop slides, we're going to practice these immediately in Problem Set 1.

Then, we'll move on the assignment (using real data!). This will allow us to practice all the skills we've learnt so far.

To answer some questions that came up yesterday.

**1**. We can use `zip()` to convert lists to dicts.

In [1]:
fruits = ["Apple", "Pear", "Peach", "Banana"]
prices = [0.35, 0.40, 0.40, 0.28]

fruit_dictionary = dict(zip(fruits, prices))

print(fruit_dictionary)

{'Apple': 0.35, 'Pear': 0.4, 'Peach': 0.4, 'Banana': 0.28}


**2.** Printing keys and values.

In [6]:
us_state_to_abbrev = {"Alabama": "AL","Alaska": "AK","Arizona": "AZ","Arkansas": "AR","California": "CA","Colorado": "CO","Connecticut": "CT","Delaware": "DE","Florida": "FL","Georgia": "GA","Hawaii": "HI","Idaho": "ID","Illinois": "IL","Indiana": "IN","Iowa": "IA","Kansas": "KS","Kentucky": "KY","Louisiana": "LA","Maine": "ME","Maryland": "MD","Massachusetts": "MA","Michigan": "MI","Minnesota": "MN","Mississippi": "MS","Missouri": "MO","Montana": "MT","Nebraska": "NE","Nevada": "NV","New Hampshire": "NH","New Jersey": "NJ","New Mexico": "NM","New York": "NY","North Carolina": "NC","North Dakota": "ND","Ohio": "OH","Oklahoma": "OK","Oregon": "OR","Pennsylvania": "PA","Rhode Island": "RI","South Carolina": "SC","South Dakota": "SD","Tennessee": "TN","Texas": "TX","Utah": "UT","Vermont": "VT","Virginia": "VA","Washington": "WA","West Virginia": "WV","Wisconsin": "WI","Wyoming": "WY","District of Columbia": "DC","American Samoa": "AS","Guam": "GU","Northern Mariana Islands": "MP","Puerto Rico": "PR","United States Minor Outlying Islands": "UM","U.S. Virgin Islands": "VI",}

for key, value in us_state_to_abbrev.items():
    print(key, "is also known as", value)

Alabama is also known as AL
Alaska is also known as AK
Arizona is also known as AZ
Arkansas is also known as AR
California is also known as CA
Colorado is also known as CO
Connecticut is also known as CT
Delaware is also known as DE
Florida is also known as FL
Georgia is also known as GA
Hawaii is also known as HI
Idaho is also known as ID
Illinois is also known as IL
Indiana is also known as IN
Iowa is also known as IA
Kansas is also known as KS
Kentucky is also known as KY
Louisiana is also known as LA
Maine is also known as ME
Maryland is also known as MD
Massachusetts is also known as MA
Michigan is also known as MI
Minnesota is also known as MN
Mississippi is also known as MS
Missouri is also known as MO
Montana is also known as MT
Nebraska is also known as NE
Nevada is also known as NV
New Hampshire is also known as NH
New Jersey is also known as NJ
New Mexico is also known as NM
New York is also known as NY
North Carolina is also known as NC
North Dakota is also known as ND
Ohio

# Loops

Loops are another big part of controlling the flow of a program, along with conditionals.

You'll notice that they play really nicely with lists and dicts, and are part of the **programming intuition** I want you to develop.

Hopefully we can start thinking in terms of loops - in terms of *automation* of tasks.

What is a loop? A loop is just a way to repeatedly execute some code.

Loops aren't absent from Stata, we're just deterred from thinking in that way as variables aren't readily accessible!

Look at the example below - notice the *syntax*.

In [8]:
planets = ['Mercury', 'Venus', 'Earth', 'Mars', 'Jupiter', 'Saturn', 'Uranus', 'Neptune']
for planet in planets:
    print(planet,end=' ') # What's the end argument doing here?

Mercury Venus Earth Mars Jupiter Saturn Uranus Neptune 

The `for` loop iterates through each member of the list `planets`. Each iteration of the loop redefines our variable `planet` with the current item. 

Let's count with a loop - we'll use the `range` function.

In [14]:
for x in range(10):
  print(x, end=" ")

0 1 2 3 4 5 6 7 8 9 

Range gives us a set of values to iterate over. Remember Python uses zero-indexing, so we have 10 items which start at 0!

We can even iterate through a string with a for loop, iterating through the characters.

In [15]:
s = 'steganograpHy is the practicE of conceaLing a file, message, image, or video within another fiLe, message, image, Or video.'
msg = ''
# print all the uppercase letters in s, one at a time
for char in s:
    if char.isupper():
        print(char, end='')    

HELLO

`while` loops are another type of loop.

Instead of looping through a set of given items, these repeat the code within the loop **while** a condition is true.

In [17]:
i = 0
while i < 10:
    print(i, end=' ')
    i += 1 # increase the value of i by 1

0 1 2 3 4 5 6 7 8 9 

We were able to count exactly as the for loop above, just using a different type of loop.

Notice we defined our own iterator here - `i` - which we used throughout the loop.

## List comprehensions - semi-advanced
List comprehensions are a cool unique feature of Python - they could make your life easier if you understand them - or you can ignore them completely!

They mix loops and lists together.

In [20]:
squares = [n**2 for n in range(10)]
squares

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

Notice that we've just built a list from a loop on a **single line**. How could we have done this otherwise?

In [22]:
squares = []
for n in range(10):
    squares.append(n**2)
squares

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

We can also add an if condition:

In [23]:
short_planets = [planet for planet in planets if len(planet) < 6]
short_planets

['Venus', 'Earth', 'Mars']

Here's an example of filtering with an if condition and applying some transformation to the loop variable:

In [24]:
loud_short_planets = [planet.upper() + '!' for planet in planets if len(planet) < 6]
loud_short_planets

['VENUS!', 'EARTH!', 'MARS!']

This can get tricky and complex! But it can be very useful.