Control Structures
------------------

We've spent some time going into detail about some of the data types and structures available in python. It's now time to talk about how to navigate through some of this data, and use data to make decisions. Traversing over data and making decisions based upon data are a common aspect of every programming language, known as control flow. Python provides a rich control flow, with a lot of conveniences for the power users. Here, we're just going to talk about the basics, to learn more, please [consult the documentation](http://docs.python.org/2/tutorial/controlflow.html). 

A common theme throughout this discussion of control structures is the notion of a "block of code." Blocks of code are **demarcated by a specific level of indentation**, typically separated from the surrounding code by some control structure elements, immediately preceeded by a colon, `:`. We'll see examples below. 

Finally, note that control structures can be nested arbitrarily, depending on the tasks you're trying to accomplish. 

### while statements:

While loops are keep iterating until a given condition becomes true. For example, the following example counts until `i` becomes larger than 5

In [None]:
i=0
while i<=5:
    print(i)
    i = i + 1

Or consider the following example: The tea starts at 115 degrees Fahrenheit. You want it at 110 degrees. A chip of ice turns out to lower the temperature one degree every second. You test the temperature each time, and also print out the temperature before reducing the temperature. In Python you could write and run the code below:

In [None]:
import time # This is just to use the time.sleep function

temperature = 115  
while temperature > 110: # first while loop code
    print(temperature)
    time.sleep(1.0) # Wait for 1 sec
    temperature = temperature - 1
     
print('The tea is cool enough.')

Consider the code below. It will keep asking the user for a password, until the user enters the correct password, which is `ilovepython'.

In [None]:
password = ""
secret_password = "ilovepython"
while password != secret_password:
    password = input("Please enter the password: ")
    if password == secret_password:
        print("Thank you. You have entered the correct password")
    else:
        print("Sorry the value entered in incorrect - try again")

And here is a simplified simulator that asks you how much money you are withdrawing from a bank account each year, until you run out of money. Notice that the loop will keep running for ever, if you never withdraw more money than what you have.

In [None]:
money_in_bank = 1000
interest = 6
year = 2017
while money_in_bank>0:
    print("At the beginning of {y} you have ${m}.".format(y=year, m=money_in_bank))
    widthdrawal = int(input("How much do you want to widthdraw in {y}? ".format(y = year)))
    money_in_bank = money_in_bank - widthdrawal
    money_in_bank = money_in_bank * (1 + interest/100)
    year = year + 1
    print("At the end of {y} you have ${m}.".format(y=year, m=money_in_bank))
    print("-----------------")
print("You have no money left!")

#### Exercise

* Write a program that prompts the user to enter numbers, one per line, ending with a line containing 0, and keep a running sum of the numbers. Print out the partial sum, as the numbers are entered. Once you get a zero, stop and print out the final sum.

### Break and Continue: 

These two statements are used to modify iteration of loops. Break is used to *exit immediately* the *inner most _loop_* in which it appears. In contrast, continue stops the code executing within the loop and goes on to the *next iteration of the same loop*.

In [None]:
cnt = 0
while True:
    cnt = cnt + 1
    n = input("Please enter 'hello':")
    if n == 'hello':
        break
    print("I politely asked you to say 'hello!")
    
print('Hello to you, too. Did I have to ask', cnt, "times?")

In [None]:
import time 
temperature = 133 
while temperature > 110: # first while loop code
    temperature = temperature - 1
    if temperature % 5 != 0: # If the temperature is not divisible by 5
        continue # We keep running the loop, but will not print
                 # the temperature and have a delay
    time.sleep(0.5)
    print(temperature)
     
print('The tea is cool enough.')

### for Statements:

**See also LPTHW, Exp 32.**

for statements are a convenient way to iterate through the values contained in a data structure. Going through the elements in a data structure one at a time, this element is assigned to variable. The code block associated with the for statement (or for loop) is then evaluated with this value.

In [None]:
set_a = {1, 2, 3, 4}
for i in set_a:
    print(i, " squared is:", i*i )

In [None]:
print("a more complex block")
set_a = {1, 2, 3, 4, 5, 6}
for i in set_a:
    # print(i)
    if i >= 3:
        print("==> ",i, " squared is:", i*i )

In [None]:
print("this also works for lists")
list_a = [1,2,3]
for num in list_a:
    print(num)

In [None]:
print("dictionaries let you iterate through keys, values, or both")
phones = {
    "Panos": "212-998-0803",
    "Maria": "656-233-5555",
    "John": "693-232-5776",
    "Jake": "415-794-3423"
}

In [None]:
print("Iterating over keys")
for k in phones.keys():
    print("key =", k, ", value =", phones[k])

In [None]:
print("Iterating over values")
for v in phones.values():
    print(v)

In [None]:
print("Iterating over both keys and values")
# Items returns *tuples* that correspond to key-value pairs
# ("Panos", "212-998-0803"), ("Maria": "656-233-5555"), etc.
for (k,v) in phones.items():
    print(k, v)

In [None]:
nba_teams = ["Atlanta Hawks", "Boston Celtics", "Brooklyn Nets", "Charlotte Hornets", "Chicago Bulls", "Cleveland Cavaliers", "Dallas Mavericks", "Denver Nuggets", "Detroit Pistons", "Golden State Warriors", "Houston Rockets", "Indiana Pacers", "LA Clippers", "Los Angeles Lakers", "Memphis Grizzlies", "Miami Heat", "Milwaukee Bucks", "Minnesota Timberwolves", "New Orleans Pelicans", "New York Knicks", "Oklahoma City Thunder", "Orlando Magic", "Philadelphia 76ers", "Phoenix Suns", "Portland Trail Blazers", "Sacramento Kings", "San Antonio Spurs", "Toronto Raptors", "Utah Jazz"]
print("The list contains", len(nba_teams), "teams")
for team in nba_teams:
    print(team)

### Exercise

* print the names of the people from the dictionary below, by iterating through the keys
* print the age of each person, by iterating through the keys, and then looking up the "YOB" entry.
* print the names of people born after 1980
* print the number of children for each person. You need to check if the "Children" list exists in the dictionary.

In [None]:
data = {
        "Foster": {
            "Job": "Professor", 
            "YOB": 1965, 
            "Children": ["Hannah"],
            "Awards": ["Best Teacher 2014", "Best Researcher 2015"],
            "Salary": 120000
        }, 
        "Joe": {
            "Job": "Data Scientist", 
            "YOB": 1981,
            "Salary": 200000
        },
        "Maria": { 
            "Job": "Software Engineer", 
            "YOB": 1993, 
            "Children": [],
            "Awards": ["Dean's List 2013", "Valedictorian 2011", "First place in Math Olympiad 2010"]
        }, 
        "Panos": { 
            "Job": "Professor", 
            "YOB": 1976, 
            "Children": ["Gregory", "Anna"]
        },
    }

In [None]:
## Print the names of people in the data

In [None]:
## Print the names and age

In [None]:
## Print the names of people born after 1980

In [None]:
## Print the number of children for each perspon

### Using Break/Continue with for loops

Let's see an example of using `break` and `continue` within a for loop.

In [None]:
nba_teams = ["Atlanta Hawks", "Boston Celtics", "Brooklyn Nets", "Charlotte Hornets", "Chicago Bulls", "Cleveland Cavaliers", "Dallas Mavericks", "Denver Nuggets", "Detroit Pistons", "Golden State Warriors", "Houston Rockets", "Indiana Pacers", "LA Clippers", "Los Angeles Lakers", "Memphis Grizzlies", "Miami Heat", "Milwaukee Bucks", "Minnesota Timberwolves", "New Orleans Pelicans", "New York Knicks", "Oklahoma City Thunder", "Orlando Magic", "Philadelphia 76ers", "Phoenix Suns", "Portland Trail Blazers", "Sacramento Kings", "San Antonio Spurs", "Toronto Raptors", "Utah Jazz"]
print("The list contains", len(nba_teams), "teams")

We will now search through the list of teams, to find whether there is a team that contains the `looking_for` string. Try the variant with the `break` and with the `continue`, to see the difference.

In [None]:
looking_for = "Brooklyn"
for team in nba_teams:
    if looking_for in team: 
        print("We found the team:", team, "containing", looking_for)
        print("We will stop searching now")
        break # we go out of the loop
        # continue # we skip the remaining of the code in the nested block
    # else:
    print(team, "does not contain the string", looking_for)
    
print("Out of the loop!")

Technically, we can simulate the use of `break` and `continue` with `if-else` statements, but their usage often makes the code easier to read. Consider the following example, where we want to search for a team, and if we find a team that matches, we want to see if they made it to the play offs.

In [None]:
playoff = ["Atlanta Hawks", "Boston Celtics", "Charlotte Hornets","Cleveland Cavaliers", "Dallas Mavericks", "Detroit Pistons","Golden State Warriors", "Houston Rockets", "Indiana Pacers", "LA Clippers", "Memphis Grizzlies", "Miami Heat", "Oklahoma City Thunder", "Portland Trail Blazers", "San Antonio Spurs", "Toronto Raptors"]
print("The list contains", len(playoff), "teams")

In [None]:
looking_for = "Clippers"

for team in nba_teams:
    # If the team does not match, we continue searching
    # without executing the remaining code
    if looking_for not in team: 
        continue
 
    # If we have found a matching team, we check for their status in playoffs
    if team in playoff:
        print(team, "was in the playoffs!")
    else:
        print(team, "was not in the playoffs...")


### Ranges of Integers:

Often it is convenient to define (and iterate through) ranges of integers. Python has a convenient range function that allows you to do just this.

In [None]:
list(range(20))

In [None]:
print(list(range(10)) )# start at zero, < the specified ceiling value
# range(10) <=> range(0,10)
for i in range(10):
    print(i, "squared is", i*i)

In [None]:
# When range command has two parameters, it starts from the first parameter
# and finishes at the second 
print(list(range(-5, 5)))#from the left value, < right value

In [None]:
# When range has a third argument, this is the "step" value
print(list(range(-5, 50, 5)) )#from the left value, to the middle value, incrementing by the right value

#### Warning

Those that are already familiar with programming will tend to write code like this:

In [None]:
# Old style, using indexing for loops
names = ["Abe", "Bill", "Chris", "Dorothy", "Ellis"]
for i in range(0,len(names)):
    print(names[i])

instead of 

In [None]:
# Pythonic style, use iterators
names = ["Abe", "Bill", "Chris", "Dorothy", "Ellis"]
for n in names:
    print(n)

*Avoid* using the indexing style method for iterating through data structures. While technically both generate the same result, the "Pythonic" way of doing things is the latter: It is simpler, more readable, and less prone to errors. 

#### Exercise

* print your name 10 times (easy, peasy). 
* print on the screen a "triangle", by printing first "#", then "##", then "###", etc. Repeat 10 times; _Hint: The command `print(i*'#')` will print the character '#' a total of `i` times._

In [None]:
#
##
###
####
#####
######
#######
########
#########
##########

List Comprehensions
-------------------

The practical data scientist often faces situations where one list is to be transformed into another list, transforming the values in the input array, filtering out certain undesired values, etc. List comprehensions are a natural, flexible way to perform these transformations on the elements in a list. 

The syntax of list comprehensions is based on the way mathematicians define sets and lists, a syntax that leaves it clear what the contents should be:

+ `S = {x² : x in {0 ... 9}}`

Python's list comprehensions give a very natural way to write statements just like these. It may look strange early on, but it becomes a very natural and concise way of creating lists, without having to write for-loops.

In [None]:
S = [] # initialize the list
for x in range(10):
    S.append(x*x)
print(S)

In [None]:
# This code below will create a list with the squares
# of the numbers from 0 to 9 
S = [] # we create an empty list
for i in range(10): # We iterate over all numbers from 0 to 9
    S.append(i*i) # We add in the list the square of the number i
print(S )# we print(the list)

In [None]:
S = [i*i for i in range(10)]
print(S)

Now let's do one more example:

+ `V = (1, 2, 4, 8, ..., 2¹²)`


In [None]:
V = [2**i for i in range(13)]
print(V)

In [None]:
V= []
for i in range(13):
    V.append(2**i)
print(V)

### The *if* statement within a list comprehension

Now let's consider the following case:

+ `M = {x | x in S and x even}`

**Note the list comprehension for deriving M uses a "if statement" to filter out those values that aren't of interest**, restricting to only the even squares.

In [None]:
S = [i*i for i in range(10)]
print(S)

In [None]:
M = []
for i in S: # iterate through all elements in S
    if i%2 == 0: # if i is an event number
        M.append(i) # ..add it to the list
print(M)

In [None]:
M = [x for x in S if x%2 == 0]
print(M)

These are simple examples, using numerical compuation. Let's see a more "practical" use: In the following operation we transform a string into an list of values, a more complex operation: 

In [None]:
words = 'The quick brown fox jumps over the lazy dog'
[(w.upper(), w.lower(), len(w)) for w in words.split()]

#### Exercise

* List each word and its length from the string 'The quick brown fox jumps over the lazy dog', conditioned on the length of the word being four characters and above
* List only words with the letter o in them

In [None]:
# List each word and its length from the string 
# 'The quick brown fox jumps over the lazy dog', 
# conditioned on the length of the word being four characters and above


In [None]:
# List only words with the letter o in them


* You are given the `wsj` article below. Write a list comprehension for getting the words that appear more than once. 
    * Use the `.split()` command for splitting, without passing a parameter.
    * When counting words, case does not matter (i.e., YAHOO is the same as Yahoo).

* Find all the *characters* in the article that are not letters or numbers. You can use the isdigit() and isalpha() functions, which work on strings. (e.g, `"Panos".isalpha()` and `"1234".isdigit()` return True) 

In [None]:
wsj = """
Yahoo Inc. disclosed a massive security breach by a “state-sponsored actor” affecting at least 500 million users, potentially the largest such data breach on record and the latest hurdle for the beaten-down internet company as it works through the sale of its core business.
Yahoo said certain user account information—including names, email addresses, telephone numbers, dates of birth, hashed passwords and, in some cases, encrypted or unencrypted security questions and answers—was stolen from the company’s network in late 2014 by what it believes is a state-sponsored actor.
Yahoo said it is notifying potentially affected users and has taken steps to secure their accounts by invalidating unencrypted security questions and answers so they can’t be used to access an account and asking potentially affected users to change their passwords.
Yahoo recommended users who haven’t changed their passwords since 2014 do so. It also encouraged users change their passwords as well as security questions and answers for any other accounts on which they use the same or similar information used for their Yahoo account.
The company, which is working with law enforcement, said the continuing investigation indicates that stolen information didn't include unprotected passwords, payment-card data or bank account information.
With 500 million user accounts affected, this is the largest-ever publicly disclosed data breach, according to Paul Stephens, director of policy and advocacy with Privacy Rights Clearing House, a not-for-profit group that compiles information on data breaches.
No evidence has been found to suggest the state-sponsored actor is currently in Yahoo’s network, and Yahoo didn’t name the country it suspected was involved. In August, a hacker called “Peace” appeared in online forums, offering to sell 200 million of the company’s usernames and passwords for about $1,900 in total. Peace had previously sold data taken from breaches at Myspace and LinkedIn Corp.
"""