In [2]:
# setup
import numpy as np
import pandas as pd
import datetime
import pandas_datareader.data as web   ##  <--- first time we have imported this

ModuleNotFoundError: No module named 'pandas_datareader'

# Control Flow and Functions

![](https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcTS-MO16PIRjAPHAjpgW3jFscP6HQNHL0ulJlMSmoqV43UL92bxHw)



---



## For Loops

```
for i in thing:
  do something
  ```
  
 We need to initialize a value (`thing`) above, and for each element within `thing`, do something.  
 
 For loops are very powerful and let us do one piece of code many times and operate on each element.

In [0]:
# basics
x = "economics"

# for loop
for i in x:
  print(i)

> Above, each letter of the string is passsed to i, in order, and we simply print it out.  Remember for the first couple of classes, we can slice up strings just like other values within lists.

In [0]:
# a little harder (remember, upper bound is exclusive, so 1-9)
my_values = range(1,10)
my_values
  

In [0]:
# what are the values?
for i in my_values:
  print(i)

In [0]:
# but we can also perform operations
for i in my_values:
  print(i**2)

In [0]:
# create a simple dataframe
my_df = pd.DataFrame({'a': np.random.randint(0, 45, size=10), 
                      'b': np.random.randint(60, 100, size=10)})
my_df.head()

In [0]:
# for each row, print out the product of a and b
# not pratical, but introduces the concept of going through elements of our objects for some need
for i in my_df.index:
  # a slice of the dataframe
  tmp = my_df.iloc[i, ]
  print(tmp['a'] * tmp['b'])

### A more complex example

While it is more complex, it builds upon all of the pieces that we have learned in class:

- leverage lists to store our elements, and as a place to store our data with a simple append
- format strings to build a proper url dynamically
- read data from the web
- add a column to flag which dataset it is
- append the data with concat

In [0]:
# a more complicated approach
teams = ['BOS', 'TOR']

# create an empty dictionary
my_data = []

# for each team, get the data and append to my_data
for team in teams:
  print(team)
  
  # build the url
  url = "https://www.hockey-reference.com/teams/{}/2019_games.html".format(team)
  print(url)
  
  ## get the data
  tmp_page = pd.read_html(url)
  tmp_df = tmp_page[0]
  tmp_df['team'] = team
  
  ## append the html table to the page
  my_data.append(tmp_df)
  

In [0]:
# how many elements in my_data
len(my_data)

In [0]:
# assign each to a dataframe and confirm
df1 = my_data[0]
df2 = my_data[1]
print(type(df1))
print(type(df2))

In [0]:
# last but not least, lets stack them
team_data = pd.concat([df1, df2])

In [0]:
team_data.head()

In [0]:
team_data.tail()

### Exercise:

- Create a list called colors that has 5 values (as strings): blue, red,  green, orange, yellow
- Create an empty list called `characters`
- For each element of `colors`, calculate the number of characters in the word and append the result to `characters`
- what is the average string length?



---



## If / ELIF / ELSE

![](http://i.imgur.com/fqJOBUS.png)

In [0]:
# a simple example
a = 100
b = 50
if a > b:
  print("a is greater than b")

In [0]:
# what happens when the values change?
a = 5
b = 175
if a > b:
  print("a is greater than b")

No output is returned, why?

In [0]:
# we can control for this case by:
a = 5
b = 175
if a > b:
  print("a is greater than b")
else:
  print("a is smaller than b")

Lets change the values one more time:

In [0]:
a = 500
b = 500
if a > b:
  print("a is greater than b")
else:
  print("a is smaller than b")

In [0]:
# we need one more test to ensure we are covering all of the conditions
a = 500
b = 500
if a > b:
  print("a is greater than b")
elif a == b:
  print("a and b are equal")
else:
  print("a is smaller than b")

### IF/ELIF/Else Exercise

For the numbers 1 to 102 (inclusive), store all of the numbers that are divisible evenly by 3 into a list called `by3`.  How many numbers are in `by3`?

> Hint:  you can use the module operator `%` to calculate a remainder.  If zero is returned, the numbers are evenly divisible



---



## Functions

![](https://bjc.edc.org/bjc-r/img/python/simple_python_function_colored.png)

- The image above defines a function using `def`
- The function is called `count_up`
- The function has one input, which is given a label/name called `num`
- A for loop is used to define a range of 1 to `num` + 1
- and iterate over that range and simply print the value



---

Functions are a very powerful way to write code once, include advanced logic with loops and if/else, for example, and return a result.

The methods we apply to a dataframe or series are functions tied to the `DataFrame` or `Series` class.

We can use functions for repeatable code or to apply advanced logic calculations to our data in pandas.

In [0]:
# create a range of values from 1 to 4 (inclusive)
x = range(1, 5)
x

In [0]:
# square each value
np.array(x) ** 2

In [0]:
# we can mirror this easily with a function
def squared(x):
  return(x ** 2)

> Note the explicit use of the `return`, which tells python the output of our function.  In the image above, print is the "output"

In [0]:
# run the function
squared(2)

In [0]:
# how about another example
squared(-9)

In [0]:
# apply it to the numpy array
xnp = np.array(x)
squared(xnp)

In [0]:
## a more complex example
def vodoo_math(a, b):
  if a == b:
    print("the numbers are equal - squaring the numbers")
    return(a*b)
  elif a < b:
    print("a is less than b, subtracting the numbers")
    return(a - b)
  else:
    print("a is greater than b, diving the numbers")
    return(a/b)
  

In [0]:
# test it out
vodoo_math(5, 5)

In [0]:
vodoo_math(1, 4)

In [0]:
vodoo_math(7, 2)

### Function Exercise

This exercise combines some of the fundamentals from earlier in the semester with the content in this module.

Create a function that does the following:
- simulates the roll of a dice
- if the value is 6, return the value of 2
- if the value is 3, 4 or 5, return a value of -1
- if the value is 1 return a value of 0
- if the value is 2, return a value of 1

Create a variable called `points` which has a value of 50

Roll the dice 100 times, and for each roll, update the value of points based on the roll

Save the current value of `points` as an entry in a list called results (there should be 100 values at the end)

Create a line plot that shows the value of `points` change with each roll

Hints:

- numpy random has a method called `choice`



