# Lab 5: Probability & Randomization I

Welcome to lab 5! In this lab, we will go over conditionals and iteration, and introduce the concept of randomness. Randomness and probability are central concepts to statistics. All of this material is covered in [Chapter 8](https://inferentialthinking.com/chapters/09/Randomness.html) of the textbook. 

First, set up the tests and imports by running the cell below.

In [None]:
name = ...

In [None]:
import numpy as np
from datascience import *
%matplotlib inline
import matplotlib.pyplot as plt
plt.style.use('fivethirtyeight')

# Fix for datascience plots
import collections as collections
import collections.abc as abc
collections.Iterable = abc.Iterable

# These lines load the tests.
from gofer.ok import check

## 1. Conditionals

In Python, Boolean values can either be `True` or `False`. We get Boolean values when using comparison operators, among which are `<` (less than), `>` (greater than), and `==` (equal to). For a complete list, refer to [Booleans and Comparison](https://inferentialthinking.com/chapters/09/Randomness.html) at the start of Chapter 8.

Run the cell below to see an example of a comparison operator in action.

In [None]:
bool_expression = 3 > 1 + 1
bool_expression

Arrays are compatible with comparison operators. The output is an array of boolean values.

In [None]:
make_array(1, 5, 7, 8, 3, -1) > 3

You're hanging out with friends and ordered pizza.  You have a stack of pizza boxes that include plain cheese, veggie, supreme, and pepperoni.  

Using the function call `np.random.choice(array_name)`, let's simulate opening a pizza box at random. Start by running the cell below several times, and observe how the results change.

In [None]:
pizza = make_array('cheese', 'veggie', 'supreme', 'pepperoni')
np.random.choice(pizza)

**Question 1:**  Assume we took 7 slices of pizza at random, and stored the results in an array called `seven_slices`. Find the number of slices of pepperoni pizza (do not hardcode the answer).  

*Hint:* Our solution involves a comparison operator and the `np.count_nonzero` method.

In [None]:
seven_slices = make_array('cheese', 'supreme', 'cheese', 'pepperoni', 'veggie', 'veggie', 'cheese')
number_pepperoni = ...
number_pepperoni

In [None]:
check('tests/q1.py')

**Conditional Statements**

A conditional statement is made up of many lines that allow Python to choose from different alternatives based on whether some condition is true.

Here is a basic example.

```
def sign(x):
    if x > 0:
        return 'Positive'
```

How the function works is if the input `x` is greater than `0`, we get the string `'Positive'` back.

If we want to test multiple conditions at once, we use the following general format.

```
if <if expression>:
    <if body>
elif <elif expression 0>:
    <elif body 0>
elif <elif expression 1>:
    <elif body 1>
...
else:
    <else body>
```

Only one of the bodies will ever be executed. Each `if` and `elif` expression is evaluated and considered in order, starting at the top. As soon as a true value is found, the corresponding body is executed, and the rest of the expression is skipped. If none of the `if` or `elif` expressions are true, then the `else body` is executed. For more examples and explanation, refer to [Section 8.1](https://www.inferentialthinking.com/chapters/08/1/conditional-statements.html).

**Question 2:**  We want to write a function that returns the price of a pizza depending on the type.  The function takes in the type of pizza as a string and returns a price: 10 dollars for cheese,  12 dollars for veggie, 13 dollars for pepperoni, and 15 dollars for supreme.  

In [None]:
def pizza_price(pizza):
    if ...
        return 10
    # next condition should return 12
    ...
    # next condition should return 13
    ...
    # next condition should return 15
    ...

plain_price = pizza_price('cheese')
plain_price

In [None]:
check('tests/q2.py')

## 2. Iteration

Using a `for` statement, we can perform a task multiple times. This is known as iteration. Here, we'll simulate drawing different suits from a deck of cards. 

In [None]:
suits = make_array("♤", "♡", "♢", "♧")

draws = make_array()

repetitions = 6

for i in np.arange(repetitions):
    draws = np.append(draws, np.random.choice(suits))

draws

Another use of iteration is to loop through a set of values. For instance, we can print out all of the colors of the rainbow.

In [None]:
rainbow = make_array("red", "orange", "yellow", "green", "blue", "indigo", "violet")

for color in rainbow:
    print(color)

We can see that the indented part of the `for` loop, known as the body, is executed once for each item in `rainbow`. Note that the name `color` is arbitrary; we could easily have named it something else.

**Question 3:**  You're working at a pizza shop, and we want to simulate the amount of money the shop brings in over two weeks (the shop sells 500 pizzas a week on average).  Write code that simulates the total money made after 2 weeks using your `pizza_price` function above.

In [None]:
possible_pizzas = ...
pizzas = 1000

total_money = ...

for ... in range(...): 
    pizza = ... # each time a pizza among the possible_pizzas is randomly selected
    cost = ... # call the function pizza_price to calculate the price for the selected pizza
    ...        # add the pizza price to the total_money

total_money

In [None]:
check('tests/q3.py')

**Question 4:** Charles Darwin is a famous naturalist and biologist from the late 1800s.  While Darwin is known for several different theories, one of his most well known theory involved the finches on Galapagos Islands and helped form his theory on natural selection and speciation.  In this question, we are going to loop through Charles Darwin's book on the Origin of Species and count up the number of times he refers to bird or birds in the text.  

In [None]:
darwin_string = open('darwin_origin_species.txt', encoding='utf-8').read()
darwin_words = np.array(darwin_string.split()) #split the string to elements of words separateed by space

birds = ... # set the initial count number
for ... in ...     # provide the for loop range
    if ...== ...   # to see if a word matches either "bird" or 'birds'
        ......
    elif ...== ... # to see if a word matches either "bird" or 'birds'
        ......

birds 

In [None]:
import numpy as np
darwin_string = open('darwin_origin_species.txt', encoding='utf-8').read()
darwin_words = np.array(darwin_string.split())
len(darwin_words)

In [None]:
check('tests/q4.py')

In [None]:
# For your convenience, you can run this cell to run all the tests at once!
import glob
from gofer.ok import check
correct = 0
total = 4
for x in range(1, total+1):
    print('Testing question {}: '.format(str(x)))
    g = check('tests/q{}.py'.format(str(x)))
    if g.grade == 1.0:
        print("Passed")
        correct += 1
    else:
        print('Failed')
        display(g)

print('Grade:  {}'.format(str(correct/total)))

In [None]:
print(name," Great work!")