# Lab 4: Simulations

Welcome to Lab 4! 

We will go over [iteration](https://www.inferentialthinking.com/chapters/09/2/Iteration.html) and [simulations](https://www.inferentialthinking.com/chapters/09/3/Simulation.html), as well as the concept of [randomness](https://www.inferentialthinking.com/chapters/09/Randomness.html).

First, set up the tests and imports by running the cell below.

In [18]:
# Run this cell, but please don't change it.

# These lines import the Numpy and Datascience modules.

from  google.colab import drive
drive.mount('/content/drive')
from datascience import * 

import numpy as np
from datascience import *

# These lines do some fancy plotting magic
import matplotlib
%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np
plt.style.use('fivethirtyeight')

Mounted at /content/drive


## 1. Nachos and Conditionals

In Python, the boolean data type contains only two unique values:  `True` and `False`. Expressions containing comparison operators such as `<` (less than), `>` (greater than), and `==` (equal to) evaluate to Boolean values. A list of common comparison operators can be found below!

<img src="comparisons.png">

Run the cell below to see an example of a comparison operator in action.

In [19]:
3 > 1 + 1

True

We can even assign the result of a comparison operation to a variable.

In [20]:
result = 10 / 2 == 5
result

True

Arrays are compatible with comparison operators. The output is an array of boolean values.

In [21]:
make_array(1, 5, 7, 8, 3, -1) > 3

array([False,  True,  True,  True, False, False], dtype=bool)

One day, when you come home after a long week, you see a hot bowl of nachos waiting on the dining table! Let's say that whenever you take a nacho from the bowl, it will either have only **cheese**, only **salsa**, **both** cheese and salsa, or **neither** cheese nor salsa (a sad tortilla chip indeed). 

Let's try and simulate taking nachos from the bowl at random using the function, `np.random.choice(...)`.

### `np.random.choice`

`np.random.choice` picks one item at random from the given array. It is equally likely to pick any of the items. Run the cell below several times, and observe how the results change.

In [22]:
nachos = make_array('cheese', 'salsa', 'both', 'neither')
np.random.choice(nachos)

'cheese'

To repeat this process multiple times, pass in an int `n` as the second argument to return `n` different random choices. By default, `np.random.choice` samples **with replacement** and returns an *array* of items. 

Run the next cell to see an example of sampling with replacement 10 times from the `nachos` array.

In [23]:
np.random.choice(nachos, 10)

array(['salsa', 'cheese', 'both', 'cheese', 'salsa', 'neither', 'neither',
       'neither', 'neither', 'cheese'],
      dtype='<U7')

To count the number of times a certain type of nacho is randomly chosen, we can use `np.count_nonzero`

### `np.count_nonzero`

`np.count_nonzero` counts the number of non-zero values that appear in an array. When an array of boolean values are passed through the function, it will count the number of `True` values (remember that in Python, `True` is coded as 1 and `False` is coded as 0.)

Run the next cell to see an example that uses `np.count_nonzero`.

In [24]:
np.count_nonzero(make_array(True, False, False, True, True))

3

<span style='background:yellow'>**Question 1.1.**</span> Assume we took ten nachos at random, and stored the results in an array called `ten_nachos` as done below. Find the number of nachos with only cheese using code (do not hardcode the answer).  

*Hint:* Our solution involves a comparison operator (e.g. `=`, `<`, ...) and the `np.count_nonzero` method.

In [25]:
ten_nachos = make_array('neither', 'cheese', 'both', 'both', 'cheese', 'salsa', 'both', 'neither', 'cheese', 'both')
number_cheese = np.count_nonzero(ten_nachos == "cheese")
number_cheese

3

In [26]:
# TEST
number_cheese == 3

True

**Conditional Statements**

A conditional statement is a multi-line statement that allows Python to choose among different alternatives based on the truth value of an expression.

Here is a basic example.

```
def sign(x):
    if x > 0:
        return 'Positive'
    else:
        return 'Negative'
```

If the input `x` is greater than `0`, we return the string `'Positive'`. Otherwise, we return `'Negative'`.

If we want to test multiple conditions at once, we use the following general format.

```
if <if expression>:
    <if body>
elif <elif expression 0>:
    <elif body 0>
elif <elif expression 1>:
    <elif body 1>
...
else:
    <else body>
```

Only the body for the first conditional expression that is true will be evaluated. Each `if` and `elif` expression is evaluated and considered in order, starting at the top. As soon as a true value is found, the corresponding body is executed, and the rest of the conditional statement is skipped. If none of the `if` or `elif` expressions are true, then the `else body` is executed. 

For more examples and explanation, refer to the section on conditional statements [here](https://www.inferentialthinking.com/chapters/09/1/conditional-statements.html).

<span style='background:yellow'>**Question 1.2.**</span> Complete the following conditional statement so that the string `'More please'` is assigned to the variable `say_please` if the number of nachos with cheese in `ten_nachos` is less than `5`.

*Hint*: You should be using `number_cheese` from Question 1.

In [27]:
say_please = '?'

if number_cheese < 5:
    say_please = 'More please'
    
say_please

'More please'

In [28]:
# TEST
say_please == 'More please'

True

<span style='background:yellow'>**Question 1.3.**</span> Write a function called `nacho_reaction` that returns a reaction (as a string) based on the type of nacho passed in as an argument. Use the table below to match the nacho type to the appropriate reaction.

<img src="nacho_reactions.png">

*Hint:* If you're failing the test, double check the spelling of your reactions.

In [29]:
def nacho_reaction(nacho):
    if nacho == 'cheese':
        return 'Cheesy!'
    # next condition should return 'Spicy!'
    elif nacho == 'salsa':
        return 'Spicy!'
    # next condition should return 'Wow!'
    elif nacho == 'both':
        return 'Wow!'
    # next condition should return 'Meh.'
    else:
        return 'Meh.'

spicy_nacho = nacho_reaction('salsa')
spicy_nacho

'Spicy!'

In [30]:
# TEST
nacho_reaction('salsa') == 'Spicy!'

True

In [31]:
# TEST
nacho_reaction('cheese') == 'Cheesy!'

True

In [32]:
# TEST
nacho_reaction('both') == 'Wow!'

True

In [33]:
# TEST
nacho_reaction('neither') == 'Meh.'

True

<span style='background:yellow'>**Question 1.4.**</span> Create a table `ten_nachos_reactions` that consists of the nachos in `ten_nachos` as well as the reactions for each of those nachos. The columns should be called `Nachos` and `Reactions`.

*Hint:* Use the `apply` method. 

In [34]:
ten_nachos_reactions = Table().with_column('Nachos', ten_nachos)
ten_nachos_reactions = ten_nachos_reactions.with_column('Reactions' ,ten_nachos_reactions.apply(nacho_reaction, 'Nachos'))
ten_nachos_reactions

Nachos,Reactions
neither,Meh.
cheese,Cheesy!
both,Wow!
both,Wow!
cheese,Cheesy!
salsa,Spicy!
both,Wow!
neither,Meh.
cheese,Cheesy!
both,Wow!


In [35]:
# TEST
# One or more of the reaction results could be incorrect;
np.count_nonzero(ten_nachos_reactions.column('Reactions') == make_array('Meh.', 'Cheesy!', 'Wow!', 'Wow!', 'Cheesy!', 'Spicy!', 'Wow!', 'Meh.', 'Cheesy!', 'Wow!')) == 10

True

<span style='background:yellow'>**Question 1.5.**</span> Using code, find the number of 'Wow!' reactions for the nachos in `ten_nachos_reactions`.

In [36]:
number_wow_reactions = np.count_nonzero(ten_nachos_reactions.where('Reactions', are.equal_to('Wow!')).column(1))
number_wow_reactions

4

In [37]:
# TEST
2 < number_wow_reactions < 6

True

In [38]:
# TEST 
# Incorrect value for number_wow_reactions
number_wow_reactions == 4

True

## 2. Simulations and For Loops
Using a `for` statement, we can perform a task multiple times. This is known as iteration.

One use of iteration is to loop through a set of values. For instance, we can print out all of the colors of the rainbow.

In [39]:
rainbow = make_array("red", "orange", "yellow", "green", "blue", "indigo", "violet")

for color in rainbow:
    print(color)

red
orange
yellow
green
blue
indigo
violet


We can see that the indented part of the `for` loop, known as the body, is executed once for each item in `rainbow`. The name `color` is assigned to the next value in `rainbow` at the start of each iteration. Note that the name `color` is arbitrary; we could easily have named it something else. The important thing is we stay consistent throughout the `for` loop. 

In [40]:
for another_name in rainbow:
    print(another_name)

red
orange
yellow
green
blue
indigo
violet


In general, however, we would like the variable name to be somewhat informative. 

<span style='background:yellow'>**Question 2.1.**</span> In the following cell, we've loaded the text of _Pride and Prejudice_ by Jane Austen, split it into individual words, and stored these words in an array `p_and_p_words`. Using a `for` loop, assign `longer_than_five` to the number of words in the novel that are more than 5 letters long.

*Hint*: You can find the number of letters in a word with the `len` function.

In [41]:
austen_string = open('/content/drive/My Drive/Colab Notebooks/Austen_PrideAndPrejudice.txt', encoding='utf-8').read()
p_and_p_words= np.array(austen_string.split())

longer_than_five = []
for word in p_and_p_words:
    if len(word) > 5:
        longer_than_five.append(word)
    
longer_than_five = np.count_nonzero(longer_than_five)
longer_than_five

35453

In [42]:
# TEST
longer_than_five == 35453

True

<span style='background:yellow'>**Question 2.2.**</span> Using a simulation with 10,000 trials, assign num_different to the number of times, in 10,000 trials, that two words picked uniformly at random (with replacement) from Pride and Prejudice have different lengths. 

*Hint 1*: What function did we use in section 1 to sample at random with replacement from an array? 

*Hint 2*: Remember that `!=` checks for non-equality between two items.

In [43]:
trials = 10000 
num_different = 0 
for i in range(trials):  
    temp_word=np.random.choice(p_and_p_words,2)    
    if(len(temp_word[0])!=len(temp_word[1])):      
        num_different=num_different+1  
num_different

8534

In [44]:
# TEST
8100 <= num_different <= 9100

True

We can also use `np.random.choice` to simulate multiple trials.

<span style='background:yellow'>**Question 2.3.**</span> Allie is playing darts. Her dartboard contains ten equal-sized zones with point values from 1 to 10. Write code that simulates her total score after 1000 dart tosses.

*Hint:* First decide the possible values you can take in the experiment (point values in this case). Then use `np.random.choice` to simulate Allie's tosses. Finally, sum up the scores to get Allie's total score.

In [45]:
def make_array(a,b,c,d,e,f,g,h,i,j):
    return [a,b,c,d,e,f,g,h,i,j]
#below line creates an array with 10 point values    
possible_point_values = make_array(1,2,3,4,5,6,7,8,9,10)
num_tosses = 1000#setting number of tosses
#below line creates an array with 1000 random values selected from possible_point_values
simulated_tosses = np.random.choice(possible_point_values,num_tosses)#which is simulation of scores in playing dart game 
total_score = sum(simulated_tosses)#then finding sum of all scores
print(total_score)#then displaying total score

5543


In [46]:
# TEST
1000 <= total_score <= 10000

True

## 3. Probability

We will be testing some probability concepts that were introduced in lecture. For all of the following problems, we will introduce a problem statement and give you a proposed answer. You must assign the provided variable to one of the following three integers, depending on whether the proposed answer is too low, too high, or correct. 

1. Assign the variable to 1 if you believe our proposed answer is too high.
2. Assign the variable to 2 if you believe our proposed answer is too low.
3. Assign the variable to 3 if you believe our proposed answer is correct.


You are more than welcome to create more cells across this notebook to use for arithmetic operations 

<span style='background:yellow'>**Question 3.1.**</span> You roll a 6-sided die 10 times. What is the chance of getting 10 sixes?

Our proposed answer: $$\left(\frac{1}{6}\right)^{10}$$

Assign `ten_sixes` to either 1, 2, or 3 depending on if you think our answer is too high, too low, or correct. 

In [47]:
main = (1/6)**10 
check = (1/6)**10 
print("main : ", main) 
print("check: ",check)
if (main== check): 
  ten_sixes=3
elif (main<check): 
  ten_sixes=2
else:
  ten_sixes=1
  ten_sixes

main :  1.6538171687920194e-08
check:  1.6538171687920194e-08


In [48]:
# TEST
ten_sixes == 3

True

<span style='background:yellow'>**Question 3.2.**</span> Take the same problem set-up as before, rolling a fair dice 10 times. What is the chance that every roll is less than or equal to 5?

Our proposed answer: $$1 - \left(\frac{1}{6}\right)^{10}$$

Assign `five_or_less` to either 1, 2, or 3. 

In [49]:
main=1-(1/6)**10
check = (5/6)**10
print("main :", main) 
print("check: ", check)
if (main== check): 
  five_or_less=3
elif (main<check): 
  five_or_less=2
else:
  five_or_less=1 
  five_or_less

main : 0.9999999834618283
check:  0.1615055828898458


In [50]:
# TEST
five_or_less == 1

True

<span style='background:yellow'>**Question 3.3.**</span> Assume we are picking a lottery ticket. We must choose three distinct numbers from 1 to 1000 and write them on a ticket. Next, someone picks three numbers one by one from a bowl with numbers from 1 to 1000 each time without putting the previous number back in. We win if our numbers are all called in order. 

If we decide to play the game and pick our numbers as 12, 140, and 890, what is the chance that we win? 

Our proposed answer: $$\left(\frac{3}{1000}\right)^3$$

Assign `lottery` to either 1, 2, or 3. 

In [51]:
main =(3/1000) **3
check= (1/1000) * (1/999)*(1/998) 
print("main:", main) 
print("check: ", check)
if (main == check): 
  lottery=3
elif (main<check): 
  lottery=2
else:
  lottery=1 
  lottery

main: 2.7e-08
check:  1.003007015031063e-09


In [52]:
# TEST
lottery == 1

True

<span style='background:yellow'>**Question 3.4.**</span> Assume we have two lists, list A and list B. List A contains the numbers [20,10,30], while list B contains the numbers [10,30,20,40,30]. We choose one number from list A randomly and one number from list B randomly. What is the chance that the number we drew from list A is larger than or equal to the number we drew from list B?

Our proposed solution: $$1/5$$

Assign `list_chances` to either 1, 2, or 3. 

*Hint: Consider the different possible ways that the items in List A can be greater than or equal to items in List B. Try working out your thoughts with a pencil and paper, what do you think the correct solutions will be close to?*

In [53]:
listA = [20,10,30]
listB = [10,30,20,40,30]
check=(7/12)
print("main: ", main) 
print("Actual_value:",check)
if (main == check): 
  list_chances=3
elif (main<check): 
  list_chances=2
else:
  list_chances=1
  list_chances

main:  2.7e-08
Actual_value: 0.5833333333333334


In [54]:
# TEST
list_chances == 2

True

Great job! You're finished with lab 4! Be sure to...

* **run all the tests**,
* **print the notebook as a PDF**,
* and **submit both the notebook and the PDF to Canvas**.