In [None]:
# Initialize Otter
import otter
grader = otter.Notebook("lab05.ipynb")

<img style="display: block; margin-left: auto; margin-right: auto" src="./ccsf-logo.png" width="250rem;" alt="The CCSF black and white logo">

# Lab 05 - Simulation and Chance

## References

* [Sections 9.0 - 9.5 of the Textbook](https://inferentialthinking.com/chapters/09/Randomness.html)
* [datascience Documentation](https://datascience.readthedocs.io/)
* [Python Quick Reference](https://ccsf-math-108.github.io/materials-fa23/resources/quick_reference.html)

## Assignment Reminders

- Make sure to run the code cell at the top of this notebook that starts with `# Initialize Otter` to load the auto-grader.
- For all tasks indicated with a 🔎 that you must write explanations and sentences for, provide your answer in the designated space.
- Throughout this assignment and all future ones, please be sure to not re-assign variables throughout the notebook! _For example, if you use `max_temperature` in your answer to one question, do not reassign it later on. Otherwise, you will fail tests that you thought you were passing previously!_
- Collaborating on labs is more than okay -- it's encouraged! You should rarely remain stuck for more than a few minutes on questions in labs, so ask an instructor or classmate for help. (Explaining things is beneficial, too -- the best way to solidify your knowledge of a subject is to explain it.) Please don't just share answers, though.
- View the related <a href="https://ccsf.instructure.com" target="_blank">Canvas</a> Assignment page for additional details.

Run the following cell to set up the lab, and make sure you run the cell at the top of the notebook that initializes Otter.

In [None]:
from datascience import *
import numpy as np
%matplotlib inline
import matplotlib.pyplot as plt
plt.style.use('fivethirtyeight')

## Nachos and Conditionals

In Python, the boolean data type contains only two unique values:  `True` and `False`. Expressions containing comparison operators such as `<` (less than), `>` (greater than), and `==` (equal to) evaluate to Boolean values. A list of common comparison operators can be found below!

| Comparison            | Operator | True example | False example |
|-----------------------|----------|--------------|---------------|
| Less than             | <        | 2 < 3        | 2 < 2         |
| Great than            | >        | 3 > 2        | 3 > 3         |
| Less than or equal    | <=       | 2 <= 2       | 3 <= 2        |
| Greater than or equal | >=       | 3 >= 3       | 2 >= 3        |
| Equal                 | ==       | 3 == 3       | 3 == 2        |
| Not equal             | !=       | 3 != 2       | 2 != 2        |

Run the cell below to see an example of a comparison operator in action.

In [None]:
3 > 1 + 1

We can even assign the result of a comparison operation to a variable.

In [None]:
result = 10 / 2 == 5
result

Arrays are compatible with comparison operators. The output is an array of boolean values.

In [None]:
make_array(1, 5, 7, 8, 3, -1) > 3

One day, when you come home after a long week, you see a hot bowl of nachos waiting on the dining table! Let's say that whenever you take a nacho from the bowl, it will either have only **cheese**, only **salsa**, **both** cheese and salsa, or **neither** cheese nor salsa (a sad tortilla chip indeed). 

Let's try and simulate taking nachos from the bowl at random using the function, `np.random.choice(...)`.

### `np.random.choice`

`np.random.choice` picks one item at random from the given array. It is equally likely to pick any of the items. Run the cell below several times, and observe how the results change.

In [None]:
nachos = make_array('cheese', 'salsa', 'both', 'neither')
np.random.choice(nachos)

To repeat this process multiple times, pass in an int `n` as the second argument to return `n` different random choices. By default, `np.random.choice` samples **with replacement** and returns an *array* of items. 

Run the next cell to see an example of sampling with replacement 10 times from the `nachos` array.

In [None]:
np.random.choice(nachos, 10)

To count the number of times a certain type of nacho is randomly chosen, we can use `np.count_nonzero`

### `np.count_nonzero`

`np.count_nonzero` counts the number of non-zero values that appear in an array. When an array of boolean values are passed through the function, it will count the number of `True` values (remember that in Python, `True` is coded as 1 and `False` is coded as 0.)

Run the next cell to see an example that uses `np.count_nonzero`.

In [None]:
np.count_nonzero(make_array(True, False, False, True, True))

### Task 01 📍

Assume we took ten nachos at random, and stored the results in an array called `ten_nachos` as done below. Find the number of nachos with only cheese using code (do not hardcode the answer).  

*Hint:* Our solution involves a comparison operator (e.g. `=`, `<`, ...) and the `np.count_nonzero` method.


In [None]:
ten_nachos = make_array('neither', 'cheese', 'both', 'both', 'cheese', 'salsa', 'both', 'neither', 'cheese', 'both')
number_cheese = ...
number_cheese

In [None]:
grader.check("task_01")

### Conditional Statements

A conditional statement is a multi-line statement that allows Python to choose among different alternatives based on the truth value of an expression.

Here is a basic example.

```
def sign(x):
    if x > 0:
        return 'Positive'
    else:
        return 'Negative'
```

If the input `x` is greater than `0`, we return the string `'Positive'`. Otherwise, we return `'Negative'`.

If we want to test multiple conditions at once, we use the following general format.

```
if <if expression>:
    <if body>
elif <elif expression 0>:
    <elif body 0>
elif <elif expression 1>:
    <elif body 1>
...
else:
    <else body>
```

Only the body for the first conditional expression that is true will be evaluated. Each `if` and `elif` expression is evaluated and considered in order, starting at the top. As soon as a true value is found, the corresponding body is executed, and the rest of the conditional statement is skipped. If none of the `if` or `elif` expressions are true, then the `else body` is executed. 

For more examples and explanation, refer to the section on conditional statements [here](https://inferentialthinking.com/chapters/09/1/Conditional_Statements.html).

### Task 02 📍

Complete the following conditional statement so that the string `'More please'` is assigned to the variable `say_please` if the number of nachos with cheese in `ten_nachos` is less than `5`.

*Hint*: You should be using `number_cheese` from Question 1.


In [None]:
say_please = '?'

if ...:
    say_please = 'More please' 

say_please

In [None]:
grader.check("task_02")

### Task 03 📍

Write a function called `nacho_reaction` that returns a reaction (as a string) based on the type of nacho passed in as an argument. Use the table below to match the nacho type to the appropriate reaction.

<img src="./nacho_reactions.png">

*Hint:* If you're failing the test, double check the spelling of your reactions.


In [None]:
def nacho_reaction(nacho):
    if nacho == "cheese":
        ...
    ... :
        ...
    ... :
        ...
    ... :
        ...


spicy_nacho = nacho_reaction('salsa')
spicy_nacho

In [None]:
grader.check("task_03")

### Task 04 📍

Create a table `ten_nachos_reactions` that consists of the nachos in `ten_nachos` as well as the reactions for each of those nachos. The columns should be called `Nachos` and `Reactions`.

*Hint:* Use the `apply` method. 


In [None]:
ten_nachos_tbl = Table().with_column('Nachos', ten_nachos)
...
ten_nachos_reactions

In [None]:
grader.check("task_04")

### Task 05 📍

Using code, find the number of 'Wow!' reactions for the nachos in `ten_nachos_reactions`.


In [None]:
number_wow_reactions = ...
number_wow_reactions

In [None]:
grader.check("task_05")

## Simulations and For Loops

Using a `for` statement, we can perform a task multiple times. This is known as iteration.

One use of iteration is to loop through a set of values. For instance, we can print out all of the colors of the rainbow.

In [None]:
rainbow = make_array("red", "orange", "yellow", "green", "blue", "indigo", "violet")

for color in rainbow:
    print(color)

We can see that the indented part of the `for` loop, known as the body, is executed once for each item in `rainbow`. The name `color` is assigned to the next value in `rainbow` at the start of each iteration. Note that the name `color` is arbitrary; we could easily have named it something else. The important thing is we stay consistent throughout the `for` loop. 

In [None]:
for another_name in rainbow:
    print(another_name)

In general, however, we would like the variable name to be somewhat informative. 

### Task 06 📍

In the following cell, we've loaded the text of _Pride and Prejudice_ by Jane Austen, split it into individual words, and stored these words in an array `p_and_p_words`. Using a `for` loop, assign `longer_than_five` to the number of words in the novel that are more than 5 characters long.

*Hint*: You can find the number of letters in a word with the `len` function.

*Note*: You should expect "words" like `"About` to be included in this collection of words that are more than 5 characters long because the quotation mark is not removed with the `split` function.


In [None]:
austen_string = open('Austen_PrideAndPrejudice.txt', encoding='utf-8').read()
p_and_p_words = np.array(austen_string.split())

longer_than_five = ...

# a for loop would be useful here



longer_than_five

In [None]:
grader.check("task_06")

### Task 07 📍

Using a simulation with 10,000 trials, assign num_different to the number of times, in 10,000 trials, that two words picked uniformly at random (with replacement) from Pride and Prejudice have different lengths. 

*Hint 1*: What function did we use in section 1 to sample at random with replacement from an array? 

*Hint 2*: Remember that `!=` checks for non-equality between two items.


In [None]:
trials = 10000
num_different = ...

for ... in ...:
    ...

num_different

In [None]:
grader.check("task_07")

## Probability

An electronic board with lights contains 10 equal-sized zones labeled with values from 1 to 10.

<img src="./number_grid.png" alt="a 2x5 grid of numbers 1 - 10" width=40%></img>

The board randomly illuminates a sequence of numbers at random. First, a number is lit up with a pink color, then a number is lit up with an orange color where the second number must be different than the first. For example, the following image shows 6 lit up with a pink color followed by 5 lit up with an orange color. 

<img src="./number_grid_color.png" alt="a 2x5 grid of numbers 1 - 10 where 6 is pink and 5 is orange" width=40%></img>

After the sequence of two numbers is shown, the board resets.

<img src="./number_grid.png" alt="a 2x5 grid of numbers 1 - 10" width=40%></img>

Use this situation to practice with the basic ideas of probability that you've learned so far.

### A Definition of Probability

The probability of some event where each outcome is equally likely to happen can be calculated by dividing the number of the event that can happen by the total number of possible outcomes.

### Task 08 📍

A sequence of two numbers is shown. What is the probability that the first number in the sequence of two numbers illuminated is a 6? 

Assign `chance_6` to a value or expression that represents this probability value. The value should be a number between `0` and `1`, inclusive.

In [None]:
chance_6 = ...
chance_6

In [None]:
grader.check("task_08")

### The Multiplication Rule

If you want to calculate the probability of one event happening and another event happening, then multiply the chance of the first event times the chance of the second event considering that the first event has happened. This is generally referred to as the multiplication rule.

### Task 09 📍

A sequence of two numbers is shown. What is the probability that the sequence (6, 5) is illuminated on the board? 

Assign `chance_65` to a value or expression that represents this probability value. The value should be a number between `0` and `1`, inclusive.

In [None]:
chance_65 = ...
chance_65

In [None]:
grader.check("task_09")

### The Additional Rule for Disjoint Events

If there are two distinct ways for an event to occur, calculate the chance of each way happening and add the values. This is called the addition rule for disjoint events.

### Task 10 📍

A sequence of two numbers is shown. What is the probability that the sequence (6, 5) or the sequence (5, 6) is illuminated on the board? 

Assign `chance_65_56` to a value or expression that represents this probability value. The value should be a number between `0` and `1`, inclusive.

In [None]:
chance_65_56 = ...
chance_65_56

In [None]:
grader.check("task_10")

### The Complement Rule

The chance of an event not happening is one minus the chance that it will happen. This is generally referred to as the complement rule.

### Task 11 📍

Twenty sequences of two numbers are shown. What is the probability that at least one of the sequences is (5, 6)?

Assign `chance_at_least_one_56` to a value or expression that represents this probability value. The value should be a number between `0` and `1`, inclusive.

In [None]:
chance_at_least_one_56 = ...
chance_at_least_one_56

In [None]:
grader.check("task_11")

## Submit your Lab to Canvas

Once you have finished working on the lab questions, prepare to submit your work in Canvas by completing the following steps.

1. In the related Canvas Assignment page, check the requirements for a Complete score for this lab assignment.
2. Double-check that you have run the code cell near the end of the notebook that contains the command `grader.check_all()`. This command will run all of the run tests on all your responses to the auto-graded tasks marked with 📍.
3. Double-check your responses to the manually graded tasks marked with 📍🔎.
4. Select the menu items `File`, `Save and Export Notebook As...`, and `Html_embed` in the notebook's Toolbar to download an HTML version of this notebook file.
5. In the related Canvas Assignment page, click Start Assignment or New Attempt to upload the downloaded HTML file.

---

To double-check your work, the cell below will rerun all of the autograder tests.

In [None]:
grader.check_all()