In [3]:
# Initialize Otter
import otter
grader = otter.Notebook("hw05.ipynb")

<img style="display: block; margin-left: auto; margin-right: auto" src="./ccsf-logo.png" width="250rem;" alt="The CCSF black and white logo">

# Homework 5: Iteration and Chance

## References

* [Sections 9.0 - 9.3](https://inferentialthinking.com/chapters/09/Randomness.html)
* [Chapter 10](https://inferentialthinking.com/chapters/10/Sampling_and_Empirical_Distributions.html)
* [`datascience` Documentation](https://datascience.readthedocs.io/)
* [Python Quick Reference](https://ccsf-math-108.github.io/materials-fa23/resources/quick_reference.html)

## Assignment Reminders

- Make sure to run the code cell at the top of this notebook that starts with `# Initialize Otter` to load the auto-grader.
- For all tasks indicated with a 🔎 that you must write explanations and sentences for, provide your answer in the designated space.
- Throughout this assignment and all future ones, please be sure to not re-assign variables throughout the notebook! _For example, if you use `max_temperature` in your answer to one question, do not reassign it later on. Otherwise, you will fail tests that you thought you were passing previously!_
- We encourage you to discuss this assignment with others but make sure to write and submit your own code. Refer to the syllabus to learn more about how to learn cooperatively.
- Unless you are asked otherwise, use the non-interactive visualizations when asked to produce a visualization for a task.
- View the related <a href="https://ccsf.instructure.com" target="_blank">Canvas</a> Assignment page for additional details.

Run the following code cell to import the tools for this assignment.

In [4]:
from datascience import *
import numpy as np
%matplotlib inline
import matplotlib.pyplot as plt
plt.style.use('fivethirtyeight')

## 2023-24 CCSF Football Season

You are going to analyze how well the CCSF football team performed in the 2023-24 season. A football game is divided into four periods, called quarters. The number of points CCSF scored in each quarter (e.g. `CCSF 1Q`), and the number of points their opponent scored in each quarter  (e.g. `Opp 1Q`) are stored in a table called `games`. If the game is tied at the end of the 4th quarter, the game goes into an additional period called overtime.

**Notes**:
* The 2022-24 season data was collected using the Game Log from [the CCSF Athletics website](https://ccsfathletics.com/sports/fball/2023-24/teams/sanfrancisco?view=gamelog).
* The table `games_ful` contains dditional statistics at the right end of the table for your curiosity.
* A `nan` value indicates that no data was provided on the website.

In [5]:
games_full = Table().read_table("ccsf_fb.csv")
games = games_full.select(1, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
games.show()

Opponent,CCSF 1Q,CCSF 2Q,CCSF 3Q,CCSF 4Q,CCSF OT,Opp 1Q,Opp 2Q,Opp 3Q,Opp 4Q,Opp OT
Santa Rosa,13,21,3,21,0,0,0,0,14,0
Sacramento City,0,3,14,7,0,3,0,0,0,0
Butte,0,7,20,0,0,9,0,0,6,0
Fresno,0,21,10,14,0,3,7,0,14,0
Sierra,14,14,14,7,0,7,14,14,0,0
San Joaquin Delta,7,27,7,22,0,0,3,0,0,0
Laney,7,17,7,7,0,0,0,0,15,0
Diablo Valley,14,29,0,13,0,0,0,0,0,0
Chabot,7,21,21,10,0,0,0,6,0,0
San Mateo,10,0,0,7,0,14,0,14,16,0


Let's start by finding the total points each team scored in a game.

### Task 01 📍

Write a function called `sum_scores`.  It should take five arguments, where each argument is the team's quarter or overtime score. It should return the team's total score for that game.

_Points:_ 2

In [10]:
def sum_scores(q1, q2, q3, q4, ot):
    """Returns the total score calculated by adding up the score of each quarter and the overtime"""
    return q1 + q2 + q3 + q4 + ot

sum_scores(14, 7, 3, 0, 3) #DO NOT CHANGE THIS LINE

27

In [11]:
grader.check("task_01")

### Task 02 📍

Create a new table `final_scores` with three columns in this *specific* order: `Opponent`, `CCSF Score`, `Opponent Score`. You will have to create the `CCSF Score` and `Opponent Score` columns. Use the function `sum_scores` you just defined in the previous question for this problem.

*Hint:* If you want to apply a function that takes in multiple arguments, you can pass multiple column names as arguments in `tbl.apply()`. The column values will be passed into the corresponding arguments of the function. Take a look at the python reference for syntax.

*Tip:* If you’re running into issues creating final_scores, check that `ccsf_scores` and `opponent_scores` output what you want. Also, you will more than likely want to create more steps/intermediate variables.



_Points:_ 3

In [20]:
# Calculate total scores and create the final_scores table
final_scores = games.with_columns(
    'CCSF Score', games.apply(sum_scores, 'CCSF 1Q', 'CCSF 2Q', 'CCSF 3Q', 'CCSF 4Q', 'CCSF OT'),
    'Opponent Score', games.apply(sum_scores, 'Opp 1Q', 'Opp 2Q', 'Opp 3Q', 'Opp 4Q', 'Opp OT')
).select('Opponent', 'CCSF Score', 'Opponent Score')

final_scores

Opponent,CCSF Score,Opponent Score
Santa Rosa,58,14
Sacramento City,24,3
Butte,27,15
Fresno,45,24
Sierra,49,35
San Joaquin Delta,63,3
Laney,38,15
Diablo Valley,56,0
Chabot,59,6
San Mateo,17,44


In [21]:
grader.check("task_02")

We can get specific row objects from a table. You can use `tbl.row(n)` to get the `(n+1)`th row of a table. `row.item("column_name")` will allow you to select the element that corresponds to `column_name` in a particular row. Here's an example:

In [22]:
# Just run this cell
games.row(10)

Row(Opponent='American River', CCSF 1Q=0, CCSF 2Q=0, CCSF 3Q=6, CCSF 4Q=0, CCSF OT=0, Opp 1Q=14, Opp 2Q=10, Opp 3Q=3, Opp 4Q=14, Opp OT=0)

In [23]:
# Just run this cell
games.row(10).item("CCSF 4Q")

0

### Task 03 📍

We want to see for a particular game whether or not CCSF won. Write a function called `did_ccsf_win`.  It should take one argument: a **row object** from the `final_scores` table. It should return either `True` if CCSF's score was greater than the Opponent's score, and `False` otherwise.

*Note 1*: "Row object" means a row from the table extracted (behind the scenes) using `tbl.row(index)` that contains all the data for that specific row. It is **not** the index of a row. Do not try and call `final_scores.row(row)` inside of the function.

*Note 2*: If you're still confused by row objects, try printing out `final_scores.row(1)` in a new cell to visually see what it looks like! This piece of code is pulling out the row object located at index 1 of the `final_scores` table and returning it. When you display it in a cell, you'll see that it is not located within a table, but is instead a standalone row object!




_Points:_ 3

In [30]:
def did_ccsf_win(row):
    return row.item('CCSF Score') > row.item('Opponent Score')

In [31]:
grader.check("task_03")

### Task 04 📍

Unfortunately, CCSF did not win against every opponent during the 2023-24 season. Using the `final_scores` table, assign `results` to an array of `True` and `False` values that correspond to whether or not CCSF won. Add the `results` array to the `final_scores` table, and assign this to `final_scores_with_results`. Then, respectively assign the number of wins and losses CCSF had to `ccsf_wins` and `ccsf_losses`.

*Hint*: When you only pass a function name and no column labels through `tbl.apply()`, the function gets applied to every row in `tbl`



_Points:_ 4

In [32]:
results = final_scores.apply(did_ccsf_win)
final_scores_with_results = final_scores.with_column('Results', results)
ccsf_wins = sum(results)
ccsf_losses = final_scores.num_rows - ccsf_wins

# Don't delete or edit the following line:
print(f"In the 2023-24 Season, CCSF Football won {ccsf_wins} games and lost {ccsf_losses} games. Go RAMS!")

In the 2023-24 Season, CCSF Football won 9 games and lost 2 games. Go RAMS!


In [33]:
grader.check("task_04")

### Task 05 📍

Sometimes in football, the two teams are equally matched and the game is quite close. Other times, it is a blowout, where the winning team wins by a large margin of victory. Let's define a **big win** to be a game in which the winning team won by more than 10 points. 

Create a function called `is_big_win`.
* The function should accept a single row (`datascience.tables.Row` data type) from a table like the `final_scores` table.
* The function should output `True` if CCSF won by **more than** 10 points. Otherwise, it should return `False`.

Test your function on the first row of the table. `final_scores.row(0)` should be the row object `Row(Opponent='Santa Rosa', CCSF Score=58, Opponent Score=14)`. Since CCSF's score is more than 10 points larger than Santa Rosa's score, then `is_big_win` applied to that row should output `True`.

_Points:_ 4

In [34]:
def is_big_win(row):
    ccsf_score = row.item('CCSF Score')
    opp_score = row.item('Opponent Score')
    return ccsf_score - opp_score > 10


# Test your function 
row1 = final_scores.row(0)
is_big_win(row1)

True

In [35]:
grader.check("task_05")

### Task 06 📍

Use your `final_scores` table with your `is_big_win` function to assign `big_wins` to an array of team names that CCSF had big wins against during the 2023-24 football season.

_Points:_ 5

In [40]:
big_wins = np.array([])

for row_index in range(final_scores.num_rows):
    row = final_scores.row(row_index)
    opponent = row.item('Opponent')
    if is_big_win(row):
        big_wins = np.append(big_wins, opponent)

big_wins

array(['Santa Rosa', 'Sacramento City', 'Butte', 'Fresno', 'Sierra',
       'San Joaquin Delta', 'Laney', 'Diablo Valley', 'Chabot'],
      dtype='<U32')

In [39]:
grader.check("task_06")

## Roulette

### Chance

A Nevada roulette wheel has 38 pockets and a small ball that rests on the wheel. When the wheel is spun, the ball comes to rest in one of the 38 pockets. That pocket is declared the winner. 

The pockets are labeled 0, 00, 1, 2, 3, 4, ... , 36. Pockets 0 and 00 are green, and the other pockets are alternately red and black. The table `wheel` is a representation of a Nevada roulette wheel. **Note that *both* columns consist of strings.** Below is an example of a roulette wheel!

<img src="./roulette_wheel.jpeg" alt="roulette wheel" width="330px">

Run the cell below to load the `wheel` table.

In [42]:
wheel = Table.read_table('roulette_wheel.csv', dtype=str)
wheel

Pocket,Color
0,green
0,green
1,red
2,black
3,red
4,black
5,red
6,black
7,red
8,black


Before you do the following tasks, make sure you understand the logic behind all the examples in [Section 9.5](https://inferentialthinking.com/chapters/09/5/Finding_Probabilities.html). 

Good ways to approach probability calculations include:

- Thinking one trial at a time: What does the first one have to be? Then what does the next one have to be?
- Breaking up the event into distinct ways in which it can happen.
- Seeing if it is easier to find the chance that the event does not happen.

On each spin of a roulette wheel, all 38 pockets are equally likely to be the winner regardless of the results of other spins. Among the 38 pockets, 18 are red, 18 black, and 2 green.

#### Task 07 📍

The winning pocket is black on all of the first three spins.

_Provide an expression that Python evaluates to the chance of the event described._

_Points:_ 2

In [43]:
first_three_black = (18/38) * (18/38) * (18/38)

In [44]:
grader.check("task_07")

#### Task 08 📍

The color green never wins in the first 10 spins.

_Provide an expression that Python evaluates to the chance of the event described._

_Points:_ 2

In [49]:
no_green = (1 - 2/38)**10

0.5823566532299399

In [50]:
grader.check("task_08")

#### Task 09 📍

The color green wins **at least** once on the first 10 spins.

_Provide an expression that Python evaluates to the chance of the event described._

_Points:_ 2

In [52]:
at_least_one_green = 1 - (1 - 2/38)**10

0.4176433467700601

In [48]:
grader.check("task_09")

### Comparing Chances

In each of the following two tasks, two events A and B are described. Choose from one of the following three options and set each answer variable to a single integer:

1. Event A is more likely than Event B
2. Event B is more likely than Event A
3. The two events have the same chance.

You should be able to make the choices **without calculation**. Good ways to approach this exercise include imagining carrying out the chance experiments yourself, one trial at a time, and by thinking about the [law of averages](https://inferentialthinking.com/chapters/10/1/Empirical_Distributions.html#the-law-of-averages).

#### Task 10 📍

A child picks four times at random from a box that has four toy animals: a bear, an elephant, a giraffe, and a kangaroo.

- Event A: all four different animals are picked, assuming the child picks without replacement
- Event B: all four different animals are picked, assuming the child picks with replacement

_Points:_ 2

In [53]:
toys_option = 2

In [54]:
grader.check("task_10")

#### Task 11 📍

In a lottery, two numbers are drawn at random without replacement from the integers 1 through 1000.

- Event A: The number 8 is picked on both draws
- Event B: The same number is picked on both draws


_Points:_ 2

In [55]:
lottery_option = 2

In [56]:
grader.check("task_11")

## Submit your Homework to Canvas

Once you have finished working on the homework tasks, prepare to submit your work in Canvas by completing the following steps.

1. In the related Canvas Assignment page, check the rubric to know how you will be scored for this assignment.
2. Double-check that you have run the code cell near the end of the notebook that contains the command `"grader.check_all()"`. This command will run all of the run tests on all your responses to the auto-graded tasks marked with 📍.
3. Double-check your responses to the manually graded tasks marked with 📍🔎.
3. Select the menu item "File" and "Save Notebook" in the notebook's Toolbar to save your work and create a specific checkpoint in the notebook's work history.
4. Select the menu items "File", "Download" in the notebook's Toolbar to download the notebook (.ipynb) file. 
5. In the related Canvas Assignment page, click Start Assignment or New Attempt to upload the downloaded .ipynb file.

**Keep in mind that the autograder does not always check for correctness. Sometimes it just checks for the format of your answer, so passing the autograder for a question does not mean you got the answer correct for that question.**

---

To double-check your work, the cell below will rerun all of the autograder tests.

In [57]:
grader.check_all()

task_01 results: All test cases passed!
task_01 - 1 message: ✅ Your function works with our 4 quarter scores and over time score.

task_02 results: All test cases passed!
task_02 - 1 message: ✅ final_scores has the correct number of columns.
task_02 - 2 message: ✅ final_scores has the correct labels.

task_03 results: All test cases passed!
task_03 - 1 message: ✅ You've created a function called did_ccsf_win.
task_03 - 2 message: ✅ Your function shows that CCSF won against Sacramento City.

task_04 results: All test cases passed!
task_04 - 1 message: ✅ ccsf_wins is a possible value
task_04 - 2 message: ✅ ccsf_losses is a possible value

task_05 results: All test cases passed!
task_05 - 1 message: ✅ It seems like you defined a function called is_big_win.
task_05 - 2 message: ✅ Your function produces the correct output for the first row of final_scores.

task_06 results: All test cases passed!
task_06 - 1 message: ✅ big_wins is a NumPy array.
task_06 - 2 message: ✅ The first item in big_