# Lab 3: DataFrames, Control Flow, and Probability

## Due Thursday, April 25th at 11:59PM

Welcome to Lab 3! This week, we will go over more DataFrame manipulation techniques, conditionals and iteration, and introduce the concept of randomness. You should complete this entire lab so that all tests pass and submit it to Gradescope by 11:59PM on the due date.

Refer to the following readings:
- Grouping with subgroups (see [BPD 11.4](https://notes.dsc10.com/02-data_sets/groupby.html#subgroups))
- Merging DataFrames (see [BPD 13](https://notes.dsc10.com/02-data_sets/merging.html))
- Conditional statements (see [CIT 9.1](https://inferentialthinking.com/chapters/09/1/Conditional_Statements.html))
- Iteration (see [CIT 9.2](https://inferentialthinking.com/chapters/09/2/Iteration.html))
- Probability (see [CIT 9.5](https://inferentialthinking.com/chapters/09/5/Finding_Probabilities.html))

First, set up the tests and imports by running the cell below.

In [1]:
import numpy as np
import babypandas as bpd

import matplotlib
import matplotlib.pyplot as plt
plt.style.use('ggplot')

# import otter
# grader = otter.Notebook()

## 1. California National Parks 🏞️ 🐻

In this question, we'll take a closer look at the `.merge` and `.groupby` DataFrame methods.

We will be working with two DataFrames, `parks` and `species`, which provide information on California National Parks and the species of plants and animals found there, respectively. These are a subset of a [larger dataset provided by the National Park Service](https://www.kaggle.com/nationalparkservice/park-biodiversity).

Run the cells below to load in our data. 

In [2]:
parks = bpd.read_csv('california_parks.csv')
parks

Unnamed: 0,Park Code,Park Name,State,Acres,Latitude,Longitude
0,CHIS,Channel Islands National Park,CA,249561,34.01,-119.42
1,JOTR,Joshua Tree National Park,CA,789745,33.79,-115.9
2,LAVO,Lassen Volcanic National Park,CA,106372,40.49,-121.51
3,PINN,Pinnacles National Park,CA,26606,36.48,-121.16
4,REDW,Redwood National Park,CA,112512,41.3,-124.0
5,SEKI,Sequoia and Kings Canyon National Parks,CA,865952,36.43,-118.68
6,YOSE,Yosemite National Park,CA,761266,37.83,-119.5


In [3]:
species = bpd.read_csv('california_parks_species.csv')
species

Unnamed: 0,Park Name,Category,Order,Family,Common Names,Abundance
0,Channel Islands National Park,Mammal,Carnivora,Canidae,Channel Islands Gray Fox,Rare
1,Channel Islands National Park,Mammal,Carnivora,Mephitidae,Spotted Skunk,Uncommon
2,Channel Islands National Park,Mammal,Carnivora,Mustelidae,Sea Otter,Unknown
3,Channel Islands National Park,Mammal,Carnivora,Otariidae,Guadalupe Fur Seal,Occasional
4,Channel Islands National Park,Mammal,Carnivora,Otariidae,Northern Fur Seal,Uncommon
...,...,...,...,...,...,...
17780,Yosemite National Park,Vascular Plant,Solanales,Solanaceae,Parish's Nightshade,Rare
17781,Yosemite National Park,Vascular Plant,Solanales,Solanaceae,"Chaparral Nightshade, Purple Nightshade",Uncommon
17782,Yosemite National Park,Vascular Plant,Vitales,Vitaceae,"Thicket Creeper, Virginia Creeper, Woodbine",Rare
17783,Yosemite National Park,Vascular Plant,Vitales,Vitaceae,"California Grape, California Wild Grape",Uncommon


**Question 1.1.** Below, create a DataFrame named `species_counts` to show how many species there are in each park. This DataFrame should:
- Have one row for each park.
- Be indexed by `'Park Name'`.
- Have one column named `'Count'` that contains the number of species per park.

In [4]:
species_counts = species.groupby('Park Name').count() 
species_counts = species_counts.assign(Count=species_counts.get('Category')).get(['Count'])
species_counts

Unnamed: 0_level_0,Count
Park Name,Unnamed: 1_level_1
Channel Islands National Park,1885
Joshua Tree National Park,2294
Lassen Volcanic National Park,1797
Pinnacles National Park,1416
Redwood National Park,6310
Sequoia and Kings Canyon National Parks,1995
Yosemite National Park,2088


**Question 1.2.** Below, use the `.merge` method to create a new DataFrame named `parks_with_species`, which should have one row for each park. `parks_with_species` should have all the columns in `parks` plus and additional column called `'Count'` with the number of species in each park. Your DataFrame should look like this:

|    | Park Code   | Park Name                               | State   |   Acres |   Latitude |   Longitude |   Count |
|---:|------------|----------------------------------------|--------|--------|-----------|------------|--------|
|  0 | CHIS        | Channel Islands National Park           | CA      |  249561 |      34.01 |     -119.42 |    1885 |
|  1 | JOTR        | Joshua Tree National Park               | CA      |  789745 |      33.79 |     -115.9  |    2294 |
|  2 | LAVO        | Lassen Volcanic National Park           | CA      |  106372 |      40.49 |     -121.51 |    1797 |
|  3 | PINN        | Pinnacles National Park                 | CA      |   26606 |      36.48 |     -121.16 |    1416 |
|  4 | REDW        | Redwood National Park                   | CA      |  112512 |      41.3  |     -124    |    6310 |
|  5 | SEKI        | Sequoia and Kings Canyon National Parks | CA      |  865952 |      36.43 |     -118.68 |    1995 |
|  6 | YOSE        | Yosemite National Park                  | CA      |  761266 |      37.83 |     -119.5  |    2088 |

In [5]:
reset_species_counts = species_counts.reset_index()
reset_species_counts

Unnamed: 0,Park Name,Count
0,Channel Islands National Park,1885
1,Joshua Tree National Park,2294
2,Lassen Volcanic National Park,1797
3,Pinnacles National Park,1416
4,Redwood National Park,6310
5,Sequoia and Kings Canyon National Parks,1995
6,Yosemite National Park,2088


In [6]:
parks_with_species = parks.merge(reset_species_counts, on='Park Name')
parks_with_species

Unnamed: 0,Park Code,Park Name,State,Acres,Latitude,Longitude,Count
0,CHIS,Channel Islands National Park,CA,249561,34.01,-119.42,1885
1,JOTR,Joshua Tree National Park,CA,789745,33.79,-115.9,2294
2,LAVO,Lassen Volcanic National Park,CA,106372,40.49,-121.51,1797
3,PINN,Pinnacles National Park,CA,26606,36.48,-121.16,1416
4,REDW,Redwood National Park,CA,112512,41.3,-124.0,6310
5,SEKI,Sequoia and Kings Canyon National Parks,CA,865952,36.43,-118.68,1995
6,YOSE,Yosemite National Park,CA,761266,37.83,-119.5,2088


**Question 1.3.** Using the `.groupby` method, assign the variable `species_category` to a DataFrame that has one row for each `'Category'` of species at each park.

Reset the index and assign columns so that you have three columns: `'Park Name'`, `'Category'`, and `'Count'`. Your DataFrame should look like this:

|     | Park Name                     | Category            | Count |
|-----|-------------------------------|---------------------|-------|
| 0   | Channel Islands National Park | Algae               | 61    |
| 1   | Channel Islands National Park | Amphibian           | 4     |
| 2   | Channel Islands National Park | Bird                | 357   |
| 3   | Channel Islands National Park | Crab/Lobster/Shrimp | 11    |
| 4   | Channel Islands National Park | Fish                | 273   |
| ... | ...                           | ...                 | ...   |
| 71  | Yosemite National Park        | Bird                | 270   |
| 72  | Yosemite National Park        | Fish                | 10    |
| 73  | Yosemite National Park        | Mammal              | 88    |
| 74  | Yosemite National Park        | Reptile             | 22    |
| 75  | Yosemite National Park        | Vascular Plant      | 1683  |

In [7]:
park_with_cat = species.merge(reset_species_counts, on = 'Park Name')
park_with_cat

Unnamed: 0,Park Name,Category,Order,Family,Common Names,Abundance,Count
0,Channel Islands National Park,Mammal,Carnivora,Canidae,Channel Islands Gray Fox,Rare,1885
1,Channel Islands National Park,Mammal,Carnivora,Mephitidae,Spotted Skunk,Uncommon,1885
2,Channel Islands National Park,Mammal,Carnivora,Mustelidae,Sea Otter,Unknown,1885
3,Channel Islands National Park,Mammal,Carnivora,Otariidae,Guadalupe Fur Seal,Occasional,1885
4,Channel Islands National Park,Mammal,Carnivora,Otariidae,Northern Fur Seal,Uncommon,1885
...,...,...,...,...,...,...,...
17780,Yosemite National Park,Vascular Plant,Solanales,Solanaceae,Parish's Nightshade,Rare,2088
17781,Yosemite National Park,Vascular Plant,Solanales,Solanaceae,"Chaparral Nightshade, Purple Nightshade",Uncommon,2088
17782,Yosemite National Park,Vascular Plant,Vitales,Vitaceae,"Thicket Creeper, Virginia Creeper, Woodbine",Rare,2088
17783,Yosemite National Park,Vascular Plant,Vitales,Vitaceae,"California Grape, California Wild Grape",Uncommon,2088


In [8]:
species_category = park_with_cat.groupby(['Park Name', 'Category']).count().get(['Count'])
species_category = species_category.reset_index()
species_category

Unnamed: 0,Park Name,Category,Count
0,Channel Islands National Park,Algae,61
1,Channel Islands National Park,Amphibian,4
2,Channel Islands National Park,Bird,357
3,Channel Islands National Park,Crab/Lobster/Shrimp,11
4,Channel Islands National Park,Fish,273
...,...,...,...
71,Yosemite National Park,Bird,270
72,Yosemite National Park,Fish,10
73,Yosemite National Park,Mammal,88
74,Yosemite National Park,Reptile,22


## 2. Coffee Shop ☕

In Python, Boolean values can either be `True` or `False`. We get Boolean values when using comparison operators, among which are `<` (less than), `>` (greater than), and `==` (equal to). A more complete list can be found below.

|symbol|meaning|
|--------|--------|
|`==` |equal to |
|`!=` |not equal to |
|`<`|less than|
|`<=`|less than or equal to|
|`>`|greater than|
|`>=`|greater than or equal to|


Run the cell below to see an example of a comparison operator in action.

In [9]:
3 > 1 + 1

True

We can even assign the result of a comparison operation to a variable.

In [10]:
result = 10 / 2 == 5
result

True

Arrays are compatible with comparison operators, and comparisons happen one element at a time. The output is an array of Boolean values.

In [11]:
odd_numbers = np.array([1, 3, 5, 7, 9]) 
odd_numbers > 2

array([False,  True,  True,  True,  True])

After making a comparison, we can count how many `True` value are in the resulting array using the function `np.count_nonzero`. When called on an array of Boolean values, `np.count_nonzero` returns the **number of `True` values** in the array. 

For example, let's see what happens when we give the above array of Boolean values as the argument to `np.count_nonzero`.

In [12]:
np.count_nonzero(odd_numbers > 2)

4

This tells us that there are 4 values in `odd_numbers` that are larger than 2. That's correct; those numbers are 3, 5, 7, and 9.

The name `np.count_nonzero` can be a little misleading, since we're not using it to tell us how many nonzero numbers are in `odd_numbers`. Rather, we're using it to tell us how many values in a Boolean array are `True`.

The name comes from the way Python treats Boolean values under the hood. In Python, `True` values are treated like the number 1, and `False` values like the number 0. So when use `np.count_nonzero` on an array of Booleans, it is effectively counting the number of nonzero values, due to the fact that `True` is 1 and `False` is 0.

Let's say you own a small coffee shop, and you have to keep track of what your customers order. Whenever someone orders coffee at your shop, they will order one of the following sizes: Small, Medium, Large, or Extra Large.

<img src='coffee_sizes.png' width=300>

Using the function call `np.random.choice(array_name)`, let's simulate customers ordering sizes of coffee at random. Start by running the cell below several times, each time representing a new customer's order.

In [13]:
coffee_sizes = np.array(['Small', 'Medium', 'Large', 'Extra Large'])
np.random.choice(coffee_sizes)

'Large'

We can add a second argument to our call to `np.random.choice` to simulate several coffee orders at the same time. The second argument represents how many coffees are being ordered. The result will be an *array* of coffee orders instead of just one order.

Let's suppose you get ten customers one morning. Run the cell below to see what sizes of coffee they get.

In [14]:
ten_coffees = np.random.choice(coffee_sizes, 10)
ten_coffees

array(['Large', 'Large', 'Large', 'Extra Large', 'Small', 'Large',
       'Medium', 'Extra Large', 'Small', 'Medium'], dtype='<U11')

Note that the cell above uses randomness, meaning if you run it again, you might get a different result. In the questions that follow, we'll ask you to work with the variable `ten_coffees`. But since `ten_coffees` might be a different set of ten coffees the next time you run the notebook, you'll want to avoid answering questions about this specific set of ten coffees. So don't type in (or *hardcode*) any answers based on the current contents of `ten_coffees` you see above. Instead, write code that will work generally, regardless of what exact values are stored in `ten_coffees`. 

**Question 2.1.** Find the number of Small coffees in `ten_coffees` using code (do not hardcode the answer).  

_**Hint:**_ Our solution involves a comparison operator and the `np.count_nonzero` function.

In [15]:
number_small = np.count_nonzero(ten_coffees == 'Small')
number_small

2

**Conditional Statements**

A conditional statement is made up of multiple lines of code that allow Python to choose from different alternatives based on whether some condition is true.

Here is a basic example.

```py
def sign(x):
    if x > 0:
        return 'Positive'
```

The function determines whether the input `x` is greater than `0`, and if so, gives us the string `'Positive'` back. If not, the function gives no output.

If we want to test multiple conditions at once, we use the following general format.

```py
if <condition 1>:
    <body 1>
elif <condition 2>:
    <body 2>
elif <condition 3>:
    <body 3>
...
else:
    <body 4>
```

Only one of the bodies will ever be executed. Each `if` and `elif` (else-if) condition is evaluated and considered in order, starting at the top. As soon as a `True` value is found (i.e. once a condition is met), the corresponding body is executed, and the rest of the conditions are skipped. If none of the `if` or `elif` conditions are true, then the code indented under `else` (`<body 4>` in this example) is executed. For more examples and explanation, refer to [CIT 9.1](https://inferentialthinking.com/chapters/09/1/Conditional_Statements.html?highlight=else).

**Question 2.2.** Complete the implementation of the function `coffee_price`, which takes in the `size` of a coffee as a string and returns a float representing the price of that coffee in dollars, based on the relationship in the table below.

| Coffee Size    | Price ($) |
| ----------- | ----------- |
| Small      | 2.99      |
| Medium  | 3.99        |
| Large      | 4.79      |
| Extra Large  | 4.99        |

In [16]:
def coffee_price(size):
    if size == 'Small':
        return 2.99
    elif size == 'Medium':
        return 3.99
    elif size == 'Large':
        return 4.79
    elif size == 'Extra Large':
        return 4.99

# This is an example call to your function.
medium_price = coffee_price('Medium')
medium_price

3.99

Now consider the DataFrame `ten_coffees_df` defined below.

In [17]:
ten_coffees_df = bpd.DataFrame().assign(Size=ten_coffees)
ten_coffees_df

Unnamed: 0,Size
0,Large
1,Large
2,Large
3,Extra Large
4,Small
5,Large
6,Medium
7,Extra Large
8,Small
9,Medium


**Question 2.3.** Add a column named `'Price'` to the DataFrame `ten_coffees_df` that includes the price of each drink.

_**Hint:**_ Use the `.apply` method.

In [18]:
ten_coffees_df = ten_coffees_df.assign(Price=ten_coffees_df.get('Size').apply(coffee_price))
ten_coffees_df

Unnamed: 0,Size,Price
0,Large,4.79
1,Large,4.79
2,Large,4.79
3,Extra Large,4.99
4,Small,2.99
5,Large,4.79
6,Medium,3.99
7,Extra Large,4.99
8,Small,2.99
9,Medium,3.99


**Question 2.4.** Using code, find the number of coffees in `ten_coffees_df` that cost more than $4.  Think about how you could find this both by using DataFrame methods or by using `np.count_nonzero`.

In [19]:
over_4_dollars = np.count_nonzero(ten_coffees_df.get('Price') > 4)
over_4_dollars

6

**Question 2.5.** Complete the function `large_or_xl` below. The function takes as input any DataFrame of coffee sizes and prices, with column names `'Size'` and `'Price'`. The function compares the number of Large and Extra Large coffees. If there are more Large coffees, the function returns `'More Large coffees'` and if there are more Extra Large coffees, the function returns `'More Extra Large coffees'`. If there are an equal number of each, the function returns `'Same amount'`.

In [20]:
def large_or_xl(coffee_df):
    coffees = coffee_df.get('Size') 
    number_large = np.count_nonzero(coffees == 'Large')
    number_extra_large = np.count_nonzero(coffees == 'Extra Large')
    # Now return the appropriate string comparing the number of Large and Extra Large coffees.
    if number_large > number_extra_large:
        return 'More Large coffees'
    elif number_large < number_extra_large:
        return 'More Extra Large coffees'
    elif number_large == number_extra_large:
        return 'Same amount'

# Below, we create a DataFrame with randomly-generated data and test your function on it.
# Do NOT change anything below this line.
# However, you may want to add a new cell and evaluate large_or_xl(ten_coffees_df)
# to see if your function behaves as expected.
np.random.seed(24)
many_coffees = bpd.DataFrame().assign(Size=np.random.choice(coffee_sizes, 250))
many_coffees = many_coffees.assign(Price=many_coffees.get('Size').apply(coffee_price))
result = large_or_xl(many_coffees)
result

'More Extra Large coffees'

## 3. Iteration 🔂

`for`-loops allow us to iterate – that is, to run a piece of code multiple times. Here, we'll simulate the act of drawing different suits from a deck of cards. This is like drawing a card, putting it back in the deck, and drawing a card again (with replacement) because on each draw, you have an equal chance of getting any of the four suits. 🃏

In [21]:
suits = np.array(['♣️', '♥️', '♠️', '♦️'])

draws = np.array([])

repetitions = 6

for i in np.arange(repetitions):
    chosen_suit = np.random.choice(suits)
    draws = np.append(draws, chosen_suit)

draws

array(['♥️', '♦️', '♠️', '♥️', '♣️', '♦️'], dtype='<U32')

Another use of iteration is to loop through a set of values. For instance, we can print out all of the colors of the rainbow. 🌈

In [22]:
rainbow = np.array(["red", "orange", "yellow", "green", "blue", "indigo", "violet"])

for color in rainbow:
    print(color)

red
orange
yellow
green
blue
indigo
violet


We can see that the indented part of the `for`-loop, known as the body, is executed once for each item in `rainbow`. Note that the name `color` is arbitrary; we could replace both instances of `color` in the cell above with any valid variable name and the code would work the same.

We can also use a `for`-loop to add to a variable in an iterative fashion. If we want to keep track of how many times some event occurs, for example, we can set a variable to 0, then add 1 to it every time the event happens. This is kind of like keeping a tally. In the end, the variable (or the tally) represents the total number of times our event happened. Gradually adding on to a variable in this way is called the **accumulator pattern** as discussed in [Lecture 10](https://dsc10.com/resources/lectures/lec10/lec10.html#The-accumulator-pattern). We'll use this a lot!

Below, we use the accumulator pattern to count the number of even numbers in an array of numbers. Each time we encounter an even number in `num_array`, we increase `even_count` by 1. To check if an individual number is even, we compute its remainder when divided by 2 using the `%` ([modulus](https://www.freecodecamp.org/news/the-python-modulo-operator-what-does-the-symbol-mean-in-python-solved/#:~:text=The%20%25%20symbol%20in%20Python%20is,basic%20syntax%20is%3A%20a%20%25%20b)) operator.

In [23]:
num_array = np.array([1, 3, 4, 7, 21, 23, 28, 28, 30])

even_count = 0

for i in num_array:
    if i % 2 == 0:
        even_count = even_count + 1
        
even_count

4

**Question 3.1.** Gina is playing darts. 🎯 Her dartboard contains ten equal-sized zones with point values from 1 to 10. Write code using `np.random.choice` that simulates her total score after 1000 dart tosses.

In [24]:
possible_point_values = np.arange(1, 10)
tosses = 1000

total_score  = 0
for i in range(tosses):
    points = np.random.choice(possible_point_values)
    total_score = np.append(total_score, points).sum()

total_score

4925

**Question 3.2.** What is the average point value of a dart thrown by Gina?

In [25]:
average_score = total_score/tosses
average_score

4.925

**Question 3.3.** In the following cell, we've loaded in the text of _Peter Pan_ (1911) by J. M. Barrie. We've split the text into individual words, and stored these words in an array. Using a `for`-loop, assign `longer_than_8` to the number of words in the novel that are more than 8 letters long.  Look at [CIT 9.2](https://inferentialthinking.com/chapters/09/2/Iteration.html) if you get stuck.

_**Hint:**_ You can find the number of letters in a word with the `len` function.

In [26]:
peter_string = open('peter_pan.txt', encoding='utf-8').read()
peter_words = np.array(peter_string.split())

array_more_than_8 = np.array([])

#for string in peter_words:
#     if len(string) > 8:
#         np.append(array_more_than_8, string)
        
print(peter_words[0:10])  

['Chapter' 'I.' 'PETER' 'BREAKS' 'THROUGH' 'All' 'children,' 'except'
 'one,' 'grow']


In [27]:
print(len(peter_words[0]))    

7


In [28]:
for string in peter_words[0:10]:
    if len(string) > 8:
        print(string)
        array_more_than_8 = np.append(array_more_than_8, string)
        print(array_more_than_8)
    array_more_than_8    

children,
['children,']


In [29]:
peter_string = open('peter_pan.txt', encoding='utf-8').read()
peter_words = np.array(peter_string.split())

array_more_than_8 = np.array([])

for string in peter_words:
    if len(string) > 8:
        array_more_than_8 = np.append(array_more_than_8, string)

longer_than_8 = len(array_more_than_8)
longer_than_8

2452

Another use of the accumulator pattern is to create an array of results. We can start with an empty array, which we initialize once at the beginning, before iterating. Then inside a loop, we generate some result and append this result onto the end of the array. At the end of the loop, the array will contain **all** of the results we generated.  This is kind of like writing down all the results on a piece of paper, one at a time, as you generate them. We used this strategy in the [coin-flipping example in Lecture 10](https://dsc10.com/resources/lectures/lec10/lec10.html#Example:-Coin-flipping).

The function we use to actually append values onto an existing array is `np.append`. This function takes as input the name of an array and some value to append. It returns an array with all the existing values in the input array, plus one more. It does not modify the input array directly, so you have to remember to save it. When we use `np.append` with the accumulator pattern, our code typically looks something like this: 

```py
results_array = np.append(results_array, result)
```

**Question 3.4.** Use the strategy outlined above to assign `long_words` to an array of all the words in  _Peter Pan_ that are more than 8 characters long.  

In [30]:
long_words = array_more_than_8
long_words

array(['children,', 'delightful,', 'henceforth', ..., 'Neverland,',
       'daughter,', 'heartless.'], dtype='<U57')

## 4. Hungry Billy 🍗 🍕🍟
After a long day of class, Billy decides to go to Dirty Birds for dinner. Today's menu has Billy's four favorite foods: wings, pizza, fries, and mozzarella sticks. However, each dish has a 25% chance of running out before Billy can get to Dirty Birds.

***Note:*** Use Python as your calculator. Your answers should be expressions (like `0.5 ** 2`); don't simplify your answers using an outside calculator. Also, all of your answers should be given as decimals between 0 and 1, not percentages.

**Question 4.1.** What is the probability that Billy will be able to eat wings at Dirty Birds?

In [31]:
wings_prob = (1-0.25) ### 100% - the prob of not having wings
wings_prob

0.75

**Question 4.2.** What is the probability that Billy will be able to eat all four of these foods at Dirty Birds?

In [32]:
all_prob = (0.75**4) ### the prob of having everything
all_prob

0.31640625

**Question 4.3.** What is the probability that Dirty Birds will have run out of at least one of the four foods before Billy can get there?

In [33]:
something_is_out = (1-0.75**4) ### 100% - the prob of having everything
something_is_out

0.68359375

To make up for their unpredictable food supply, Dirty Birds decides to hold a contest for some free HDH Dining swag. There is a bag with three red marbles, three green marbles, and three blue marbles. Billy has to draw three marbles **without replacement**. In order to win, all three marbles Billy draws must be of different colors.

**Question 4.4.** What is the probability that Billy wins the contest?

_**Hint:**_ If you're stuck, start by determining the probability that the second marble Billy draws is different from the first marble Billy draws.

In [34]:
### 9/9 of getting any balls. Then 6/8 of getting  a ball of another color. Then 3/7 of getting the third color

winning_prob = (6/8)*(3/7)
winning_prob

0.3214285714285714

Congratulations! You are done with Lab 3.
