In [15]:
# Randomness and Probability

In this week's lesson, we're going to explore the different functions and methods python has to learn about probabilities. We'll even perform some virtual experiements to illustrate some important probability concepts. 

As a quick recap: probability is a math term that refers to how **likely** something is to happen. Probabilities can be expressed as a fraction or decimal between 0 and 1, where **0** is the probability of something that *can't ever happen*, **1** is the probability of something that will *always happen no matter what*, and an outcome with probability **0.5** or **$\frac{1}{2}$** will likely happen half the time. (2 out of 4 tries, 10 out of 20 tries, 50 out of 100 tries, and so on)


The math concept of probability is completely tied to the idea of randomness. When situations are random, there's no certain way to predict how they will turn out. We can only make guesses that some outcomes are likely and others are unlikely. A standard example is a coin flip - we can be pretty certain that "heads" will be face up about $\frac{1}{2}$ of the time, so the probability is **0.5**. However, every time you wake up, you are on planet Earth, so the likelyhood that you are on Earth when you wake up in the morning is virtually **1**. (in the year 2020 at least) 

I'm sure you can think of a fun example for an outcome that has a **0** probability...



## Coding a bag full of marbles

Well, let's get to it. First we'll import the `random` module.

In [16]:
import random

Next, we'll fill a bag with a few marbles. 
#### More specifically, we're assigning the variable `bag` a list with 8 elements that are the string `yellow` and 8 elements that are the string `blue`. 

In [17]:
bag=['yellow','yellow','yellow','yellow',
    'yellow','yellow','yellow','yellow',
    'blue','blue','blue','blue',
    'blue','blue','blue','blue']

In [18]:
print(bag) #take a look in the bag

['yellow', 'yellow', 'yellow', 'yellow', 'yellow', 'yellow', 'yellow', 'yellow', 'blue', 'blue', 'blue', 'blue', 'blue', 'blue', 'blue', 'blue']


It was a bit painful typing that by hand, so here's a quicker way of making that list. We can multiply lists to copy them a certain number of times, and we can add lists to stick the lists together. Pretty logical, I think.

#### Now `bag` should have 10 of each color

In [19]:
bag=['yellow']*10+['blue']*10
print(bag)

['yellow', 'yellow', 'yellow', 'yellow', 'yellow', 'yellow', 'yellow', 'yellow', 'yellow', 'yellow', 'blue', 'blue', 'blue', 'blue', 'blue', 'blue', 'blue', 'blue', 'blue', 'blue']


Here's how we can take a list and count the number of repeated elements:

In [20]:
bag.count('blue')

10

In [21]:
bag.count('yellow')

10

And here's how we grab the entry at index 0 (the "first" entry):

In [22]:
bag[0]

'yellow'

and the element at index 1: 
#### (the, erm... "second" entry?? These are bad words for this purpose):

In [23]:
bag[1]

'yellow'

and so on:

In [24]:
bag[3]

'yellow'

In [25]:
bag[9]

'yellow'

In [26]:
bag[10]

'blue'

In [27]:
bag[19]

'blue'

In [28]:
bag[20]

IndexError: list index out of range

Uh-oh, why did that happen? Please explain why `bag[20]` gives `list index out of range`...

[Because you start from counting from 0 so basically there is 19 colors.]

Now lets use the `random` module. `random.shuffle( )` takes a list and changes it so that the elements are all in randomly different places:

In [29]:
random.shuffle(bag)
print(bag)

['yellow', 'blue', 'yellow', 'blue', 'yellow', 'yellow', 'yellow', 'yellow', 'blue', 'yellow', 'yellow', 'blue', 'yellow', 'blue', 'blue', 'blue', 'blue', 'blue', 'blue', 'yellow']


As you repeat the above code cell, the following cell will print out whatever is at the beginning of `bag` after it is randomly shuffled. What is the probability that the following cell prints `blue`?

[50/50]

In [30]:
print(bag[0])

yellow


The following code is similar to an experiment were your friend shakes up a bag of marbles (10 yellow and 10 blue), asks you to draw one and then record the color 20 different times, making sure you put the marble back in the bag each time:

In [31]:
for i in range(20):
    random.shuffle(bag) #these two lines happen 20 times
    print(bag[0])       #

yellow
yellow
blue
yellow
yellow
blue
blue
blue
blue
blue
blue
blue
yellow
blue
yellow
yellow
blue
yellow
blue
yellow


What is the ratio of blue marbles drawn to total marbles drawn? Should this always be equal to $\frac{1}{2}$? If you repeat the experiment, is the ratio the same? Explain:

[The amount is 11 blue to 9 yellow so the ratio is 9/11]

## Getting more efficient

Counting everything by hand is good to help guide your understanding, but it's also a little tedious. In this next section, we'll get python to do all the dirty work for us. 

Let's imagine an empty bag called `empty`. We'll assign it the empty list `[]`. We can then "fill" it with our data by using the `append( )` method, as demonstrated by the following three cells:

In [32]:
empty=[]
empty

[]

In [33]:
empty.append('green')
empty

['green']

In [34]:
empty.append('red')
empty

['green', 'red']

#### (`append` always puts its input at the end of the list it is called with.)

So, with this in mind, let's repeat our experiment, and collect the data about our random draws in `empty` (no, it won't be empty after the for loop!) 

In [35]:
empty=[]
for i in range(20):
    random.shuffle(bag) 
    empty.append(bag[0])
print(empty) #this happens after everything indented above finishes

['yellow', 'blue', 'blue', 'yellow', 'blue', 'blue', 'blue', 'blue', 'yellow', 'yellow', 'yellow', 'blue', 'blue', 'yellow', 'yellow', 'blue', 'yellow', 'blue', 'yellow', 'yellow']


Now, let's count the yellows and blues and store these in `results`

In [36]:
results=[empty.count('yellow'), empty.count('blue')]
results

[10, 10]

And we can calculate the observed ratios like so

In [37]:
results[0]/20 #number of yellows to total trials

0.5

In [38]:
results[1]/20 #number of blues to total trials

0.5

The great thing about python doing this work is that now we can ask it to do experiments that would take humans a very long time to do.
#### This one is a similar experiment where 1000 draws are recorded before the ratios are calculated 

In [39]:
empty=[]
for i in range(1000):
    random.shuffle(bag)
    empty.append(bag[0])
results=[empty.count('yellow'), empty.count('blue')]

In [40]:
results[0]/1000

0.493

In [41]:
results[1]/1000

0.507

In [42]:
empty #thanks python, for not making me count this up myself!

['blue',
 'blue',
 'yellow',
 'blue',
 'yellow',
 'blue',
 'yellow',
 'blue',
 'blue',
 'blue',
 'blue',
 'yellow',
 'yellow',
 'blue',
 'yellow',
 'yellow',
 'yellow',
 'yellow',
 'yellow',
 'yellow',
 'yellow',
 'yellow',
 'blue',
 'yellow',
 'yellow',
 'blue',
 'yellow',
 'blue',
 'blue',
 'yellow',
 'blue',
 'yellow',
 'blue',
 'yellow',
 'blue',
 'blue',
 'yellow',
 'blue',
 'blue',
 'blue',
 'blue',
 'yellow',
 'blue',
 'blue',
 'blue',
 'blue',
 'blue',
 'blue',
 'yellow',
 'yellow',
 'yellow',
 'blue',
 'blue',
 'yellow',
 'blue',
 'yellow',
 'blue',
 'blue',
 'blue',
 'yellow',
 'yellow',
 'blue',
 'yellow',
 'blue',
 'blue',
 'yellow',
 'blue',
 'blue',
 'blue',
 'blue',
 'yellow',
 'yellow',
 'yellow',
 'blue',
 'blue',
 'blue',
 'blue',
 'yellow',
 'blue',
 'yellow',
 'yellow',
 'blue',
 'yellow',
 'blue',
 'blue',
 'yellow',
 'blue',
 'yellow',
 'blue',
 'yellow',
 'blue',
 'blue',
 'blue',
 'yellow',
 'blue',
 'blue',
 'blue',
 'blue',
 'blue',
 'blue',
 'yellow',
 'yello

We've been studying the experimentally observed probability of drawing from a bag of marbles. These types of observations are often called "statistics." The observed outcomes of an experiment will often closely resemble the mathematical probability of that outcome, but thanks to randomness, we won't always get exactly $\frac{10}{20}$ draws that are blue, or even $\frac{500}{1000}$ draws that are blue. 

### First quick project:
To wrap up this section up, write some code below that will fill a bag with 4 green marbles, 5 red marbles, and 16 pink marbles. Your code should calculate the observed probabilities of drawing a green, red and pink marbles.

Begin by calculating the simple probabilities of drawing a green, a red, and a pink marble:

In [43]:
4/25 # ratio of green to total

0.16

In [44]:
5/25 # ratio of red to total

0.2

In [45]:
15/25 # ratio of pink to total

0.6

Next, assign `bag` a list containing 4, 5, and 16 strings that are `'red'`, `'green'`, and `'pink'` respectively. 

In [46]:
bag = ['green']*4+['red']*5+['pink']*16

Create an `empty` list for keeping track of your draws, then run your experiment by shuffling the bag, drawing one of the elements, and appending it to empty. (do refer to my code above, but try writing it by yourself before copying)

In [47]:
empty=[]
for i in range(110):
    random.shuffle(bag)
empty.append(bag[0]*56)
empty.append(bag[1]*41)
empty.append(bag[2]*7)
#next, use a for loop to repeat the shuffling, drawing and appending over and over

Finally, calculate the number of 'red', 'green', and 'pink' draws divided by the number of trials to find your experimental probabilities.

In [48]:
empty.count('green')
#hint: use empty.count('red') to find how many 'red' elements appear in empty

0

Do your experimental probabilities look close to the simple probabilities you calculated at the beginning? When you repeat the experiment, does it seem to be close each time?

[your response here]

Let's move onto a different situation and study the multiplication rule of counting.

## Counting

You may think, "Mr. P, I've already learned how to count!" And I would likely believe you. But, there are situations in the study of probability where counting the number of unique possible outcomes is actually pretty challenging, and certainly not obvious.

Let's pretend you have 4 different choices to make every day you get dressed for work. You'll choose one of each of the following:
- 3 hats
- 5 tops
- 4 bottoms
- 3 pairs of shoes

Now, lets get creative.

In [49]:
hats = ['cool hat', 'straw hat', 'Seattle Sonics hat']
tops = ['blue t-shirt', 'red sweater', 'button down tee', 'XXXXL yellow tee', 'leather tee']
bottoms = ['jeans', 'shorts', 'PJs', 'kilt']
shoes = ['tennis shoes', 'clogs', 'fancy clogs']

The `random` module has another useful method: `choice( )`

It picks a random element from the list you give as its input, like so:

In [50]:
random.choice(tops)

'button down tee'

Each time you run the above code, it will output a random pick from the list `tops`.

Next, mainly for our enjoyment, let's pick a random outfit:

In [51]:
random.choice(hats)+random.choice(tops)+random.choice(bottoms)+random.choice(shoes)

'straw hatblue t-shirtjeansfancy clogs'

Here is a way that's a bit more organized:

In [52]:
outfit=[]

hat=random.choice(hats)
top=random.choice(tops)
bottom=random.choice(bottoms)
shoe=random.choice(shoes)

print([hat,top,bottom,shoe])

['cool hat', 'leather tee', 'jeans', 'clogs']


How could you print out all the elements of `top`, one by one?

In [53]:
n = tops
print(n)# your code here

['blue t-shirt', 'red sweater', 'button down tee', 'XXXXL yellow tee', 'leather tee']


In [54]:
"""Try it by yourself

.
.
.
.

Before you look at my answer below

.
.
.
.
.
.

Just see if you can, then copy if you're not sure!


.
.
.
.
.

Ok here's my answer!"""

"Try it by yourself\n\n.\n.\n.\n.\n\nBefore you look at my answer below\n\n.\n.\n.\n.\n.\n.\n\nJust see if you can, then copy if you're not sure!\n\n\n.\n.\n.\n.\n.\n\nOk here's my answer!"

In [55]:
for t in tops:
    print(t)

blue t-shirt
red sweater
button down tee
XXXXL yellow tee
leather tee


Great! Let's pose the question we're all dying to figure out: 

## How many different outfits are possible given your current wardrobe?

It's not an obvious question, but we'll soon learn that it is a mathematically simple one to answer.

But first, let's see if we can get python to print out every unique outfit we could choose.

Let's use a similar approach as in our last experiment: start with an empty list of `outfits`, then append all the choices of hat, top, bottom and shoes one after the other.

In [56]:
outfits=[]
for hat in hats:

    for top in tops:

        for bottom in bottoms:

            for s in shoes:
                outfits.append([hat, top, bottom, s])

In [57]:
outfits # you can learn a lot about for loops by trying to find the pattern in this list!

[['cool hat', 'blue t-shirt', 'jeans', 'tennis shoes'],
 ['cool hat', 'blue t-shirt', 'jeans', 'clogs'],
 ['cool hat', 'blue t-shirt', 'jeans', 'fancy clogs'],
 ['cool hat', 'blue t-shirt', 'shorts', 'tennis shoes'],
 ['cool hat', 'blue t-shirt', 'shorts', 'clogs'],
 ['cool hat', 'blue t-shirt', 'shorts', 'fancy clogs'],
 ['cool hat', 'blue t-shirt', 'PJs', 'tennis shoes'],
 ['cool hat', 'blue t-shirt', 'PJs', 'clogs'],
 ['cool hat', 'blue t-shirt', 'PJs', 'fancy clogs'],
 ['cool hat', 'blue t-shirt', 'kilt', 'tennis shoes'],
 ['cool hat', 'blue t-shirt', 'kilt', 'clogs'],
 ['cool hat', 'blue t-shirt', 'kilt', 'fancy clogs'],
 ['cool hat', 'red sweater', 'jeans', 'tennis shoes'],
 ['cool hat', 'red sweater', 'jeans', 'clogs'],
 ['cool hat', 'red sweater', 'jeans', 'fancy clogs'],
 ['cool hat', 'red sweater', 'shorts', 'tennis shoes'],
 ['cool hat', 'red sweater', 'shorts', 'clogs'],
 ['cool hat', 'red sweater', 'shorts', 'fancy clogs'],
 ['cool hat', 'red sweater', 'PJs', 'tennis shoes

Now, finding the length of our list `outfits` is easy using the common python function, `len( )`:

In [58]:
len(outfits)

180

Perhaps you have already figured out the quick way of finding the answer to the question of how many unique outfits we have, but if not, here's a hint:

In [59]:
3*5*4*3

180

So, please explain, what do you think the "rule of product" means for counting? (see https://en.wikipedia.org/wiki/Rule_of_product for a way too complicated answer)

[The multiplication Table is a way of counting where there is a number of choices for every letter.]

Now, wasn't that satisfying? 180 outfits, count-em! And yes, I could have just told you to multiply the number of hats by the number of tops by the number of bottoms by the number of pairs of shoes, but would you have believed me? I think it's much more convincing to let python **show** you every single unique outfit. Plus, now I have a great new way to get dressed in the morning!

In [60]:
random.choice(outfits)

['cool hat', 'XXXXL yellow tee', 'jeans', 'tennis shoes']

## Card games

The last situation I want to explore in our Randomness assignment involves a deck of cards. If you're not familiar with what a deck of cards consists of, please quickly skim https://boardgamegeek.com/wiki/page/standard_deck_playing_card_games

Lets think about how python could generate a deck of cards.

In [61]:
suits = ['♤', '♧', '♡', '♢']
cards = ['A']+list(range(2,11))+['J','Q','K'] # range(2,11) won't include 10

In [62]:
deck=[]
for suit in suits:
    for card in cards:
        deck.append(str(card) + suit) #without str(card), sometimes we'd be adding an int to a str!

In [63]:
print(deck)

['A♤', '2♤', '3♤', '4♤', '5♤', '6♤', '7♤', '8♤', '9♤', '10♤', 'J♤', 'Q♤', 'K♤', 'A♧', '2♧', '3♧', '4♧', '5♧', '6♧', '7♧', '8♧', '9♧', '10♧', 'J♧', 'Q♧', 'K♧', 'A♡', '2♡', '3♡', '4♡', '5♡', '6♡', '7♡', '8♡', '9♡', '10♡', 'J♡', 'Q♡', 'K♡', 'A♢', '2♢', '3♢', '4♢', '5♢', '6♢', '7♢', '8♢', '9♢', '10♢', 'J♢', 'Q♢', 'K♢']


In [64]:
len(deck) #hopefully we have 52 cards!

52

In fact, let's capture all that code as a nice buttoned-up function, `make_deck()`.

In [65]:
def make_deck(): #this function has no input! That's ok, we don't need any input...
    suits = ['♤', '♧', '♡', '♢']
    cards = ['A']+list(range(2,11))+['J','Q','K']
    deck=[]
    for suit in suits:
        for card in cards:
            deck.append(str(card) + suit) #without str(card), sometimes we'd be adding an int to a str!
    return deck #but it does have output!

Now we can reset our deck by assigning `make_deck()` to a variable.

In [66]:
deck=make_deck()
deck

['A♤',
 '2♤',
 '3♤',
 '4♤',
 '5♤',
 '6♤',
 '7♤',
 '8♤',
 '9♤',
 '10♤',
 'J♤',
 'Q♤',
 'K♤',
 'A♧',
 '2♧',
 '3♧',
 '4♧',
 '5♧',
 '6♧',
 '7♧',
 '8♧',
 '9♧',
 '10♧',
 'J♧',
 'Q♧',
 'K♧',
 'A♡',
 '2♡',
 '3♡',
 '4♡',
 '5♡',
 '6♡',
 '7♡',
 '8♡',
 '9♡',
 '10♡',
 'J♡',
 'Q♡',
 'K♡',
 'A♢',
 '2♢',
 '3♢',
 '4♢',
 '5♢',
 '6♢',
 '7♢',
 '8♢',
 '9♢',
 '10♢',
 'J♢',
 'Q♢',
 'K♢']

Ok, great! Now that `deck` holds a full deck of cards, we can proceed to play around with some playing card probabilities!

How could we draw a card at random?

In [67]:
random.shuffle(deck)# your code here
print(deck[0])

J♤


How could we shuffle the deck? (randomly change the position of every card in the deck)

In [68]:
random.shuffle(deck)
print(deck)# your code here

['6♢', 'A♢', '7♡', '6♤', 'K♢', '9♡', '6♧', '2♤', '4♡', '7♢', '5♡', '6♡', 'A♤', '9♢', 'A♧', '9♧', 'Q♧', 'J♢', '3♧', 'J♧', 'Q♤', '8♤', '4♤', 'K♡', '10♢', '5♢', 'A♡', 'Q♡', '4♧', '9♤', '5♤', '3♢', '8♡', 'K♧', 'J♤', '10♤', '5♧', 'J♡', '7♤', '7♧', '3♡', 'K♤', '2♢', '8♧', 'Q♢', '8♢', '10♡', '4♢', '3♤', '2♡', '10♧', '2♧']


The `pop()` method will draw an element from a list, then remove it from that list. It takes the index number you wish to remove and returns the element: 

In [69]:
card=deck.pop(0) #draw the card at index 0 and remove it
card

'6♢'

This is a lot like drawing the first card from the pile.

In [70]:
deck

['A♢',
 '7♡',
 '6♤',
 'K♢',
 '9♡',
 '6♧',
 '2♤',
 '4♡',
 '7♢',
 '5♡',
 '6♡',
 'A♤',
 '9♢',
 'A♧',
 '9♧',
 'Q♧',
 'J♢',
 '3♧',
 'J♧',
 'Q♤',
 '8♤',
 '4♤',
 'K♡',
 '10♢',
 '5♢',
 'A♡',
 'Q♡',
 '4♧',
 '9♤',
 '5♤',
 '3♢',
 '8♡',
 'K♧',
 'J♤',
 '10♤',
 '5♧',
 'J♡',
 '7♤',
 '7♧',
 '3♡',
 'K♤',
 '2♢',
 '8♧',
 'Q♢',
 '8♢',
 '10♡',
 '4♢',
 '3♤',
 '2♡',
 '10♧',
 '2♧']

Since we're going to be drawing a lot of cards, let's make this action into a nice function:

In [71]:
def draw_top(d):
    return d.pop(0)

In [72]:
draw_top(deck)

'A♢'

Eventually though, we will want to be able to draw a random card and then remove that card from the deck. We can do that with the `remove( )` method:

In [73]:
draw = random.choice(deck)
deck.remove(draw) #draw holds our card
print(draw)

6♤


In [74]:
deck #now the deck doesn't have that card either

['7♡',
 'K♢',
 '9♡',
 '6♧',
 '2♤',
 '4♡',
 '7♢',
 '5♡',
 '6♡',
 'A♤',
 '9♢',
 'A♧',
 '9♧',
 'Q♧',
 'J♢',
 '3♧',
 'J♧',
 'Q♤',
 '8♤',
 '4♤',
 'K♡',
 '10♢',
 '5♢',
 'A♡',
 'Q♡',
 '4♧',
 '9♤',
 '5♤',
 '3♢',
 '8♡',
 'K♧',
 'J♤',
 '10♤',
 '5♧',
 'J♡',
 '7♤',
 '7♧',
 '3♡',
 'K♤',
 '2♢',
 '8♧',
 'Q♢',
 '8♢',
 '10♡',
 '4♢',
 '3♤',
 '2♡',
 '10♧',
 '2♧']

It would probably be convenient to make a function that does this for us, since we'll be doing it a lot. Give it a try! Write a function `draw_card( )` that takes a list of cards as an input, chooses one at random, removes it from the list and returns it!

In [75]:
def draw_card(d):
    draw = random.choice(d)
    d.remove(draw)
    return draw# give it a try, scroll down to see my solution 
    #I did not know how to do this one

In [76]:
"""Nothing to see here folks
.
.
.
.
.
.
This is a placeholder so you can't see my answer...
.
.
.
.
.
My answer is below, but try it above first!
.
.
.
.
.
Really, give it a try!
.
.
.
.
.
.
.
.
.
.
.
.
Remember, "mistakes are proof that you are trying!"""

'Nothing to see here folks\n.\n.\n.\n.\n.\n.\nThis is a placeholder so you can\'t see my answer...\n.\n.\n.\n.\n.\nMy answer is below, but try it above first!\n.\n.\n.\n.\n.\nReally, give it a try!\n.\n.\n.\n.\n.\n.\n.\n.\n.\n.\n.\n.\nRemember, "mistakes are proof that you are trying!'

In [77]:
def draw_card(d):
    draw=random.choice(d)
    d.remove(draw)
    return draw

In [78]:
draw_card(deck)

'9♢'

In [79]:
len(deck)

48

How'd you do? Ok, well we're pretty well set up to do some random card experiments now. Let's start by resetting our deck of cards:

In [80]:
deck=make_deck()

Let's draw a random card and check if it's an ace. Keep running the cell if it's not an ace!

In [86]:
card=draw_card(deck)
if card[0]=='A':
    print("Wow, it's an Ace! Unbelievable!!")

How many times did you have to run the code cell until you got an ace?

Remember, `draw_card()` is removing cards every time you run it! How many cards are left?

In [82]:
len(deck)

51

What is the simple probability of drawing an ace?


In [83]:
4/52# your calculation here

0.07692307692307693

### Last experiments

Now, write some code to set up an experiment: shuffle the deck and pick the top card. Do this 100 times. Calculate the experimental probability of drawing an ace!
#### (hint, just use `deck[0]`, not our `draw_card()` function, since we want to keep the card in the deck!)

In [84]:
deck[0]# start by defining an empty holder for the draws
for i in range(100):
    random.shuffle(deck)
    draw
# next, use a for loop to repeat the shuffle/draw process 100 times

    #inside the for loop, shuffle the deck
    #then pick the top card, deck[0]
    #and append it onto the empty list

#finally, count the number of A cards

#and divide that by how many total draws (100)


And here's my final challenge to you! 

#### (And it's pretty dang challenging, I'd say)

Write some code that:
1. Resets the deck
2. Draws 5 cards at random, and puts them into a "hand" (just a list, in this case)
3. Records this hand of cards
4. Repeats items 1, 2 and 3 100 times.
5. Caculates the experimental probability of drawing at least one ace when you draw 5 cards at random 

In [119]:
deck=make_deck()
for draw in range(6):
    print(draw)
#draw_card(deck)*5# your code here

0
1
2
3
4
5


Ok, folks, I hope you found this primer on probability and statistics with python an interesting challenge! I am excited to see what you all come up with. Please reach out with any and all questions, or just to get a hint!


Written by John Platter, 2020