### Random Samples

I just want to show you a variant on `random.choices` that we saw in the previous video.

`choices` chooses `k` random elements from some sequence, **with replacement**.

This means we could create a random selection containing more elements than we started off with:

In [1]:
import random

In [2]:
random.choices(list('abc'), k=10)

['a', 'a', 'c', 'a', 'a', 'a', 'c', 'b', 'a', 'a']

Sometimes however, we do not want that replacement - instead we want a population sample (so once an element has been randomly selected, it cannot be selected again).

This is where the `sample` function comes in - it does exactly that. Of course, we can no longer pick more elements than we have in our population. Also, picking a sample equal in size to the population basically returns a "shuffled" population.

In [3]:
l = range(20)

In [4]:
random.sample(l, k=10)

[2, 11, 4, 9, 12, 5, 15, 8, 1, 3]

We can even set the sample size equal to the population size:

In [5]:
random.sample(l, k=20)

[16, 19, 0, 7, 17, 11, 4, 8, 6, 15, 1, 3, 13, 10, 9, 2, 18, 12, 14, 5]

But no larger than the population size:

In [6]:
random.sample(l, 50)

ValueError: Sample larger than population or is negative

Also worth pointing out is that if you set a specific seed, you will get repeatability of your sample selection:

In [7]:
random.seed(0)
random.sample(l, k=5)

[12, 13, 1, 8, 15]

In [8]:
random.seed(0)
random.sample(l, k=5)

[12, 13, 1, 8, 15]

Let's see how we might use this to select some cards from a deck - obviously we don't want replacement here - once a card has ben picked from a deck it's no longer available for a second random pick.

In [9]:
suits = 'C', 'D', 'H', 'A'
ranks = tuple(range(2,11)) + tuple('JQKA')

In [10]:
suits

('C', 'D', 'H', 'A')

In [11]:
ranks

(2, 3, 4, 5, 6, 7, 8, 9, 10, 'J', 'Q', 'K', 'A')

Now we have to combine suits and ranks to form a deck.

In [12]:
deck = [str(rank) + suit for suit in suits for rank in ranks]

In [13]:
print(deck)

['2C', '3C', '4C', '5C', '6C', '7C', '8C', '9C', '10C', 'JC', 'QC', 'KC', 'AC', '2D', '3D', '4D', '5D', '6D', '7D', '8D', '9D', '10D', 'JD', 'QD', 'KD', 'AD', '2H', '3H', '4H', '5H', '6H', '7H', '8H', '9H', '10H', 'JH', 'QH', 'KH', 'AH', '2A', '3A', '4A', '5A', '6A', '7A', '8A', '9A', '10A', 'JA', 'QA', 'KA', 'AA']


Let's import `Counter` from the collections module to make sure we have no repitition when we pull a sample vs when we use `choices`.

In [14]:
from collections import Counter

In [15]:
Counter(random.sample(deck, k=20))

Counter({'AD': 1,
         'KA': 1,
         '8D': 1,
         '6H': 1,
         'JD': 1,
         'KH': 1,
         '2D': 1,
         '8H': 1,
         '10C': 1,
         '7D': 1,
         '6A': 1,
         '8C': 1,
         '2A': 1,
         '5D': 1,
         '10H': 1,
         'JC': 1,
         'QA': 1,
         '3A': 1,
         '6C': 1,
         '10D': 1})

But if we used `choices` most likely we'll get some repetitions:

In [16]:
Counter(random.choices(deck, k=20))

Counter({'KD': 2,
         '2C': 2,
         '7C': 1,
         'JD': 1,
         '7H': 1,
         '10A': 1,
         'KA': 1,
         '7A': 1,
         '2D': 1,
         '4A': 1,
         '4H': 1,
         'KH': 1,
         '9D': 1,
         '5A': 1,
         '10H': 1,
         'AD': 1,
         '8A': 1,
         'AC': 1})