### Random Seeds

The random module provides a variety of functions related to (pseudo) random numbers.

The problem when you use random numbers in your code is that it can be difficult to debug because the same random number sequence is not the same from run to run of your program. If your code fails somewhere in the middle of a run it is difficult to make the problem repeatable. Debugging intermittent and non-repeatable failures is one of the worst things to do!

Fortunately, when using the random module, we can set the seed for the random underlying random number generator.

Random numbers are not truly random - they are generated in such a way that the numbers appear random and evenly distributed, but in fact they are being generated using a specific algorithm.

That algorithm depends on a seed value. That seed value will determine the exact sequence of randomly generated numbers (so as you can see, it's not truly random). Setting different seeds will result in different random sequences, but setting the seed to the same value will result in the same sequence being generated.

By default, the seed uses the system time, hence every time you run your program a different seed is set. But we can easily set the seed to something specific - very useful for debugging purposes.

In [3]:
import random

for _ in range(10):
    print(random.randint(10,50), random.random())

28 0.28485030707117365
16 0.9641745358546493
34 0.2795019853989167
21 0.046005455867326184
29 0.7471879797760834
29 0.857150870460005
30 0.331791006790802
39 0.7663729603587679
28 0.48473930849856894
35 0.3025960256614897


In [4]:
random.seed(0)
for i in range(10):
    print(random.randint(10, 20), random.random())

16 0.7579544029403025
16 0.04048437818077755
18 0.48592769656281265
14 0.9677999949201714
15 0.5833820394550312
13 0.5046868558173903
14 0.1397457849666789
11 0.6183689966753316
14 0.9872592010330129
18 0.9827854760376531


### Random Choices
How would you pick a random element from a list?

You might be tempted to use the random library to pick a random index (integer) and use that random index to retrieve the element from the list (or more genrally sequence).


In [5]:
import random
random.seed(0) # set a seed so we always generate the same random sequences:
l = [10, 20, 30, 40, 50, 60]
random_index = random.randrange(len(l))
l[random_index]

40

In [6]:
random.seed(0)
for i in range(10):
    print(l[random.randrange(len(l))])

40
40
10
30
50
40
40
30
40
30


The random module also has a choices method which allows us to choose multiple random choices (as opposed to choice which only picks one).

The thing is, choices has a few more advanced features built in.

In [9]:
list_1 = list(range(1000))
random.choices(list_1, k=5)

[902, 310, 729, 898, 683]

In [11]:
list_2 = ['a', 'b', 'c']
random.seed(0)
for _ in range(10):
    print(random.choices(list_2, k=2))

['c', 'c']
['b', 'a']
['b', 'b']
['c', 'a']
['b', 'b']
['c', 'b']
['a', 'c']
['b', 'a']
['c', 'c']
['c', 'c']


In [12]:
for _ in range(10):
    print(random.choices(list_2, k=5))

['a', 'c', 'c', 'c', 'b']
['a', 'b', 'b', 'c', 'c']
['b', 'c', 'a', 'c', 'b']
['a', 'c', 'b', 'c', 'c']
['a', 'b', 'c', 'a', 'a']
['c', 'a', 'b', 'a', 'c']
['c', 'b', 'a', 'a', 'b']
['c', 'a', 'b', 'c', 'b']
['c', 'b', 'c', 'b', 'b']
['b', 'b', 'b', 'b', 'a']


In [13]:
'''
In addition, we can also specify a weight for each item in the population. 
This essentially allows us to have certain items be picked more often than others.
 The weight list must be the same length as the population.
'''

weights_2 = [10, 1, 1]
for _ in range(10):
    print(random.choices(list_2, k=5, weights=weights_2))

['a', 'a', 'a', 'a', 'a']
['a', 'a', 'b', 'c', 'b']
['b', 'c', 'a', 'a', 'a']
['a', 'a', 'b', 'b', 'a']
['c', 'a', 'a', 'a', 'c']
['c', 'a', 'a', 'a', 'a']
['a', 'b', 'a', 'a', 'a']
['a', 'a', 'a', 'a', 'a']
['a', 'a', 'a', 'a', 'b']
['a', 'a', 'a', 'a', 'a']


### Random Samples
I just want to show you a variant on random.choices that we saw in the previous video.

choices chooses k random elements from some sequence, with replacement.

Sometimes however, we do not want that replacement - instead we want a population sample (so once an element has been randomly selected, it cannot be selected again).

This is where the sample function comes in - it does exactly that. Of course, we can no longer pick more elements than we have in our population. Also, picking a sample equal in size to the population basically returns a "shuffled" population

In [14]:
l = range(20)
random.sample(l, k=10)

[18, 3, 12, 2, 11, 13, 1, 0, 9, 17]

### Timing code using timeit
When we were looking at decorators we wrote a timing decorator. It could even take a number of repititions as a parameter. This can be handy to time functions directly in your code without affecting the result of the function. But it wrote the results out to the console, and sometimes we just want to access the timing data right inside our Python code.

The timeit module in Python is an alternative that works well for some things. It is a little more complicated to use because it runs 'outside' of our local namespace, and you have to pass just small snippets of code to it (well you pass multi-line chunks of code, but it gets tedious), and you also have to make it aware of you global or local scope if that's needed by the code you want to time. One thing it does that we did not do was temporarily disable the garbage collector. Still, there are a lot of pitfalls to benchmarking, and this approach like ours, is good enough for most cases. YMMV.

It has the advantage that it can also be run directly from the command line.

In [15]:
from timeit import timeit

Basically the timeit function needs to know a few things:

the Python statement to run (the stmt argument)
how many times to run the same code (the number argument - watch out, the default is 1_000_000 times!)
any setup code (like imports) (the setup argument)
an optional scope that acts like a global scope to the statement (the globals argument)

In [16]:
import math
math.sqrt(2)



1.4142135623730951

In [17]:
timeit(stmt='math.sqrt(2)')

NameError: name 'math' is not defined

In [18]:
timeit(stmt = 'import math\nmath.sqrt(2)')

#OR
# timeit(stmt = 'math.sqrt(2)', setup='import math')

0.4592923999998675