# Random numbers

Monte Carlo methods heavily rely on random numbers. It is therefore essential to generate these random numbers as random as possible, especially when a large number of them is required. This is not such an easy task!

Discussing all the practical issues involved is beyond the scope of this introduction and we will rely on NumPy's built-in functions.

## Uniform random distribution

The most basic and fundamental distribution of random numbers is the uniform distribution. It describes a sequence of random numbers that are distributed in a given interval with equal probability. Let us see how this works for a few examples. As usual, we first import a few libraries.

In [None]:
from matplotlib import pyplot
import numpy
%matplotlib inline
from matplotlib import rcParams
rcParams['font.family'] = 'serif'
rcParams['font.size'] = 16

In [None]:
# Let's draw one random number
x = numpy.random.random_sample()
print(x)

If you execute the above piece of code several times, you will notice that each time you will get a different number from 0 to 1. As the function numpy.random.random_sample() draws uniformly between 0 and 1, all these numbers have equal probabilities of being drawn (note that 0 is a possible output while 1 is excluded; one says that the numbers are drawn in the interval [0,1) ). 

If you know in advance how many random numbers you would like to draw, you may also specify this as an option and numpy.random.random_sample() will return an appropriately sized array:

In [None]:
# We draw ten random numbers
x = numpy.random.random_sample(10)
print(x)

### Are the numbers really random?

You may wonder how your computer is able to draw these random numbers and if they are really random. In fact, they are not. The internal algorithm produces a perfectly repeatable sequence of numbers that only look like random. However, after a (very) large number of draws, the sequence will repeat itself.

When you start Python (through the notebook for example), the system picks a starting place in the sequence and then produces all the following numbers in the sequence with relevant calls to the numpy.random.random_sample() function. You may however pick the starting place in the sequence yourself and therefore draw a perfectly reproducible set of numbers. This is done by specifying the so-called 'seed' of the generator like this:

In [None]:
numpy.random.seed(0)
x = numpy.random.random_sample(10)
print(x)

If you repeat the above piece of code several times, you will observe that the floats returned are always identical! The seed, 0 in this case, is an integer that fixes a starting place in the sequence of "pseudo-random" numbers. Choose a different value for the seed and see what happens.

Now try the following code:

In [None]:
numpy.random.seed(0)
x = numpy.random.random_sample(5)
print(x)
x = numpy.random.random_sample(5)
print(x)


Compare this to the previous 10 numbers generated (with seed=0). Do you see what's happening?

Setting the seed of the random number generator can be very useful if you want to debug a code and have a repeatable output. However, when you perform a real computation, you should let the system pick a seed at random based on the date and time for example. This is done each time python starts but you may also do it yourself by calling numpy.random.seed() without any argument.

### Uniform distribution in the interval [a,b)

Very often, one is interested in a sample of random numbers in the interval [a,b) instead of just [0,1). This is easily obtained by first drawing the numbers between 0 and 1 and then 'rescaling' the output like this:

In [None]:
a=5
b=20
x = numpy.random.random_sample(10)
y = (b-a)*x+a
print(y)

## Visualising a probability distribution

The numpy function numpy.random.random_sample() is designed to return numbers between 0 and 1 with equal probability and one therefore speaks of a uniform random distribution. A proper way of checking this consists in plotting the histogram of the random numbers generated. In a histogram, the interval of interest is first divided into smaller intervals called bins. For each bin, the histogram displays the number of occurances of numbers that have a value within the bin. Let's try it:

In [None]:
binwidth=1
pyplot.xlim(5,20)
pyplot.hist(y,bins=numpy.arange(5.,21., binwidth),normed=False);

By examining the histogram you should be able to confirm how many of the random numbers drawn above fall in each bin (in this case we have created bins that span the intervals [i,i+1) where i are the integers between 5 and 19).

Of course, the above distribution is far from uniform; we only drew 10 random numbers so there was no chance of having the same number of occurances in each bin (we have 15 of them!). The distribution will only appear uniform if we draw a large amount of random numbers and the fluctuations in the histogram are smoothed out. Try it, redraw the histogram for 10000, or even 1000000 random numbers. 

Finally, you should draw the histogram with the option 'normed=True'. That way, all the values computed will be divided by the total number of random numbers drawn and provide the frequency of obtaining a random number in any given bin. As you increase the size of your random number sample and decrease the size of the bins, this frequency converges to the **probability distribution function** of the random numbers.

In [None]:
from IPython.core.display import HTML
css_file = '../../styles/numericalmoocstyle.css'
HTML(open(css_file, "r").read())