# Generating random numbers with NumPy

NumPy greatly streamlines the process of generating many random numbers!

## Import NumPy

In [None]:
import numpy as np

## Random number generators

NumPy has its own family of random number generators. It requires that we initialize or specify the GENERATOR object.

In [None]:
rg = np.random.default_rng(12345)

In [None]:
%whos

Generating random numbers requires calling methods associated with the generator object.

The uniform random number between 0 and 1 is created using the `rg.random()` method.

In [None]:
rg.random()

In [None]:
rg.random()

In [None]:
rg.random()

Reseting the seed will allow us to reproduce the previous SEQUENCE of random numbers.

In [None]:
rg = np.random.default_rng(12345)

print( rg.random(), rg.random(), rg.random() )

But, NumPy allows us to generate MANY random numbers in one function call!!!!

The `random` module required us to ITERATE via for-loops or list comprehensions to generate SEQUENCES of random numbers.

NumPy handles all of that for us!!!

In [None]:
rg = np.random.default_rng(12345)

In [None]:
rg.random( 3 )

This allows us to generate a LARGE number of values and summarize them easily!!

In [None]:
rg.random( 1001 ).mean()

But...we can tell `rg.random()` to generate a 2D array!!!!!

In [None]:
rg = np.random.default_rng(12345)

In [None]:
rg.random( (4, 2) )

In [None]:
rg.random( (1001, 5) ).shape

In [None]:
rg.random( (5, 1001) ).shape

In [None]:
rg.random( (10000, 500) ).shape

In [None]:
rg.random( (10000, 500) ).size

## Other distributions

In [None]:
help( np.random.random )

In [None]:
help( np.random.normal )

To generate 7 random numbers from a Gaussian or Bell Curve with mean -100 and standard deviation 5:

In [None]:
rg = np.random.default_rng(12345)

In [None]:
rg.normal( -100, 5, 7 )

In [None]:
rg.normal( -100, 5, (4, 2) )

# Simulate the standard error on the mean (SEM)

NumPy greatly streamlines and simplifies the simulation!!!!

## Import NumPy

In [None]:
import numpy as np

## Define the random number generator

In [None]:
rg = np.random.default_rng(1976)

In [None]:
%whos

## Simulation

We can generate as many random numbers as we want and organize the results in a 2D array!

Let's simulate 7 random numbers from a Uniform distribution between 0 and 1 and REPLICATE that process 5000 times.

In [None]:
Ns = 7

In [None]:
nr = 5000

Store the random numbers in an array named `X`.

In [None]:
X = rg.random( (Ns, nr) )

In [None]:
X.shape

In [None]:
X.ndim

In [None]:
X.size

In [None]:
X[:, :3]

### Calculate the average from the 7 random numbers

Each column is an individual replication of 7 numbers.

Calculate their average!

In [None]:
X.mean()

In [None]:
X.mean( axis = 0 ).shape

In [None]:
x_means = X.mean( axis = 0 )

In [None]:
x_means.shape

In [None]:
x_means[:4]

In [None]:
x_means[-1]

### Calculate the standard error on the mean

The SEM is the STANDARD DEVIATION of the averages!!!!!!

The `.std()` must be applied to the result stored in `x_means`.

BUT REMEMBER we want the UNBIASED ESTIMATOR!!!

In [None]:
x_means.std( ddof = 1 )

### Everything in one line of code

In [None]:
rg = np.random.default_rng(1976)

rg.random( (Ns, nr) ).mean( axis = 0).std( ddof = 1 )

## Study the effect of the sample size on the standard error

In [None]:
sample_sizes = 5 * ( 2 ** np.arange(12) )

In [None]:
sample_sizes

In [None]:
sample_sizes.shape

In [None]:
list( sample_sizes )

Let's iterate and apply the simulation for the standard error on the mean to the different sample sizes!!!!

In [None]:
rg = np.random.default_rng(1976)

In [None]:
sem_vs_sample_size = [ rg.random( (nns, nr) ).mean(axis=0).std(ddof=1) for nns in list(sample_sizes)  ]

In [None]:
sem_vs_sample_size

Let's plot the standard error on the mean vs the sample size.

In [None]:
import matplotlib.pyplot as plt

In [None]:
fig, ax = plt.subplots(figsize=(12, 6))

ax.plot( sample_sizes, sem_vs_sample_size, '-o' )
ax.set_xscale('log')
ax.set_xlabel('sample size')
ax.set_ylabel('standard error on the mean')
ax.grid(True)
plt.show()

# RESHAPING with NumPy

We need to review DIMENSIONS or SHAPES in NumPy. What is the difference between ROWS and COLUMNS?

## Import Modules

In [None]:
import numpy as np

## Introduction to reshaping

Let's begin be reviewing a 1D NumPy array.

In [None]:
np.arange( 1, 25 )

In [None]:
np.arange( 1, 25 ).ndim

In [None]:
np.arange( 1, 25 ).shape

In [None]:
np.arange( 1, 25 ).size

In [None]:
x = np.arange( 1, 25 )

In [None]:
%whos

In [None]:
x

In [None]:
x.size

In [None]:
x.ndim

In [None]:
x.shape

Let's convert or **RESHAPE** the 1D array into a 2D array!!!!

We will literally ADD a dimension to the object and thus modify where and how the values are stored!

We reshape with the `.reshape()` method. All NumPy arrays have the `.reshape()` method.

`.reshape( <NEW ROWS>, <NEW COLUMNS> )`

Let's convert the `x` into a 2D array with 12 rows and 2 columns.

In [None]:
x

In [None]:
x.reshape( 12, 2 )

In [None]:
[ [1, 2],
  [3, 4], 
  [5, 6] ]

In [None]:
x.reshape( 12, 2 ).ndim

The `.ndim` attribute is DIFFERENT AFTER applying the reshaping operation! Originally, the `x` object has 1 dimension:

In [None]:
x.ndim

In [None]:
x.shape

In [None]:
x.reshape( 12, 2 ).shape

HOWEVER...VERY IMPORTANTLY...the TOTAL number of ELEMENTS has NOT changed!!!

In [None]:
x.size

In [None]:
x.reshape( 12, 2 ).size

In [None]:
x

In [None]:
x.reshape( 12, 2 )

In [None]:
x2 = x.reshape( 12, 2 )

In [None]:
x2

In [None]:
x2[ :3, : ]

In [None]:
x[ :3 ]

Of course, we could have reshaped to something other than 12 rows!

We could have reshaped to 6 rows and 4 columns!

In [None]:
x.reshape( 6, 4 )

Or, I could do 4 rows and 6 columns!

In [None]:
x.reshape( 4, 6 )

Or, 2 rows and 12 columns!

In [None]:
x.reshape( 2, 12 )

In [None]:
x.reshape( 2, 12 ).ndim

In [None]:
x.reshape( 2, 12 ).shape

In [None]:
x.reshape( 2, 12 ).size

I could even try 8 rows and 3 columns!

In [None]:
x.reshape( 8, 3 )

Or 3 rows and 8 columns!

In [None]:
x.reshape( 3, 8 )

The important point is that the NUMBER OF ROWS multiplied by the NUMBER OF COLUMNS must equal the total number of elements or SIZE!

If, you don't want to do the mental math and type both the NEW number of ROWS and the NEW number of COLUMNS...you can include a -1 as an argument.

In [None]:
x.reshape( 12, -1 )

In [None]:
x.reshape( 3, -1 )

In [None]:
x.reshape( 3, -1 ).shape

In [None]:
x.reshape( -1, 2 )

In [None]:
x.reshape( -1, 8 )

PLEASE BE CAREFUL!!!!!!

In [None]:
x.reshape( 9, -1 )

In [None]:
x.reshape( 9, 3 )

In [None]:
x.reshape( 9, 2 )

In [None]:
x.reshape( 8, 4 )

The SIZE must equal the PRODUCT of the number of rows and the number of columns!

## Other reshaping procedures

The most common reshaping technique is the TRANSPOSE of a 2D array.

Transposing SWAPS the rows and the columns.

To see how to transpose let's work with a 2D array named `y`.

In [None]:
y = np.arange( 1, 25 ).reshape( 8, -1 )

In [None]:
y

In [None]:
y.ndim

In [None]:
%whos

In [None]:
y.shape

NumPy 2D arrays include the `.T` attribute to EXECUTE a transpose.

In [None]:
y.T

In [None]:
y

In [None]:
y.shape

In [None]:
y.T.shape

Transposing is unfortunately what a lot of people will think of when you say RESHAPING the array!

But the `.reshape()` method is FAR MORE GENERAL than just TRANSPOSING!!

In [None]:
y

In [None]:
y.T

In [None]:
y.reshape( 12, 2 )

In [None]:
y.reshape( 4, 6 )

In fact the `.reshape()` is so general we can even turn a 2D array into a SINGLE ROW or SINGLE COLUMN object!!

In [None]:
y

In [None]:
y.reshape( 1, -1 )

In [None]:
y.reshape( -1, 1 )

In [None]:
x

In [None]:
y.reshape( 1, -1 )

In [None]:
x.ndim

In [None]:
x.shape

In [None]:
y.reshape( 1, -1 ).ndim

In [None]:
y.reshape( 1, -1 ).shape

Sometimes we need to DROP a 2D array DOWN to a 1D array.

We can do that with the `.ravel()` method.

In [None]:
x

In [None]:
x.ndim

In [None]:
y.reshape( 1, -1 ).shape

In [None]:
y.reshape( 1, -1 ).ravel()

In [None]:
y.reshape( 1, -1 ).ravel().shape

In [None]:
y.reshape( 1, -1 ).ravel().ndim

In [None]:
y.reshape( -1, 1 ).ravel()

In [None]:
y.reshape( -1, 1 ).ravel().ndim

In [None]:
y.reshape( -1, 1 ).ravel().shape

## Summary

We need to pay close attention to the number of ROWS and COLUMNS or the DIMENSIONS and SHAPE of the NumPy array!!!

We can modify the SHAPE by `.reshape()` and we can "SQUEEZE" or "compress" a 2D array into a 1D array with `.ravel()`.