# Reproducability of Randomness

- Reproducibility is an important aspect of scientific research and is critical for ensuring that results can be validated and replicated.
- In Deep Learning (DL) and other computational techniques, randomness is often introduced through techniques such as random initialization of weights, random shuffling of data, or random sampling of mini-batches during training.

- The use of randomness in DL models can result in significant variability in the results, making it difficult to compare and reproduce results. To address this, the concept of "seeding" is used to ensure reproducible randomness.



In [None]:
import matplotlib.pyplot as plt
import torch
import numpy as np


# Reproducable Randomness using 'Seeding'

- Seeding is the process of setting a random seed, or starting point, for the random number generator used in a DL model.
- By setting a seed, the sequence of random numbers generated by the model will be the same every time the model is run, even on different computers or with different software.
- This ensures that the results of the model are reproducible and can be validated and replicated.

- To achieve reproducible randomness, it is important to seed the random number generator at the beginning of the model and to use the same seed value every time the model is run.
- This allows researchers to compare and validate results, and it also makes it easier to debug and understand the behavior of the model.

In [None]:
# How to use numpy's and Pytorch's seed functions

# generate a few random numbers
np.random.randn(5)

# different every time we produce this

array([-1.77528229,  1.31487654, -0.47344805, -1.0922299 , -0.25002744])

In [None]:
# repeat after fixing the seed (old but still widely used method)
np.random.seed(17)
print(np.random.randn(3))
print(np.random.randn(3))

# will be the same numbers all the time

# [ 0.27626589 -1.85462808  0.62390111]
# [1.14531129 1.03719047 1.88663893]

[ 0.27626589 -1.85462808  0.62390111]
[1.14531129 1.03719047 1.88663893]


# New seed mechanism in Numpy

In [None]:
randseed1=np.random.RandomState(17)
randseed2=np.random.RandomState(20210530)

print(randseed1.randn(3)) # same sequence
print(randseed2.randn(3)) # different from above, but same each time
print(randseed1.randn(3)) # same as two up
print(randseed2.randn(3)) # same as two up
print(np.random.randn(3)) # different every time


# [ 0.27626589 -1.85462808  0.62390111]
# [-0.24972681 -1.01951826  2.23461339]
# [1.14531129 1.03719047 1.88663893]
# [0.72764703 1.2921122  1.15494929]
# [0.10385631 0.30059104 0.9682053 ]

[ 0.27626589 -1.85462808  0.62390111]
[-0.24972681 -1.01951826  2.23461339]
[1.14531129 1.03719047 1.88663893]
[0.72764703 1.2921122  1.15494929]
[-1.16537308 -2.03599479 -1.15541329]


# Using Pytorch

In [None]:
# There are multiple 'seeds' in python, and you need to be mindful of which are set, and their scope

torch.randn(3) # random number generator in torch

tensor([-0.6124, -1.1835, -1.4831])

In [None]:
torch.manual_seed(17) # this function is local to pytorch and doesn't effect numpy
print(torch.randn(5))

#torch's seed doesn't spread to numpy
print(np.random.randn(5))

tensor([-1.4135,  0.2336,  0.0340,  0.3499, -0.0145])
[-0.98339611  0.24452002 -0.58140974  0.4295639   0.79840199]
