In [1]:
import numpy as np

Randomness is used for:

- Initialising weights

- Shuffling datasets

- Train / validation split

- Monte-Carlo simulations

- Stochastic Gradient Descent

If the model always sees data in the same order or starts in the same way, it can learn a distorted or incomplete view of reality.

This distortion is called bias.

Bias here does not mean social bias, it means systematic error in learning.

**np.random.seed() - Reproducibility**

In [2]:
np.random.seed(42)

Every time running the code, get the same random numbers.

**np.random.rand() - Uniform random numbers (0, 1)**

In [3]:
x = np.random.rand(5)
print(x)

[0.37454012 0.95071431 0.73199394 0.59865848 0.15601864]


Used for:

- Weight initialisation

- Noise injection

In [4]:
# 2D example
W = np.random.rand(3, 4)
print(W)

[[0.15599452 0.05808361 0.86617615 0.60111501]
 [0.70807258 0.02058449 0.96990985 0.83244264]
 [0.21233911 0.18182497 0.18340451 0.30424224]]


**np.random.randn() - Normal (Gaussian) distribution**

In [7]:
# Mean = 0, Std = 1
x = np.random.randn(5)
print(x)

[ 0.39423302  0.12221917 -0.51543566 -0.60025385  0.94743982]


Used in:

- Neural network initialisation

- Simulation noise

In [6]:
# Weight matrix example
W = np.random.randn(3, 2)
W

array([[-0.74240684, -0.7033438 ],
       [-2.13962066, -0.62947496],
       [ 0.59772047,  2.55948803]])

**np.random.uniform() - Custom range**

In [8]:
x = np.random.uniform(low=-1, high=1, size=5)
print(x)

[-0.80465577  0.36846605 -0.11969501 -0.75592353 -0.00964618]


Used when:

- Want controlled weight ranges

- Bounded noise

**np.random.normal() - Custom Gaussian**

In [9]:
x = np.random.normal(loc=0, scale=0.1, size=5)
print(x)

[ 0.08225449 -0.12208436  0.02088636 -0.19596701 -0.1328186 ]


Used in:

- Small perturbations

- Evolutionary algorithms

**np.random.randint() - Random integers**

In [10]:
x = np.random.randint(0, 10, size=6)
print(x)

[1 9 1 9 3 7]


Used in:

- Random labels (testing)

- Sampling indices

- Mini-batch selection

**np.random.choice() - Sampling from a set**

In [11]:
data = np.array([10, 20, 30, 40, 50])

sample = np.random.choice(data, size=3, replace=False)
print(sample)

[50 30 20]


Used in:

- Mini-batches

- Bootstrapping

- Cross-validation

In [12]:
# Common ML pattern
indices = np.random.choice(len(data), size=3, replace=False)


**np.random.shuffle() - Shuffle data in-place**

In [13]:
X = np.array([[1, 10],
              [2, 20],
              [3, 30],
              [4, 40]])

np.random.shuffle(X)
print(X)

[[ 4 40]
 [ 2 20]
 [ 1 10]
 [ 3 30]]


Shuffles rows only, modifies original array.

Used before training.

In [14]:
# Train-Test Split Example
X = np.arange(10)

np.random.shuffle(X)

train = X[:7]
test = X[7:]

print("Train:", train)
print("Test:", test)

Train: [1 2 5 0 4 6 7]
Test: [3 8 9]


**Real ML Example: Weight Initialisation + Mini-Batch**

In [15]:
np.random.seed(1)

# Dataset
X = np.random.rand(100, 3)
y = np.random.randint(0, 2, size=100)

# Weights
w = np.random.randn(3)

# Mini-batch indices
batch_idx = np.random.choice(100, size=10, replace=False)

X_batch = X[batch_idx]
y_batch = y[batch_idx]

print("Batch shape:", X_batch.shape)

Batch shape: (10, 3)
