### Topic: Different ways to generate a random variables in numpy and what a random bit generator is?

In [104]:
import numpy as np
from numpy.random import Generator as gen
from numpy.random import PCG64 as pcg


- the generator function takes a bit generator as an input and creates generator objects. pcg stands for permutation congruential generator and is a bit generator.

- In NumPy, bit generators like PCG64 include underlying methods (or pointers to methods) for generating raw random bits. These methods are implemented in low-level languages like C for efficiency, and Python provides an interface to use them.

- PCG64 uses function pointers that can produce 64 bits random numbers in size.

In [105]:
array_rg=gen(pcg())
array_rg

Generator(PCG64) at 0x111D80900


## Generating Random Values from a Normal Distribution

When you use the `normal()` method in NumPy, it generates random numbers from a **normal distribution**, also known as a **Gaussian distribution** or **bell-shaped curve**. This distribution is widely used in statistics and real-world scenarios because it describes how many natural phenomena behave, such as human heights or test scores. The random numbers generated by the `normal()` method are centered around a specified mean and spread out based on a specified standard deviation.

---

### What is a Normal Distribution?

A **normal distribution** is defined by two key parameters:
1. **Mean (μ):** This is the center of the distribution, where most values cluster. By default, NumPy sets the mean to **0**.
2. **Standard Deviation (σ):** This measures the spread of the distribution. A smaller value results in numbers being tightly clustered around the mean, while a larger value spreads them further out. By default, NumPy sets the standard deviation to **1**.

In simpler terms:
- Most random numbers will be close to the mean.
- As you move farther from the mean, the likelihood of those numbers decreases.
- For example, in a standard normal distribution (mean = 0, standard deviation = 1), about **68%** of values fall within the range \([-1, 1]\), and about **99.7%** fall within \([-3, 3]\).

---

### How Does NumPy Generate Random Numbers from a Normal Distribution?

The `normal()` method in NumPy is part of the **Generator** class, which is a modern system for generating random numbers. The **Generator** class includes methods for generating numbers from various probability distributions, and `normal()` specifically deals with the normal distribution.

1. **Random Bit Generator (RBG):** The `Generator` object relies on a **Pseudo-Random Number Generator (PRNG)** to produce sequences of numbers that appear random.
2. **PCG64 Algorithm:** In our example, we are using the `array_rg` object is created using the **PCG64 algorithm**, a fast and reliable PRNG with good statistical properties.
3. **Mathematical Model:** The `normal()` method transforms the raw random numbers generated by the PRNG into numbers that follow the normal distribution using statistical techniques (e.g., the Box-Muller Transform).

---

### Using the `normal()` Method in NumPy

- **Single Random Number:** Calling `array_rg.normal()` generates one random number from the normal distribution with default parameters (mean = 0, standard deviation = 1).
- **Array of Random Numbers:** By specifying the `size` parameter, you can generate arrays of any size or shape:
  - `array_rg.normal(size=10)` creates an array of 10 random numbers.
  - `array_rg.normal(size=(6, 6))` generates a 6x6 matrix of random numbers.

---

### Why is This Useful?

The normal distribution is incredibly important because it reflects how real-world data is often distributed:
- Most values are close to the average (mean), with fewer extreme values.
- For example, in human heights, most people are of average height, with very tall or very short individuals being less common.

NumPy makes it easy to simulate such data for experiments, simulations, or modeling. The `normal()` method is flexible and allows you to:
- Specify a different mean and standard deviation.
- Generate random numbers in 1D, 2D, or even higher dimensions.

---





In [106]:
array_rg.normal()

0.816351997494115

In [107]:
array_rg.normal(size=10)
# high chances are that the values differs from what everyone gets because everytime we call a method, the generator randomly selects a 'seed' .
# seed: a set of starting parameters for the algorithm


array([ 0.11395489, -1.46119515,  0.03303646,  1.18133187,  1.96519303,
       -1.25556345, -1.16151885, -0.76277552,  0.73835355, -0.06426705])

In [108]:
array_rg.normal(size=(6,6))

array([[ 1.01550339,  0.73812128, -0.5326279 ,  0.23416632,  0.20929344,
         0.07644285],
       [ 0.5910469 ,  1.51765477,  2.02741444, -0.43180656,  0.81528704,
         0.56350609],
       [-3.01446349,  0.59523767, -1.58356493, -1.26542076, -0.89775253,
         0.83119585],
       [ 0.29552741,  0.77305999, -1.52707802, -0.31491796, -0.728422  ,
         1.6225263 ],
       [ 0.51132457,  0.28762878,  1.12091873, -0.12841962, -0.88394333,
         0.90442226],
       [-0.56617108, -0.84834435,  1.53740517, -0.23931325,  0.25885508,
        -1.53073297]])

#### if it always generates random values, and we push thi random dataet to train our model, the model will not be highly efficient. to fix this we use pcg to specify a generator object . this will ensures that everytime we run the functions with the same arguments, it will give us the same output.

In [109]:
array_rg=gen(pcg(seed=56))
array_rg.normal(size=(6,6))

array([[-0.84994072,  0.32194085,  1.78904286,  0.8793392 ,  0.37159282,
         1.48769378],
       [ 0.72866553,  2.08124873, -1.33726808,  0.12535155,  0.54422123,
        -0.67382748],
       [ 0.01964472, -1.05808512, -1.34806622,  1.21982602, -2.40019087,
        -1.55657975],
       [-0.41482473, -1.00506989,  0.96917647,  0.64675038,  0.26726439,
         1.54703204],
       [ 1.17653654,  1.14996121,  0.38008235,  0.03514379,  0.8768966 ,
         0.97870581],
       [ 0.36795013,  0.17431947,  0.5349831 ,  0.90704283,  0.98899818,
        -0.80529877]])

- When we execute the normal() method in a new Jupyter Notebook cell, it generates a different outcome each time we run the cell. This happens because a seed only lasts for a single method call, function execution, or cell execution before it resets. As a result, the random number generator does not retain the seed across multiple executions unless we explicitly set it again.

- If we want the same results again in another cell, we must reset the seed before each method or function call. This ensures that the random number sequence starts from the same point. This behavior is designed to prevent unintentionally setting a seed once and forgetting to reset it later, which could lead to inconsistent results when randomness is expected in different parts of the code.

In [110]:

array_rg.normal(size=(6,6))

array([[-0.27967585,  0.74853881,  0.13229222, -0.01968978,  0.95685845,
        -0.08517853],
       [ 0.63303659, -1.15275681,  1.72740472, -0.59258512, -0.35135543,
         2.43353726],
       [ 1.43843797,  0.48945543, -0.39600758, -0.45785367, -1.58864579,
        -0.71498977],
       [-2.7130341 ,  0.39751722,  0.15586318, -1.06543045,  0.07774814,
        -1.62957302],
       [-0.42699464,  0.40270864,  0.99968769,  0.35438673, -0.8256648 ,
        -0.49180551],
       [ 1.75186548, -1.94760034,  1.20858425, -2.09512612, -0.99543563,
         0.75756955]])

In [111]:
# whenever we intend to use a seed, we will reset it everytime otherwise it will give random numbers.
array_rg=gen(pcg(seed=56))
array_rg.normal(size=(6,6))

array([[-0.84994072,  0.32194085,  1.78904286,  0.8793392 ,  0.37159282,
         1.48769378],
       [ 0.72866553,  2.08124873, -1.33726808,  0.12535155,  0.54422123,
        -0.67382748],
       [ 0.01964472, -1.05808512, -1.34806622,  1.21982602, -2.40019087,
        -1.55657975],
       [-0.41482473, -1.00506989,  0.96917647,  0.64675038,  0.26726439,
         1.54703204],
       [ 1.17653654,  1.14996121,  0.38008235,  0.03514379,  0.8768966 ,
         0.97870581],
       [ 0.36795013,  0.17431947,  0.5349831 ,  0.90704283,  0.98899818,
        -0.80529877]])