In [34]:
#continue with generating array of random data from different probability distribution.
import numpy as np
from numpy.random import Generator as gen
from numpy.random import PCG64 as pcg


In [35]:
array_rg=gen(pcg(seed=246))
array_rg


Generator(PCG64) at 0x106757760





# Understanding Poisson Distribution in NumPy

## What is Poisson Distribution?
The **Poisson distribution** is a statistical model used to describe the probability of a certain number of events occurring within a fixed interval of time, space, or any measurable dimension. It is based on the following assumptions:
1. **Independence**: Events occur independently of each other.
2. **Constant Rate**: The average number of events (\( \lambda \)) is consistent across intervals.
3. **Non-Negative Counts**: The outcome is always a non-negative integer (e.g., 0, 1, 2, ...).

---

## Key Concepts
- **Lambda (\( \lambda \))**:
  - Represents the **average number of events** expected in a given interval.
  - Example: If \( \lambda = 1 \), it means we expect, on average, 1 event per interval.

- **Output Values**:
  - The Poisson distribution outputs integers representing the **count of events** in an interval.
  - ### The counts are random but centered around the value of \( \lambda \).

- **Typical Behavior**:
  - For small \( \lambda \) (e.g., \( \lambda = 1 \)), the counts are usually low (0, 1, 2, or 3).
  - As \( \lambda \) increases, higher counts become more probable.

---

## Default Behavior in NumPy
When using the Poisson distribution in NumPy:
1. **Default Lambda**:
   - If \( \lambda \) (denoted as `lam`) is not specified, it defaults to \( \lambda = 1 \).
   - This means the distribution expects 1 event on average per interval.

2. **Output Shape**:
   - You can specify the size (shape) of the output array (e.g., a \(3 \times 3\) matrix).
   - Each element in the array represents an independent random count of events.

---

## Characteristics of Poisson Distribution
1. **Cluster Around Lambda**:
   - The majority of values will be near \( \lambda \).
   - For \( \lambda = 1 \), most values are 0, 1, 2, or 3.

2. **Skewness**:
   - For smaller \( \lambda \), the distribution is skewed to the right (fewer high values).
   - As \( \lambda \) increases, the distribution becomes more symmetric.

3. **Real-World Examples**:
   - Number of emails received in an hour.
   - Number of customers arriving at a store per minute.
   - Number of accidents on a road per day.

---

In [36]:
# generating array from poisson distribution
# the output of an array is 0,1,2,3 because the default value of lambda is 1 as elements of array using poisson distribution is centred around lambda value.
array_rg=gen(pcg(seed=246))
array_rg.poisson(size=(3,3))

array([[0, 0, 3],
       [2, 0, 1],
       [1, 1, 0]])

In [37]:
# setting the lambda value
array_rg=gen(pcg(seed=246))
array_rg.poisson(lam=100,size=(3,3))

array([[ 92, 109, 108],
       [ 87, 114, 102],
       [ 96,  74,  91]])

In [38]:
#Generating an array using binomial distribution

# measures how many times a certain outcome can appear over a series of trials, when there are only two possible outcomes like pass or fail, head or tail etc. requires two more arguments other than size 'n' and 'p' , where n is th number of trials we are running and p is the probability of the desired outcome.

array_rg=gen(pcg(seed=246))
array_rg.binomial(n=20, p=0.7,size=(3,3))

# as one can see in the output that probability is 0.7 and no of independent trails is 20 ,most of the elements in array is close to 14 which is 70 percent of 20.

array([[15, 15, 12],
       [13, 13, 16],
       [12, 10, 16]])

In [39]:
# simulates the outcome of a logistic distribution
# does not require parameter like 'n' and 'p' but require parameters like loc and scale.

# the loc or location parameter is centre of the distribution just like the mean or average in poisson distribution. and the scale determines the spread(width) of the distribution.

array_rg=gen(pcg(seed=246))
array_rg.logistic(loc=10,scale=2.5,size=(3,3))

# one can notice in the below array that all the values fluctuate around +-2.5 of 10.

array([[ 7.28550632,  8.09899721, 13.20635804],
       [12.35738522, 13.00231881,  6.73628199],
       [14.16223615, 18.5972148 ,  5.24109497]])