# Geometric Distribution

The Geometric distribution is a discrete probability distribution that models the number of trials needed to achieve the first success in a sequence of independent Bernoulli trials (where each trial has two possible outcomes: success or failure). It is often used to model the number of attempts required to achieve a specific outcome.

Geometric distribution that is based on three important assumptions. These are listed as follows.

- The trials being conducted are independent.
- There can only be two outcomes of each trial - success or failure.
- The success probability, denoted by p, is the same for each trial.

## **Key Characteristics**

1. **Probability of Success (p)**: The probability of success on each trial. 

2. **Probability of Failure (1-p)**: The probability of failure on each trial.

## **Probability Mass Function (PMF)**

The PMF of the Geometric distribution is given by:
$$
P(X = k) = (1 - p)^{k - 1} p
$$
for k = 1, 2, 3, ..., where:
- X is the random variable representing the number of trials until the first success.
- 𝑝 is the probability of success on each trial.

## **Cumulative Distribution Function (CDF)**
The CDF of the Geometric distribution is:
$$
F(k) = P(X \leq k) = 1 - (1 - p)^k
$$
for k = 1, 2, 3,...

## **Mean and Variance**

- **Mean (Expected Value)**: E[X] = 1/p

- **Variance**: Var(X) = (1-p)/p^2

## **Memoryless Property**

One notable property of the Geometric distribution is the memoryless property, which states that the probability of success in future trials is independent of the number of past failures. Mathematically, this can be expressed as:

$$
P(X > n + m \mid X > n) = P(X > m)
$$

### Example: 
Suppose you are rolling a fair six-sided die and you want to find the probability that you roll a six for the first time on the fourth roll. Here, a success is rolling a six (with probability p = 1/6)

Using the PMF:
$$
P(X = 4) = \left(1 - \frac{1}{6}\right)^{4-1} \times \frac{1}{6} = \left(\frac{5}{6}\right)^3 \times \frac{1}{6} \approx 0.096
$$

## **Use Cases**

1. **Quality Control**: Number of items inspected before finding a defective one.
2. **Customer Service**: Number of calls made until the first successful contact with a customer.
3. **Sports**: Number of attempts before scoring a goal.

In [2]:
import numpy as np
from scipy.stats import geom

# Probability of success
p = 1/6

# Generate random samples
samples = np.random.geometric(p, size=1000)

# Calculate PMF for k = 4
k = 4
pmf_value = geom.pmf(k, p)
print(f"P(X = {k}) = {pmf_value}")

# Calculate the mean and variance
mean = geom.mean(p)
variance = geom.var(p)
print(f"Mean: {mean}, Variance: {variance}")


P(X = 4) = 0.09645061728395063
Mean: 6.0, Variance: 30.000000000000007


- **How does the probability of success 𝑝 affect the shape of the Geometric distribution?**

As 𝑝 increases, the distribution becomes more peaked and concentrated around smaller values of 𝑘. As 𝑝 decreases, the distribution becomes more spread out.

