Q1. What is the Probability density function?

The probability density function (PDF) is a function used in statistics to describe the likelihood of a continuous random variable taking on a particular value. Unlike discrete random variables, which use probability mass functions, continuous random variables use PDFs because they can take on an infinite number of values.

Here are some key points about PDFs:

1. **Non-Negative**: The PDF \( f(x) \) is always non-negative for all values of \( x \), meaning ( f(x) > 0 ).

2. **Total Area**: The total area under the curve of the PDF across the entire range of possible values is equal to 1. This ensures that the probabilities of all possible outcomes sum to one.

3. **Not a Probability**: The value of the PDF at a specific point does not represent a probability. Instead, it indicates the relative likelihood of the random variable being near that value.

4. **Examples**:normal distribution,
exponential distribution,
uniform distribution.


Q2. What are the types of Probability distribution?


Probability distributions can be broadly categorized into two main types: discrete and continuous distributions. Here's a breakdown of each type along with some common examples:

### 1. Discrete Probability Distributions
These distributions apply to scenarios where the possible outcomes are countable.

- **Bernoulli Distribution**: Models a single trial with two possible outcomes (success/failure).
- **Binomial Distribution**: Represents the number of successes in a fixed number of independent Bernoulli trials.
- **Poisson Distribution**: Models the number of events occurring within a fixed interval of time or space, given a known average rate of occurrence.
- **Geometric Distribution**: Models the number of trials needed for the first success in repeated Bernoulli trials.
- **Negative Binomial Distribution**: Generalizes the geometric distribution to count the number of trials needed to achieve a fixed number of successes.
- **Hypergeometric Distribution**: Models the number of successes in a sample drawn without replacement from a finite population.

### 2. Continuous Probability Distributions
These distributions apply to scenarios where the outcomes can take on any value within a continuous range.

- **Normal Distribution**: A symmetric distribution characterized by its bell-shaped curve, defined by its mean and standard deviation.
- **Uniform Distribution**: All outcomes are equally likely within a specified range.
- **Exponential Distribution**: Models the time between events in a Poisson process; often used for modeling waiting times.
- **Gamma Distribution**: Generalizes the exponential distribution; useful in various fields like queuing and reliability analysis.
- **Beta Distribution**: Defined on the interval [0, 1]; often used in Bayesian statistics and modeling proportions.
- **Log-Normal Distribution**: Models variables whose logarithm is normally distributed; common in financial contexts.

### 3. Other Types
- **Multivariate Distributions**: Distributions that involve multiple random variables (e.g., multivariate normal distribution).
- **Joint and Marginal Distributions**: Joint distributions describe the probabilities of two or more random variables occurring together, while marginal distributions describe the probabilities of a subset of those variables.

Each distribution has its own characteristics and is suitable for different types of data and scenarios. Understanding the context and nature of the data is key to selecting the appropriate probability distribution.


Q3. Write a Python function to calculate the probability density function of a normal distribution with
given mean and standard deviation at a given point.

The PDF of a normal distribution is given by the formula:

\[
f(x) = \frac{1}{\sigma \sqrt{2\pi}} e^{-\frac{(x - \mu)^2}{2\sigma^2}}
\]

Where:
- \( \mu \) is the mean
- \( \sigma \) is the standard deviation
- \( x \) is the point at which you want to calculate the PDF

Here’s a Python function to compute the PDF of a normal distribution:

```python
import math

def normal_pdf(x, mean, std_dev):
    """
    Calculate the probability density function of a normal distribution.

    Parameters:
    x (float): The point at which to calculate the PDF.
    mean (float): The mean (μ) of the normal distribution.
    std_dev (float): The standard deviation (σ) of the normal distribution.

    Returns:
    float: The value of the PDF at the specified point x.
    """
    coefficient = 1 / (std_dev * math.sqrt(2 * math.pi))
    exponent = math.exp(-0.5 * ((x - mean) / std_dev) ** 2)
    return coefficient * exponent

# Example usage:
mean = 0
std_dev = 1
x = 1
pdf_value = normal_pdf(x, mean, std_dev)
print(f"The PDF of the normal distribution at x={x} is {pdf_value:.4f}")
```

### Explanation
- The function `normal_pdf` takes three parameters: `x`, `mean`, and `std_dev`.
- It calculates the coefficient and exponent as per the formula.
- Finally, it returns the PDF value at the specified point.

You can modify the `mean`, `std_dev`, and `x` variables to compute the PDF for different scenarios.

Q4. What are the properties of Binomial distribution? Give two examples of events where binomial
distribution can be applied.

### Properties of Binomial Distribution

1. **Fixed Number of Trials (n)**: The number of trials is predetermined and remains constant.

2. **Two Possible Outcomes**: Each trial results in one of two outcomes: success or failure.

3. **Constant Probability (p)**: The probability of success remains the same for each trial.

4. **Independence**: The outcome of one trial does not affect the outcomes of other trials.

5. **Discrete Distribution**: The binomial distribution is discrete, as it counts the number of successes over a finite number of trials.

6. **Mean (Expected Value)**: The mean of a binomial distribution is given by:
   \[
   \mu = n \cdot p
   \]

7. **Variance**: The variance of a binomial distribution is given by:
   \[
   \sigma^2 = n \cdot p \cdot (1 - p)
   \]

8. **Probability Mass Function (PMF)**: The probability of getting exactly \( k \) successes in \( n \) trials is calculated using:
   \[
   P(X = k) = \binom{n}{k} p^k (1 - p)^{n - k}
   \]
   where \(\binom{n}{k}\) is the binomial coefficient.

### Examples of Events for Binomial Distribution

1. **Flipping a Coin**:
   - **Scenario**: Tossing a fair coin 10 times.
   - **Application**: Each toss is an independent trial with two outcomes: heads (success) or tails (failure). If we want to find the probability of getting exactly 4 heads in 10 tosses, we can apply the binomial distribution with \( n = 10 \) and \( p = 0.5 \).

2. **Product Defects in Manufacturing**:
   - **Scenario**: A factory produces widgets, and it is known that 2% of the widgets are defective.
   - **Application**: If a quality control inspector randomly selects 50 widgets, we can model the number of defective widgets found as a binomial distribution. Here, \( n = 50 \) and \( p = 0.02 \). We could calculate the probability of finding exactly 3 defective widgets.

These examples demonstrate how the binomial distribution can be used to model real-world situations where outcomes can be classified as success or failure.

Q5. Generate a random sample of size 1000 from a binomial distribution with probability of success 0.4
and plot a histogram of the results using matplotlib.

```python
import numpy as np
import matplotlib.pyplot as plt

# Parameters
n = 1          # Number of trials
p = 0.4        # Probability of success
sample_size = 1000  # Size of the random sample

# Generate random sample from binomial distribution
sample = np.random.binomial(n, p, sample_size)

# Plotting the histogram
plt.figure(figsize=(10, 6))
plt.hist(sample, bins=np.arange(0, 2), density=True, alpha=0.6, color='blue', edgecolor='black')
plt.title('Histogram of Binomial Distribution (n=1, p=0.4)')
plt.xlabel('Number of Successes')
plt.ylabel('Frequency')
plt.xticks([0, 1])  # Since n=1, the possible outcomes are 0 and 1
plt.grid(axis='y', alpha=0.75)
plt.show()
```

### Explanation
1. **Parameters**:
   - `n`: Number of trials (set to 1 for a single Bernoulli trial).
   - `p`: Probability of success (0.4 in this case).
   - `sample_size`: The size of the sample (1000).

2. **Random Sample Generation**:
   - `np.random.binomial(n, p, sample_size)`: Generates a sample of size 1000 from a binomial distribution with the specified parameters.

3. **Plotting**:
   - A histogram is created to visualize the distribution of the results.
   - `bins=np.arange(0, 2)`: Specifies the bins for the histogram, which is appropriate since the outcomes for a single trial are either 0 or 1.

4. **Display**:
   - The histogram is displayed with appropriate titles and labels.


Q6. Write a Python function to calculate the cumulative distribution function of a Poisson distribution
with given mean at a given point.




The PMF of a Poisson distribution is given by:

\[
P(X = k) = \frac{\lambda^k e^{-\lambda}}{k!}
\]

Where:
- \( \lambda \) is the mean (rate) of the distribution.
- \( k \) is the number of occurrences.

The CDF is defined as:

\[
P(X \leq k) = \sum_{i=0}^{k} P(X = i)
\]

### Python Function

Here’s a Python function to calculate the CDF of a Poisson distribution:

```python
import math

def poisson_cdf(k, mean):
    """
    Calculate the cumulative distribution function (CDF) of a Poisson distribution.

    Parameters:
    k (int): The number of occurrences (point at which to calculate the CDF).
    mean (float): The mean (λ) of the Poisson distribution.

    Returns:
    float: The value of the CDF at the specified point k.
    """
    cdf = 0.0
    for i in range(k + 1):
        cdf += (mean ** i) * math.exp(-mean) / math.factorial(i)
    return cdf

# Example usage:
mean = 3.0  # mean of the Poisson distribution
k = 5       # point at which to calculate the CDF
cdf_value = poisson_cdf(k, mean)
print(f"The CDF of the Poisson distribution at k={k} with mean={mean} is {cdf_value:.4f}")
```

### Explanation
1. **Function Definition**: The function `poisson_cdf` takes two parameters: `k` (the number of occurrences) and `mean` (the mean of the Poisson distribution).
  
2. **CDF Calculation**:
   - A loop runs from 0 to \( k \), summing the probabilities using the Poisson PMF formula.

3. **Example Usage**: The example shows how to use the function to calculate the CDF for a Poisson distribution with a mean of 3.0 at \( k = 5 \).

You can modify the `mean` and `k` values to calculate the CDF for different scenarios.

Q7. How Binomial distribution different from Poisson distribution?

The **Binomial distribution** and the **Poisson distribution** are both discrete probability distributions, but they differ in their underlying assumptions and applications. Here's a breakdown of the key differences between them.

### 1. **Nature of Events**
   - **Binomial Distribution**: Describes the number of successes in a fixed number of independent **Bernoulli trials**, where each trial has two possible outcomes (success or failure).
   - **Poisson Distribution**: Describes the number of events that occur in a fixed interval of time, space, or another dimension, where the events are independent, and the probability of an event occurring in a small interval is proportional to the size of the interval.

### 2. **Parameters**
   - **Binomial Distribution**:
     - \( n \): The number of trials.
     - \( p \): The probability of success on a single trial.
   - **Poisson Distribution**:
     - \( \lambda \): The average number of events occurring in a given time interval (rate parameter).

### 3. **Conditions**
   - **Binomial Distribution**:
     - Fixed number of trials (\(n\)).
     - Each trial is independent.
     - The probability of success \(p\) is constant across trials.
   - **Poisson Distribution**:
     - The number of events in a fixed interval of time or space is counted.
     - Events occur independently of each other.
     - The probability of more than one event happening in a very small interval is close to 0.

### 4. **Range of Outcomes**
   - **Binomial Distribution**: The number of successes can be any integer from 0 to \(n\) (the total number of trials).
   - **Poisson Distribution**: The number of events can be any non-negative integer (0, 1, 2, 3, ...), theoretically unbounded.

### 5. **Mean and Variance**
   - **Binomial Distribution**:
     - Mean \( = np \)
     - Variance \( = np(1 - p) \)
   - **Poisson Distribution**:
     - Mean \( = \lambda \)
     - Variance \( = \lambda \) (The mean and variance are equal in a Poisson distribution).

### 6. **When to Use**
   - **Binomial Distribution**: When the number of trials is fixed and you want to find the probability of a certain number of successes out of those trials. Example: The number of heads in 10 coin flips.
   - **Poisson Distribution**: When you are interested in counting the number of events that occur in a fixed interval of time or space. Example: The number of customer arrivals at a store in an hour.

### 7. **Approximation**
   - For large \( n \) and small \( p \), the binomial distribution can be approximated by a Poisson distribution with \( \lambda = np \), especially when \( np \) is moderately sized.

### Summary Table:

| **Characteristic**          | **Binomial Distribution**                               | **Poisson Distribution**                        |
|-----------------------------|---------------------------------------------------------|-------------------------------------------------|
| **Nature**                   | Fixed number of trials (n), count successes             | Count events in a time/space interval           |
| **Parameters**               | \(n\) (trials), \(p\) (probability of success)          | \( \lambda \) (mean number of events)           |
| **Range**                    | \(0\) to \(n\)                                          | 0 to infinity                                   |
| **Mean**                     | \( np \)                                                | \( \lambda \)                                   |
| **Variance**                 | \( np(1-p) \)                                           | \( \lambda \)                                   |
| **Application**              | Success/failure in trials                               | Count of rare events                            |



Q8. Generate a random sample of size 1000 from a Poisson distribution with mean 5 and calculate the
sample mean and variance.

The random sample of size 1000 from a Poisson distribution with a mean of 5 has:

- **Sample Mean**: 4.981
- **Sample Variance**: 5.093

These values are close to the theoretical mean and variance of the Poisson distribution, which are both equal to 5.

Q9. How mean and variance are related in Binomial distribution and Poisson distribution?

In both the **Binomial distribution** and the **Poisson distribution**, the mean and variance are directly related to the parameters of the distributions, but their relationships differ due to the nature of the distributions.

### 1. **Binomial Distribution**
The **Binomial distribution** describes the number of successes in a fixed number of independent Bernoulli trials, each with a constant probability of success. It is parameterized by:
- \( n \): The number of trials.
- \( p \): The probability of success on a single trial.

#### **Mean and Variance in Binomial Distribution:**
- **Mean** \( \mu = np \): The mean is the product of the number of trials \( n \) and the probability of success \( p \).
- **Variance** \( \sigma^2 = np(1 - p) \): The variance is the product of the mean \( np \) and \( (1 - p) \), which represents the probability of failure.

#### Relationship:
- The **variance** of the binomial distribution depends on both the mean \( np \) and the probability of failure \( (1 - p) \). If \( p \) is small, the variance is close to the mean. However, the variance is always **less than the mean** for any non-extreme value of \( p \) (i.e., \( 0 < p < 1 \)).
- If \( p = 0.5 \), the variance reaches its maximum, \( \frac{n}{4} \), because \( (1 - p) = p \).

---

### 2. **Poisson Distribution**
The **Poisson distribution** models the number of events occurring in a fixed interval of time or space, with the events occurring independently of each other. It is parameterized by:
- \( \lambda \): The average number of events (mean rate of occurrence) in the given interval.

#### **Mean and Variance in Poisson Distribution:**
- **Mean** \( \mu = \lambda \): The mean is simply the rate parameter \( \lambda \), representing the average number of occurrences.
- **Variance** \( \sigma^2 = \lambda \): The variance of the Poisson distribution is also equal to \( \lambda \).

#### Relationship:
- In the **Poisson distribution**, the **mean and variance are equal**. This is a defining property of the Poisson distribution, making it unique compared to other probability distributions.

---

### **Comparison of the Relationship Between Mean and Variance**:

| **Distribution**      | **Mean**           | **Variance**               | **Relationship**                                         |
|-----------------------|--------------------|----------------------------|----------------------------------------------------------|
| **Binomial**           | \( \mu = np \)     | \( \sigma^2 = np(1 - p) \) | Variance depends on \( p \); variance is less than mean   |
| **Poisson**            | \( \mu = \lambda \)| \( \sigma^2 = \lambda \)   | Variance equals the mean                                  |

### Key Insights:
- In the **Binomial distribution**, the variance is smaller than the mean unless \( p = 0.5 \), and it depends on both \( p \) and \( n \).
- In the **Poisson distribution**, the mean and variance are always identical.

Q10. In normal distribution with respect to mean position, where does the least frequent data appear?

In a **normal distribution**, the data is symmetrically distributed around the mean, forming a bell-shaped curve. The most frequent data (i.e., the mode) occurs at the **mean** position, as the mean, median, and mode are all the same in a perfectly normal distribution. The further you move away from the mean, the frequency of data points decreases.

### The Least Frequent Data in a Normal Distribution:
The least frequent data appears at the **tails** of the distribution, which are farthest away from the mean. These tails extend indefinitely in both directions, but the probability of data points occurring in the tails is extremely low.

#### Key Points:
- **At the mean**: The frequency of data is the highest.
- **At the tails** (both far left and far right of the mean): The frequency of data is the lowest.
- As you move several standard deviations away from the mean (whether positive or negative), the probability of data points appearing decreases rapidly.

### In Terms of Standard Deviations:
- In a standard normal distribution:
  - Around **68%** of the data lies within **1 standard deviation** of the mean.
  - Around **95%** of the data lies within **2 standard deviations**.
  - Around **99.7%** of the data lies within **3 standard deviations**.
- Data points beyond **3 standard deviations** from the mean are rare, meaning the tails (i.e., \( x \) values far from the mean) contain the least frequent occurrences.

### Visual Representation:
In the bell curve of a normal distribution, the tails (extreme left and extreme right, far from the mean) represent the least frequent data. These are the outliers or rare occurrences in the dataset.