Q1. What is the Probability density function?

In [None]:
The Probability Density Function (PDF) is a function that describes the likelihood of a continuous random variable taking on a particular value. For a continuous random variable, the PDF represents the probability that the variable's value falls within a particular range. It's not the probability of a specific value (as in discrete distributions), but rather the probability density over an interval.

Mathematically, for a continuous random variable \(X\), the PDF is represented as \(f(x)\), where:

- \(f(x)\) ≥ 0 for all \(x\) (non-negative)
- The area under the curve over a certain range gives the probability that \(X\) falls within that range.
- The integral of the PDF over its entire range is equal to 1.

For example, in the case of a normal distribution, the PDF gives the probability density of the variable taking on a specific value \(x\) at any point along the curve. It doesn't give the probability at that exact point, but rather the density of probability around that point.

The PDF for different distributions takes different mathematical forms, allowing statisticians and analysts to understand the likelihood of different values occurring within the distribution.

Q2. What are the types of Probability distribution?

In [None]:
There are several types of probability distributions, each with its own characteristics and applications. Here are some common ones:

1. Discrete Probability Distributions:
    - Bernoulli Distribution: Models a single trial with two outcomes (success or failure).
    - Binomial Distribution: Describes the number of successes in a fixed number of independent trials.
    - Poisson Distribution: Models the number of events occurring in a fixed interval of time or space when the events occur with a known constant rate and independently of the time since the last event.

2. Continuous Probability Distributions:
    - Normal Distribution (Gaussian Distribution): Symmetric, bell-shaped distribution used in various fields due to its prevalence in nature and the Central Limit Theorem.
    - Exponential Distribution: Models the time between events in a Poisson process (events occur continuously and independently at a constant average rate).
    - Uniform Distribution: All values within a range are equally likely.

Each distribution has its own probability density function (PDF) or probability mass function (PMF) that characterizes the likelihood of different outcomes occurring. These distributions are used in statistics, sciences, engineering, finance, and many other fields to model and analyze random phenomena.

Q3. Write a Python function to calculate the probability density function of a normal distribution with
given mean and standard deviation at a given point.

In [None]:
Certainly! Here's a Python function that calculates the probability density function (PDF) of a normal distribution at a given point \( x \) with a specified mean (\( \mu \)) and standard deviation (\( \sigma \)):

import math

def normal_pdf(x, mean, std_dev):
    # Calculate the exponent in the PDF formula
    exponent = -(1/2) * ((x - mean) / std_dev) ** 2
    
    # Calculate the constant factor in the PDF formula
    constant = 1 / (std_dev * math.sqrt(2 * math.pi))
    
    # Calculate and return the PDF value
    pdf = constant * math.exp(exponent)
    return pdf

You can use this function by passing the value at which you want to calculate the PDF, along with the mean and standard deviation of the normal distribution. For instance:

mean = 10  # Mean of the normal distribution
std_dev = 2  # Standard deviation of the normal distribution
point = 12  # Point at which to calculate the PDF

pdf_value = normal_pdf(point, mean, std_dev)
print(f"The probability density function at {point} is: {pdf_value}")

Replace the `mean`, `std_dev`, and `point` variables with your desired values to calculate the PDF at a specific point for a normal distribution with the given mean and standard deviation.

Q4. What are the properties of Binomial distribution? Give two examples of events where binomial
distribution can be applied.

In [None]:
The Binomial distribution is characterized by several key properties:

1. Fixed Number of Trials: The distribution models a fixed number of independent trials, denoted as \( n \).

2. Two Possible Outcomes: Each trial results in one of two outcomes, usually termed as success (S) or failure (F).

3. Constant Probability of Success: The probability of success, denoted as \( p \), remains constant across all trials.

4. Independence: Each trial is independent of the others.

5. Discreteness: The distribution is discrete, as it deals with a finite number of successes out of \( n \) trials.

Examples where the Binomial distribution can be applied:

1. Coin Flipping: Consider flipping a fair coin \( n \) times and counting the number of heads obtained. Each flip has two outcomes (heads or tails), the probability of getting heads (success) remains the same for each flip, and the trials are independent.

2. Medical Testing: Suppose a new medical test correctly identifies a disease \( p \) percent of the time when the disease is present and incorrectly indicates the disease \( 1 - p \) percent of the time when the disease is absent. If \( n \) individuals are tested, the number of correct identifications of the disease can be modeled using a binomial distribution, where each test is independent and has two possible outcomes (correct identification or incorrect identification).

In essence, the Binomial distribution finds application in scenarios involving a fixed number of independent trials, each with two possible outcomes and a constant probability of success.

Q5. Generate a random sample of size 1000 from a binomial distribution with probability of success 0.4
and plot a histogram of the results using matplotlib.

In [None]:
Certainly! You can generate a random sample from a binomial distribution using Python's NumPy library and then plot a histogram of the results using Matplotlib. Here's an example:

import numpy as np
import matplotlib.pyplot as plt

# Parameters for the binomial distribution
probability_success = 0.4
num_trials = 1000

# Generate the random sample from the binomial distribution
random_sample = np.random.binomial(1, probability_success, num_trials)

# Plotting a histogram of the results
plt.hist(random_sample, bins=2, edgecolor='black')  # Two bins for success and failure
plt.xlabel('Outcome')
plt.ylabel('Frequency')
plt.title('Histogram of Binomial Distribution')
plt.xticks([0, 1], ['Failure', 'Success'])  # Labeling x-axis ticks
plt.show()

This code snippet generates a random sample of size 1000 from a binomial distribution with a probability of success of 0.4 and plots a histogram to display the frequency of success and failure in the generated sample. Adjust the parameters (`probability_success`, `num_trials`) as needed for different distributions and sample sizes.

Q6. Write a Python function to calculate the cumulative distribution function of a Poisson distribution
with given mean at a given point.

In [None]:
Certainly! The cumulative distribution function (CDF) for a Poisson distribution with a given mean (\(\lambda\)) at a specific point \(k\) can be calculated using the formula:

\[ \text{CDF}(k, \lambda) = \sum_{i=0}^{k} \frac{e^{-\lambda} \cdot \lambda^i}{i!} \]

Here's a Python function to calculate the CDF of a Poisson distribution at a given point \(k\) with a specified mean \(\lambda\):

import math

def poisson_cdf(k, lambd):
    cdf = 0
    for i in range(k + 1):
        cdf += math.exp(-lambd) * (lambd ** i) / math.factorial(i)
    return cdf

You can use this function by passing the value at which you want to calculate the CDF and the mean of the Poisson distribution. For example:

mean_lambda = 3.5  # Mean of the Poisson distribution
point_k = 2  # Point at which to calculate the CDF

cdf_value = poisson_cdf(point_k, mean_lambda)
print(f"The cumulative distribution function at {point_k} is: {cdf_value}")

Replace the `mean_lambda` and `point_k` variables with your desired values to calculate the CDF at a specific point for a Poisson distribution with the given mean.

Q7. How Binomial distribution different from Poisson distribution?

The Binomial distribution and the Poisson distribution are both fundamental probability distributions, but they differ in several key aspects:

1. Nature of Trials:
   - Binomial Distribution: It models a fixed number of independent trials, each with two possible outcomes (success or failure).
   - Poisson Distribution: It models the number of events occurring in a fixed interval of time or space, assuming a continuous process where events occur at a known average rate but independently of the time since the last event.

2. Number of Possible Outcomes:
   - Binomial Distribution: Has a finite number of trials with discrete outcomes.
   - Poisson Distribution: Deals with an infinite number of potential occurrences within a continuous interval.

3. Parameters:
   - Binomial Distribution: Characterized by \(n\) (the number of trials) and \(p\) (the probability of success in each trial).
   - Poisson Distribution: Characterized by a single parameter \(\lambda\) (the average rate of occurrence).

4. Assumptions:
   - Binomial Distribution: Assumes a fixed number of trials and a constant probability of success in each trial.
   - Poisson Distribution: Assumes events occur randomly and independently at a constant average rate.

5. Applicability:
   - Binomial Distribution: Often used when there's a fixed number of trials, such as coin flips, dice rolls, or the number of successes in a specific number of experiments.
   - Poisson Distribution: Applied in scenarios where events occur continuously and independently at a known average rate, like the number of phone calls at a call center within a specific time frame or the number of arrivals at a service point within a given interval.

In summary, while both distributions involve discrete counts, the Binomial distribution deals with a fixed number of trials and a limited number of potential outcomes, whereas the Poisson distribution deals with a continuous process and an unlimited number of potential occurrences within a fixed interval.

Q8. Generate a random sample of size 1000 from a Poisson distribution with mean 5 and calculate the
sample mean and variance.

In [None]:
To generate a random sample from a Poisson distribution with a mean of 5 and calculate the sample mean and variance using Python, you can use the NumPy library. Here's an example:

import numpy as np

# Parameters for the Poisson distribution
mean = 5
sample_size = 1000

# Generate random sample from Poisson distribution
random_sample = np.random.poisson(mean, sample_size)

# Calculate sample mean and variance
sample_mean = np.mean(random_sample)
sample_variance = np.var(random_sample)

print(f"Sample Mean: {sample_mean}")
print(f"Sample Variance: {sample_variance}")

This code snippet generates a random sample of size 1000 from a Poisson distribution with a mean of 5 using NumPy's `random.poisson()` function. Then, it calculates the sample mean and variance using NumPy's `mean()` and `var()` functions, respectively.

The `sample_mean` variable will contain the calculated sample mean, while the `sample_variance` variable will contain the calculated sample variance of the generated random sample. Adjust the `mean` and `sample_size` variables as needed for different means and sample sizes.

Q9. How mean and variance are related in Binomial distribution and Poisson distribution?

In [None]:
In both Binomial and Poisson distributions, the mean and variance are related, but the specific relationships differ due to the nature of these distributions.

Binomial Distribution:

- Mean of a Binomial Distribution: For a Binomial distribution with parameters \( n \) (number of trials) and \( p \) (probability of success), the mean (\( \mu \)) is given by \( \mu = np \).
  
- Variance of a Binomial Distribution: The variance (\( \sigma^2 \)) of a Binomial distribution is calculated using the formula \( \sigma^2 = np(1-p) \).

In a Binomial distribution, both the mean and the variance depend on the number of trials \( n \) and the probability of success \( p \). The variance increases as \( np(1-p) \) increases or when \( p \) is farther away from 0.5.

Poisson Distribution:

- Mean of a Poisson Distribution: For a Poisson distribution, the mean (\( \lambda \)) and variance (\( \sigma^2 \)) are equal and are both given by \( \lambda \).

In a Poisson distribution, the mean and variance are both solely determined by the average rate of occurrence (\( \lambda \)). As \( \lambda \) increases, both the mean and variance also increase, indicating a wider spread of the distribution.

In summary, for a Binomial distribution, the variance depends on both the number of trials and the probability of success, whereas for a Poisson distribution, the variance is solely determined by the mean (\( \lambda \)), and both the mean and variance are equal.

Q10. In normal distribution with respect to mean position, where does the least frequent data appear?

In [None]:
In a normal distribution, the least frequent data occurs at the tails of the distribution, farthest away from the mean. The tails of a normal distribution represent the extreme values, and these values are less likely to occur compared to the values closer to the mean.

In a symmetric normal distribution:

- The data at the extremes, both the far left (lower than the mean by multiple standard deviations) and far right (higher than the mean by multiple standard deviations), are the least frequent.
- The probability decreases as you move away from the mean towards the tails of the distribution.

This is in line with the properties of a normal distribution, where the majority of the data is concentrated around the mean, and as you move away from the mean towards the tails, the probability of occurrence decreases rapidly.