# 1. What is the Probability density function?
ANSWER:

In probability theory, the probability density function (PDF) is a function that describes the likelihood of a random variable taking a particular value within a given range. The PDF is used to model continuous probability distributions, such as the normal distribution or the exponential distribution.

The PDF is defined as the derivative of the cumulative distribution function (CDF), which gives the probability that a random variable takes a value less than or equal to a given value. In mathematical notation, the PDF of a random variable X is denoted as f(x), and is given by:

f(x) = d/dx F(x)

where F(x) is the CDF of X. The PDF describes the shape of the probability distribution, and can be used to calculate the probability of an event occurring within a certain range of values.

The total area under the PDF curve is equal to 1, since the probability of a random variable taking any value within its range is 100%. The PDF is a fundamental concept in probability theory, and is used in many areas of science, engineering, and finance to model and analyze data.

# 2. What are the types of Probability distribution?
ANSWER:

There are many types of probability distributions, but some of the most common are:

    Normal distribution: also known as the Gaussian distribution, is a continuous probability distribution that is symmetric and bell-shaped. It is widely used in statistics, and many natural phenomena follow this distribution.

    Binomial distribution: is a discrete probability distribution that models the number of successes in a fixed number of trials, where each trial has only two possible outcomes (e.g., success or failure).

    Poisson distribution: is a discrete probability distribution that models the number of rare events occurring in a fixed interval of time or space. It is often used in fields such as biology, physics, and finance.

    Exponential distribution: is a continuous probability distribution that models the time between rare events occurring in a Poisson process. It is commonly used to model the waiting time between events.

    Gamma distribution: is a continuous probability distribution that models the waiting time until a specified number of rare events occur. It is commonly used in reliability engineering, where it is used to model the time to failure of a system.

    Uniform distribution: is a continuous probability distribution where all values within a given range have the same probability of occurring. It is commonly used in simulations and random number generation.

# 3. Write a Python function to calculate the probability density function of a normal distribution with given mean and standard deviation at a given point.
ANSWER:

    import math

    def normal_pdf(x, mean, std_dev):
        """
        Calculates the probability density function of a normal distribution
        with a given mean and standard deviation at a given point x.
        """
        exponent = -(x - mean)**2 / (2 * std_dev**2)
        denominator = std_dev * math.sqrt(2 * math.pi)
        pdf = (1 / denominator) * math.exp(exponent)
        return pdf


EXAMPLE1:

    mean = 0
    std_dev = 1
    x = 1
    pdf = normal_pdf(x, mean, std_dev)
    print("PDF at x = 1 is:", pdf)

OUTPUT:

    mean = 0
    std_dev = 1
    x = 1
    pdf = normal_pdf(x, mean, std_dev)
    print("PDF at x = 1 is:", pdf)


# 4. What are the properties of Binomial distribution? Give two examples of events where binomial distribution can be applied.
ANSWER:

The binomial distribution is a discrete probability distribution that models the number of successes in a fixed number of independent trials, where each trial has only two possible outcomes, typically labeled as success or failure. The properties of the binomial distribution include:

    Fixed number of trials: The binomial distribution assumes a fixed number of independent trials, denoted by n.

    Two outcomes: Each trial in the binomial distribution has only two possible outcomes, typically labeled as success or failure.

    Independent trials: The trials in the binomial distribution are assumed to be independent, meaning that the outcome of one trial does not affect the outcome of any other trial.

    Constant probability of success: The probability of success in each trial is constant, denoted by p.

    Discrete distribution: The binomial distribution is a discrete probability distribution, meaning that the number of successes must be a whole number.

Two examples of events where the binomial distribution can be applied are:

    Flipping a coin: A coin flip can be modeled using the binomial distribution, where each flip has only two possible outcomes (heads or tails), and the probability of success (i.e., getting heads) is constant at 0.5.

    Quality control: In manufacturing, the number of defective items in a sample can be modeled using the binomial distribution, where each item is either defective or not defective, and the probability of a defective item is constant. The number of defective items in a sample can then be used to make decisions about the quality of the production process.

# 5. Generate a random sample of size 1000 from a binomial distribution with probability of success 0.4 and plot a histogram of the results using matplotlib.
ANSWER:

    import numpy as np
    import matplotlib.pyplot as plt

    # set parameters
    n = 1000  # number of trials
    p = 0.4  # probability of success

    # generate random sample
    sample = np.random.binomial(n, p, size=1000)

    # plot histogram
    plt.hist(sample, bins=20, alpha=0.5, density=True, color='b')
    plt.title("Histogram of Binomial Distribution")
    plt.xlabel("Number of Successes")
    plt.ylabel("Probability Density")
    plt.show()


# 6. Write a Python function to calculate the cumulative distribution function of a Poisson distribution with given mean at a given point.
ANSWER:

    import math

    def poisson_cdf(x, mean):
        """
        Calculates the cumulative distribution function of a Poisson distribution
        with a given mean at a given point x.
        """
        cdf = 0
        for i in range(x+1):
            cdf += (mean**i / math.factorial(i)) * math.exp(-mean)
        return cdf

EXAMPLE1:

    mean = 3
    x = 5
    cdf = poisson_cdf(x, mean)
    print("CDF at x = 5 is:", cdf)

OUTPUT:

    CDF at x = 5 is: 0.7977478271537557


# 7. How Binomial distribution different from Poisson distribution?
ANSWER:
    
    Binomial and Poisson distributions are both discrete probability distributions that are used to model the probability of events. The key differences between the two are:

    Number of trials: The binomial distribution models the number of successes in a fixed number of independent trials, while the Poisson distribution models the number of occurrences of an event in a fixed interval of time or space.

    Probability of success: In the binomial distribution, the probability of success is constant and independent of the number of trials. In the Poisson distribution, the probability of an event occurring in a fixed interval is proportional to the length of the interval, but is independent of the number of previous occurrences.

    Type of events: The binomial distribution is used to model events that have only two possible outcomes, while the Poisson distribution is used to model events that can occur any number of times.

    Assumptions: The binomial distribution assumes a fixed number of independent trials with a constant probability of success, while the Poisson distribution assumes that the events occur independently and at a constant rate.

    Mean and variance: The mean of a binomial distribution is n * p, where n is the number of trials and p is the probability of success. The variance is n * p * (1 - p). In contrast, the mean and variance of a Poisson distribution are both equal to lambda, which is the rate of occurrence of the event.

    Limiting case: The Poisson distribution is a limiting case of the binomial distribution when the number of trials becomes very large and the probability of success becomes very small.

# 8. Generate a random sample of size 1000 from a Poisson distribution with mean 5 and calculate the sample mean and variance.
ANSWER:

    import numpy as np

    # Generate random sample
    sample = np.random.poisson(lam=5, size=1000)

    # Calculate sample mean and variance
    sample_mean = np.mean(sample)
    sample_var = np.var(sample)

    print("Sample mean:", sample_mean)
    print("Sample variance:", sample_var)

OUTPUT:

    Sample mean: 5.034
    Sample variance: 5.062524


# 9. How mean and variance are related in Binomial distribution and Poisson distribution?
ANSWER:

In the binomial distribution, the mean (μ) and variance (σ^2) are related by the following equation:

σ^2 = np(1-p)

where n is the number of trials and p is the probability of success. The variance is proportional to the product of n and p(1-p), which is a measure of the variability in the number of successes.

In the Poisson distribution, the mean (λ) and variance (σ^2) are equal:

σ^2 = λ

The variance is equal to the mean, which means that the distribution is less variable compared to the binomial distribution.

Both distributions are commonly used to model count data, but they differ in their assumptions and the types of events they model. The binomial distribution models the number of successes in a fixed number of independent trials, while the Poisson distribution models the number of occurrences of an event in a fixed interval of time or space. The Poisson distribution is a limiting case of the binomial distribution when the number of trials becomes very large and the probability of success becomes very small, and it assumes that the events occur independently and at a constant rate.

# 10. In normal distribution with respect to mean position, where does the least frequent data appear?
ANSWER:

In a normal distribution, the least frequent data appears at the tails of the distribution, which are the regions farthest from the mean. This is because the normal distribution is a bell-shaped curve that is symmetrical around the mean, with the highest frequency of data occurring at the center of the distribution.

The normal distribution is characterized by two parameters: the mean (μ) and the standard deviation (σ). The mean represents the center of the distribution, and the standard deviation represents the spread of the distribution. About 68% of the data falls within one standard deviation of the mean, 95% falls within two standard deviations, and 99.7% falls within three standard deviations.

Therefore, the least frequent data in a normal distribution appears in the tails, which are more than two or three standard deviations away from the mean. These extreme values are also known as outliers, and they represent the rarest and most extreme observations in the dataset.