In [None]:
Q1. What is the Probability density function?
The Probability Density Function (PDF) is a concept in probability theory and statistics that describes the likelihood of a continuous random variable taking on a specific value within a given range. The PDF represents the probability that the random variable falls within a specific infinitesimal interval around a particular value. Unlike the probability mass function used in discrete random variables, the PDF is used for continuous random variables.

In mathematical notation, the PDF is denoted by f(x) for a random variable X, and it satisfies the following properties:

f(x) is non-negative: f(x) ≥ 0 for all x in the range of X.
The area under the curve of the PDF over its entire range is equal to 1: ∫f(x)dx = 1.
The probability of X being within a specific interval [a, b] is given by the integral of the PDF over that interval:
P(a ≤ X ≤ b) = ∫[a, b] f(x)dx

Q2. What are the types of Probability distribution?
There are several types of probability distributions, each applicable to different types of random variables. Some common probability distributions include:

Bernoulli Distribution: Describes the probability of success (1) or failure (0) in a single trial.
Binomial Distribution: Represents the probability of a certain number of successes in a fixed number of independent Bernoulli trials.
Poisson Distribution: Models the number of events that occur in a fixed interval of time or space, assuming a constant rate of occurrence.
Normal Distribution (Gaussian Distribution): A continuous distribution with a symmetric bell-shaped curve, commonly used in many statistical applications.
Exponential Distribution: Models the time between events in a Poisson process, which is a continuous counterpart to the Poisson distribution.
Uniform Distribution: Has constant probability across a specific range, resulting in a rectangular-shaped probability density function.
Gamma Distribution: Generalizes the exponential distribution to handle more complex cases.
Chi-Square Distribution: Arises from the sum of squared standard normal deviates and is commonly used in hypothesis testing.
Q3. Write a Python function to calculate the probability density function of a normal distribution with given mean and standard deviation at a given point.

python
Copy code
import math

def normal_pdf(x, mean, std_dev):
    coefficient = 1 / (std_dev * math.sqrt(2 * math.pi))
    exponent = -((x - mean) ** 2) / (2 * std_dev ** 2)
    return coefficient * math.exp(exponent)
This function takes the value x, mean, and standard deviation as inputs and returns the probability density of the normal distribution at that point.

Q4. What are the properties of Binomial distribution? Give two examples of events where binomial distribution can be applied.

Properties of the Binomial Distribution:

The experiment consists of a fixed number of independent trials.
Each trial has two possible outcomes: success (S) or failure (F).
The probability of success (p) is constant for each trial.
The trials are mutually exclusive and have a common probability of success.
The random variable X represents the number of successes in a fixed number of trials (n).
Examples of events where the binomial distribution can be applied:

Tossing a fair coin: The probability of getting heads or tails is constant for each toss, and the outcome of one toss does not affect the outcome of another toss.
Exam pass rate: Suppose a certain percentage of students pass an exam, and each student's performance is independent of others. The binomial distribution can be used to calculate the probability of a specific number of students passing the exam in a fixed number of attempts.
Q5. Generate a random sample of size 1000 from a binomial distribution with a probability of success 0.4 and plot a histogram of the results using matplotlib.

python
Copy code
import numpy as np
import matplotlib.pyplot as plt

# Generate random sample from binomial distribution
sample_size = 1000
probability_of_success = 0.4
random_sample = np.random.binomial(1, probability_of_success, size=sample_size)

# Plot histogram
plt.hist(random_sample, bins=2, edgecolor='black')
plt.xticks([0, 1], ['Failure', 'Success'])
plt.xlabel('Outcome')
plt.ylabel('Frequency')
plt.title('Binomial Distribution Sample')
plt.show()
This code generates a random sample of 1000 data points from a binomial distribution with a probability of success (1) being 0.4. The histogram displays the frequency of successes and failures in the sample.

Q6. Write a Python function to calculate the cumulative distribution function of a Poisson distribution with a given mean at a given point.

python
Copy code
import math

def poisson_cdf(k, mean):
    cdf = 0
    for i in range(k + 1):
        cdf += (mean ** i) * math.exp(-mean) / math.factorial(i)
    return cdf
This function takes the value k (the point at which to calculate the cumulative distribution) and the mean of the Poisson distribution as inputs and returns the cumulative probability up to that point.

Q7. How is the Binomial distribution different from the Poisson distribution?
The Binomial and Poisson distributions are both probability distributions used to model the number of successful events in a series of trials. However, they have different characteristics and are applied in different scenarios:

Binomial Distribution:

Applicable when the number of trials (n) is fixed, and each trial has two possible outcomes: success or failure.
The probability of success (p) is constant for each trial.
The random variable X represents the number of successes in n trials.
The trials are independent and mutually exclusive.
The binomial distribution is discrete.
The mean of the distribution is given by n * p, and the variance is given by n * p * (1 - p).
Poisson Distribution:

Applicable when the number of trials (n) is not fixed, but the average rate of occurrence of an event (λ) is known.
The probability of a single success in a small interval is proportional to the size of the interval.
The random variable X represents the number of events occurring in a fixed interval of time or space.
The events are independent and occur at a constant average rate.
The Poisson distribution is discrete.
The mean and variance of the distribution are both given by λ.
In summary, the main difference is that the Binomial distribution deals with a fixed number of trials with two outcomes, while the Poisson distribution deals with a variable number of occurrences in a fixed interval with a known average rate.

Q8. Generate a random sample of size 1000 from a Poisson distribution with mean 5 and calculate the sample mean and variance.

python
Copy code
import numpy as np

# Generate random sample from Poisson distribution
sample_size = 1000
mean = 5
random_sample = np.random.poisson(mean, size=sample_size)

# Calculate sample mean and variance
sample_mean = np.mean(random_sample)
sample_variance = np.var(random_sample)

print("Sample Mean:", sample_mean)
print("Sample Variance:", sample_variance)
This code generates a random sample of size 1000 from a Poisson distribution with a mean of 5. It then calculates the sample mean and variance of the generated sample.

Q9. How are mean and variance related in the Binomial distribution and Poisson distribution?
In both the Binomial distribution and Poisson distribution:

The mean (μ) and variance (σ^2) are related to a single parameter. For the Binomial distribution, the parameter is the probability of success (p), and for the Poisson distribution, the parameter is the average rate of occurrence (λ).
The mean is equal to the product of the number of trials (n) and the probability of success (p) in the Binomial distribution, i.e., μ = n * p.
The mean is equal to the average rate of occurrence (λ) in the Poisson distribution, i.e., μ = λ.
The variance is equal to the product of the number of trials (n), the probability of success (p), and the probability of failure (1-p) in the Binomial distribution, i.e., σ^2 = n * p * (1 - p).
The variance is equal to the average rate of occurrence (λ) in the Poisson distribution, i.e., σ^2 = λ.
In summary, in both distributions, the variance is equal to the mean, which is a unique property of the Binomial and Poisson distributions and is not applicable to other probability distributions.

Q10. In normal distribution with respect to mean position, where does the least frequent data appear?
In a normal distribution, the least frequent data appears in the tails of the distribution, farthest from the mean. The normal distribution is a symmetric distribution, which means that data is equally likely to appear on either side of the mean.

The tails of the distribution extend infinitely in both directions, and as we move farther away from the mean, the data becomes less frequent. The data points in the tails represent extreme values and are less probable to occur compared to values closer to the mean.

As we move towards the mean from the tails, the frequency of data points increases, reaching its peak at the mean, where the most frequent data appears. The distribution is symmetric, so the frequencies of data points on one side of the mean mirror the frequencies on the other side.