### Question 1: What is a random variable in probability theory?

**Answer:**  
A random variable is a key concept in probability theory that numerically represents outcomes of random experiments. It assigns a real number to each outcome in a sample space. For example, in rolling a die, the random variable maps outcomes to numbers 1 to 6. There are two types: discrete (countable values) and continuous (any value in range).  They allow the use of operations like expectation and variance, essential for modeling and inference in statistics. Understanding random variables is foundational for advanced statistical methods like distributions and hypothesis testing.

### Question 2: What are the types of random variables?

**Answer:**  
Random variables are mainly divided into discrete and continuous types. Discrete random variables take countable, distinct values like number of heads in coin tosses or students in a class. Continuous random variables can assume any values within intervals, such as height or weight, modeled using probability density functions. The distinction is important for selecting correct probability distributions and analysis methods.

### Question 3: Explain the difference between discrete and continuous distributions.

**Answer:**  
Discrete and continuous distributions differ mainly in the values their variables can take. Discrete distributions represent countable outcomes such as the number of successes in trials, while continuous distributions represent infinite possible values within intervals, like height or time. Discrete distributions assign probabilities to exact values, while continuous distributions use densities over ranges. Examples of discrete distributions include binomial and Poisson; continuous examples are normal and exponential distributions.

### Question 4: What is a binomial distribution, and how is it used in probability?

**Answer:**  
The binomial distribution calculates the probability of getting a fixed number of successes in independent trials with two possible outcomes (success/failure). It’s widely used in quality control, surveys, and experimentation where outcomes fall into categories. Parameters are the number of trials and the success probability. This distribution helps analyze categorical data and make predictions through inferential statistics.

### Question 5: What is the standard normal distribution, and why is it important?

**Answer:**  
The standard normal distribution is a bell-shaped curve with mean zero and standard deviation one, representing the standardized normal variable. It’s important because it enables the comparison of scores from different normal distributions through z-scores. Many statistical tests rely on the assumption of normality explained by this distribution, making it foundational for hypothesis testing, confidence intervals, and quality control in diverse fields.

### Question 6: What is the Central Limit Theorem (CLT), and why is it critical in statistics?

**Answer:**  
The Central Limit Theorem states that the sample mean distribution approaches normality as sample size grows, regardless of the original population's distribution. This theorem enables the use of normal-based inference techniques even when the data isn't perfectly normal, supporting confidence intervals and hypothesis testing. It is central to many practical statistical analyses in economics, biology, and social sciences.

### Question 7: What is the significance of confidence intervals in statistical analysis?

**Answer:**  
Confidence intervals provide a range for estimating population parameters with a chosen confidence level, often 95%. They express estimate precision and uncertainty associated with sampling variability, helping researchers make decisions beyond single-point estimates. Widely used in clinical research, polling, and quality control, they facilitate interpretation and decision-making by quantifying estimate reliability.

### Question 8: What is the concept of expected value in a probability distribution?

**Answer:**  
Expected value refers to the average outcome of a random variable weighted by the probabilities of values, representing central tendency in a probability distribution. It is crucial for decision-making in risk management, insurance, and finance, providing a long-run expectation in uncertain scenarios.

### Question 9: Write a Python program to generate 1000 random numbers from a normal distribution with mean = 50 and standard deviation = 5. Compute its mean and standard deviation using NumPy, and draw a histogram to visualize the distribution.

```
import numpy as np
import matplotlib.pyplot as plt
data = np.random.normal(50, 5, 1000)
mean = np.mean(data)
std = np.std(data)blob:vscode-webview://1p1j9esln0tp6b9s7m48tfni4v77ibliteiouitecgmervtsr7q3/93a43791-7834-49ee-9a62-0ce4c29599fe
print('Mean:', mean)
print('Std:', std)
plt.hist(data, bins=30, alpha=0.7, color='blue')
plt.title('Histogram of Normal Distribution')
plt.show()
```
Output:
----------------------------------------------------------------------------------------------------------------------------------

Mean: 49.920473005363064

Std: 5.046439466970164

### Question 10: You are working as a data analyst for a retail company. The company has collected daily sales data for 2 years and wants you to identify the overall sales trend.

daily_sales = [220, 245, 210, 265, 230, 250, 260, 275, 240, 255, 
               235, 260, 245, 250, 225, 270, 265, 255, 250, 260]

Explain how you would apply the Central Limit Theorem to estimate the average sales with a 95% confidence interval.
Write the Python code to compute the mean sales and its confidence interval.

Explanation: The Central Limit Theorem allows us to assume the distribution of sample means is approximately normal for large samples. Using this, we calculate a 95% confidence interval for average sales from sample mean, standard deviation, and size.

Python code:
```
import numpy as np
from scipy import stats
daily_sales = [220, 245, 210, 265, 230, 250, 260, 275, 240, 255, 235, 260, 245, 250, 225, 270, 265, 255, 250, 260]
sample_mean = np.mean(daily_sales)
sample_std = np.std(daily_sales, ddof=1)
sample_size = len(daily_sales)
confidence_level = 0.95
alpha = 1-confidence_level
t_critical = stats.t.ppf(1 - alpha/2, df=sample_size-1)
margin = t_critical * (sample_std / sample_size**0.5)
confidence_interval = (sample_mean - margin, sample_mean + margin)
print('Mean Sales:', sample_mean)
print('95% CI:', confidence_interval)
```

Output:
-----------------------------------------------------------------------------------------------------------------------------------------------------------
Mean Sales: 248.25

95% CI: (np.float64(240.16957025147158), np.float64(256.3304297485284))