## Q1: What is the difference between a t-test and a z-test? Provide an example scenario where you would use each type of test.

## Ans:

A t-test and a z-test are both statistical hypothesis tests used to make inferences about population parameters based on sample data. However, they are suited for different situations, primarily based on the characteristics of the data and what we know about the population.

T-Test:

1. Used for small sample sizes: T-tests are typically used when you have a relatively small sample size (typically less than 30 observations) and the population standard deviation is unknown.
2. Sample standard deviation: In a t-test, we use the sample standard deviation to estimate the population standard deviation.
3. Example Scenario: Suppose we want to test whether a new teaching method improves students' test scores. You randomly select 20 students, apply the new method to them, and then compare their scores to the scores of 20 other students who were taught using the traditional method. In this case, you would use a t-test because you have a small sample size (n = 20) and you don't know the population standard deviation.

Z-Test:

1. Used for large sample sizes: Z-tests are appropriate when you have a relatively large sample size (typically greater than 30 observations) or when you know the population standard deviation.
2. Population standard deviation: In a z-test, you use the known population standard deviation or, if the sample size is large enough, you can use the sample standard deviation as an estimate of the population standard deviation.
3. Example Scenario: Suppose you want to test whether the average height of adult males in a particular city is significantly different from the national average height. You collect height data from 500 adult males in that city and know the population standard deviation for height in the national population. Here, you would use a z-test because you have a large sample size (n = 500) and you know the population standard deviation.

## Q2: Differentiate between one-tailed and two-tailed tests.

## Ans:

One-tailed and two-tailed tests are two types of hypothesis tests used in statistics to determine the significance of observed sample results in relation to a null hypothesis. The key difference between them lies in the directionality of the hypothesis being tested and the critical region for rejecting the null hypothesis.

One-Tailed Test:

1. Directional Hypothesis: In a one-tailed test, we have a specific direction in mind when formulating our hypothesis. We are testing whether the population parameter is greater than or less than a certain value.
2. Critical Region: The critical region for a one-tailed test is on one side of the sampling distribution's probability distribution curve.
3. Rejection Region: If the sample result falls into the critical region, we reject the null hypothesis. This means we are specifically looking for an effect in one direction.

Example of a One-Tailed Test:\
Suppose we want to test whether a new drug improves patients' recovery time after surgery. Our null hypothesis (H0) might be: "The new drug has no effect on recovery time." Our alternative hypothesis (Ha) for a one-tailed test might be: "The new drug reduces recovery time." In this case, we are only interested in whether the drug makes recovery faster, and we would look for evidence of this effect in one direction.

Two-Tailed Test:

1. Non-Directional Hypothesis: In a two-tailed test, we do not have a specific direction in mind when formulating our hypothesis. We are testing whether the population parameter is different from a certain value, but we do not specify whether it's greater or less than that value.
2. Critical Region: The critical region for a two-tailed test is split into two sides of the sampling distribution's probability distribution curve.
3. Rejection Region: If the sample result falls into either of the two critical regions, we reject the null hypothesis. This means we are looking for evidence of an effect in either direction.

Example of a Two-Tailed Test:\
Suppose we want to test whether a coin is fair. Our null hypothesis (H0) might be: "The coin is fair, and the probability of getting heads is 0.5." Our alternative hypothesis (Ha) for a two-tailed test might be: "The coin is not fair, and the probability of getting heads is not equal to 0.5." In this case, we are interested in detecting any deviation from the expected value of 0.5, whether it's more heads (greater than 0.5) or fewer heads (less than 0.5).

## Q3: Explain the concept of Type 1 and Type 2 errors in hypothesis testing. Provide an example scenario for each type of error.

## Ans:

In hypothesis testing, Type I and Type II errors are two types of errors that can occur when making decisions about the null hypothesis (H0) and the alternative hypothesis (Ha). These errors are associated with the incorrect rejection or acceptance of a null hypothesis based on sample data.

1. Type I Error (False Positive):\
Definition: Type I error occurs when we reject a null hypothesis that is actually true. In other words, we conclude that there is a significant effect or difference when, in reality, there is none.\
Symbol: It is denoted by α (alpha).\
Example Scenario for Type I Error:\
Suppose a pharmaceutical company is testing a new drug for its effectiveness in treating a certain condition. They set up their hypothesis test as follows:
Null Hypothesis (H0): The new drug has no effect (i.e., it's not better than a placebo).\
Alternative Hypothesis (Ha): The new drug is effective (i.e., it's better than a placebo).\
If, after conducting the test, they incorrectly conclude that the new drug is effective (reject H0) when, in reality, it's not (H0 is true), this would be a Type I error. It could lead to the drug being marketed and prescribed when it's actually ineffective.

2. Type II Error (False Negative):\
Definition: Type II error occurs when you fail to reject a null hypothesis that is actually false. In other words, you conclude that there is no significant effect or difference when, in reality, there is one.\
Symbol: It is denoted by β (beta).\
Example Scenario for Type II Error:\
Suppose a quality control team is testing a production process to ensure that it meets certain specifications. They set up their hypothesis test as follows:\
Null Hypothesis (H0): The production process meets specifications.\
Alternative Hypothesis (Ha): The production process does not meet specifications.\
If, after conducting the test, they fail to detect that the production process is not meeting specifications (fail to reject H0) when, in reality, it's producing defective products (Ha is true), this would be a Type II error. Defective products could continue to be produced and shipped, leading to potential quality and safety issues.

In summary:

    Type I error is the incorrect rejection of a true null hypothesis, leading to a false positive result.\
    Type II error is the failure to reject a false null hypothesis, leading to a false negative result.

## Q4: Explain Bayes's theorem with an example.

## Ans:

Bayes's theorem is a fundamental concept in probability theory and statistics used to update the probability of an event based on new information or evidence. The theorem is particularly useful in situations where you want to revise your beliefs or probabilities in light of new data. Bayesian statistics in an approach to data analysis and parameter estimation based on Bayes theorem.

The formula for Bayes's theorem is as follows:\
$P(A|B)=\frac{P(B|A)P(A)}{P(B)}$

Where:

1. P(A∣B): represents the probability of event A occurring given that event B has occurred.
2. P(B∣A): is the probability of event B occurring given that event A has occurred.
3. P(A): is the prior probability of event A.
4. P(B): is the marginal probability of event B, which serves as a normalization constant.

Example:

Let's illustrate Bayes's theorem with a medical diagnosis scenario:

Suppose a doctor and want to determine the probability that a patient has a certain rare disease (D) based on the results of a diagnostic test (T). He knows the following probabilities:

The probability that a patient has the disease, P(D), is 0.01 (1% of the population has the disease). The probability that the diagnostic test correctly identifies the disease if the patient has it, P(T∣D), is 0.95 (a 95% true positive rate). The probability that the diagnostic test incorrectly indicates the presence of the disease when the patient doesn't have it, P(T∣¬D), is 0.10 (a 10% false positive rate).

Now, you want to find the probability that a patient truly has the disease given a positive test result, P(D∣T).

Now, you want to find the probability that a patient truly has the disease given a positive test result, P(D∣T).

Using Bayes's theorem:

$P(D|T) = \frac{P(T|D)P(D)}{P(T)}$

To calculate P(T), we can use the law of total probability:

P(T)=P(T∣D)⋅P(D)+P(T∣¬D)⋅P(¬D)

Where:

    P(¬D) is the probability that a patient does not have the disease, which is equal to 1−P(D).

Now, plug in the values:

P(T)=(0.95⋅0.01)+(0.10⋅0.99)=0.049+0.099=0.148

Now, we can calculate P(D∣T):

P(D∣T)=(0.95⋅0.010)/(.148)≈0.064

## Q5: What is a confidence interval? How to calculate the confidence interval, explain with an example.

## Ans:

A confidence interval is a statistical range or interval constructed around a sample statistic, such as a sample mean or proportion, that is used to estimate the range within which the true population parameter is likely to fall with a certain level of confidence. It provides a measure of the precision or uncertainty associated with the sample statistic.

The key components of a confidence interval are:

    Sample Statistic: The value computed from the sample data, such as the sample mean or proportion.
    Margin of Error: A range added to and subtracted from the sample statistic to create the interval. It represents the precision of the estimate.
    Confidence Level: The level of confidence you have that the true population parameter lies within the interval. Common confidence levels are 90%, 95%, and 99%, but other levels can be used.

The formula for calculating a confidence interval for a population mean (when the population standard deviation is known) is:

Confidence Interval=Sample Mean±($\frac{Zσ}{\sqrt(n)}$)

Where:

    Sample Mean: The mean of the sample data.
    Z: The critical value from the standard normal distribution corresponding to the desired confidence level. For example, for a 95% confidence interval, Z ≈ 1.96.
    σ: The population standard deviation.
    n: The sample size.

Now, let's illustrate how to calculate a confidence interval with an example:

Example:
Suppose we want to estimate the average height of adult males in a certain city. We collect a random sample of 100 adult males and find that the sample mean height is 175 cm, and we know from previous research that the population standard deviation is 8 cm. We want to calculate a 95% confidence interval for the population mean height.

    Find the critical value (Z) for a 95% confidence level. For a 95% confidence level, Z ≈ 1.96 
    Plug the values into the formula:
    Confidence Interval=175±(1.96⋅8/10)

Calculate the margin of error:
Margin of Error=1.96⋅8100≈1.568

    Calculate the confidence interval:
    Confidence Interval=175±1.568=(173.432,176.568)

So, with 95% confidence, we can say that the true average height of adult males in the city is likely to fall within the interval of 173.432 cm to 176.568 cm based on our sample data and the known population standard deviation.

## Q7. Calculate the 95% confidence interval for a sample of data with a mean of 50 and a standard deviation of 5. Interpret the results.

## Ans:

![](7.jpg)

## Q8. What is the margin of error in a confidence interval? How does sample size affect the margin of error? Provide an example of a scenario where a larger sample size would result in a smaller margin of error.

## Ans:

The margin of error (MOE) in a confidence interval is a measure of the range or uncertainty associated with the estimate of a population parameter, such as the population mean or proportion, based on a sample. It quantifies how much the sample statistic (e.g., sample mean or sample proportion) is expected to vary from the true population parameter if we were to take multiple random samples from the same population.

The key points about the margin of error are as follows:

    Inverse Relationship with Confidence Level: The margin of error is inversely related to the confidence level. In other words, as we increase the confidence level (e.g., from 90% to 95% to 99%), the margin of error will also increase because we are widening the interval to be more confident about capturing the true parameter.

    Direct Relationship with Sample Size: The margin of error is directly related to the sample size. As the sample size increases, the margin of error decreases. Larger sample sizes provide more information about the population, leading to more precise estimates.

Example Scenario:

Let's consider an example involving a political poll to understand how sample size affects the margin of error:

Suppose you want to estimate the proportion of voters in a city who support Candidate A in an upcoming election. You decide to conduct a survey, and we have two options for the sample size:

Option 1: Survey 200 randomly selected voters.
Option 2: Survey 1,000 randomly selected voters.

We want to calculate 95% confidence intervals for both options.

    Option 1 (Sample size = 200):
        The margin of error will be relatively larger because the sample size is smaller.
        The confidence interval may be, for example, 45% to 55%, with a margin of error around ±5%.

    Option 2 (Sample size = 1,000):
        The margin of error will be relatively smaller because the sample size is larger.
        The confidence interval may be, for example, 48% to 52%, with a smaller margin of error around ±2%.

## Q9. Calculate the z-score for a data point with a value of 75, a population mean of 70, and a population standard deviation of 5. Interpret the results.

## Ans:

Z-score = $\frac{x-\mu}{\sigma} = \frac{75-70}{5}$=1

## Q10. In a study of the effectiveness of a new weight loss drug, a sample of 50 participants lost an average of 6 pounds with a standard deviation of 2.5 pounds. Conduct a hypothesis test to determine if the drug is significantly effective at a 95% confidence level using a t-test.

## Ans:

![](10.jpg)

## Q11. In a survey of 500 people, 65% reported being satisfied with their current job. Calculate the 95% confidence interval for the true proportion of people who are satisfied with their job.

## Ans:

![](11.jpg)

## Q12. A researcher is testing the effectiveness of two different teaching methods on student performance. Sample A has a mean score of 85 with a standard deviation of 6, while sample B has a mean score of 82 with a standard deviation of 5. Conduct a hypothesis test to determine if the two teaching methods have a significant difference in student performance using a t-test with a significance level of 0.01.

## Ans:

![](12a.jpg)

![](12b.jpg)

## Q13. A population has a mean of 60 and a standard deviation of 8. A sample of 50 observations has a mean of 65. Calculate the 90% confidence interval for the true population mean.

## Ans:

![](13.jpg)

## Q14. In a study of the effects of caffeine on reaction time, a sample of 30 participants had an average reaction time of 0.25 seconds with a standard deviation of 0.05 seconds. Conduct a hypothesis test to determine if the caffeine has a significant effect on reaction time at a 90% confidence level using a t-test.

## Ans:

In [None]:
![]()