# 1. What is the key factor that makes the difference between ideas that can, and cannot be examined and tested statistically? What would you describe is the key "criteria" defining what a good null hypothesis is? And what is the difference between a null hypothesis and an alternative hypothesis in the context of hypothesis testing?

The key factor differentiating testable ideas from non-testable ones in statistics is the ability to collect data that can be analyzed to provide evidence for or against an idea.
<p>
A good null hypothesis should be:
<p>
Testable: We need data that can potentially contradict it.<p>
Default/Uninteresting: It often represents the status quo or a lack of effect.<p>
A "Straw Man": something we might not actually believe, but it sets up a framework to potentially disprove and move towards a more interesting conclusion.<p>
The null hypothesis (H<sub>0</sub>) is our initial assumption, often representing "no effect" or the default belief. The alternative hypothesis (H<sub>A</sub>) simply states that the null hypothesis is false. It doesn't specify how it's false, just that it is.

# 2. "It is important to note that outcomes of tests refer to the population parameter, rather than the sample statistic! As such, the result that we get is for the population." In terms of the distinctions between the concepts of [...] how would you describe what the sentence above means?

Observed Values (xi): These are the individual data points we collect in our sample. For example, if we're measuring the height of students, each student's height is an x<sub>i</sub>.
<p>
Sample Average (x̄): This is the average of all the observed values in our sample. It gives us an estimate of the average height of students in our sample.
<p>
Population Average (μ): This is the true average value we're interested in, but we usually can't measure it directly (e.g., the average height of all students, not just those in our sample).
Hypothesized Value (μ0): This is the specific value of the population average we're testing in our null hypothesis (H<sub>0</sub>). For example, we might hypothesize that the average height of all students (μ) is 5 feet 8 inches (μ<sub>0</sub>).
<p>
The sentence means that while we use data from our sample to perform a hypothesis test, the conclusions we draw apply to the population parameter (μ), not just the sample statistic (x̄).

# 3. "Imagine a world where the null hypothesis is true" when calculating a p-value? Explain why this is.

We "imagine a world where the null hypothesis is true" when calculating a p-value because the p-value is specifically defined as the probability of observing our data (or more extreme data) if the null hypothesis were actually true.
<p>
Sampling Distribution Under H0: To calculate a p-value, we need to consider the sampling distribution of our test statistic under the null hypothesis. This distribution shows us the range of values we'd expect for our statistic due to random chance alone if the null hypothesis were true.
<p>
Comparing Our Observed Statistic: We then compare our observed statistic to this sampling distribution. If our observed statistic falls in an extreme tail of the distribution (meaning it's unlikely to occur by chance if H<sub>0</sub> were true), we get a small p-value, providing evidence against the null hypothesis.

# 4. A smaller p-value makes the null hypothesis look more ridiculous. Explain why this is.

A smaller p-value makes the null hypothesis look more "ridiculous" because it means our observed data is increasingly unlikely to have occurred by chance if the null hypothesis were actually true. A small p-value indicates that our observed test statistic falls far out in the tail of the sampling distribution under the null hypothesis.This suggests that our data is so unusual under the assumption of H<sub>0</sub> that it's more plausible that H<sub>0</sub> is actually false. The smaller the p-value, the stronger the evidence against H<sub>0</sub>.

# 5. Gunturkun kissing couples experiment

We are testing the null hypothesis ￼ that humans have no preference for left or right head tilt when kissing, meaning the probability of tilting either left or right is 50/50. <p>
The data from the study shows that 80 out of 124 couples tilted their heads to the right.
We assume a binomial distribution for this problem since each couple’s head tilt can be considered a Bernoulli trial (tilt right or not).
Under ￼, the probability of tilting right is 0.5 (50%).
Number of right tilts (successes): 80
Total number of couples (trials): 124
I used the binomial test to compute the probability of observing 80 or more couples tilting their heads to the right, given the null hypothesis ￼ (50% chance of either tilt).
The alternative hypothesis is one-sided (greater), since we are interested in whether more couples tilt to the right than expected by random chance.
Using the scipy.stats.binom_test() function in Python, I computed the p-value, which turned out to be 0.00078.
<p>
import scipy.stats as stats<p>

Number of successes (tilt right), total observations, and the null hypothesis probability (50%)<p>
successes = 80<p>
n = 124<p>
null_hypothesis_prob = 0.5<p>

Perform a binomial test<p>
p_value = stats.binom_test(successes, n, null_hypothesis_prob, alternative='greater')<p>
print(p_value)<p>
This p-value is very small, indicating that it’s highly unlikely to observe this many couples tilting their heads to the right if the true probability were 50%. Therefore, we reject the null hypothesis and conclude that there is very strong evidence against it.

# 6. Can a smaller p-value definitively prove that the null hypothesis is false? Is it possible to definitively prove that Fido (from the "second pre-lecture video") is innocent using a p-value? Is it possible to difinitively prove that Fido is guilty using a p-value? How low or high does a p-value have to be to definitely prove one or the other?
It is not possible to definitively prove that Fido is innocent. the P-value just states if our evidence makes the null hypothesis look ridiculous or not. A smaller P-value does not definitively prove the null hypothesis to be false as there is always some chance that it is true, and a larger P-value doesn't automatically prove it as true because it just means that there is not enough evidence to reject it. 

# 7. Describe (perhaps with the help of your ChatBot) what changed in the code; how this changes the interpretation of the hypothesis test; and whether or not we should indeed expect the p-value to be smaller in the "one tailed" versus "two tailed" analysis.


p-value Calculation: In the original (two-sided) test, we checked whether the observed statistic was extreme in both directions (greater or less than the null hypothesis value). <p>In the updated one-sided test, we only check whether the observed mean is greater than the null hypothesis value (65 in this case).<p>
Condition for One-sided Test: The condition bootstrapped_means_under_H0 >= observed_mean is used to calculate the p-value, meaning we only consider the proportion of bootstrapped means that are greater than or equal to the observed mean.<p>
Removal of Symmetry:We removed the calculation for checking “as or more extreme” values on both sides (less than and greater than the null hypothesis). The symmetric region in a two-sided test (for values lower than the null) was eliminated.<p>
One-sided Test Interpretation: The one-sided test is only concerned with whether the observed statistic (mean) is greater than the null hypothesis value (the hypothesized mean).<p>This changes the interpretation of the test. For example, if the one-sided p-value is small, it indicates strong evidence that the sample mean is significantly larger than the null hypothesis mean.<p>
Comparison to Two-sided Test: A two-sided test is more conservative because it tests for both directions (whether the statistic is either greater or smaller than the hypothesized value).<p>As a result, a two-sided test generally requires stronger evidence (a larger deviation from the null) to reject the null hypothesis.
<p>
Should the p-value be Smaller in the One-tailed Test?<p>
Yes, we generally expect the p-value to be smaller in a one-tailed test compared to a two-tailed test.<p>
This is because the one-tailed test only considers the probability of the observed statistic in one direction (greater than or equal to the null value), whereas the two-tailed test looks for extreme values in both directions (greater and less).<p>
Since we’re excluding half of the distribution (the side where values are less than the null), the probability of finding values as extreme as the observed one is naturally smaller, leading to a smaller p-value.

# 8. 

# 9. Yes