# DSC 10 Discussion Week 6
---

Welcome to Discussion 6!

**IMPORTANT** 
- **MIDTERM** on November 10th 
- I will not be able to answer any midterm-related content questions until everyone has taken the exam

<img src="data/panda_smile.jpg" width="500">

## $\underline{Lecture\ 10 : Models\ and\ Statistics}$

### Model
- a set of assumptions about data
- assessing the quality of models $\rightarrow$ statistical inference!

### Terminology
- **Parameter** : a number associated with the *population* $\rightarrow$ rarely known exactly
- **Statistic** : a number calculated from the *sample* $\rightarrow$ estimate of a parameter

### Bias-Variance trade-off
- **Bias** : systematic error in one direction (too high or too low) $\rightarrow$ good estimates have *LOW bias*
- **Variance** : degree to which the value of an estimate varies $\rightarrow$ good estimates have *LOW variance*

### Simulation
- **Single experiment** : ```np.random.multinomial(sample_size, pop_distribution)```
- **A bunch of experiments** : iteration!
- **Visualize** : plot! $\rightarrow$ often *histogram* to show distribution

## $\underline{Lecture\ 11 : Hypothesis\ Testing}$

### Two Viewpoints
- **Null Hypothesis** : default view $\rightarrow$ must be simulatable
- **Alternate Hypothesis** : opposite of Null Hypothesis 

### Computing statistics under Null Hypothesis
- Choose a relevant *test statistic*
    - counts, ratios, differences, absolute differences, etc. depending on problem
    - **Total Variation Difference** : difference between two distributions  
    - be careful with use of ```abs()```!
- Track experiment outcomes and compute the **empirical distribution of the statistic under the null hypothesis**

### Drawing conclusions
- Compare the following : 
    - **observed test statistic** (red dot/line from class) 
    - **empirical distribution under the null hypothesis** (histograms from experiments)
- Determine if observed value is consistent
    - by visualization or some other conventional quantitative measure
    - **p-value** : probability that a result *at least* as extreme as the observation holds under the null hypothesis
        - common cutoff is 5% for statistical significance

#### Extra
- You can find additional help on these topics in the course [textbook](https://eldridgejm.github.io/dive_into_data_science/front.html).
- [Here](https://ucsd-ets.github.io/dsc10-2020-fa/published/default/reference/babypandas-reference.pdf) is a pointer to that reference sheet we saw last time.

In [1]:
import babypandas as bpd
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline

# Example 1: Fighting Professors

Two professors are fighting about who is a better teacher. To settle the matter, they decide to give each of their classes the same exam. Whoever's class performs better will be considered the best teacher.

## The Data

In [2]:
scores = bpd.read_csv('data/scores.csv')
scores

Unnamed: 0,prof,score
0,A,87.133940
1,A,67.265656
2,A,62.050823
3,A,81.964750
4,A,91.909607
...,...,...
121,B,79.515321
122,B,72.241259
123,B,76.123686
124,B,69.966714


## Exploration

<!-- BEGIN QUESTION -->

Which professor (A or B) appears to have "won"?

<!--
BEGIN QUESTION
name: q10
manual: true
-->

In [17]:
scores.groupby('prof').mean() # SOLUTION NO PROMPT
won_prof = 'A' # SOLUTION

In [None]:
grader.check("q10")

<!-- END QUESTION -->



## Question 1

The winning professor claims that they are significantly better than the other professor -- and it isn't just due to random chance. What technique can we use to evaluate their claim?

**Answer**:

<!-- BEGIN QUESTION -->

<!--
BEGIN QUESTION
name: q11
manual: true
-->

_Type your answer here, replacing this text._

**Solution**:

Hypothesis testing

<!-- END QUESTION -->



## Question 2

What are the null and alternative hypotheses?

- **Null**:
- **Alternative**:

<!-- BEGIN QUESTION -->

<!--
BEGIN QUESTION
name: q12
manual: true
-->

_Type your answer here, replacing this text._

**Solution**:

Null hypothesis: In the population, the distribution of scores of students under professor A is the same for students under professor B. The difference in the sample is due to chance.

Alternative hypothesis: In the population, the distribution of scores of students under professor A have higher scores, on average, than the students under professor B

<!-- END QUESTION -->



## Question 3

What test statistic can we use? Remember: it is usually better for *large* values of the test statistic to point towards the alternative hypothesis.

**Answer**:

<!-- BEGIN QUESTION -->

<!--
BEGIN QUESTION
name: q13
manual: true
-->

_Type your answer here, replacing this text._

**Solution**:

Difference in Means between student scores under professor A and professor B. The higher the difference the more the statistic leans towards professor A i.e the alternative hypothesis

<!-- END QUESTION -->



## Question 4

What was the *observed value* of your test statistic?

In [5]:
obs = scores.groupby('prof').mean().get('score').loc['A'] - scores.groupby('prof').mean().get('score').loc['B'] # SOLUTION
obs

4.7411054460319235

In [None]:
grader.check("q14")

## Question 5

Implement your chosen technique to test whether the null hypothesis should be rejected.

In [7]:
num_simulations = 1000
simulated_stats = np.array([]) # SOLUTION
# BEGIN SOLUTION NO PROMPT
for _ in range(num_simulations):
    shuffled_scores = np.random.permutation(scores.get('score'))
    shuffled = scores.assign(shuffled_score = shuffled_scores)
    group_means = shuffled.groupby('prof').mean().get('shuffled_score')
    difference = group_means.loc['A'] - group_means.loc['B']
    simulated_stats = np.append(simulated_stats, difference)
# END SOLUTION
simulated_stats

array([-8.03926446e-01,  8.97716941e-01,  3.22565010e+00,  2.36161957e+00,
        3.95190648e-01, -2.62548026e-01,  1.58764686e+00,  3.24102629e+00,
       -5.58152612e-01,  1.28746188e+00, -2.04997617e+00, -2.52243166e+00,
        3.68243414e+00, -2.34986667e+00,  3.34508544e-01,  2.30471600e-01,
        1.69733717e+00,  1.41808526e+00, -8.32158317e-01, -1.21975316e+00,
        1.26563438e+00,  2.01503269e+00,  6.57817378e-01,  1.10095518e+00,
       -3.61009359e-01,  3.36875483e-01, -9.91085335e-01,  2.05952645e+00,
        5.64684740e-01,  1.47361849e+00,  3.54957639e+00,  2.69705909e+00,
       -2.36416253e+00, -2.99829660e+00,  2.83669144e-01,  1.04211749e+00,
       -1.20987863e+00,  2.26751194e+00,  5.01290382e-01,  9.65896134e-01,
       -3.17607100e-01, -2.43772241e-01,  1.24400880e+00, -1.05497140e+00,
       -5.25774316e+00, -8.45420040e-01, -1.12782432e+00,  3.66905715e-01,
       -1.56270158e+00,  2.93657808e-01, -8.89914095e-01, -1.48170774e+00,
       -1.37602141e+00, -

## Question 6

What is the probability that we see our observed value of the test statistic if the null hypothesis is true?

In [8]:
p_val = (simulated_stats >= obs).mean() # SOLUTION
p_val

0.002

In [None]:
grader.check("q16")

## Question 7

The "winning" professor claims that the results show that they are the better teacher. Is this correct?

In [10]:
claim_true_or_false = True # SOLUTION
claim_true_or_false

True

In [None]:
grader.check("q17")

# Example 2: Fun with Test Statistics

## Question 8

You want to test whether a coin is fair. Your hypotheses are:

- **Null**: the coin is fair
- **Alternative**: the coin is not fair

You'll flip the coin 100 times. What test statistic should you use to assess your claim?

In [12]:
# fill out the following code to set up this experiment

num_flips = 100 # SOLUTION

# model the probability of our coin
model = np.array([0.5, 0.5]) # SOLUTION

# flip our coin ... times
flip_outcomes = np.random.multinomial(num_flips, model) # SOLUTION

# flip_outcomes = [num_heads, num_tails]
num_heads = flip_outcomes[0]

# What is our test statistic?
def test_statistic(num_heads):
    return np.abs(num_heads - 50) # SOLUTION

# compute test statistic
print(f"Test statistic result : {test_statistic(num_heads)}")

Test statistic result : 2


## Question 9

In your experiment, you saw 61 heads. What is the observed value of your test statistic?

In [13]:
num_heads_experiment = 61

observed_test_statistic = test_statistic(num_heads_experiment) # SOLUTION

print(f"Test statistic result : {observed_test_statistic}")

Test statistic result : 11


## Question 10

You want to test whether an *n*-sided die is fair. Your hypotheses are:

- **Null**: the die is fair
- **Alternative**: the die is not fair

You'll roll the die 100 times. What test statistic should you use to assess your claim?

In [14]:
# fill out the following code to set up this experiment


# specify number of sides
N = 20
num_rolls = 100 # SOLUTION

# model the probability of our die
model_die = np.array([1/N]*N) # SOLUTION

# roll our die ... times
roll_outcomes = np.random.multinomial(num_rolls, model_die) # SOLUTION

# roll_outcomes = [count_num_side_1 ,..., ..., count_num_side_N]
# roll_outcomes_prob = [perc_num_side_1 ,..., ..., perc_num_side_N]

roll_outcomes_prob = roll_outcomes / num_rolls # SOLUTION

# What is our test statistic?
def test_statistic_die(roll_outcomes_prob, model_die):
    return np.abs(roll_outcomes_prob - model_die).sum() / 2 # SOLUTION

# compute test statistic
print(f"Test statistic result : {test_statistic_die(roll_outcomes_prob, model_die)}")

Test statistic result : 0.15000000000000002


## Question 11

You rolled a 4-sided side 100 times and got "one" 20 times, "two" 30 times, "three" 40 times, and "four" 10 times. What is the observed value of your test statistic?

In [15]:
# specify number of sides
N = 4
num_rolls = 100 # SOLUTION

# Given roll outcomes
roll_outcomes = np.array([20, 30, 40, 10]) 
roll_outcomes_prob = roll_outcomes / num_rolls # SOLUTION

# model the probability of our die
model_die = np.array([1/N]*N) # SOLUTION

# compute the test statistic
test_statistic = test_statistic_die(roll_outcomes_prob, model_die) # SOLUTION

# display results
print(f"Test statistic result : {test_statistic_die(roll_outcomes_prob, model_die)}")

Test statistic result : 0.2


## Question 12

You rolled a 2-sided die 100 times and got "one" 61 times and "two" 39 times. What is the observed value of your test statistic?

In [16]:
# specify number of sides
N = 2
num_rolls = 100 # SOLUTION

# Given roll outcomes
roll_outcomes = np.array([61, 39]) 
roll_outcomes_prob = roll_outcomes / num_rolls # SOLUTION

# model the probability of our die
model_die = np.array([1/N]*N) # SOLUTION

# compute the test statistic
test_statistic = test_statistic_die(roll_outcomes_prob, model_die) # SOLUTION

# display results
print(f"Test statistic result : {test_statistic_die(roll_outcomes_prob, model_die)}")

Test statistic result : 0.10999999999999999
