<a href="https://colab.research.google.com/github/annakasper1/QNC/blob/main/Simple_Non_parametric_T_tests.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**Simple Non-Parametric T-tests**

In [None]:
import numpy as np
import scipy.stats as st
from statsmodels.stats.descriptivestats import sign_test


According to Microsoft Copilot: The NumPy package in python stands for Numerical Python.

The scipy.stats package is a submodule of SciPy. It provides statistical functions such as t-tests and includes probability distributions.

The sign_test is a non-parametric test. It evaluates if median of a distribution differs from a specified value.

**Sign test: one sample, skewed distribution**

Null hypothesis: equal probability |p=0.5| of an effect/difference between paired measurements of a single sample.

In [None]:
# Make some paired data
a = [3,10,4,20,4,7,50,3,5,5,7] # this could be a pre-treatment score for a patient
b = [5,9,10,15,6,5,43,6,2,1,0] # this could be a post-treatment score for the same patient
diff = [bi-ai for ai, bi in zip(a,b)] # "i" represents which a and which b values are paired. zip(a, b) pairs up the elements from a and b.
# The result is a list of differences.

_, p = sign_test(diff) # this line runs the sign test on the list of differences. It asks the question - "is the direction of change from a to b estatistically significant?"
# the test statistic and the p-value "p" are returned. The underscore "_" is used to ignore the test statistic.
print(f'p={p:.2f}') # The p-value is printed and rounded to 2 decimal places, using an f-string for formatting.

p=0.55


**Wilcoxon signed-rank test: one sample or paired samples, symmetric distribution|s|**

Null hypothesis: the sample tested came from a population with a specified median value. The median value of this sample compared to the others in the population is the same.

Typically used as a substitute for a one-sample t-test when the data distribution is not normal.

One sample: test if median of a single sample is equal to a specific value.
Paired sample: test if two related samples, like before/after measurements from the same subject, have equal median values.


In [None]:

samples = np.random.randint(0, high=51, size=200) # uses NumPy to generate a list/array of 200 random integers between 0 and 50
null_hypothesis_median = 24 # sets the null hypothesis median to 24. Here, we assume the true median of the population from which 'samples' were drawn is 24.

# Unlike in Matlab, the scipy implementation does not handle the case of comparing
#  to a median other than zero, so we make this a (fake) paired two-sample test
#  by subtracting the median from each value
_, p = st.wilcoxon(samples-null_hypothesis_median)
print(f'p = {p:.2f}')

p = 0.98


**Mann-Whitney: unpaired, two sample**

Null hypothesis: two unpaired samples come from independent distributions that share the same median value.

Typically used as a substitute for a two-sample t-test when the data distribution is not normal and to see if the distributions differ in terms of the median.

In [None]:

X = np.random.randint(0, high=51, size=200)
Y = 2 + np.random.randint(0, high=51, size=200)
_, p = st.mannwhitneyu(X,Y)
print(f'p = {p:.2f}')

p = 0.01


**Other notes**

When deciding between the Wilcoxon vs the sign test on paired measurements, consider these factors:
*   Is the distribution of differences symmetric or skewed?
*   Does the mean roughly approximate the median or is it different?
*   Do you want the sign and magnitude of the differences or just the sign?

The primary benefit of the sign test is that it is not sensitive to outliers/skew.

A distribution is symmetric if the values are evenly spead across a central point, like a mirror image.




**Exercise 1**

You are a behavioral geologist measuring the reaction time of rocks in response to a tone. Specifically, you want to compare the median reaction time of geodes, to that of limestone. You recruit 20 rocks per group, and run your reaction-time experiment. What test would you use to compare median reaction times between geodes and limestone, and why?

***My answer:*** I would use the Mann-Whitney test because the geodes and limestones are two independent populations and the goal is to compare the medians.

**Exercise 2**

You are a brilliant scientist working at a biotech firm developing a vaccine that reverses aging. Wow! To test the efficacy of the vaccine, you recruit 50 people, give them a course of your vaccine, and measure their age with a very special scale before and after treatment. You want to start by refuting the simple that that the participants' measured ages are not changed by the treatment. What test do you use and why?

***My answer:*** I think in this context, without understanding what the distribution of differences would look like, the test used would be the sign test since the scenario frames that my aim is only to look at the direction of the change, not the magnitude of it.

***My question:*** I'm not sure what the second to last sentence was supposed to say.

**Exercise 3**

You are a neuroeconomist and believe you have developed a wearable device that can change consumer preferences about a given product. To test your device, you present product X to a group of 40 individuals, and ask them to fill out a survery assessing how much they like the product (larger score means they like it more). Then, you have the individuals wear the device, present product X, and assess how much they like of the product. You want to know if the device reliably increases, decreases, or does not affect their liking of product X. What test would you use and why? What result would indicate that their liking has increased?

***My answer:*** I would use the sign test because this will provide the direction of change in the participants' results. A positive score would indicate the participants liked the product more while a negative score would indicate that the participants liked the product less. A score of zero would be closer to the null hypothesis of there being no difference before and after participants wore the device.