## Introduction to the Sample Final Test

Dear Students,

Welcome to the sample final test for our laboratory course. This test is designed to assess your understanding and application of the concepts and techniques we have covered throughout the semester. 

Instructions:

Read Each Question Carefully: Ensure you understand what is being asked before you start coding.

Write Clean and Commented Code: Your code should be well-organized and include comments explaining your logic.

Test Your Code: Make sure to test your code with different inputs to ensure it works correctly.

Conclusions: Make final decisions, decide about the statistical and practical significance. 

Resources:

You are allowed to use your notes, textbooks, and online resources to help you complete the test. 

**Please be advised that the use of any Generative AI (GenAI) tools is strictly prohibited during this test. This includes, but is not limited to, tools that generate code, text, or any other form of content based on AI algorithms.**

Collaboration with classmates is not permitted. This test is an individual assessment of your skills.

I encourage you to take your time and approach each question methodically. This test is an opportunity to demonstrate your proficiency and understanding of the material. 

Best regards,

Karol
/Mathematical Statistics 2024/2025/



# Task 1: Verify the Hypothesis

Objective: Verify the hypothesis that the salaries of professors working in theoretical departments (B) are much lower than those working in applied departments (A).

In [13]:
import pandas as pd
import numpy as np

# Load the Salaries dataset from the URL
url = "https://vincentarelbundock.github.io/Rdatasets/csv/carData/Salaries.csv"
salaries = pd.read_csv(url)

# Filter the data based on the department type
theoretical_salaries = salaries[salaries['discipline'] == 'B']['salary']
applied_salaries = salaries[salaries['discipline'] == 'A']['salary']

# Display the first few rows of the dataset
salaries

Unnamed: 0,rownames,rank,discipline,yrs.since.phd,yrs.service,sex,salary
0,1,Prof,B,19,18,Male,139750
1,2,Prof,B,20,16,Male,173200
2,3,AsstProf,B,4,3,Male,79750
3,4,Prof,B,45,39,Male,115000
4,5,Prof,B,40,41,Male,141500
...,...,...,...,...,...,...,...
392,393,Prof,A,33,30,Male,103106
393,394,Prof,A,31,19,Male,150564
394,395,Prof,A,42,25,Male,101738
395,396,Prof,A,25,15,Male,95329


In [8]:
# your solution
print(theoretical_salaries.size, applied_salaries.size)

from scipy.stats import shapiro

stat, p = shapiro(theoretical_salaries)
print(f'Sharpino Statistic={stat}, p={p}')

stat, p = shapiro(applied_salaries)
print(f'Sharpino Statistic={stat}, p={p}')


216 181
Sharpino Statistic=0.9611942982211139, p=1.271009083842308e-05
Sharpino Statistic=0.941829929220077, p=1.0267201943206342e-06


In [15]:
from scipy.stats import boxcox

t_theoretical_salaries, _ = boxcox(theoretical_salaries)
t_applied_salaries, _ = boxcox(applied_salaries)

stat, p = shapiro(t_theoretical_salaries)
print(f'Sharpino Statistic={stat}, p={p}')

stat, p = shapiro(t_applied_salaries)
print(f'Sharpino Statistic={stat}, p={p}')

Sharpino Statistic=0.9891412860313052, p=0.10216707231581107
Sharpino Statistic=0.9850858810631478, p=0.051302860959439006
OK


In [14]:
from scipy.stats import levene

stat, p_value = levene(t_theoretical_salaries, t_applied_salaries)
print(f"Statistic: {stat}, P-value: {p_value}")

Statistic: 286.4828141687703, P-value: 1.0247816132973972e-48


In [23]:
from scipy.stats import ttest_ind

# T-Test(t_applied_salaries < t_theoretical_salaries)
t_stat, p_value = ttest_ind(t_applied_salaries, t_theoretical_salaries, equal_var=False, alternative='less')

print(f"T-statistic: {t_stat:.2f}, P-value: {p_value}")

T-statistic: -1589.20, P-value: 0.0


## REJECT H_0

In [None]:
# 

# Task 2: Verify the Hypothesis

Objective: Verify if the proportion of higher rank professors (associate and full professors) is significantly different between male and female scientists.

In [28]:
from scipy.stats import chi2_contingency
from pandas import crosstab

salaries['higher_rank'] = salaries['rank'].apply(lambda x: 'low' if x == 'AssocProf' else 'high' if x == 'Prof' else None)

# def empty_function(value):
#     if value == 'AssocProf':
#         return 'low'
#     elif value == 'Prof':
#         return 'high'
#     else:
#         return None

table = crosstab(salaries['higher_rank'], salaries['sex'])
print(table)

chi2, p, dof, expected = chi2_contingency(table)

print(f"Chi2 Statistic: {chi2}, Df: {dof}, P-value: {p}")

sex          Female  Male
higher_rank              
high             18   248
low              10    54
Chi2 Statistic: 4.134655979582055, Df: 1, P-value: 0.04201360631381723


# Task 3: Verify the Hypothesis

Objective: Verify if the salaries of professors are significantly different based on rank, gender, and discipline, and check for interactions between these groups.

In [29]:
from scipy.stats import shapiro

stat, p = shapiro(salaries['salary'])
print(f'Sharpino Statistic={stat}, p={p}')

Sharpino Statistic=0.9598763278424717, p=6.076052123031469e-09


In [30]:
from scipy.stats import boxcox

salaries['t_salary'], _ = boxcox(salaries['salary'])

stat, p = shapiro(salaries['t_salary'])
print(f'Sharpino Statistic={stat}, p={p}')

Sharpino Statistic=0.9913485882391487, p=0.020271080134795293


In [42]:
column = 'rank'
groups = [salaries[salaries[column] == g]['t_salary'] for g in salaries[column].unique()]
stat, p = levene(*groups)
print(f'Levene Statistic Rank={stat}, p={p}')

column = 'sex'
groups = [salaries[salaries[column] == g]['t_salary'] for g in salaries[column].unique()]
stat, p = levene(*groups)
print(f'Levene Statistic Sex={stat}, p={p}')

column = 'discipline'
groups = [salaries[salaries[column] == g]['t_salary'] for g in salaries[column].unique()]
stat, p = levene(*groups)
print(f'Levene Statistic Disciplin={stat}, p={p}')

Levene Statistic Rank=15.39244267707073, p=3.65975319846777e-07
Levene Statistic Sex=0.1404424384435778, p=0.7080426838760642
Levene Statistic Disciplin=1.4833588695145101, p=0.2239767480337511


In [43]:
from statsmodels.formula.api import ols
import statsmodels.api as sm

model = ols('t_salary ~ sex + discipline', data=salaries).fit()
anova_table = sm.stats.anova_lm(model, typ=2)

print(anova_table)

              sum_sq     df          F    PR(>F)
sex         0.002357    1.0   9.265586  0.002492
discipline  0.003334    1.0  13.107670  0.000332
Residual    0.100214  394.0        NaN       NaN


# Task 4: Verify the Hypothesis

Objective: Verify if credit amounts (in DM) are significantly different for people applying with different job, personal status, sex, or age.

In [3]:
import pandas as pd

# Load the GermanCredit dataset from GitHub
url = "https://raw.githubusercontent.com/selva86/datasets/master/GermanCredit.csv"
germancredit = pd.read_csv(url)

# Display the first few rows of the dataset
print(germancredit.columns)
print(germancredit.head())

Index(['status', 'duration', 'credit_history', 'purpose', 'amount', 'savings',
       'employment_duration', 'installment_rate', 'personal_status_sex',
       'other_debtors', 'present_residence', 'property', 'age',
       'other_installment_plans', 'housing', 'number_credits', 'job',
       'people_liable', 'telephone', 'foreign_worker', 'credit_risk'],
      dtype='object')
                status  duration                            credit_history  \
0         ... < 100 DM         6   critical account/other credits existing   
1    0 <= ... < 200 DM        48  existing credits paid back duly till now   
2  no checking account        12   critical account/other credits existing   
3         ... < 100 DM        42  existing credits paid back duly till now   
4         ... < 100 DM        24           delay in paying off in the past   

               purpose  amount                     savings  \
0  domestic appliances    1169  unknown/no savings account   
1  domestic appliances    59

In [20]:
# your solution
# 2-way ANCOVA
from scipy.stats import shapiro
import numpy as np
from scipy.stats import boxcox
transformed_data, lambda_value = boxcox(germancredit['amount'])

stat, p = shapiro(transformed_data)
print(f'Sharpino Statistic={stat}, p={p}')

germancredit['personal_status_sex']
germancredit['amount_trans'] = transformed_data
# stat, p = levene(group1, group2, group3)


Sharpino Statistic=0.994295610968472, p=0.0007607681859426297


In [21]:
from scipy.stats import levene

credit_wide = germancredit.melt(id_vars='amount_trans', value_vars=['personal_status_sex'])
print(credit_wide.head())
groups = [credit_wide[credit_wide["value"] == g]["amount_trans"] for g in credit_wide["value"].unique()]
stat, p = levene(*groups)
print(f'Levene Statistic={stat}, p={p}')

   amount_trans             variable                                value
0      5.682667  personal_status_sex                        male : single
1      6.666001  personal_status_sex  female : divorced/separated/married
2      6.047305  personal_status_sex                        male : single
3      6.825672  personal_status_sex                        male : single
4      6.550332  personal_status_sex                        male : single
Levene Statistic=1.0432200773471447, p=0.3725560351845742


In [37]:
from statsmodels.formula.api import ols
import statsmodels.api as sm
# Perform two-way ANCOVA
model = ols('amount_trans ~ personal_status_sex * job + age', data=germancredit).fit()
anova_table = sm.stats.anova_lm(model, typ=2)

# Display the ANCOVA results
print(anova_table)

                             sum_sq     df          F        PR(>F)
personal_status_sex       15.374696    3.0  27.241914  3.054340e-12
job                        9.639518    3.0  17.079941  5.106528e-08
personal_status_sex:job    2.876209    9.0   1.698753  9.462776e-02
age                        1.046340    1.0   5.561925  1.854999e-02
Residual                 185.115491  984.0        NaN           NaN




# Task 5: Evaluate Interaction Between Group and Time

Description: 

The data provide the anxiety score, measured at three time points, of three groups of individuals practicing physical exercises at different levels (grp1: basal, grp2: moderate and grp3: high)

Objective: Evaluate if there is an interaction between group and time in explaining anxiety scores.

In [13]:
import pandas as pd

# Load the anxiety dataset from GitHub
url = "https://raw.githubusercontent.com/kflisikowski/ds/master/anxiety.csv"
anxiety_data = pd.read_csv(url)

# Display the first few rows of the dataset
print(anxiety_data.head())

   Unnamed: 0  id group    t1    t2    t3
0           1   1  grp1  14.1  14.4  14.1
1           2   2  grp1  14.5  14.6  14.3
2           3   3  grp1  15.7  15.2  14.9
3           4   4  grp1  16.0  15.5  15.3
4           5   5  grp1  16.5  15.8  15.7


In [None]:
# your solution

# Task 6: Evaluate the Goodness of Fit

Objective: Use the goodness of fit test to determine whether the distribution of credit amounts for male customers matches that of female customers.

In [14]:
import pandas as pd

# Load the German Credit dataset from GitHub
url = "https://raw.githubusercontent.com/selva86/datasets/master/GermanCredit.csv"
germancredit = pd.read_csv(url)

# Display the first few rows of the dataset
print(germancredit.head())

                status  duration                            credit_history  \
0         ... < 100 DM         6   critical account/other credits existing   
1    0 <= ... < 200 DM        48  existing credits paid back duly till now   
2  no checking account        12   critical account/other credits existing   
3         ... < 100 DM        42  existing credits paid back duly till now   
4         ... < 100 DM        24           delay in paying off in the past   

               purpose  amount                     savings  \
0  domestic appliances    1169  unknown/no savings account   
1  domestic appliances    5951                ... < 100 DM   
2           retraining    2096                ... < 100 DM   
3     radio/television    7882                ... < 100 DM   
4            car (new)    4870                ... < 100 DM   

  employment_duration  installment_rate                  personal_status_sex  \
0      ... >= 7 years                 4                        male : single  

In [None]:
# your solution

# Task 7: Evaluate the Change in Asthma Symptoms Over Time

Objective: determine if there is a significant change in asthma symptoms reported by participants at two different time points.

In [2]:
import pandas as pd

# Load the asthma dataset from GitHub
url = "https://github.com/bougioukas/basic_stats_R/raw/main/data/asthma.xlsx"
asthma_data = pd.read_excel(url)

# Display the first few rows of the dataset
print(asthma_data.head())

  know_begin know_end
0        yes      yes
1         no       no
2        yes       no
3         no       no
4         no       no


In [None]:
# your solution

# Task 8: Differences of BG readings Over Time 

Objective: determine if there is a significant difference in the blood glucose (BG) readings over multiple time points.

Data: let's use a hypothethical example of blood glucose (BG) readings of persons with diabetes.

The test is done three times, say before, within and after a given clinical treatment and we want to know if there is a significant difference within the groups (times).

In [4]:
# Read dataset from url:
import io
import requests
url="https://raw.githubusercontent.com/trangel/stats-with-python/master/data/BG-db.csv"
s=requests.get(url).content
df=pd.read_csv(io.StringIO(s.decode('utf-8')),index_col=0)


df.columns=['before','during','after']
df.index.name='Subject'
df.head(10)

Unnamed: 0_level_0,before,during,after
Subject,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
0,89.162573,94.023517,94.594145
1,90.857629,95.273755,95.040646
2,94.912999,96.61287,95.200472
3,95.254064,96.818673,97.205801
4,97.136291,97.760342,98.42884
5,99.809999,99.169227,98.867769
6,101.094087,99.579283,99.790581
7,101.531428,99.661758,100.669928
8,101.981148,100.812359,101.751155
9,101.993065,102.274035,101.751638


In [None]:
# your solution

# Task 9: Evaluate the Change in Mice Weights Before and After Treatment

Objective: determine if there is a significant difference in the weights of mice before and after treatment.

In [6]:
import pandas as pd

# Weight of the mice before treatment
before = [200.1, 190.9, 192.7, 213, 241.4, 196.9, 172.2, 185.5, 205.2, 193.7]

# Weight of the mice after treatment
after = [392.9, 393.2, 345.1, 393, 434, 427.9, 422, 383.9, 392.3, 352.2]

# Create a data frame
my_data = pd.DataFrame({
    'group': ['before'] * len(before) + ['after'] * len(after),
    'weight': before + after
})

# Display the first few rows of the dataset
print(my_data.head(10))

    group  weight
0  before   200.1
1  before   190.9
2  before   192.7
3  before   213.0
4  before   241.4
5  before   196.9
6  before   172.2
7  before   185.5
8  before   205.2
9  before   193.7


In [None]:
# your solution

# Task 10: Calculate Effect Size and Power 

Objective: Use Python to calculate the effect size and power for a test comparing the total bill amounts between smokers and non-smokers. Interpret your results. If the power is not satisfactory - how many observations should we sample to achieve 90% power?

The tips dataset contains information about tips received by waitstaff in a restaurant, including various attributes such as total bill, tip amount, sex of the bill payer, whether the payer is a smoker, day of the week, time of day, and size of the party.

The tips dataset contains the following columns:

total_bill: The total bill amount (including tip) in dollars.

tip: The tip amount in dollars.

sex: The sex of the bill payer (Male or Female).

smoker: Whether the bill payer is a smoker (Yes or No).

day: The day of the week (Thur, Fri, Sat, Sun).

time: The time of day (Lunch or Dinner).

size: The size of the party.

In [8]:
import seaborn as sns
import pandas as pd

# Load the tips dataset
tips = sns.load_dataset('tips')

# Display the first few rows of the dataset
print(tips.head())

   total_bill   tip     sex smoker  day    time  size
0       16.99  1.01  Female     No  Sun  Dinner     2
1       10.34  1.66    Male     No  Sun  Dinner     3
2       21.01  3.50    Male     No  Sun  Dinner     3
3       23.68  3.31    Male     No  Sun  Dinner     2
4       24.59  3.61  Female     No  Sun  Dinner     4


In [None]:
# your solution

# Task 11: 2-way Anova

Objective: Three teachers graded final exams for students, and each exam varied in difficulty (Easy, Medium, Hard). We are interested in investigating whether there are differences in scores based on:
1.	The teacher grading the exam.
2.	The difficulty level of the exam.
3.	The interaction between teacher and difficulty level.


In [1]:
import pandas as pd

# Dane
data = {
    'Teacher': ['Teacher 1'] * 15 + ['Teacher 2'] * 15 + ['Teacher 3'] * 15,
    'Difficulty': ['Easy'] * 5 + ['Medium'] * 5 + ['Hard'] * 5 +
                  ['Easy'] * 5 + ['Medium'] * 5 + ['Hard'] * 5 +
                  ['Easy'] * 5 + ['Medium'] * 5 + ['Hard'] * 5,
    'Scores': [78, 82, 85, 80, 79, 70, 75, 73, 72, 74, 60, 65, 62, 63, 61,
               80, 85, 83, 81, 82, 72, 74, 75, 73, 71, 63, 64, 66, 62, 65,
               81, 83, 82, 84, 85, 73, 71, 74, 72, 70, 64, 63, 62, 65, 66]
}

# Tworzenie DataFrame
df = pd.DataFrame(data)

# Wyświetlenie DataFrame
print(df)

      Teacher Difficulty  Scores
0   Teacher 1       Easy      78
1   Teacher 1       Easy      82
2   Teacher 1       Easy      85
3   Teacher 1       Easy      80
4   Teacher 1       Easy      79
5   Teacher 1     Medium      70
6   Teacher 1     Medium      75
7   Teacher 1     Medium      73
8   Teacher 1     Medium      72
9   Teacher 1     Medium      74
10  Teacher 1       Hard      60
11  Teacher 1       Hard      65
12  Teacher 1       Hard      62
13  Teacher 1       Hard      63
14  Teacher 1       Hard      61
15  Teacher 2       Easy      80
16  Teacher 2       Easy      85
17  Teacher 2       Easy      83
18  Teacher 2       Easy      81
19  Teacher 2       Easy      82
20  Teacher 2     Medium      72
21  Teacher 2     Medium      74
22  Teacher 2     Medium      75
23  Teacher 2     Medium      73
24  Teacher 2     Medium      71
25  Teacher 2       Hard      63
26  Teacher 2       Hard      64
27  Teacher 2       Hard      66
28  Teacher 2       Hard      62
29  Teache

In [None]:
# your solution

# Task 12: Mantel-Haenszel test

Objective: use the Mantel-Haenszel test to determine if there is a significant difference in the changes in doctoral program completion status between male and female students, controlling for the time variable (initial and final status).

Gdańsk Tech classified students entering the PhD programs in a given year by their status 6 years later, with the data broken down by gender. The initial and final status of students is as follows: Men: 15 completed, 5 not completed, Women: 10 completed, 10 not completed. Final Status: Men: 12 completed, 8 not completed, Women: 8 completed, 12 not completed. Determine if there is a significant difference in the changes of doctoral program completion status between male and female students.


In [None]:
# your solution