<a href="https://colab.research.google.com/github/BartoszJanJerzy/stats/blob/main/Boredom_experiment.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#WELCOME!

We (MA Bartosz Wójtowicz, PROF Stanisława Steuden, The John Paul II Catholic University of Lublin) wrote this this code to calculate results in our article titled 'Do you feel like a bored stiff? Experimentally Induced Boredom Increases Hostility, Avoidance Motivation and Decreases Mindfulness'.

---

What is the scehdule?

1 - import libraries

2 - functions used in script

3 - import data and screening

4 - reliability coeffs (Cronbach's alpha)

5 - compare pre and post-test

6 - correlation matrix

7 - main analysis: linear regressions

8 - extra analysis: curvilinear regressions

---

We share this code with you, feel free to use or check it.

#1 - Libraries

In [43]:
import pandas as pd
import numpy as np
import plotly.graph_objects as go
from plotly.subplots import make_subplots
from scipy.stats import pearsonr, ttest_rel
import statistics
import statsmodels.api as sm # import statsmodels
from scipy import stats

import sklearn
from sklearn.linear_model import LinearRegression
from sklearn.preprocessing import PolynomialFeatures
from sklearn.metrics import r2_score

#2 - Functions

In [3]:
def CronbachAlphaa(df_list, names_list):
  subscales = []
  alpha = []

  for df, name in zip(df_list, names_list):
    # 1. Transform the df into a correlation matrix
    df_corr = df.corr()

    # 2.1 Calculate N
    # The number of variables equals the number of columns in the df
    N = df.shape[1]

    # 2.2 Calculate R
    # For this, we'll loop through the columns and append every
    # relevant correlation to an array calles "r_s". Then, we'll
    # calculate the mean of "r_s"
    rs = np.array([])
    for i, col in enumerate(df_corr.columns):
        sum_ = df_corr[col][i+1:].values
        rs = np.append(sum_, rs)
    mean_r = np.mean(rs)

    # 3. Use the formula to calculate Cronbach's Alpha
    cronbach_alpha = (N * mean_r) / (1 + (N - 1) * mean_r)
    cronbach_alpha = round(cronbach_alpha, 3)

    subscales.append(name)
    alpha.append(cronbach_alpha)
  
  alpha_dict = {
      'subscale':subscales,
      'alpha':alpha
  }
  
  return pd.DataFrame(alpha_dict)

In [70]:
def MakeRegressions(X, y):
  # linear regression
  regressor = LinearRegression()
  regressor.fit(X, y)
  y_pred_lin = regressor.predict(X)

  linear_r2 = r2_score(y, y_pred_lin)
  print('Regresja liniowa R2 = ',linear_r2)


  # squared regression
  poly = PolynomialFeatures(degree=2)
  X_2 = poly.fit_transform(X)
  regressor.fit(X_2,y)
  y_pred_2 = regressor.predict(X_2)
  squared_r2 = r2_score(y, y_pred_2)
  print('Regresja kwadratowa R2 = ',squared_r2)

  # 3' degree regression
  poly = PolynomialFeatures(degree=3)
  X_3 = poly.fit_transform(X)
  regressor.fit(X_3,y)
  y_pred_3 = regressor.predict(X_3)
  deg3_r2 = r2_score(y, y_pred_3)
  print('Regresja stopnia 3 R2 = ',deg3_r2)

  # 4' degree regression
  poly = PolynomialFeatures(degree=4)
  X_4 = poly.fit_transform(X)
  regressor.fit(X_4,y)
  y_pred_4 = regressor.predict(X_4)
  deg4_r2 = r2_score(y, y_pred_4)
  print('Regresja stopnia 4 R2 = ',deg4_r2)

In [52]:
def MakeScatterPlot(df_1, df_2, title, x_title, y_title):
  fig = go.Figure(
      data=go.Scatter(
          x = df_1,
          y = df_2,
          mode = 'markers'
          ),
      layout=go.Layout(
          title = title,
          xaxis_title = x_title,
          yaxis_title = y_title,
          width = 1000,
          height = 700
      ))
  fig.show()

#3 - DATA

In [None]:
data_raw = pd.read_excel('/content/data.xlsx')
data_whole = data_raw.copy()

emotion_pre_df = pd.read_excel('/content/emotions_pre.xlsx')
emotion_post_df = pd.read_excel('/content/emotions_post.xlsx')
data_whole

In [5]:
for c in data_whole.columns:
  print(c)

id
sex
age
bored
boredom_level
std_boredom_level
photo_1
photo_2
photo_3
photo_4
photo_5
photo_6
photo_7
photo_8
photo_9
photo_10
photo_11
photo_12
rt_1
rt_2
rt_3
rt_4
rt_5
rt_6
rt_7
rt_8
rt_9
rt_10
rt_11
rt_12
neg_aff_1
fear_1
sad_1
guilt_1
hostility_1
disgust_1
tired_1
pos_aff_1
joy_1
self_conf_1
mindfulness_1
calm_1
surprise_1
neg_aff_2
fear_2
sad_2
guilt_2
hostility_2
disgust_2
tired_2
pos_aff_2
joy_2
self_conf_2
mindfulness_2
calm_2
surprise_2
rt_neg_aff
rt_fear
rt_sad
rt_guilt
rt_hostility
rt_disgust
rt_tired
rt_pos_aff
rt_joy
rt_self_conf
rt_mindfulness
rt_calm
rt_surprise
neg_aff
fear
sad
guilt
hostility
disgust
tired
pos_aff
joy
self_conf
mindfulness
calm
surprise


##Screening data
1. Delete records if participant did not get bored
2. Delete outliers -> delete record if results of at least one subscale is lower than -3SD or higher than +3SD

###Prepare filters

In [6]:
mean_boredom = data_whole['boredom_level'].mean()
sd_boredom = data_whole['boredom_level'].std()

filter_1 = data_whole['bored'] == 'yes'
filter_2 = data_whole['boredom_level'] > mean_boredom - 3*sd_boredom
filter_3 = data_whole['boredom_level'] < mean_boredom + 3*sd_boredom

###Use filters

In [7]:
filters = [filter_1, filter_2, filter_3]

for filter in filters:
  data_whole = data_whole[filter]


Boolean Series key will be reindexed to match DataFrame index.



In [8]:
print(f'The research sample - after screening -  has N={len(data_whole)}')

The research sample - after screening -  has N=59


#4 - RELIABILITY COEFFS

In [None]:
for c in emotion_pre_df.columns:
  print(c)

##Pre-test

In [10]:
df_pos_pre = emotion_pre_df[['uważny','silny','zainspirowany','czujny','aktywny','podekscytwany','dumny','entuzjastyczny','zdecydowany','zainteresowany']]
df_neg_pre = emotion_pre_df[['drażliwy','obawiający się','zaniepokojony','winny','nerwowy','wrogi','roztrzęsiony','zawstydzony','przestraszony','zestresowany']]
df_fear_pre = emotion_pre_df[['obawiający się','chwiejny','nerwowy','roztrzęsiony','przestraszony','przerażony']]
df_sad_pre = emotion_pre_df[['smutny','sam','przygnębiony','samotny','przybity']]
df_guilt_pre = emotion_pre_df[['zdegustowany','winny','zawstydzony','zły na siebie','zasługujący na potępienie','niezadowolony z siebie']]
df_host_pre = emotion_pre_df[['zdegustowany','czujący pogardę','drażliwy','rozzłoszczony','wrogi','czujący odrazę']]
df_joy_pre = emotion_pre_df[['wesoły','zachwycony','szczęśliwy','radosny','podekscytwany','żwawy','entuzjastyczny','energiczny']]
df_mind_pre = emotion_pre_df[['uważny','czujny','zdecydowany','skoncentrowany']]
df_conf_pre = emotion_pre_df[['śmiały','silny','nieustraszony','odważny','dumny','pewny siebie']]
df_tired_pre = emotion_pre_df[['ospały','zmęczony','śpiący','senny']]
df_calm_pre = emotion_pre_df[['odprężony','wyciszony','na luzie']]
df_surp_pre = emotion_pre_df[['zaskoczony','zdumiony','zadziwiony']]

df_list_pre = [df_neg_pre, df_fear_pre, df_sad_pre, df_guilt_pre, df_host_pre, df_tired_pre, df_pos_pre, df_joy_pre, df_conf_pre, df_mind_pre, df_calm_pre, df_surp_pre]
names_pre = ['neg_pre','fear_pre','sad_pre','guilt_pre','host_pre','tired_pre','pos_pre','joy_pre','conf_pre','mind_pre','calm_pre','surp_pre']

In [11]:
df_alpha_pre = CronbachAlphaa(df_list_pre, names_pre)

df_alpha_pre

Unnamed: 0,subscale,alpha
0,neg_pre,0.849
1,fear_pre,0.798
2,sad_pre,0.875
3,guilt_pre,0.808
4,host_pre,0.725
5,tired_pre,0.903
6,pos_pre,0.716
7,joy_pre,0.806
8,conf_pre,0.816
9,mind_pre,0.633


##Post-test

In [12]:
df_pos_post = emotion_post_df[['uważny','silny','zainspirowany','czujny','aktywny','podekscytwany','dumny','entuzjastyczny','zdecydowany','zainteresowany']]
df_neg_post = emotion_post_df[['drażliwy','obawiający się','zaniepokojony','winny','nerwowy','wrogi','roztrzęsiony','zawstydzony','przestraszony','zestresowany']]
df_fear_post = emotion_post_df[['obawiający się','chwiejny','nerwowy','roztrzęsiony','przestraszony','przerażony']]
df_sad_post = emotion_post_df[['smutny','sam','przygnębiony','samotny','przybity']]
df_guilt_post = emotion_post_df[['zdegustowany','winny','zawstydzony','zły na siebie','zasługujący na potępienie','niezadowolony z siebie']]
df_host_post = emotion_post_df[['zdegustowany','czujący pogardę','drażliwy','rozzłoszczony','wrogi','czujący odrazę']]
df_joy_post = emotion_post_df[['wesoły','zachwycony','szczęśliwy','radosny','podekscytwany','żwawy','entuzjastyczny','energiczny']]
df_mind_post = emotion_post_df[['uważny','czujny','zdecydowany','skoncentrowany']]
df_conf_post = emotion_post_df[['śmiały','silny','nieustraszony','odważny','dumny','pewny siebie']]
df_tired_post = emotion_post_df[['ospały','zmęczony','śpiący','senny']]
df_calm_post = emotion_post_df[['odprężony','wyciszony','na luzie']]
df_surp_post = emotion_post_df[['zaskoczony','zdumiony','zadziwiony']]

df_list_post = [df_neg_post, df_fear_post, df_sad_post, df_guilt_post, df_host_post, df_tired_post, df_pos_post, df_joy_post, df_conf_post, df_mind_post, df_calm_post, df_surp_post]
names_post = ['neg_post','fear_post','sad_post','guilt_post','host_post','tired_post','pos_post','joy_post','conf_post','mind_post','calm_post','surp_post']

In [13]:
df_alpha_post = CronbachAlphaa(df_list_post, names_post)

df_alpha_post

Unnamed: 0,subscale,alpha
0,neg_post,0.883
1,fear_post,0.879
2,sad_post,0.884
3,guilt_post,0.758
4,host_post,0.89
5,tired_post,0.861
6,pos_post,0.846
7,joy_post,0.841
8,conf_post,0.758
9,mind_post,0.765


##General

In [14]:
dict_alpha = {
    'subscale':names_pre+names_post,
    'alpha':df_alpha_pre['alpha'].to_list()+df_alpha_post['alpha'].to_list()
}

df_alpha = pd.DataFrame(dict_alpha).T
df_alpha

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23
subscale,neg_pre,fear_pre,sad_pre,guilt_pre,host_pre,tired_pre,pos_pre,joy_pre,conf_pre,mind_pre,calm_pre,surp_pre,neg_post,fear_post,sad_post,guilt_post,host_post,tired_post,pos_post,joy_post,conf_post,mind_post,calm_post,surp_post
alpha,0.849,0.798,0.875,0.808,0.725,0.903,0.716,0.806,0.816,0.633,0.439,0.747,0.883,0.879,0.884,0.758,0.89,0.861,0.846,0.841,0.758,0.765,0.506,0.816


#5 - COMPARE PRE AND POST-TEST


##Add columns with mean reaction times of photo rating task

In [32]:
data_whole['mean_rt_pre'] = data_whole[['rt_1','rt_2','rt_3','rt_4','rt_5','rt_6']].mean(axis=1)
data_whole['mean_rt_post'] = data_whole[['rt_7','rt_8','rt_9','rt_10','rt_11','rt_12']].mean(axis=1)
data_whole.head()

Unnamed: 0,id,sex,age,bored,boredom_level,std_boredom_level,photo_1,photo_2,photo_3,photo_4,photo_5,photo_6,photo_7,photo_8,photo_9,photo_10,photo_11,photo_12,rt_1,rt_2,rt_3,rt_4,rt_5,rt_6,rt_7,rt_8,rt_9,rt_10,rt_11,rt_12,neg_aff_1,fear_1,sad_1,guilt_1,hostility_1,disgust_1,tired_1,pos_aff_1,joy_1,self_conf_1,...,guilt_2,hostility_2,disgust_2,tired_2,pos_aff_2,joy_2,self_conf_2,mindfulness_2,calm_2,surprise_2,rt_neg_aff,rt_fear,rt_sad,rt_guilt,rt_hostility,rt_disgust,rt_tired,rt_pos_aff,rt_joy,rt_self_conf,rt_mindfulness,rt_calm,rt_surprise,neg_aff,fear,sad,guilt,hostility,disgust,tired,pos_aff,joy,self_conf,mindfulness,calm,surprise,mean_pre,mean_post,mean_rt_pre,mean_rt_post
0,1,male,21,yes,56,-1.490836,4,5,6,7,3,1,3,4,6,3,2,5,6.843366,6.774051,5.854006,4.20224,11.088847,5.382434,6.878299,3.477545,4.006206,4.188837,6.95716,3.601301,25,8,14,18,8,6,8,27,23,20,...,17,11,4,13,24,20,17,15,11,4,1.820583,1.709402,1.981099,1.58311,2.004345,1.232639,2.200262,1.868854,1.895965,1.757156,1.682501,1.531974,2.016069,4,0,2,-1,3,-2,5,-5,-3,-3,0,0,-2,6.690824,4.851558,6.690824,4.851558
1,2,male,22,yes,76,-0.341829,5,5,6,6,6,4,5,6,4,5,2,6,6.494722,3.713602,2.971776,3.129173,5.353667,2.945511,5.68915,3.338221,3.266094,3.834831,6.833323,3.572935,29,7,8,6,6,5,12,34,22,17,...,7,6,7,11,31,22,21,16,11,6,1.269524,1.496081,1.650274,1.407007,1.323642,1.922727,1.453893,1.809881,2.356215,1.687866,1.743071,2.386268,2.313132,1,1,0,1,0,2,-1,2,0,4,1,0,0,4.101408,4.422426,4.101408,4.422426
2,3,male,22,yes,100,1.036978,7,7,7,6,5,5,6,5,7,6,6,6,6.654982,3.621899,2.866908,7.093405,5.3483,8.906936,6.542446,6.174742,6.334032,4.290494,4.101693,5.465377,11,12,23,18,11,6,11,30,16,17,...,16,12,5,12,28,11,20,17,8,15,2.204739,2.385546,2.014631,2.130921,2.882301,2.313276,2.030253,2.261895,2.799713,2.886488,2.526246,2.852765,2.479717,-1,-3,2,-2,1,-1,1,2,-5,3,1,-1,9,5.748738,5.484797,5.748738,5.484797
3,4,male,23,yes,100,1.036978,3,4,3,2,4,1,1,2,3,4,1,3,3.370701,5.387383,2.928206,2.864354,5.366482,3.373197,4.402694,3.459649,1.865895,6.598516,3.548205,2.888457,11,6,8,7,8,7,13,23,29,21,...,11,25,5,17,17,26,21,14,8,12,1.63384,1.523673,1.368837,1.93758,2.12364,1.342989,1.442085,1.401582,2.220207,1.60334,1.353279,2.012111,1.918888,8,3,2,4,17,-2,4,1,-3,0,-2,-2,3,3.88172,3.793902,3.88172,3.793902
4,5,male,23,yes,84,0.117773,4,5,5,5,3,3,4,6,4,5,5,7,6.943733,7.002155,10.996005,4.915954,20.422287,17.149401,6.358012,3.812209,7.283081,12.98211,7.540267,9.494517,15,8,11,7,6,4,8,34,19,19,...,6,9,5,14,23,21,20,12,9,4,2.04514,2.458417,2.480509,2.40229,2.472205,2.590943,1.938022,2.568736,2.667546,2.887715,2.824324,3.414757,3.032894,1,0,-1,-1,3,1,6,0,2,1,1,0,1,11.238256,7.911699,11.238256,7.911699


##Paired-samples t test

In [33]:
subscales = ['neg_aff','fear','sad','guilt','hostility','tired','pos_aff','joy','self_conf','mindfulness','calm','surprise','photo rating time']
cols_pre = ['neg_aff_1','fear_1','sad_1','guilt_1','hostility_1','tired_1','pos_aff_1','joy_1','self_conf_1','mindfulness_1','calm_1','surprise_1','mean_rt_pre']
cols_post = ['neg_aff_2','fear_2','sad_2','guilt_2','hostility_2','tired_2','pos_aff_2','joy_2','self_conf_2','mindfulness_2','calm_2','surprise_2','mean_rt_post']
t_list = []
p_list = []
m_pre_list = []
m_post_list = []
sd_pre_list = []
sd_post_list = []

for pre, post in zip(cols_pre, cols_post):
  t = round(ttest_rel(data_whole[pre], data_whole[post])[0], 2)
  p = round(ttest_rel(data_whole[pre], data_whole[post])[1], 3)
  m_pre = round(data_whole[pre].mean(), 2)
  sd_pre = round(data_whole[pre].std(), 2)
  m_post = round(data_whole[post].mean(), 2)
  sd_post = round(data_whole[post].std(), 2)

  t_list.append(t)
  p_list.append(p)
  m_pre_list.append(m_pre)
  m_post_list.append(m_post)
  sd_pre_list.append(sd_pre)
  sd_post_list.append(sd_post)

ttest_dict = {
    'subscale':subscales,
    'm_pre':m_pre_list,
    'sd_pre':sd_pre_list,
    'm_post':m_post_list,
    'sd_post':sd_post_list,
    't':t_list,
    'p':p_list,
}

ttest_df = pd.DataFrame(ttest_dict)
ttest_df

Unnamed: 0,subscale,m_pre,sd_pre,m_post,sd_post,t,p
0,neg_aff,16.0,4.6,18.69,6.83,-2.49,0.016
1,fear,9.54,3.62,10.69,4.55,-2.59,0.012
2,sad,9.54,4.33,10.69,4.84,-3.08,0.003
3,guilt,10.2,4.75,9.9,4.66,0.72,0.473
4,hostility,7.75,2.45,12.05,5.28,-7.02,0.0
5,tired,10.08,3.94,14.08,3.51,-7.58,0.0
6,pos_aff,31.83,4.64,29.25,7.11,3.38,0.001
7,joy,22.22,4.6,21.41,5.71,1.3,0.2
8,self_conf,17.69,3.94,17.56,4.46,0.38,0.704
9,mindfulness,14.76,2.27,13.1,3.45,4.0,0.0


## d cohens for paired samples test

In [34]:
# correlations between measures
corr_list = []

for pre, post in zip(cols_pre, cols_post):
  corr_list.append(pearsonr(data_whole[pre], data_whole[post])[0])

# calcualte d
d_list = []

for i, corr in zip(range(len(ttest_df)), corr_list):
  m = ttest_df.loc[i]['m_pre'] - ttest_df.loc[i]['m_post']
  
  s1 = ttest_df.loc[i]['sd_pre']
  s2 = ttest_df.loc[i]['sd_post']
  s = np.sqrt(s1*s1 + s2*s2 - 2*corr*s1*s2)

  d_list.append(round(m / s, 2)) 

ttest_df['d'] = d_list
ttest_df

Unnamed: 0,subscale,m_pre,sd_pre,m_post,sd_post,t,p,d
0,neg_aff,16.0,4.6,18.69,6.83,-2.49,0.016,-0.32
1,fear,9.54,3.62,10.69,4.55,-2.59,0.012,-0.34
2,sad,9.54,4.33,10.69,4.84,-3.08,0.003,-0.4
3,guilt,10.2,4.75,9.9,4.66,0.72,0.473,0.09
4,hostility,7.75,2.45,12.05,5.28,-7.02,0.0,-0.91
5,tired,10.08,3.94,14.08,3.51,-7.58,0.0,-0.99
6,pos_aff,31.83,4.64,29.25,7.11,3.38,0.001,0.44
7,joy,22.22,4.6,21.41,5.71,1.3,0.2,0.17
8,self_conf,17.69,3.94,17.56,4.46,0.38,0.704,0.05
9,mindfulness,14.76,2.27,13.1,3.45,4.0,0.0,0.52


##Export table with t test

In [35]:
ttest_df.to_excel('ttest_table.xlsx')

#6 - RUN REGRESSIONS
predictor: Boredom level

dependent variables: post-test measures from significant t test  

##Which t test comparizons are identified as significant?
max alpha error probability = .05 / 12 = .004

In [37]:
filter = ttest_df['p'] < .004
sig_df = ttest_df[filter]
sig_df

Unnamed: 0,subscale,m_pre,sd_pre,m_post,sd_post,t,p,d
2,sad,9.54,4.33,10.69,4.84,-3.08,0.003,-0.4
4,hostility,7.75,2.45,12.05,5.28,-7.02,0.0,-0.91
5,tired,10.08,3.94,14.08,3.51,-7.58,0.0,-0.99
6,pos_aff,31.83,4.64,29.25,7.11,3.38,0.001,0.44
9,mindfulness,14.76,2.27,13.1,3.45,4.0,0.0,0.52
11,surprise,5.47,2.06,7.36,3.04,-5.09,0.0,-0.67
12,photo rating time,6.13,2.05,5.37,1.98,3.74,0.0,0.48


##Run regressions

###Unstandarized coeffs

In [47]:
subscales = ['sad_2','hostility_2','tired_2','pos_aff_2','mindfulness_2','surprise_2','mean_rt_post']

for scale in subscales:
  x = data_whole['boredom_level']
  x = sm.add_constant(x)
  y = data_whole[scale]

  model = sm.OLS(y, x).fit()

  print(f'>>>>>>>>>> LINEAR REGRESSION: X=boredom level, y={scale}\n')
  print(model.summary())
  print('\n'*5)

>>>>>>>>>> LINEAR REGRESSION: X=boredom level, y=sad_2

                            OLS Regression Results                            
Dep. Variable:                  sad_2   R-squared:                       0.000
Model:                            OLS   Adj. R-squared:                 -0.018
Method:                 Least Squares   F-statistic:                 0.0009281
Date:                Wed, 31 Mar 2021   Prob (F-statistic):              0.976
Time:                        15:06:03   Log-Likelihood:                -176.25
No. Observations:                  59   AIC:                             356.5
Df Residuals:                      57   BIC:                             360.6
Df Model:                           1                                         
Covariance Type:            nonrobust                                         
                    coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------

###Standarized coeffs

In [48]:
df_z = data_whole[['boredom_level','sad_2','hostility_2','tired_2','pos_aff_2','mindfulness_2','surprise_2','mean_rt_post']]
df_z = df_z.apply(stats.zscore)

for scale in subscales:
  x = df_z['boredom_level']
  x = sm.add_constant(x)
  y = df_z[scale]

  model = sm.OLS(y, x).fit()

  print(f'>>>>>>>>>> LINEAR REGRESSION: X=boredom level, y={scale}\n')
  print(model.summary())
  print('\n'*5)

>>>>>>>>>> LINEAR REGRESSION: X=boredom level, y=sad_2

                            OLS Regression Results                            
Dep. Variable:                  sad_2   R-squared:                       0.000
Model:                            OLS   Adj. R-squared:                 -0.018
Method:                 Least Squares   F-statistic:                 0.0009281
Date:                Wed, 31 Mar 2021   Prob (F-statistic):              0.976
Time:                        15:06:14   Log-Likelihood:                -83.717
No. Observations:                  59   AIC:                             171.4
Df Residuals:                      57   BIC:                             175.6
Df Model:                           1                                         
Covariance Type:            nonrobust                                         
                    coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------

#7 - EXPLORE CURVILINEAR MODELS

In [51]:
subscales = ['sad_2','hostility_2','tired_2','pos_aff_2','mindfulness_2','surprise_2','mean_rt_post']

for scale in subscales:
  x = data_whole['boredom_level']
  x = x.to_numpy().reshape(-1,1)

  y = data_whole[scale]
  y = y.to_numpy().reshape(-1,1)

  print(f'>>>>>>>>>> REGRESSIONs: X=boredom level, y={scale}\n')
  MakeRegressions(x, y)
  print('\n'*2)

>>>>>>>>>> REGRESSIONs: X=boredom level, y=sad_2

Regresja liniowa R2 =  1.6282414500912168e-05
Regresja kwadratowa R2 =  0.003784859576911548
Regresja stopnia 3 R2 =  0.10291355668722002
Regresja stopnia 4 R2 =  0.1055975606196966



>>>>>>>>>> REGRESSIONs: X=boredom level, y=hostility_2

Regresja liniowa R2 =  0.06447102512437719
Regresja kwadratowa R2 =  0.11488154350986723
Regresja stopnia 3 R2 =  0.1425845273099139
Regresja stopnia 4 R2 =  0.14607188587469755



>>>>>>>>>> REGRESSIONs: X=boredom level, y=tired_2

Regresja liniowa R2 =  0.05642954548395662
Regresja kwadratowa R2 =  0.058131477797160214
Regresja stopnia 3 R2 =  0.058151376178791736
Regresja stopnia 4 R2 =  0.058214999037841775



>>>>>>>>>> REGRESSIONs: X=boredom level, y=pos_aff_2

Regresja liniowa R2 =  0.0038090674100490496
Regresja kwadratowa R2 =  0.015598247270988574
Regresja stopnia 3 R2 =  0.03572069348995177
Regresja stopnia 4 R2 =  0.09840056650390117



>>>>>>>>>> REGRESSIONs: X=boredom level, y=mindfulne

##Sadness

In [53]:
MakeScatterPlot(data_whole['boredom_level'], data_whole['sad_2'], f'Boredom -> Sadness (n={len(data_whole)})', 'Boredom','Sadness')

In [56]:
df_1 = data_whole[['boredom_level','sad_2']]
df_1['boredom^2'] = pow(df_1['boredom_level'],2)
df_1['boredom^3'] = pow(df_1['boredom_level'],3)

X = df_1[['boredom_level', 'boredom^2','boredom^3']]
X = sm.add_constant(X)
y = df_1['sad_2']

model = sm.OLS(y, X).fit()

print(model.summary())

                            OLS Regression Results                            
Dep. Variable:                  sad_2   R-squared:                       0.103
Model:                            OLS   Adj. R-squared:                  0.054
Method:                 Least Squares   F-statistic:                     2.103
Date:                Wed, 31 Mar 2021   Prob (F-statistic):              0.110
Time:                        15:13:43   Log-Likelihood:                -173.04
No. Observations:                  59   AIC:                             354.1
Df Residuals:                      55   BIC:                             362.4
Df Model:                           3                                         
Covariance Type:            nonrobust                                         
                    coef    std err          t      P>|t|      [0.025      0.975]
---------------------------------------------------------------------------------
const          -224.4864    100.312     -2.238



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



##Hostility

In [57]:
MakeScatterPlot(data_whole['boredom_level'], data_whole['hostility_2'], f'Boredom -> Hostility (n={len(data_whole)})', 'Boredom','Hostility')

In [58]:
df_2 = data_whole[['boredom_level','hostility_2']]
df_2['boredom^2'] = pow(df_1['boredom_level'],2)

X = df_2[['boredom_level', 'boredom^2']]
X = sm.add_constant(X)
y = df_2['hostility_2']

model = sm.OLS(y, X).fit()

print(model.summary())

                            OLS Regression Results                            
Dep. Variable:            hostility_2   R-squared:                       0.115
Model:                            OLS   Adj. R-squared:                  0.083
Method:                 Least Squares   F-statistic:                     3.634
Date:                Wed, 31 Mar 2021   Prob (F-statistic):             0.0328
Time:                        15:15:12   Log-Likelihood:                -177.78
No. Observations:                  59   AIC:                             361.6
Df Residuals:                      56   BIC:                             367.8
Df Model:                           2                                         
Covariance Type:            nonrobust                                         
                    coef    std err          t      P>|t|      [0.025      0.975]
---------------------------------------------------------------------------------
const            38.1916     18.992      2.011



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



###Visualization

In [59]:
boredom = df_2['boredom_level'].to_numpy().reshape(-1,1)
hostility = df_2['hostility_2'].to_numpy().reshape(-1,1)

# linear
regressor = LinearRegression()
regressor.fit(boredom, hostility)
y_pred_lin = regressor.predict(boredom)
hostility_pred_lin = pd.DataFrame(y_pred_lin)

# 2' degree
poly = PolynomialFeatures(degree=2)
boredom_2 = poly.fit_transform(boredom)

regressor.fit(boredom_2,hostility)
y_pred_2 = regressor.predict(boredom_2)
degree_2_r2 = r2_score(hostility, y_pred_2)

# visualization data
temp_df_2 = pd.DataFrame(boredom, columns=['boredom'])
temp_df_2.sort_values(by='boredom', inplace=True)
temp_df = pd.DataFrame(y_pred_2)
temp_df_2['hostility_pred'] = temp_df

In [62]:
# wykres całościowy
  
fig = go.Figure(
      data=[
            go.Scatter(
                name = 'Raw results',
                x = df_2['boredom_level'],
                y = df_2['hostility_2'],
                mode = 'markers',
                opacity=0.75
            ),
            go.Scatter(
                name = f'Linear regression, p>.05',
                x = df_2['boredom_level'],
                y = hostility_pred_lin[0],
                mode = 'lines',
                marker_color='#FF9F1C'
            ),
            go.Scatter(
                name = f'3 degree regression, p=.033',
                x = temp_df_2['boredom'],
                y = temp_df_2['hostility_pred'],
                mode = 'lines',
                marker_color='#011627'
            ),
            ],
      layout=go.Layout(
          title = f'Boredom -> Hostility (N={len(df_2)})',
          xaxis_title = 'Boredom',
          yaxis_title='Hostility',
          width = 1100,
          height = 700,
          font_size=16
      ))
fig.show()

##Mindfulness

In [65]:
MakeScatterPlot(data_whole['boredom_level'], data_whole['mean_rt_post'], f'Boredom -> Reaction times (n={len(data_whole)})', 'Boredom','Reaction times')

In [66]:
df_2 = data_whole[['boredom_level','mean_rt_post']]
df_2['boredom^2'] = pow(df_1['boredom_level'],2)

X = df_2[['boredom_level', 'boredom^2']]
X = sm.add_constant(X)
y = df_2['mean_rt_post']

model = sm.OLS(y, X).fit()

print(model.summary())

                            OLS Regression Results                            
Dep. Variable:           mean_rt_post   R-squared:                       0.102
Model:                            OLS   Adj. R-squared:                  0.070
Method:                 Least Squares   F-statistic:                     3.178
Date:                Wed, 31 Mar 2021   Prob (F-statistic):             0.0493
Time:                        15:25:35   Log-Likelihood:                -120.22
No. Observations:                  59   AIC:                             246.4
Df Residuals:                      56   BIC:                             252.7
Df Model:                           2                                         
Covariance Type:            nonrobust                                         
                    coef    std err          t      P>|t|      [0.025      0.975]
---------------------------------------------------------------------------------
const           -12.5490      7.159     -1.753



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



###Visualization

In [67]:
boredom = df_2['boredom_level'].to_numpy().reshape(-1,1)
rt = df_2['mean_rt_post'].to_numpy().reshape(-1,1)

# linear
regressor = LinearRegression()
regressor.fit(boredom, rt)
y_pred_lin = regressor.predict(boredom)
rt_pred_lin = pd.DataFrame(y_pred_lin)

# 2' degree
poly = PolynomialFeatures(degree=2)
boredom_2 = poly.fit_transform(boredom)

regressor.fit(boredom_2,rt)
y_pred_2 = regressor.predict(boredom_2)
degree_2_r2 = r2_score(rt, y_pred_2)

# visualization data
temp_df_2 = pd.DataFrame(boredom, columns=['boredom'])
temp_df_2.sort_values(by='boredom', inplace=True)
temp_df = pd.DataFrame(y_pred_2)
temp_df_2['rt_pred'] = temp_df

In [69]:
# wykres całościowy
  
fig = go.Figure(
      data=[
            go.Scatter(
                name = 'Raw results',
                x = df_2['boredom_level'],
                y = df_2['mean_rt_post'],
                mode = 'markers',
                opacity=0.75
            ),
            go.Scatter(
                name = f'Linear regression, p>.05',
                x = df_2['boredom_level'],
                y = rt_pred_lin[0],
                mode = 'lines',
                marker_color='#FF9F1C'
            ),
            go.Scatter(
                name = f'3 degree regression, p=.049',
                x = temp_df_2['boredom'],
                y = temp_df_2['rt_pred'],
                mode = 'lines',
                marker_color='#011627'
            ),
            ],
      layout=go.Layout(
          title = f'Boredom -> rt (N={len(df_2)})',
          xaxis_title = 'Boredom',
          yaxis_title='rt',
          width = 1100,
          height = 700,
          font_size=16
      ))
fig.show()