# Testing hypotheses on synthetic dummy data

## Setup & loading data

In [1]:
import numpy as np
import pandas as pd
from scipy.stats import ttest_1samp, ttest_rel

In [2]:
responses_t1 = pd.read_csv("dataset_T1.csv")
responses_t2 = pd.read_csv("dataset_T2.csv")

### Create the aggregated values

biospheric ((pv19 + pv22)/2), altruistic ((pv3 + pv12 + pv18)/3), hedonic ((pv10 + pv21)/2) and egoistic ((pv2 + pv4 + pv13)/3)

In [3]:
responses_t1['egoistic']   = responses_t1[[f'pv{i}' for i in [ 2,  4, 13]]].mean()
responses_t1['altruistic'] = responses_t1[[f'pv{i}' for i in [ 3, 12, 18]]].mean()
responses_t1['hedonic']    = responses_t1[[f'pv{i}' for i in [10, 21]]].mean()
responses_t1['biospheric'] = responses_t1[[f'pv{i}' for i in [19, 22]]].mean()

## Changes in energy-saving behaviour

1.	There is a difference between the set room temperature at time one (temp_pref_1, temp_pref_2, temp_pref_3, temp_pref_4, temp_pref_5) and time two (temp_pref_1_t2, temp_pref_2_t2, temp_pref_3_t2, temp_pref_4_t2, temp_pref_5_t2), so that it is higher at time two. 
Proposed analysis: t-test. 


In [4]:
# Testing difference between two related samples with expectation that t1 < t2
for i in range(1, 6):
    test_result = ttest_rel(responses_t1[f'temp_pref_{i}'], responses_t2[f'temp_pref_{i}_t2'], alternative='less')
    print(f'Test for temp_pref_{i}: {test_result}')

Test for temp_pref_1: TtestResult(statistic=-0.9585487119228076, pvalue=0.17006062139351663, df=99)
Test for temp_pref_2: TtestResult(statistic=1.1161418858721988, pvalue=0.8664684136125256, df=99)
Test for temp_pref_3: TtestResult(statistic=0.9228791308292538, pvalue=0.8208430181259463, df=99)
Test for temp_pref_4: TtestResult(statistic=0.5769906816497485, pvalue=0.7173722500392348, df=99)
Test for temp_pref_5: TtestResult(statistic=0.3534294541816439, pvalue=0.637740995759017, df=99)


2.	There is a difference in curtailment behaviours (empty, clothes, night, away, shut, shw_time): they are lower at time two (empty_t2, clothes_t2, night_t2, away_t2, shut_t2, shw_time_t2). 
Proposed analysis: t-test. 

In [None]:
# Testing difference between two related samples with expectation that t1 > t2
for description in ['empty', 'clothes', 'night', 'away', 'shut', 'shw_time']:
    test_result = ttest_rel(df[f'{description}'], df[f'{description}_t2'], alternative='greater')
    print(f'Test for {description}: {test_result}')

## Changes in energy-saving behaviour differ based on intrinsic motivation

1.	Stronger biospheric and altruistic values (Bio_values, Altr_values) will be associated with smaller differences in temperature settings between time one (temp_pref_4, temp_pref_5, temp_pref_6) and time two (temp_pref_4_t2, temp_pref_5_t2, temp_pref_6_t2).  
Proposed analysis: Pearson’s correlation or linear regression.

 2.	Stronger egoistic and hedonic values (Ego_values, Hed_values) will be associated with larger differences in temperature settings between time one (temp_pref_4, temp_pref_5, temp_pref_6) and time two (temp_pref_4_t2, temp_pref_5_t2, temp_pref_6_t2).
Proposed analysis: Pearson’s correlation or linear regression.

3.	Stronger biospheric and altruistic values (Bio_values, Altr_values) will be associated with smaller differences in curtailment behaviours between time one (empty, clothes, night, away, shut, shw_time, shw_time, shw_time_num, shw_frw_num, bath_num) and time two (empty_t2, clothes_t2, night_t2, away_t2, shut_t2, shw_time_t2, shw_time_t2, shw_time_num_t2, shw_frw_num_t2, bath_num_t2).
Proposed analysis: Pearson’s correlation or linear regression.

4.	Stronger egoistic and hedonic values (Ego_values, Hed_values) will be associated with bigger differences in curtailment behaviours between time one (empty, clothes, night, away, shut, shw_time, shw_time_num, shw_frw_num, bath_num) and time two (empty_t2, clothes_t2, night_t2, away_t2, shut_t2, shw_time_t2, shw_time_t2, shw_time_num_t2, shw_frw_num_t2, bath_num_t2).
Proposed analysis: Pearson’s correlation or linear regression.