Hypothesis 1. There a significant linear relationship between voltage and global active power

The null and alternative hypotheses are:

H₀: ρ = 0 (no correlation between voltage and global active power)
H₁: ρ ≠ 0 (there is a correlation between voltage and global active power)

To test if higher voltage readings are associated with increased global active power usage, we can use a t-test for Correlation Coefficient since we're testing the relationship between two continuous variables. 

We calculate the Pearson correlation coefficient between voltage and global active power
Computes the t-statistic for testing the significance of the correlation
Determines the critical value for a two-tailed test at α=0.05

The test will help determine if there's statistically significant evidence of a relationship between voltage readings and global active power usage. 

In [5]:
import pandas as pd
import numpy as np
from scipy import stats
import matplotlib.pyplot as plt
import seaborn as sns

def test_voltage_power_correlation(df):

    correlation, p_value = stats.pearsonr(df['Voltage'], df['Global_active_power'])
    
    n = len(df)
    t_stat = correlation * np.sqrt((n-2)/(1-correlation**2))
    
    # Calculate degrees of freedom
    df_stat = n - 2
    
    # Calculate critical value for two-tailed test at 0.05 significance level
    critical_value = stats.t.ppf(0.975, df_stat)
    
    results = {
        'correlation': correlation,
        'p_value': p_value,
        't_statistic': t_stat,
        'critical_value': critical_value,
        'degrees_of_freedom': df_stat
    }
    
    return results

results = test_voltage_power_correlation(df)

print("\nHypothesis Test Results:")
print("-----------------------")
print(f"Correlation Coefficient (r): {results['correlation']:.4f}")
print(f"t-statistic: {results['t_statistic']:.4f}")
print(f"Critical Value (two-tailed, α=0.05): ±{results['critical_value']:.4f}")
print(f"p-value: {results['p_value']:.4f}")
print(f"Degrees of freedom: {results['degrees_of_freedom']}")

print("\nInterpretation:")
print("--------------")
if abs(results['t_statistic']) > results['critical_value']:
    print("Reject the null hypothesis.")
    if results['correlation'] > 0:
        print("There is significant evidence of a positive correlation between voltage and global active power.")
    else:
        print("There is significant evidence of a negative correlation between voltage and global active power.")
else:
    print("Fail to reject the null hypothesis.")
    print("There is insufficient evidence of a correlation between voltage and global active power.")



Hypothesis Test Results:
-----------------------
Correlation Coefficient (r): -0.3154
t-statistic: -459.7670
Critical Value (two-tailed, α=0.05): ±1.9600
p-value: 0.0000
Degrees of freedom: 1913073

Interpretation:
--------------
Reject the null hypothesis.
There is significant evidence of a negative correlation between voltage and global active power.
