Scenario: We want to test if there is a significant difference in the average weight of wolves between two regions (Region A and Region B) and if there's a difference between male and female wolves within these regions . Datns: Region (A or B) Gender (Male or Female) Weight ( in kg)

In [1]:
import numpy as np
from scipy import stats

In [2]:
# Given data
data = [
    {'Region': 'A', 'Gender': 'Male', 'Weight': 40.5},
    {'Region': 'A', 'Gender': 'Female', 'Weight': 38.2},
    {'Region': 'A', 'Gender': 'Male', 'Weight': 42.0},
    {'Region': 'A', 'Gender': 'Female', 'Weight': 37.5},
    {'Region': 'A', 'Gender': 'Male', 'Weight': 41.3},
    {'Region': 'A', 'Gender': 'Female', 'Weight': 36.7},
    {'Region': 'B', 'Gender': 'Male', 'Weight': 45.2},
    {'Region': 'B', 'Gender': 'Female', 'Weight': 43.0},
    {'Region': 'B', 'Gender': 'Male', 'Weight': 46.5},
    {'Region': 'B', 'Gender': 'Female', 'Weight': 42.8},
    {'Region': 'B', 'Gender': 'Male', 'Weight': 44.1},
    {'Region': 'B', 'Gender': 'Female', 'Weight': 41.7}
]

In [3]:
# Convert data to numpy arrays for easier manipulation
weights_A = np.array([d['Weight'] for d in data if d['Region'] == 'A'])
weights_B = np.array([d['Weight'] for d in data if d['Region'] == 'B'])

weights_male = np.array([d['Weight'] for d in data if d['Gender'] == 'Male'])
weights_female = np.array([d['Weight'] for d in data if d['Gender'] == 'Female'])


In [4]:
# Function to calculate the Z-test
def z_test(group1, group2):
    mean1, mean2 = np.mean(group1), np.mean(group2)
    std1, std2 = np.std(group1, ddof=1), np.std(group2, ddof=1)
    n1, n2 = len(group1), len(group2)
   
    pooled_std = np.sqrt(std1**2/n1 + std2**2/n2)
    z_score = (mean1 - mean2) / pooled_std
    p_value = stats.norm.sf(abs(z_score)) * 2  # two-tailed p-value
   
    return z_score, p_value

In [5]:
# Perform Z-tests
z_region, p_region = z_test(weights_A, weights_B)
z_gender, p_gender = z_test(weights_male, weights_female)

In [6]:
# Output the results
print(f"Region Z-test: Z = {z_region:.2f}, p = {p_region:.4f}")
print(f"Gender Z-test: Z = {z_gender:.2f}, p = {p_gender:.4f}")

Region Z-test: Z = -3.95, p = 0.0001
Gender Z-test: Z = 2.18, p = 0.0293



Region Hypothesis:
The Z-score is -3.95, and the p-value is 0.0001.
Since the p-value is less than 0.05, we reject the null hypothesis.
This indicates that there is a significant difference in the average weight of wolves between Region A and Region B.


Gender Hypothesis:
The Z-score is 2.18, and the p-value is 0.0293.
Since the p-value is less than 0.05, we reject the null hypothesis.
This indicates that there is a significant difference in the average weight of male and female wolves.