# Guruh 2: Ilg'or hipoteza testi

**Maqsad**: Chi-square test va ikki tanlovli testlarni o'rganish

**Vazifalar**:
1. Chi-square independence test
2. Ikki tanlovli t-test
3. Effect size hisoblash
4. Power analysis

In [None]:
# Kerakli kutubxonalarni import qiling
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from scipy import stats
from scipy.stats import chi2, t, norm

# Grafik sozlamalari
plt.style.use('seaborn-v0_8')
plt.rcParams['figure.figsize'] = (12, 6)
plt.rcParams['font.size'] = 12
np.random.seed(42)

## Vazifa 1: Chi-square Independence Test

**Masala**: 
Restoranlarda mijozlarning jinsiga ko'ra ovqat tanlovi farq qiladimi?

Ma'lumotlar:
- Erkaklar: Pizza(45), Salat(15), Burger(40)
- Ayollar: Pizza(25), Salat(35), Burger(20)

In [None]:
# Contingency table yarating
data = {
    'Pizza': [45, 25],
    'Salat': [15, 35], 
    'Burger': [40, 20]
}

# Bu yerni to'ldiring:
contingency_table = pd.DataFrame(data, index=['Erkak', 'Ayol'])

print("VAZIFA 1: CHI-SQUARE INDEPENDENCE TEST")
print("=" * 40)
print("Contingency table:")
print(contingency_table)

print("\nHipotezalar:")
print("H₀: Jins va ovqat tanlovi mustaqil")
print("H₁: Jins va ovqat tanlovi bog'langan")

In [None]:
# Chi-square test ni bajaring

# Bu yerni to'ldiring:
chi2_stat, p_value, dof, expected = # sizning kodingiz

alpha = 0.05
critical_value = chi2.ppf(1 - alpha, dof)

print(f"Chi-square statistic = {chi2_stat:.4f}")
print(f"p-value = {p_value:.4f}")
print(f"Degrees of freedom = {dof}")
print(f"Critical value = {critical_value:.4f}")

print(f"\nKutilgan chastotalar:")
expected_df = pd.DataFrame(expected, 
                          index=contingency_table.index, 
                          columns=contingency_table.columns)
print(expected_df.round(2))

In [None]:
# Effect size (Cramer's V) ni hisoblang

# Bu yerni to'ldiring:
n = # jami kuzatuvlar soni
cramers_v = # Cramer's V formulasi

print(f"Effect size (Cramer's V) = {cramers_v:.4f}")

# Effect size interpretation
if cramers_v < 0.1:
    effect_size = "Juda kichik"
elif cramers_v < 0.3:
    effect_size = "Kichik"
elif cramers_v < 0.5:
    effect_size = "O'rta"
else:
    effect_size = "Katta"
    
print(f"Effect size interpretation: {effect_size} bog'lanish")

In [None]:
# Natijalarni vizualizatsiya qiling
fig, (ax1, ax2, ax3) = plt.subplots(1, 3, figsize=(18, 5))

# Kuzatilgan chastotalar
sns.heatmap(contingency_table, annot=True, fmt='d', cmap='Blues', ax=ax1)
ax1.set_title('Kuzatilgan chastotalar')

# Kutilgan chastotalar  
sns.heatmap(expected_df, annot=True, fmt='.1f', cmap='Oranges', ax=ax2)
ax2.set_title('Kutilgan chastotalar')

# Chi-square distribution
x = np.linspace(0, 15, 1000)
y = chi2.pdf(x, dof)

ax3.plot(x, y, 'b-', linewidth=2, label=f'χ² distribution (df={dof})')
ax3.axvline(chi2_stat, color='red', linestyle='--', linewidth=2, 
            label=f'χ² = {chi2_stat:.3f}')
ax3.axvline(critical_value, color='orange', linestyle=':', linewidth=2, 
            label=f'Critical = {critical_value:.3f}')

# Critical region
x_critical = x[x >= critical_value]
y_critical = chi2.pdf(x_critical, dof)
ax3.fill_between(x_critical, y_critical, alpha=0.3, color='red')

ax3.set_xlabel('χ² qiymati')
ax3.set_ylabel('Ehtimollik zichligi')
ax3.set_title('Chi-square test')
ax3.legend()
ax3.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

# Xulosa
print("\nXULOSA:")
if p_value < alpha:
    print("✅ Null hipotezani rad etamiz")
    print("   Jins va ovqat tanlovi o'rtasida bog'lanish bor")
else:
    print("❌ Null hipotezani rad eta olmaymiz")
    print("   Jins va ovqat tanlovi mustaqil")

## Vazifa 2: Ikki tanlovli T-test (Independent samples)

**Masala**: 
Ikki turli vitamin qo'shimchasining samaradorligi farq qiladimi?

- Vitamin A guruh: [12, 15, 18, 14, 16, 13, 17, 15, 19, 14]
- Vitamin B guruh: [16, 19, 22, 18, 20, 17, 21, 18, 23, 19]

In [None]:
# Ma'lumotlarni kiriting
group_a = [12, 15, 18, 14, 16, 13, 17, 15, 19, 14]
group_b = [16, 19, 22, 18, 20, 17, 21, 18, 23, 19]

print("VAZIFA 2: IKKI TANLOVLI T-TEST")
print("=" * 35)
print("H₀: μ₁ = μ₂ (guruhlar o'rtacha bir xil)")
print("H₁: μ₁ ≠ μ₂ (guruhlar o'rtacha farq qiladi)")

print(f"\nVitamin A guruh: {group_a}")
print(f"Vitamin B guruh: {group_b}")

In [None]:
# Deskriptiv statistikalarni hisoblang

# Bu yerni to'ldiring:
mean_a = # Vitamin A o'rtacha
mean_b = # Vitamin B o'rtacha
std_a = # Vitamin A standart chetlanish
std_b = # Vitamin B standart chetlanish
n_a = # Vitamin A hajmi
n_b = # Vitamin B hajmi

print(f"Vitamin A: Mean = {mean_a:.2f}, Std = {std_a:.2f}, n = {n_a}")
print(f"Vitamin B: Mean = {mean_b:.2f}, Std = {std_b:.2f}, n = {n_b}")
print(f"Farq (B - A) = {mean_b - mean_a:.2f}")

In [None]:
# T-test ni bajaring (equal variances deb faraz qiling)

# Bu yerni to'ldiring:
t_stat, p_value = # sizning kodingiz (scipy.stats.ttest_ind)

alpha = 0.05
df = n_a + n_b - 2
critical_value = t.ppf(1 - alpha/2, df)

print(f"\nT-statistic = {t_stat:.4f}")
print(f"p-value = {p_value:.4f}")
print(f"Degrees of freedom = {df}")
print(f"Critical value (±) = {critical_value:.3f}")

In [None]:
# Effect size (Cohen's d) ni hisoblang

# Pooled standard deviation
pooled_std = np.sqrt(((n_a - 1) * std_a**2 + (n_b - 1) * std_b**2) / (n_a + n_b - 2))

# Bu yerni to'ldiring:
cohens_d = # Cohen's d formulasi

print(f"Pooled standard deviation = {pooled_std:.3f}")
print(f"Cohen's d = {cohens_d:.3f}")

# Effect size interpretation
if abs(cohens_d) < 0.2:
    effect_interpretation = "Kichik"
elif abs(cohens_d) < 0.5:
    effect_interpretation = "O'rta"
elif abs(cohens_d) < 0.8:
    effect_interpretation = "Katta"
else:
    effect_interpretation = "Juda katta"
    
print(f"Effect size interpretation: {effect_interpretation} ta'sir")

In [None]:
# Natijalarni vizualizatsiya qiling
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(15, 6))

# Box plot
data_for_plot = [group_a, group_b]
labels = ['Vitamin A', 'Vitamin B']
ax1.boxplot(data_for_plot, labels=labels)
ax1.set_ylabel('Samaradorlik ko\'rsatkichi')
ax1.set_title('Guruhlar bo\'yicha taqqoslash')
ax1.grid(True, alpha=0.3)

# T-distribution
x = np.linspace(-5, 5, 1000)
y = t.pdf(x, df)

ax2.plot(x, y, 'b-', linewidth=2, label=f't-distribution (df={df})')
ax2.axvline(t_stat, color='red', linestyle='--', linewidth=2, 
            label=f'T-statistic = {t_stat:.3f}')
ax2.axvline(critical_value, color='orange', linestyle=':', linewidth=2, 
            label=f'Critical = ±{critical_value:.3f}')
ax2.axvline(-critical_value, color='orange', linestyle=':', linewidth=2)

# Critical regions
x_left = x[x <= -critical_value]
y_left = t.pdf(x_left, df)
x_right = x[x >= critical_value]
y_right = t.pdf(x_right, df)

ax2.fill_between(x_left, y_left, alpha=0.3, color='red')
ax2.fill_between(x_right, y_right, alpha=0.3, color='red')

ax2.set_xlabel('T-score')
ax2.set_ylabel('Ehtimollik zichligi')
ax2.set_title('Ikki tanlovli T-test')
ax2.legend()
ax2.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

# Xulosa
print("\nXULOSA:")
if p_value < alpha:
    print("✅ Null hipotezani rad etamiz")
    print("   Ikki vitamin o'rtasida statistik jihatdan ahamiyatli farq bor")
    if mean_b > mean_a:
        print("   Vitamin B ko'proq samarali")
    else:
        print("   Vitamin A ko'proq samarali")
else:
    print("❌ Null hipotezani rad eta olmaymiz")
    print("   Ikki vitamin o'rtasida ahamiyatli farq yo'q")

## Vazifa 3: Power Analysis

**Maqsad**: Test kuchini (power) hisoblash va sample size ni aniqlash

In [None]:
# Power analysis uchun statsmodels kutubxonasi kerak
# Agar o'rnatilmagan bo'lsa: pip install statsmodels

try:
    from statsmodels.stats.power import ttest_power
    
    # Power hisoblash (Cohen's d = 0.8, alpha = 0.05, sample_size = 10)
    effect_size = 0.8
    alpha = 0.05
    sample_size = 10
    
    power = ttest_power(effect_size, sample_size, alpha, alternative='two-sided')
    
    print("POWER ANALYSIS")
    print("=" * 20)
    print(f"Effect size (Cohen's d) = {effect_size}")
    print(f"Alpha = {alpha}")
    print(f"Sample size = {sample_size}")
    print(f"Power = {power:.3f}")
    
    # Sample size ni aniqlash (power = 0.8 uchun)
    from statsmodels.stats.power import tt_solve_power
    
    required_n = tt_solve_power(effect_size=effect_size, power=0.8, alpha=alpha)
    print(f"\n80% power uchun kerakli sample size = {required_n:.1f}")
    
except ImportError:
    print("statsmodels kutubxonasi o'rnatilmagan")
    print("Power analysis uchun: pip install statsmodels")
    
    # Manual power calculation
    print("\nManual power hisoblash:")
    effect_size = cohens_d
    
    # Non-centrality parameter
    ncp = effect_size * np.sqrt(n_a * n_b / (n_a + n_b))
    
    # Power (approximate)
    power_approx = 1 - t.cdf(critical_value, df, ncp) + t.cdf(-critical_value, df, ncp)
    print(f"Approximate power = {power_approx:.3f}")

## Uyga vazifa

### 1. Chi-square Goodness of Fit test
Nard o'yinida zarning har bir yoqi teng ehtimol bilan tushishi kerak.
100 marta tashlanganda natijalar:
- 1: 18 marta
- 2: 22 marta  
- 3: 16 marta
- 4: 14 marta
- 5: 12 marta
- 6: 18 marta

**Savol**: Zar adolatlimi?

### 2. Paired T-test
10 kishining sport mashg'ulotlaridan oldin va keyin og'irligi:
- Oldin: [70, 68, 72, 75, 69, 71, 73, 67, 74, 70]
- Keyin: [68, 65, 70, 72, 66, 69, 70, 64, 71, 67]

**Savol**: Sport mashg'ulotlari og'irlikni kamaytiradimi?

### 3. Amaliy topshiriq
- Har ikkala test uchun to'liq analiz bajaring
- Effect size hisoblang
- Natijalarni vizualizatsiya qiling
- Power analysis bajaring (agar mumkin bo'lsa)
- Xulosa yozing