### **Chi-Square Goodness of Fit Test – Another Example**  

#### **Scenario: Customer Preferences for Coffee Brands ☕**  
A coffee company wants to know if customer preferences for different coffee brands are **evenly distributed** or if some brands are more popular.  

They survey **200 customers** and record their brand preferences among **four major brands**: **Starbucks, Dunkin', Tim Hortons, and Peet’s**. If preferences were **uniform**, each brand should be chosen by **50 customers**.  

---

### **Step 1: Define Hypotheses**  
- **Null Hypothesis ($( H_0 )$)**: Customers choose brands **equally** (i.e., uniform distribution).  
- **Alternative Hypothesis ($( H_a )$)**: Customer preferences **are not evenly distributed**.

---

### **Step 2: Observed & Expected Frequencies**  
| Coffee Brand  | Observed Customers | Expected Customers (if uniform) |
|--------------|------------------|------------------------------|
| Starbucks    | 60               | 50  |
| Dunkin'      | 55               | 50  |
| Tim Hortons  | 40               | 50  |
| Peet’s       | 45               | 50  |
| **Total**    | **200**           | **200** |

🔹 **Expected frequency** for each brand:  
$[
\frac{200 \text{ customers}}{4 \text{ brands}} = 50
]$

---

### **Step 3: Compute Chi-Square Statistic**  
$[
\chi^2 = \sum \frac{(O_i - E_i)^2}{E_i}
]$

where:  
- $( O_i )$ = observed frequency  
- $( E_i )$ = expected frequency  




In [1]:

### **Step 4: Python Code**
import scipy.stats as stats
import numpy as np

# Observed frequencies from survey
observed = np.array([60, 55, 40, 45])

# Expected frequencies if all brands were equally preferred
expected = np.array([50, 50, 50, 50])

# Perform Chi-Square Goodness of Fit test
chi_square_stat, p_value = stats.chisquare(observed, expected)

print(f"Chi-Square Statistic: {chi_square_stat:.4f}, P-value: {p_value:.4f}")



Chi-Square Statistic: 5.0000, P-value: 0.1718



### **Step 5: Interpret Results**
- **If p-value < 0.05**: Reject $( H_0 )$ → Preferences **are not** equally distributed.  
- **If p-value ≥ 0.05**: Fail to reject $( H_0 )$ → No strong evidence of unequal preferences.  

---

### **Conclusion**
This test determines whether **real-world customer choices** align with the expected distribution (uniform in this case). If the p-value is low, we conclude that some brands are **more popular than others**.  

