### Task:

Is the beef consumption in Argentina significantly different from that in Bangladesh?

H₀: Mean beef consumption (Argentina) = Mean beef consumption (Bangladesh)

H₁: Mean beef consumption (Argentina) ≠ Mean beef consumption (Bangladesh)

In [212]:
import gdown
from scipy import stats

file_id = '1yB5qSBOLl96Y563nIewKOU8RN_gsY3dO'  # Make sure it's a string
gdown.download(f'https://drive.google.com/uc?id={file_id}', 'data.csv', quiet=False)

Downloading...
From: https://drive.google.com/uc?id=1yB5qSBOLl96Y563nIewKOU8RN_gsY3dO
To: C:\Users\ncc\PycharmProjects\GEN-AI\WEEK 9\data.csv
100%|██████████| 52.0k/52.0k [00:00<00:00, 282kB/s]


'data.csv'

In [213]:
import pandas as pd
import numpy as np
# Read the downloaded CSV file
df = pd.read_csv('data.csv')
df.dropna(inplace=True)
df

Unnamed: 0.1,Unnamed: 0,country,food_category,consumption,co2_emission
0,1,Argentina,pork,10.51,37.20
1,2,Argentina,poultry,38.66,41.53
2,3,Argentina,beef,55.48,1712.00
3,4,Argentina,lamb_goat,1.56,54.63
4,5,Argentina,fish,4.36,6.96
...,...,...,...,...,...
1425,1426,Bangladesh,dairy,21.91,31.21
1426,1427,Bangladesh,wheat,17.47,3.33
1427,1428,Bangladesh,rice,171.73,219.76
1428,1429,Bangladesh,soybeans,0.61,0.27


In [214]:
arg_beef = df[(df['country'] == 'Argentina') & (df['food_category'] == 'beef')]
arg_beef_consump = arg_beef['consumption']

bang_beef = df[(df['country'] == 'Bangladesh') & (df['food_category'] == 'beef')]
bang_beef_consump = bang_beef['consumption']


In [215]:
arg_beef_consump

2    55.48
Name: consumption, dtype: float64

In [216]:
bang_beef_consump

1421    1.28
Name: consumption, dtype: float64

In [217]:
arg_beef_consump_mean = arg_beef_consump.mean()
bang_beef_consump_mean = bang_beef_consump.mean()

print("Mean Beef Consumption(Argentina): ", arg_beef_consump_mean)
print("Mean Beef Consumption(Bangladesh): ", bang_beef_consump_mean)

Mean Beef Consumption(Argentina):  55.48
Mean Beef Consumption(Bangladesh):  1.28


In [218]:
np.random.seed(42) # for reproducibility

In [219]:
arg_beef_samples = np.random.normal(loc=arg_beef_consump_mean, scale=5, size=30)
arg_beef_samples

array([57.96357077, 54.78867849, 58.71844269, 63.09514928, 54.30923313,
       54.30931522, 63.37606408, 59.31717365, 53.13262807, 58.19280022,
       53.16291154, 53.15135123, 56.68981136, 45.91359878, 46.85541084,
       52.66856235, 50.4158444 , 57.05123666, 50.93987962, 48.41848149,
       62.80824384, 54.3511185 , 55.81764102, 48.35625907, 52.75808638,
       56.03461295, 49.72503211, 57.35849009, 52.47680655, 54.02153125])

In [220]:
bang_beef_samples = np.random.normal(loc=bang_beef_consump_mean, scale=5, size=30)
bang_beef_samples

array([-1.72853306, 10.54139092,  1.21251388, -4.00855464,  5.39272456,
       -4.82421825,  2.32431798, -8.51835062, -5.36093024,  2.26430618,
        4.9723329 ,  2.13684141,  0.70175859, -0.22551848, -6.11260995,
       -2.31922104, -1.02319385,  6.56561113,  2.99809145, -7.53520078,
        2.90041985, -0.6454114 , -2.10461   ,  4.33838144,  6.43499761,
        5.9364006 , -2.91608762, -0.26606188,  2.93631716,  6.15772564])

In [221]:
# one-sample t-test : Argentina beef samples vs bangladesh mean
t_stat, p_val =stats.ttest_1samp(arg_beef_samples, bang_beef_consump_mean)

In [222]:
print("T'Statistic: ", t_stat)
print("P-value: ", p_val)

T'Statistic:  64.82465065513577
P-value:  6.1573169718969674e-33


In [223]:
# Comparing Bangladesh samples vs argentina mean
t_stat, p_val =stats.ttest_1samp(bang_beef_samples, arg_beef_consump_mean)

In [224]:
print("T'Statistic: ", t_stat)
print("P-value: ", p_val)

T'Statistic:  -64.47923713919985
P-value:  7.181767814881781e-33


Interpretation:

If p < 0.05, reject H₀ → Mean beef consumption (Argentina) = Mean beef consumption (Bangladesh)

If p ≥ 0.05, fail to reject H₀.

**CONCLUSION**: Since, p < 0.05, the mean beef consumption of Argentina is significantly different from Bangladesh

In [225]:

# Using independent two-sample t-test
t_statistic, p_value = stats.ttest_ind(arg_beef_samples, bang_beef_samples, equal_var=False)


In [226]:

print(f"T-statistic: {t_statistic}")
print(f"P-value: {p_value}")

T-statistic: 45.56550535993598
P-value: 4.508880481655939e-47


In [227]:
alpha = 0.05

print(f"\nSignificance level (alpha): {alpha}")
if p_value < alpha:
    print(f"The p-value ({p_value:.10f}) is less than alpha. Reject the null hypothesis.")
    print("Conclusion: The mean beef consumption in Argentina is significantly different from that in Bangladesh.")
else:
    print(f"The p-value ({p_value:.10f}) is greater than alpha. Fail to reject the null hypothesis.")
    print("Conclusion: There is no significant difference in the mean beef consumption.")




Significance level (alpha): 0.05
The p-value (0.0000000000) is less than alpha. Reject the null hypothesis.
Conclusion: The mean beef consumption in Argentina is significantly different from that in Bangladesh.
