### **Data Ingestion**

In [5]:
# import necessary package
import gdown
import pandas as pd
import numpy as np
import scipy.stats as stat

In [6]:
# load the data

file_id = '1yB5qSBOLl96Y563nIewKOU8RN_gsY3dO'  # Make sure it's a string
gdown.download(f'https://drive.google.com/uc?id={file_id}', 'data.csv', quiet=False)

Downloading...
From: https://drive.google.com/uc?id=1yB5qSBOLl96Y563nIewKOU8RN_gsY3dO
To: c:\Users\ncc\Desktop\task\Week_9\data.csv
100%|██████████| 52.0k/52.0k [00:00<00:00, 240kB/s]


'data.csv'

In [7]:
file = pd.read_csv('data.csv')

### **Data Cleaning**

In [8]:
file.isna().sum()

Unnamed: 0       0
country          0
food_category    0
consumption      0
co2_emission     0
dtype: int64

In [10]:
# drop unnecessary columns
file.drop('Unnamed: 0', axis = 1, inplace = True)

#### **Question**

Is the beef consumption in Argentina significantly different from that in Bangladesh?


- H₀: Mean beef consumption (Argentina) = Mean beef consumption (Bangladesh)

- H₁: Mean beef consumption (Argentina) ≠ Mean beef consumption (Bangladesh)

#### **Solution:**

To identify is the beef consumption in Argentina is slightly different from that in Bangladesh, we have to take the mean of both nations into consideration.

H₀: Mean beef consumption in Argentina = Mean beef consumption in Bangladesh

H₁: Mean beef consumption in Argentina ≠ Mean beef consumption in Bangladesh

In [None]:
# Select Argentina beef consumption
arg_beef = file[(file["country"]=="Argentina") & (file["food_category"]=="beef")]
arg_beef

Unnamed: 0,country,food_category,consumption,co2_emission
2,Argentina,beef,55.48,1712.0


In [None]:
# Select Bangladesh beef consumption
bang_beef = file[(file['country'] == 'Bangladesh') & (file['food_category'] == 'beef')]
bang_beef

Unnamed: 0,country,food_category,consumption,co2_emission
1421,Bangladesh,beef,1.28,39.5


In [26]:
# Brong out the consumption rate for both countries
arg_consump = arg_beef["consumption"]
bang_consump = bang_beef['consumption']

In [28]:
# Get the consumption mean of both countries
arg_consump_mean = arg_consump.mean()
bang_consump_mean = bang_consump.mean()

print(f'The mean consumption for Argentina is: {arg_consump_mean}\nThe mean consumption for Bangladesh is : {bang_consump_mean}')

The mean consumption for Argentina is: 55.48
The mean consumption for Bangladesh is : 1.28


In [29]:
rg_beef_samples = np.random.normal(loc=arg_consump_mean, scale=5, size=30)
rg_beef_samples

array([51.72689367, 49.32880518, 57.63086503, 50.44784177, 55.22003662,
       60.71995913, 53.55656492, 59.63675417, 48.86241364, 60.21104525,
       56.46874139, 54.02369781, 60.29904946, 46.61847023, 65.07971936,
       50.9535088 , 58.09514997, 56.2461046 , 55.59148433, 54.33427123,
       61.15477623, 56.70274767, 55.28860963, 54.68961909, 51.83041659,
       64.8309752 , 59.67205768, 59.19455662, 50.15818384, 57.06493335])

In [30]:
# One-sample t-test
t_stat, p_val = stat.ttest_1samp(rg_beef_samples, 50)

In [31]:
print("T-statistic:", t_stat)

T-statistic: 6.958242448732897


In [33]:
print('P-value: ', p_val)



P-value:  1.1953807721651018e-07


#### **Interpretation**

If p < 0.05, reject H₀ → beef consumption differs significantly.

If p ≥ 0.05, fail to reject H₀.