# 🌍 SDG 13 Project: Predicting CO₂ Emissions

**Goal**: Use supervised learning to predict CO₂ emissions based on GDP and population data.

**SDG Focus**: SDG 13 – Climate Action

This project uses real-world-inspired data to help forecast emissions and inform sustainable policy decisions.

In [None]:
# Import libraries
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_absolute_error, r2_score

## 🔍 Load and Explore the Dataset

In [None]:
# Load the dataset
df = pd.read_csv('co2_data.csv')
df.head()

## 🔧 Preprocess the Data

In [None]:
# Drop missing values (if any)
df = df.dropna()

# Select features and target
X = df[['GDP', 'Population']]
y = df['CO2_Emissions']

## 🧠 Train the Model

In [None]:
# Split into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train Linear Regression model
model = LinearRegression()
model.fit(X_train, y_train)

## 📈 Evaluate the Model

In [None]:
# Predict on test set
predictions = model.predict(X_test)

# Evaluate
mae = mean_absolute_error(y_test, predictions)
r2 = r2_score(y_test, predictions)

print(f"Mean Absolute Error: {mae:.2f}")
print(f"R² Score: {r2:.2f}")

## 📊 Visualize Predictions

In [None]:
# Scatter plot of actual vs predicted
plt.figure(figsize=(8,6))
plt.scatter(y_test, predictions, color='blue')
plt.plot([min(y_test), max(y_test)], [min(y_test), max(y_test)], color='red', linestyle='--')
plt.xlabel("Actual CO₂ Emissions (Mt)")
plt.ylabel("Predicted CO₂ Emissions (Mt)")
plt.title("Actual vs Predicted CO₂ Emissions")
plt.grid(True)
plt.show()

## 🧠 Ethical Reflection
- **Bias Risk**: Countries with missing or under-reported data (especially in the Global South) may affect model fairness.
- **Sustainability**: Accurate predictions can help allocate resources and improve climate policy globally.
- **Improvement**: Incorporate more variables like renewable energy percentage or deforestation rates.