# 04 Predictive Modeling (Simple and Explainable)

## Objectives

- Build a simple, explainable forecast model
- Evaluate performance with time-aware splits
- Export predictions for the dashboard

## Inputs

- data/processed/v1/environmental_trends_clean.csv

## Outputs

- data/processed/v1/model_predictions.csv

## Additional Comments

- Report limitations and avoid overclaiming

---

# Change working directory

In [1]:
import os
current_dir = os.getcwd()
os.chdir(os.path.dirname(current_dir))
os.getcwd()

'c:\\Users\\sergi\\OneDrive\\Documents\\Code Institute Data analytics\\Capstone project 3\\Global_environmental_trends_2000_2024\\global_env_trend'

# Load processed data

In [2]:
import pandas as pd
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_absolute_error, mean_squared_error
clean_path = "data/processed/v1/environmental_trends_clean.csv"
df = pd.read_csv(clean_path)
df.head()

Unnamed: 0,Year,Country,Avg_Temperature_degC,CO2_Emissions_tons_per_capita,Sea_Level_Rise_mm,Rainfall_mm,Population,Renewable_Energy_pct,Extreme_Weather_Events,Forest_Area_pct
0,2000,United States,13.5,20.2,0,715,282500000,6.2,38,33.1
1,2000,China,12.8,2.7,0,645,1267000000,16.5,24,18.8
2,2000,Germany,9.3,10.1,0,700,82200000,6.6,12,31.8
3,2000,Brazil,24.9,1.9,0,1760,175000000,83.7,18,65.4
4,2000,Australia,21.7,17.2,0,534,19200000,8.8,11,16.2


# Prepare features and target (placeholder)

In [3]:
df_model = df.dropna(subset=["Year", "Avg_Temperature_degC"])
X = df_model[["Year"]]
y = df_model["Avg_Temperature_degC"]

# Time-aware split and model training (placeholder)

In [4]:
split_year = 2018
train = df_model[df_model["Year"] <= split_year]
test = df_model[df_model["Year"] > split_year]
model = LinearRegression()
model.fit(train[["Year"]], train["Avg_Temperature_degC"])
preds = model.predict(test[["Year"]])
mae = mean_absolute_error(test["Avg_Temperature_degC"], preds)
rmse = mean_squared_error(test["Avg_Temperature_degC"], preds, squared=False)
mae, rmse

(6.399958579881656, 7.1623526237200075)

# Export predictions (placeholder)

In [5]:
preds_df = test[["Year", "Country"]].copy() if "Country" in test.columns else test[["Year"]].copy()
preds_df["Predicted_Avg_Temperature_degC"] = preds
preds_path = "data/processed/v1/model_predictions.csv"
preds_df.to_csv(preds_path, index=False)
preds_path

'data/processed/v1/model_predictions.csv'