# Day 7: Improving Predictions with Advanced Models

So far, we have built and evaluated two models: **Linear Regression** and **Decision Tree**.  
Today, we will explore **more advanced algorithms** to improve our predictions:

- **Random Forest Regressor** → an ensemble of decision trees.  
- **Gradient Boosting Regressor** → a boosting-based model that learns from previous mistakes.  

We will compare their performance with earlier models and analyze if they provide better accuracy.


In [2]:
# Day 7: Trying Advanced Models - Random Forest & Gradient Boosting

import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestRegressor, GradientBoostingRegressor
from sklearn.metrics import mean_squared_error, r2_score

# Load the dataset (same synthetic stock data we created earlier)
stock_data = pd.read_csv("synthetic_stock_data.csv")

# Features and target
X = stock_data[["open", "high", "low", "volume"]]
y = stock_data["close"]

# Split into training & testing (same as Day 6)
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)

# ---- Train Random Forest ----
rf_model = RandomForestRegressor(n_estimators=100, random_state=42)
rf_model.fit(X_train, y_train)
rf_preds = rf_model.predict(X_test)

rf_mse = mean_squared_error(y_test, rf_preds)
rf_r2 = r2_score(y_test, rf_preds)

# ---- Train Gradient Boosting ----
gb_model = GradientBoostingRegressor(n_estimators=100, learning_rate=0.1, random_state=42)
gb_model.fit(X_train, y_train)
gb_preds = gb_model.predict(X_test)

gb_mse = mean_squared_error(y_test, gb_preds)
gb_r2 = r2_score(y_test, gb_preds)

# ---- Print results ----
print("🔹 Random Forest → MSE:", round(rf_mse, 2), " R²:", round(rf_r2, 2))
print("🔹 Gradient Boosting → MSE:", round(gb_mse, 2), " R²:", round(gb_r2, 2))


🔹 Random Forest → MSE: 0.24  R²: 0.98
🔹 Gradient Boosting → MSE: 0.18  R²: 0.99


### Day 7 Summary – Advanced Models

Today we tested two advanced models: **Random Forest** and **Gradient Boosting**.

- **Random Forest** → MSE = 0.24, R² = 0.98  
- **Gradient Boosting** → MSE = 0.18, R² = 0.99  

📊 **Observation:**  
Both models performed really well, but Gradient Boosting gave slightly better accuracy compared to Random Forest. This shows that boosting techniques can capture patterns more effectively in our synthetic stock dataset.  

Next, we can explore **hyperparameter tuning** to further optimize these models.
