---

## **Machine Learning - II**

---

### **Stacked Regression Model Comparison**


In [None]:
# Import necessary libraries
from numpy import mean  # To calculate the average of cross-validation scores
from sklearn.datasets import make_regression  # To generate a synthetic regression dataset
from sklearn.model_selection import cross_val_score, RepeatedKFold  # For cross-validation and model evaluation
from sklearn.linear_model import LinearRegression  # Meta-model for stacking
from sklearn.neighbors import KNeighborsRegressor  # Base model 1 (K-Nearest Neighbors)
from sklearn.tree import DecisionTreeRegressor  # Base model 2 (Decision Tree)
from sklearn.svm import SVR  # Base model 3 (Support Vector Regressor)
from sklearn.ensemble import StackingRegressor  # For building the Stacking ensemble model

In [None]:
# Create a synthetic regression dataset
X, y = make_regression(n_samples=1000, n_features=20, random_state=1) # 1000 samples, 20 I/P values, to save the random values
# X: Feature matrix with 1000 samples and 20 features
# y: Target vector with 1000 values corresponding to each sample

In [None]:
# Define a function to create a stacking model
def get_stacking():
    level0 = list()  # Initialize an empty list to hold base models (level 0)
    level0.append(('knn', KNeighborsRegressor()))  # Add KNN as a base model
    level0.append(('cart', DecisionTreeRegressor()))  # Add Decision Tree as a base model
    level0.append(('svm', SVR()))  # Add Support Vector Regressor as a base model
    level1 = LinearRegression()  # Define Linear Regression as the meta-model (level 1)
    model = StackingRegressor(estimators=level0, final_estimator=level1)  # Combine base models and meta-model in a Stacking Regressor
    return model  # Return the complete stacking model

# Level 0 is base models
# Level 1 is meta model
# Level 0 is list due to it having multiple models in it

In [None]:
# Define a function to retrieve the models for comparison
def get_models():
    models = dict()  # Initialize an empty dictionary to store models
    models['knn'] = KNeighborsRegressor()  # Add KNN model to the dictionary
    models['cart'] = DecisionTreeRegressor()  # Add Decision Tree model to the dictionary
    models['svm'] = SVR()  # Add SVR model to the dictionary
    models['stacking'] = get_stacking()  # Add the stacking model to the dictionary
    return models  # Return the dictionary of models

In [None]:
# Define a function to evaluate a model using cross-validation
def evaluate_model(model, X, y):
    cv = RepeatedKFold(n_splits=10, n_repeats=3, random_state=1)  # 10-fold cross-validation repeated 3 times
    # This divides the dataset(1000 samples) into 10 splits and repeats the process 3 times for reliable results
    scores = cross_val_score(model, X, y, scoring='neg_mean_absolute_error', cv=cv)  # Evaluate using negative MAE
    # Negative mean absolute error is used here because cross_val_score expects higher scores to be better
    return scores  # Return the cross-validation scores

In [None]:
# Get the dictionary of models (KNN, CART, SVR, Stacking)
models = get_models()

# Initialize empty lists to store results and model names
results, names = list(), list()

# Loop through each model in the dictionary
for name, model in models.items():
    scores = evaluate_model(model, X, y)  # Evaluate the model on the dataset using cross-validation
    results.append(scores)  # Append the scores to the results list
    names.append(name)  # Append the model name to the names list
    print(f'{name}: {mean(scores)}')  # Print the model name and the mean cross-validation score (mean of 30 scores)

knn -72.38368150286077
cart -89.59476145848235
svm -113.9538393363414
stacking -40.600319718110576


### **Interpretation of Results:**
We used four different models to perform regression on a synthetic dataset: `KNeighborsRegressor` (KNN), `DecisionTreeRegressor` (CART), `SVR`, and a `StackingRegressor`. The evaluation was done using cross-validation, and the negative mean absolute error (NMAE) was used as the metric.

Here are the results for each model:
- **KNN (KNeighborsRegressor):** NMAE = `-72.38`
- **CART (DecisionTreeRegressor):** NMAE = `-89.59`
- **SVR (Support Vector Regressor):** NMAE = `-113.95`
- **Stacking Model:** NMAE = `-40.60`

### Interpretation:
1. **KNN:** The KNeighborsRegressor model had a moderate performance with an NMAE of `-72.38`. It performed better than both the CART and SVR models, but there was still room for improvement.
   
2. **CART:** The DecisionTreeRegressor performed worse than KNN, yielding an NMAE of `-89.59`. This model tends to overfit the training data, especially on small or noisy datasets, and its performance was suboptimal.
   
3. **SVR:** The SVR model had the poorest performance with an NMAE of `-113.95`. SVR models can struggle when data is high-dimensional or when parameters are not well-tuned, which could explain its poor performance here.
   
4. **Stacking Model:** The stacking model achieved the best performance with an NMAE of `-40.60`. By combining the predictions of multiple models (KNN, CART, SVR) and using a meta-model (LinearRegression) to make final predictions, the stacking approach reduced the error significantly.

### Why Stacking Was Done:
The performance of individual base models was not satisfactory:
- **KNN** had moderate performance but could be improved.
- **CART** and **SVR** struggled to capture the relationships in the data well.
- No single model provided an optimal solution on its own.

Stacking was used to **leverage the strengths** of each base model and create a more powerful ensemble. The idea behind stacking is that by combining multiple weak models, we can reduce their individual weaknesses and create a better final prediction.

In this case, the **StackingRegressor** used the combined knowledge from KNN, DecisionTree, and SVR models, and then the meta-model (LinearRegression) learned how to combine their predictions effectively. This approach improved the overall accuracy and reduced the error, as evident from the best NMAE score of `-40.60`.

### Conclusion:
The results demonstrate that stacking is an effective method to improve performance, especially when base models alone are insufficient. The stacking model outperformed individual models by a large margin, making it a superior choice for this regression task. The **combination of multiple models** helped generalize better to the underlying data, reducing the prediction error and yielding more reliable results.