<a href="https://colab.research.google.com/github/calmrocks/master-machine-learning-engineer/blob/main/basic_models/Regression.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Regression Case Study: House Price Prediction

## Introduction
This notebook demonstrates regression modeling using the California Housing dataset. We'll explore various regression techniques and best practices for predicting house prices.

## Dataset Overview
The California Housing dataset contains information about houses in California districts, including:
- Median house value (target variable)
- Housing features (bedrooms, population, income, etc.)
- Location information
- Economic indicators

## Objectives
1. Implement and compare different regression models
2. Apply feature engineering and selection
3. Handle non-linear relationships
4. Evaluate model performance
5. Demonstrate regression best practices

In [None]:
# Import required libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.datasets import fetch_california_housing
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler

# Set plotting style
plt.style.use('seaborn')
sns.set_theme()

# Suppress warnings
import warnings
warnings.filterwarnings('ignore')

## Data Loading and Exploration

In this section, we'll:
1. Load and examine the dataset
2. Analyze feature distributions
3. Investigate relationships between features
4. Check for data quality issues
5. Explore target variable distribution

In [None]:
# Load California Housing dataset
housing = fetch_california_housing()
df = pd.DataFrame(housing.data, columns=housing.feature_names)
df['PRICE'] = housing.target

# Display basic information
print("Dataset Shape:", df.shape)
print("\nFeatures:")
print(housing.feature_names)
print("\nFirst few rows:")
display(df.head())

# Basic statistics
print("\nBasic Statistics:")
display(df.describe())

# Check for missing values
print("\nMissing Values:")
print(df.isnull().sum())

### Target Variable Analysis

Let's examine the distribution of house prices and identify any potential issues:

In [None]:
# Analyze target variable distribution
plt.figure(figsize=(15, 5))

# Histogram
plt.subplot(1, 3, 1)
sns.histplot(data=df, x='PRICE', bins=50)
plt.title('Distribution of House Prices')
plt.xlabel('Price (100k$)')

# Box plot
plt.subplot(1, 3, 2)
sns.boxplot(data=df, y='PRICE')
plt.title('Box Plot of House Prices')
plt.ylabel('Price (100k$)')

# Q-Q plot
plt.subplot(1, 3, 3)
from scipy import stats
stats.probplot(df['PRICE'], dist="norm", plot=plt)
plt.title('Q-Q Plot of House Prices')

plt.tight_layout()
plt.show()

# Print summary statistics
print("\nPrice Statistics:")
print(df['PRICE'].describe())

# Check skewness and kurtosis
print("\nSkewness:", df['PRICE'].skew())
print("Kurtosis:", df['PRICE'].kurtosis())

### Feature Distributions

Examine the distribution of each feature to identify patterns and potential issues:

In [None]:
# Create distribution plots for all features
features = housing.feature_names
n_features = len(features)
n_rows = (n_features + 1) // 2

plt.figure(figsize=(15, 5*n_rows))

for idx, feature in enumerate(features, 1):
    plt.subplot(n_rows, 2, idx)
    
    # Histogram with KDE
    sns.histplot(data=df, x=feature, bins=50, kde=True)
    plt.title(f'Distribution of {feature}')
    
    # Add skewness and kurtosis annotations
    skew = df[feature].skew()
    kurt = df[feature].kurtosis()
    plt.annotate(f'Skewness: {skew:.2f}\nKurtosis: {kurt:.2f}',
                 xy=(0.95, 0.95), xycoords='axes fraction',
                 ha='right', va='top',
                 bbox=dict(boxstyle='round', facecolor='white', alpha=0.8))

plt.tight_layout()
plt.show()

# Print summary of feature characteristics
print("Feature Characteristics:")
for feature in features:
    print(f"\n{feature}:")
    print(f"Range: [{df[feature].min():.2f}, {df[feature].max():.2f}]")
    print(f"Skewness: {df[feature].skew():.2f}")
    print(f"Kurtosis: {df[feature].kurtosis():.2f}")

### Correlation Analysis

Investigate relationships between features and the target variable:

In [None]:
# Calculate correlation matrix
correlation = df.corr()

# Plot correlation heatmap
plt.figure(figsize=(12, 8))
sns.heatmap(correlation, annot=True, cmap='coolwarm', center=0)
plt.title('Correlation Matrix')
plt.show()

# Print correlations with target variable
price_correlations = correlation['PRICE'].sort_values(ascending=False)
print("\nCorrelations with Price:")
print(price_correlations)

# Scatter plots for top correlated features
top_features = price_correlations[1:5].index  # Exclude PRICE itself
plt.figure(figsize=(15, 4))

for idx, feature in enumerate(top_features, 1):
    plt.subplot(1, 4, idx)
    plt.scatter(df[feature], df['PRICE'], alpha=0.5)
    plt.xlabel(feature)
    plt.ylabel('PRICE')
    plt.title(f'Price vs {feature}')

plt.tight_layout()
plt.show()

### Feature Interactions

Explore interactions between important features:

In [None]:
# Create pair plots for top correlated features
top_features_with_price = list(top_features) + ['PRICE']
sns.pairplot(df[top_features_with_price], diag_kind='kde')
plt.suptitle('Pair Plot of Top Features', y=1.02)
plt.show()

# 3D scatter plot for top 2 features vs price
from mpl_toolkits.mplot3d import Axes3D

fig = plt.figure(figsize=(10, 8))
ax = fig.add_subplot(111, projection='3d')

scatter = ax.scatter(df[top_features[0]], 
                    df[top_features[1]], 
                    df['PRICE'],
                    c=df['PRICE'],
                    cmap='viridis')

ax.set_xlabel(top_features[0])
ax.set_ylabel(top_features[1])
ax.set_zlabel('PRICE')
plt.colorbar(scatter)
plt.title('3D Visualization of Top Features vs Price')
plt.show()

### Initial Findings

1. **Target Variable (Price)**:
   - Distribution characteristics
   - Presence of outliers
   - Potential need for transformation

2. **Feature Characteristics**:
   - Range and scale differences
   - Skewness in distributions
   - Potential outliers

3. **Relationships**:
   - Strong correlations identified
   - Non-linear patterns
   - Important feature interactions

4. **Data Quality**:
   - Missing values assessment
   - Outlier impact
   - Feature scaling needs

Next steps:
1. Feature engineering based on observed patterns
2. Handle skewed distributions
3. Address outliers
4. Prepare data for modeling

## Feature Engineering and Preprocessing

Based on our exploratory analysis, we'll:
1. Handle skewed features and outliers
2. Create interaction features
3. Apply feature transformations
4. Scale features appropriately
5. Prepare data for modeling

### Handling Skewed Features

Apply appropriate transformations to handle skewed distributions:

In [None]:
from scipy import stats

def analyze_skewness(data, threshold=0.5):
    """Analyze feature skewness and suggest transformations."""
    skew_stats = pd.DataFrame({
        'Original_Skew': data.skew()
    })
    
    # Try different transformations
    for feature in data.columns:
        # Log transformation (adding 1 to handle zeros)
        if (data[feature] >= 0).all():
            log_skew = np.log1p(data[feature]).skew()
            skew_stats.loc[feature, 'Log_Skew'] = log_skew
        
        # Square root transformation
        if (data[feature] >= 0).all():
            sqrt_skew = np.sqrt(data[feature]).skew()
            skew_stats.loc[feature, 'Sqrt_Skew'] = sqrt_skew
        
        # Box-Cox transformation
        if (data[feature] > 0).all():
            boxcox_data, _ = stats.boxcox(data[feature])
            boxcox_skew = pd.Series(boxcox_data).skew()
            skew_stats.loc[feature, 'BoxCox_Skew'] = boxcox_skew
    
    return skew_stats

# Analyze skewness
skew_analysis = analyze_skewness(df)
print("Skewness Analysis:")
display(skew_analysis)

# Apply transformations to skewed features
df_transformed = df.copy()
transformations = {}

for feature in df.columns:
    if abs(df[feature].skew()) > 0.5:  # Apply transformation if skewness > 0.5
        if (df[feature] >= 0).all():
            # Choose best transformation based on skewness reduction
            skew_values = skew_analysis.loc[feature].dropna()
            best_transform = skew_values.abs().idxmin()
            
            if 'Log' in best_transform:
                df_transformed[f'{feature}_transformed'] = np.log1p(df[feature])
                transformations[feature] = 'log'
            elif 'Sqrt' in best_transform:
                df_transformed[f'{feature}_transformed'] = np.sqrt(df[feature])
                transformations[feature] = 'sqrt'
            elif 'BoxCox' in best_transform and (df[feature] > 0).all():
                df_transformed[f'{feature}_transformed'], _ = stats.boxcox(df[feature])
                transformations[feature] = 'boxcox'

# Visualize transformations
for feature, transform in transformations.items():
    plt.figure(figsize=(15, 5))
    
    # Original distribution
    plt.subplot(1, 2, 1)
    sns.histplot(data=df, x=feature, bins=50, kde=True)
    plt.title(f'Original {feature} Distribution')
    
    # Transformed distribution
    plt.subplot(1, 2, 2)
    sns.histplot(data=df_transformed, x=f'{feature}_transformed', bins=50, kde=True)
    plt.title(f'Transformed {feature} Distribution ({transform})')
    
    plt.tight_layout()
    plt.show()

### Feature Interactions

Create meaningful interaction features based on domain knowledge and correlations:

In [None]:
# Create interaction features
def create_interactions(df, top_n=3):
    """Create interaction features from top correlated features."""
    # Get top correlated features with price
    correlations = df.corr()['PRICE'].abs().sort_values(ascending=False)
    top_features = correlations[1:top_n+1].index
    
    df_interactions = df.copy()
    
    # Create multiplicative interactions
    for i in range(len(top_features)):
        for j in range(i+1, len(top_features)):
            feat1, feat2 = top_features[i], top_features[j]
            interaction_name = f'{feat1}_{feat2}_interaction'
            df_interactions[interaction_name] = df[feat1] * df[feat2]
    
    # Create ratio features
    for i in range(len(top_features)):
        for j in range(len(top_features)):
            if i != j:
                feat1, feat2 = top_features[i], top_features[j]
                ratio_name = f'{feat1}_per_{feat2}'
                df_interactions[ratio_name] = df[feat1] / (df[feat2] + 1e-6)
    
    return df_interactions

# Create interaction features
df_with_interactions = create_interactions(df_transformed)

# Analyze new features
new_features = [col for col in df_with_interactions.columns 
               if col not in df_transformed.columns]

# Calculate correlations with price for new features
new_correlations = df_with_interactions[new_features + ['PRICE']].corr()['PRICE']
print("Correlations of new features with price:")
display(new_correlations.sort_values(ascending=False))

# Visualize top new features
top_new_features = new_correlations.abs().sort_values(ascending=False).head(4).index

plt.figure(figsize=(15, 4))
for idx, feature in enumerate(top_new_features, 1):
    plt.subplot(1, 4, idx)
    plt.scatter(df_with_interactions[feature], 
               df_with_interactions['PRICE'], 
               alpha=0.5)
    plt.xlabel(feature)
    plt.ylabel('PRICE')
    plt.title(f'Price vs {feature}')
plt.tight_layout()
plt.show()

### Polynomial Features

Create polynomial features to capture non-linear relationships:

In [None]:
from sklearn.preprocessing import PolynomialFeatures

def create_polynomial_features(df, features, degree=2):
    """Create polynomial features for specified columns."""
    poly = PolynomialFeatures(degree=degree, include_bias=False)
    poly_features = poly.fit_transform(df[features])
    
    # Generate feature names
    feature_names = poly.get_feature_names_out(features)
    
    # Create DataFrame with polynomial features
    df_poly = pd.DataFrame(poly_features, columns=feature_names)
    
    # Add non-polynomial columns
    for col in df.columns:
        if col not in features:
            df_poly[col] = df[col]
    
    return df_poly

# Select features for polynomial transformation
poly_features = df.corr()['PRICE'].abs().sort_values(ascending=False)[1:4].index

# Create polynomial features
df_with_poly = create_polynomial_features(df_with_interactions, poly_features)

# Analyze new polynomial features
new_poly_features = [col for col in df_with_poly.columns 
                    if col not in df_with_interactions.columns]

# Calculate correlations with price for polynomial features
poly_correlations = df_with_poly[new_poly_features + ['PRICE']].corr()['PRICE']
print("Correlations of polynomial features with price:")
display(poly_correlations.sort_values(ascending=False))

# Visualize top polynomial features
top_poly_features = poly_correlations.abs().sort_values(ascending=False).head(4).index

plt.figure(figsize=(15, 4))
for idx, feature in enumerate(top_poly_features, 1):
    plt.subplot(1, 4, idx)
    plt.scatter(df_with_poly[feature], 
               df_with_poly['PRICE'], 
               alpha=0.5)
    plt.xlabel(feature)
    plt.ylabel('PRICE')
    plt.title(f'Price vs {feature}')
plt.tight_layout()
plt.show()

### Feature Selection

Select most important features using various methods:

In [None]:
from sklearn.feature_selection import SelectKBest, f_regression, mutual_info_regression

def select_features(X, y, n_features=20):
    """Select top features using multiple methods."""
    # Correlation based selection
    correlations = pd.DataFrame({
        'feature': X.columns,
        'correlation': X.corrwith(y).abs()
    }).sort_values('correlation', ascending=False)
    
    # F-regression based selection
    f_selector = SelectKBest(f_regression, k=n_features)
    f_selector.fit(X, y)
    f_scores = pd.DataFrame({
        'feature': X.columns,
        'f_score': f_selector.scores_
    }).sort_values('f_score', ascending=False)
    
    # Mutual information based selection
    mi_selector = SelectKBest(mutual_info_regression, k=n_features)
    mi_selector.fit(X, y)
    mi_scores = pd.DataFrame({
        'feature': X.columns,
        'mi_score': mi_selector.scores_
    }).sort_values('mi_score', ascending=False)
    
    return correlations, f_scores, mi_scores

# Prepare data for feature selection
X = df_with_poly.drop('PRICE', axis=1)
y = df_with_poly['PRICE']

# Select features
corr_features, f_features, mi_features = select_features(X, y)

# Plot feature importance scores
plt.figure(figsize=(15, 5))

# Correlation based importance
plt.subplot(1, 3, 1)
sns.barplot(data=corr_features.head(10), x='correlation', y='feature')
plt.title('Top Features by Correlation')

# F-score based importance
plt.subplot(1, 3, 2)
sns.barplot(data=f_features.head(10), x='f_score', y='feature')
plt.title('Top Features by F-Score')

# Mutual information based importance
plt.subplot(1, 3, 3)
sns.barplot(data=mi_features.head(10), x='mi_score', y='feature')
plt.title('Top Features by Mutual Information')

plt.tight_layout()
plt.show()

# Select final features that appear in top 10 of at least 2 methods
top_features_sets = [
    set(corr_features.head(10)['feature']),
    set(f_features.head(10)['feature']),
    set(mi_features.head(10)['feature'])
]

final_features = set()
for feature in X.columns:
    if sum([feature in feature_set for feature_set in top_features_sets]) >= 2:
        final_features.add(feature)

print("\nSelected features:")
print(final_features)

### Feature Scaling

Scale the selected features for modeling:

In [None]:
from sklearn.preprocessing import StandardScaler, RobustScaler

# Prepare final feature set
X_final = df_with_poly[list(final_features)]
y_final = df_with_poly['PRICE']

# Split data
X_train, X_test, y_train, y_test = train_test_split(
    X_final, y_final, test_size=0.2, random_state=42
)

# Scale features
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

# Convert to DataFrame to maintain feature names
X_train_scaled = pd.DataFrame(X_train_scaled, columns=X_train.columns)
X_test_scaled = pd.DataFrame(X_test_scaled, columns=X_test.columns)

# Display scaling results
print("Scaled features statistics:")
display(X_train_scaled.describe())

# Save preprocessed data
preprocessed_data = {
    'X_train': X_train_scaled,
    'X_test': X_test_scaled,
    'y_train': y_train,
    'y_test': y_test,
    'scaler': scaler,
    'feature_names': list(final_features)
}

### Preprocessing Summary

1. **Feature Transformations**:
   - Handled skewed features
   - Created interaction features
   - Added polynomial terms

2. **Feature Selection**:
   - Used multiple selection methods
   - Combined results for robust selection
   - Identified most important features

3. **Data Preparation**:
   - Scaled features appropriately
   - Split data for modeling
   - Preserved feature names

Next steps:
1. Implement regression models
2. Compare model performance
3. Fine-tune best models

## Model Implementation

We'll implement several regression models:
1. Linear Regression (baseline)
2. Ridge and Lasso Regression
3. Random Forest
4. XGBoost
5. LightGBM

For each model, we'll:
- Train with default parameters
- Make predictions
- Calculate regression metrics
- Analyze residuals

In [None]:
from sklearn.metrics import mean_squared_error, mean_absolute_error, r2_score
from sklearn.model_selection import cross_val_score

def evaluate_regression_model(y_true, y_pred, model_name):
    """Calculate and display regression metrics."""
    mse = mean_squared_error(y_true, y_pred)
    rmse = np.sqrt(mse)
    mae = mean_absolute_error(y_true, y_pred)
    r2 = r2_score(y_true, y_pred)
    
    print(f"{model_name} Performance:")
    print(f"RMSE: {rmse:.4f}")
    print(f"MAE: {mae:.4f}")
    print(f"R2 Score: {r2:.4f}")
    
    # Plot actual vs predicted
    plt.figure(figsize=(15, 5))
    
    # Actual vs Predicted
    plt.subplot(1, 2, 1)
    plt.scatter(y_true, y_pred, alpha=0.5)
    plt.plot([y_true.min(), y_true.max()], [y_true.min(), y_true.max()], 'r--', lw=2)
    plt.xlabel('Actual Price')
    plt.ylabel('Predicted Price')
    plt.title(f'{model_name}: Actual vs Predicted')
    
    # Residuals
    residuals = y_true - y_pred
    plt.subplot(1, 2, 2)
    plt.scatter(y_pred, residuals, alpha=0.5)
    plt.axhline(y=0, color='r', linestyle='--')
    plt.xlabel('Predicted Price')
    plt.ylabel('Residuals')
    plt.title('Residual Plot')
    
    plt.tight_layout()
    plt.show()
    
    return {
        'rmse': rmse,
        'mae': mae,
        'r2': r2
    }

### Linear Regression (Baseline)

Implement basic linear regression as a baseline model:

In [None]:
from sklearn.linear_model import LinearRegression

# Train linear regression
lr_model = LinearRegression()
lr_model.fit(X_train_scaled, y_train)

# Make predictions
lr_pred = lr_model.predict(X_test_scaled)

# Evaluate model
lr_metrics = evaluate_regression_model(y_test, lr_pred, 'Linear Regression')

# Feature importance
lr_importance = pd.DataFrame({
    'feature': X_train_scaled.columns,
    'coefficient': abs(lr_model.coef_)
}).sort_values('coefficient', ascending=False)

plt.figure(figsize=(10, 6))
sns.barplot(data=lr_importance.head(10), x='coefficient', y='feature')
plt.title('Top 10 Features (Linear Regression)')
plt.show()

### Ridge and Lasso Regression

Implement regularized regression models:

In [None]:
from sklearn.linear_model import Ridge, Lasso

# Ridge Regression
ridge_model = Ridge(alpha=1.0)
ridge_model.fit(X_train_scaled, y_train)
ridge_pred = ridge_model.predict(X_test_scaled)
ridge_metrics = evaluate_regression_model(y_test, ridge_pred, 'Ridge Regression')

# Lasso Regression
lasso_model = Lasso(alpha=1.0)
lasso_model.fit(X_train_scaled, y_train)
lasso_pred = lasso_model.predict(X_test_scaled)
lasso_metrics = evaluate_regression_model(y_test, lasso_pred, 'Lasso Regression')

# Compare feature importance
plt.figure(figsize=(15, 5))

# Ridge coefficients
plt.subplot(1, 2, 1)
ridge_importance = pd.DataFrame({
    'feature': X_train_scaled.columns,
    'coefficient': abs(ridge_model.coef_)
}).sort_values('coefficient', ascending=False)
sns.barplot(data=ridge_importance.head(10), x='coefficient', y='feature')
plt.title('Top 10 Features (Ridge)')

# Lasso coefficients
plt.subplot(1, 2, 2)
lasso_importance = pd.DataFrame({
    'feature': X_train_scaled.columns,
    'coefficient': abs(lasso_model.coef_)
}).sort_values('coefficient', ascending=False)
sns.barplot(data=lasso_importance.head(10), x='coefficient', y='feature')
plt.title('Top 10 Features (Lasso)')

plt.tight_layout()
plt.show()

### Random Forest Regression

Implement Random Forest model:

In [None]:
from sklearn.ensemble import RandomForestRegressor

# Train Random Forest
rf_model = RandomForestRegressor(
    n_estimators=100,
    max_depth=None,
    min_samples_split=2,
    random_state=42
)

rf_model.fit(X_train_scaled, y_train)
rf_pred = rf_model.predict(X_test_scaled)
rf_metrics = evaluate_regression_model(y_test, rf_pred, 'Random Forest')

# Feature importance
rf_importance = pd.DataFrame({
    'feature': X_train_scaled.columns,
    'importance': rf_model.feature_importances_
}).sort_values('importance', ascending=False)

plt.figure(figsize=(10, 6))
sns.barplot(data=rf_importance.head(10), x='importance', y='feature')
plt.title('Top 10 Features (Random Forest)')
plt.show()

### XGBoost Regression

Implement XGBoost model:

In [None]:
from xgboost import XGBRegressor

# Train XGBoost
xgb_model = XGBRegressor(
    n_estimators=100,
    learning_rate=0.1,
    max_depth=6,
    random_state=42
)

xgb_model.fit(X_train_scaled, y_train)
xgb_pred = xgb_model.predict(X_test_scaled)
xgb_metrics = evaluate_regression_model(y_test, xgb_pred, 'XGBoost')

# Feature importance
xgb_importance = pd.DataFrame({
    'feature': X_train_scaled.columns,
    'importance': xgb_model.feature_importances_
}).sort_values('importance', ascending=False)

plt.figure(figsize=(10, 6))
sns.barplot(data=xgb_importance.head(10), x='importance', y='feature')
plt.title('Top 10 Features (XGBoost)')
plt.show()

### LightGBM Regression

Implement LightGBM model:

In [None]:
from lightgbm import LGBMRegressor

# Train LightGBM
lgb_model = LGBMRegressor(
    n_estimators=100,
    learning_rate=0.1,
    num_leaves=31,
    random_state=42
)

lgb_model.fit(X_train_scaled, y_train)
lgb_pred = lgb_model.predict(X_test_scaled)
lgb_metrics = evaluate_regression_model(y_test, lgb_pred, 'LightGBM')

# Feature importance
lgb_importance = pd.DataFrame({
    'feature': X_train_scaled.columns,
    'importance': lgb_model.feature_importances_
}).sort_values('importance', ascending=False)

plt.figure(figsize=(10, 6))
sns.barplot(data=lgb_importance.head(10), x='importance', y='feature')
plt.title('Top 10 Features (LightGBM)')
plt.show()

### Model Comparison

In [None]:
# Collect all results
models = {
    'Linear Regression': lr_metrics,
    'Ridge': ridge_metrics,
    'Lasso': lasso_metrics,
    'Random Forest': rf_metrics,
    'XGBoost': xgb_metrics,
    'LightGBM': lgb_metrics
}

# Create comparison DataFrame
comparison_df = pd.DataFrame(models).T
print("Model Comparison:")
display(comparison_df)

# Plot metrics comparison
plt.figure(figsize=(15, 5))

# RMSE
plt.subplot(1, 3, 1)
sns.barplot(data=comparison_df.reset_index(), x='index', y='rmse')
plt.xticks(rotation=45)
plt.title('RMSE by Model')

# MAE
plt.subplot(1, 3, 2)
sns.barplot(data=comparison_df.reset_index(), x='index', y='mae')
plt.xticks(rotation=45)
plt.title('MAE by Model')

# R2
plt.subplot(1, 3, 3)
sns.barplot(data=comparison_df.reset_index(), x='index', y='r2')
plt.xticks(rotation=45)
plt.title('R² Score by Model')

plt.tight_layout()
plt.show()

### Model Implementation Summary

1. **Model Performance**:
   - Best performing model: [based on results]
   - Performance metrics comparison
   - Residual analysis insights

2. **Feature Importance**:
   - Consistent important features across models
   - Different importance rankings
   - Feature selection validation

3. **Model Characteristics**:
   - Linear vs non-linear performance
   - Regularization effects
   - Ensemble method benefits

Next steps:
1. Hyperparameter tuning
2. Model stacking/ensembling
3. Cross-validation analysis