# Regression Models

This notebook applies multiple regression machine learning techniques to the training data and evaluates their performance. The models considered include Linear Regression, Ridge Regression, and Lasso Regression.

In [1]:
# Import necessary libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression, Ridge, Lasso
from sklearn.metrics import mean_squared_error, r2_score
import warnings

warnings.filterwarnings('ignore')

In [2]:
# Load the training data
train_data = pd.read_csv('../data/train.csv')
train_data.head()

In [3]:
# Prepare the data for modeling
X = train_data.drop('target', axis=1)  # Replace 'target' with the actual target column name
y = train_data['target']  # Replace 'target' with the actual target column name

# Split the data into training and validation sets
X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=0.2, random_state=42)

In [4]:
# Initialize models
models = {
    'Linear Regression': LinearRegression(),
    'Ridge Regression': Ridge(),
    'Lasso Regression': Lasso()
}

# Train models and evaluate performance
results = {}
for model_name, model in models.items():
    model.fit(X_train, y_train)
    y_pred = model.predict(X_val)
    mse = mean_squared_error(y_val, y_pred)
    r2 = r2_score(y_val, y_pred)
    results[model_name] = {'MSE': mse, 'R2': r2}

# Display results
results_df = pd.DataFrame(results).T
results_df

## Conclusion

Based on the evaluation metrics, we can identify the best performing model for deployment. The next steps will involve selecting the best model and preparing it for deployment on AWS SageMaker.