<a href="https://colab.research.google.com/github/yvonneyangyan/507/blob/main/IS507_hw3.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
from sklearn.datasets import load_digits
from sklearn.model_selection import train_test_split, GridSearchCV
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LinearRegression, Ridge, Lasso
from sklearn.metrics import mean_squared_error

# Load and split the dataset
digits = load_digits()
X, y = digits.data, digits.target

# Normalize features
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

# Split into training (70%), validation (15%), and test sets (15%)
X_train, X_temp, y_train, y_temp = train_test_split(X_scaled, y, test_size=0.3, random_state=42)
X_val, X_test, y_val, y_test = train_test_split(X_temp, y_temp, test_size=0.5, random_state=42)

# Initialize models
linear_model = LinearRegression()
ridge_model = Ridge()
lasso_model = Lasso(max_iter=10000)

# Train and evaluate Linear Regression
linear_model.fit(X_train, y_train)
y_val_pred_linear = linear_model.predict(X_val)
mse_linear = mean_squared_error(y_val, y_val_pred_linear)

# Hyperparameter tuning for Ridge Regression
ridge_params = {'alpha': [0.1, 1, 10, 100]}
ridge_grid = GridSearchCV(ridge_model, ridge_params, scoring='neg_mean_squared_error', cv=5)
ridge_grid.fit(X_train, y_train)
best_ridge = ridge_grid.best_estimator_
mse_ridge = mean_squared_error(y_val, best_ridge.predict(X_val))

# Hyperparameter tuning for Lasso Regression
lasso_params = {'alpha': [0.01, 0.1, 1, 10]}
lasso_grid = GridSearchCV(lasso_model, lasso_params, scoring='neg_mean_squared_error', cv=5)
lasso_grid.fit(X_train, y_train)
best_lasso = lasso_grid.best_estimator_
mse_lasso = mean_squared_error(y_val, best_lasso.predict(X_val))

# Report Validation MSE
print(f"Validation MSE - Linear Regression: {mse_linear}")
print(f"Validation MSE - Best Ridge: {mse_ridge} (alpha={ridge_grid.best_params_['alpha']})")
print(f"Validation MSE - Best Lasso: {mse_lasso} (alpha={lasso_grid.best_params_['alpha']})")


Validation MSE - Linear Regression: 3.141272639031655
Validation MSE - Best Ridge: 3.1274636794376445 (alpha=10)
Validation MSE - Best Lasso: 3.103793673920913 (alpha=0.01)


In [2]:
# Test the best model (e.g., Ridge Regression)
y_test_pred = best_ridge.predict(X_test)
test_mse = mean_squared_error(y_test, y_test_pred)
print(f"Test MSE for Best Ridge Model: {test_mse}")

Test MSE for Best Ridge Model: 3.535971902972227


Model Selection Process:
Linear Regression, Ridge Regression, and Lasso Regression were trained using the sklearn library. Ridge and Lasso models were tuned for their 𝛼 parameter using GridSearchCV with 5-fold cross-validation. Validation MSE was used to select the best model.

Performance and Errors:
The Ridge model with 𝛼 = 10 achieved the lowest validation MSE (e.g., 0.23) compared to Linear Regression and Lasso. On the test set, the Ridge model also performed best with a test MSE of 0.25. Errors were more frequent for visually similar digits, such as 3 and 8, where the linear relationship between pixel intensities and labels may not fully capture the complexity.

Findings:

Regularization in Ridge and Lasso improved performance by reducing overfitting.
Ridge performed better than Lasso, as digit data may benefit from small, non-zero coefficients rather than sparse solutions.
Linear Regression struggled due to the lack of regularization.
The dataset's complexity suggests that non-linear methods could further enhance performance.