# Assignment 2: Regression, Multi-class, and Hyper-parameter Tuning

### Task 1: Regression Metrics (30 points total)

The code below executes the following steps:
* Load the California Housing dataset from sklearn.
* Split the dataset into training and testing sets.
* Train a linear regression model on the training data.

It is your task to:
* Evaluate the model's performance using Mean Absolute Error (MAE), Mean Squared Error (MSE), and R-squared metrics.
* Print the evaluation results.
* Interpret the results and discuss how each metric reflects the performance of a regression model.

In [None]:
from sklearn.datasets import fetch_california_housing
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_absolute_error, mean_squared_error, r2_score

# Load the Boston Housing dataset
housing = fetch_california_housing()
X, y = housing.data, housing.target

# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train a linear regression model
model = LinearRegression()
model.fit(X_train, y_train)

# Make predictions on the testing set
y_pred = model.predict(X_test)

mae = mean_absolute_error(y_test, y_pred)
mse = mean_squared_error(y_test, y_pred)
r2 = r2_score(y_test, y_pred)

print(f"Mean Absolute Error (MAE): {mae:.4f}")
print(f"Mean Squared Error (MSE): {mse:.4f}")
print(f"R-squared (R²): {r2:.4f}")

Mean Absolute Error (MAE): 0.5332
Mean Squared Error (MSE): 0.5559
R-squared (R²): 0.5758


**Question: Interpret the results. How might we interpret the model performance and communicate it to stakeholders? (20 points)**

MAE is the average difference between predicted values and actual values. Since we have an MAE of .53, in housing prices it means, on average, the predcited price was off by $53,000 from the actual price. MSE is the average of the squared difference between predicted vs actual price. Since the differences are squared, it penalizes large errors. R-squared is different. It explains how well the model performs using the variables. The closer the R-squared is to 1 the better the model performed. In this dataset, using the variables we selected, the model can knows about 61% how what drives the housing prices.


*Your Answer:*

### Task 2: Multiclass Classification Metrics (30 points total)

The code below executes the following steps:
* Load the Iris dataset from scikit-learn.
* Split the dataset into training and testing sets.
* Train a multiclass classification model, logistic regression, on the training data.

It is your task to:
* Evaluate the model's performance using precision, recall, F1 score
* Visualize a confusion matrix.
* Print the evaluation results.
* Interpret the results and discuss how each metric reflects the performance of a regression model.

In [None]:
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import precision_score, recall_score, f1_score, confusion_matrix

# Load the Iris dataset
iris = load_iris()
X, y = iris.data, iris.target

# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train a logistic regression model
model = LogisticRegression()
model.fit(X_train, y_train)

# Make predictions on the testing set
y_pred = model.predict(X_test)

# Evaluate model performance

precision = precision_score(y_test, y_pred, average='weighted')
recall = recall_score(y_test, y_pred, average='weighted')
f1 = f1_score(y_test, y_pred, average='weighted')
conf_matrix = confusion_matrix(y_test, y_pred)

print(f"Precision: {precision:.4f}")
print(f"Recall: {recall:.4f}")
print(f"F1 Score: {f1:.4f}")
print("Confusion Matrix:")
print(conf_matrix)

Precision: 1.0000
Recall: 1.0000
F1 Score: 1.0000
Confusion Matrix:
[[10  0  0]
 [ 0  9  0]
 [ 0  0 11]]


**Question: Interpret the results. How might we interpret the model performance and communicate it to stakeholders? (20 points)**


*Your Answer:*
According to the numbers above, the model performs very well. It makes little to no, incorrect predictions, little to no miss classified species, and g

### Task 3: Model Selection, Hyperparameter Tuning, and Cross-Validation (40 points total)
The code below executes the following steps:
* Load in the Iris dataset.
* Split into training and testing

It is your task to:
* Implement a grid search with cross-validation to tune hyperparameters for a classification model (e.g. random forest).
* Explore different hyperparameters (e.g. number of estimators for random forest).
* Evaluate the model's performance using accuracy, precision, recall, and F1 score on the testing set.
* Print the **best hyperparameters** and evaluation results.

In [None]:
from sklearn.datasets import load_iris
from sklearn.model_selection import GridSearchCV, train_test_split
from sklearn.svm import SVC
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score, classification_report
from sklearn.model_selection import cross_val_score

# Load the Iris dataset
iris = load_iris()
X, y = iris.data, iris.target

# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Define the parameter grid for the RandomForestClassifier
param_grid = {
    'n_estimators': [50, 100, 200],  # Number of trees in the forest
    'max_depth': [None, 10, 20],     # Maximum depth of each tree
    'min_samples_split': [2, 5, 10], # Minimum samples required to split an internal node
    'min_samples_leaf': [1, 2, 4]     # Minimum samples required to be at a leaf node
}

# Create a RandomForestClassifier
rf_classifier = RandomForestClassifier(random_state=42)

# Perform GridSearchCV with cross-validation
grid_search = GridSearchCV(estimator=rf_classifier, param_grid=param_grid, cv=5, scoring='accuracy')
grid_search.fit(X_train, y_train)


# Get the best model and its hyperparameters
best_rf = grid_search.best_estimator_
best_params = grid_search.best_params_

# Make predictions on the test set
y_pred = best_rf.predict(X_test)

# Evaluate the model
accuracy = accuracy_score(y_test, y_pred)
precision = precision_score(y_test, y_pred, average='weighted')
recall = recall_score(y_test, y_pred, average='weighted')
f1 = f1_score(y_test, y_pred, average='weighted')

# Print the results
print("Best Hyperparameters:", best_params)
print("Accuracy:", accuracy)
print("Precision:", precision)
print("Recall:", recall)
print("F1 Score:", f1)
print(classification_report(y_test, y_pred))

#Optional:  Evaluate using cross-validation on the entire dataset to get a more robust estimate.
cv_scores = cross_val_score(best_rf, X, y, cv=5, scoring='accuracy')
print("\nCross-validation scores:", cv_scores)
print("Mean cross-validation accuracy:", cv_scores.mean())




Best Hyperparameters: {'max_depth': None, 'min_samples_leaf': 2, 'min_samples_split': 2, 'n_estimators': 200}
Accuracy: 1.0
Precision: 1.0
Recall: 1.0
F1 Score: 1.0
              precision    recall  f1-score   support

           0       1.00      1.00      1.00        10
           1       1.00      1.00      1.00         9
           2       1.00      1.00      1.00        11

    accuracy                           1.00        30
   macro avg       1.00      1.00      1.00        30
weighted avg       1.00      1.00      1.00        30


Cross-validation scores: [0.96666667 0.96666667 0.93333333 0.9        1.        ]
Mean cross-validation accuracy: 0.9533333333333334


### OPTIONAL Task 4: Custom Scoring Metric (20 bonus points)

In sklearn, you are not limited to using their scoring functions. You can create your own!

You can create a custom scoring metric in scikit-learn by defining a scoring function and then using the `make_scorer` function to wrap it as a scorer.

**For bonus points:**

* Define a custom scoring function custom_scoring that calculates the weighted sum of precision and recall for a binary classification problem.
* Then wrap this function using make_scorer to create a custom scorer custom_scorer.
* Use this custom scorer in cross-validation to evaluate the performance of a logistic regression model.

In [None]:
from sklearn.metrics import make_scorer
from sklearn.model_selection import cross_val_score
from sklearn.datasets import make_classification
from sklearn.linear_model import LogisticRegression

# Define your custom scoring function
def custom_scoring(y_true, y_pred, precision_weight = 0.6, recall_weight = 0.4):
    # YOUR CODE HERE

    return score

# Wrap the custom scoring function as a scorer
# YOUR CODE HERE



In [None]:
# THIS CODE TESTS YOUR FUNCTION
# Generate sample data
X, y = make_classification(n_samples=100, n_features=10, random_state=42)

# Create and train a model using cross-validation
model = LogisticRegression()
scores = cross_val_score(model, X, y, cv=5, scoring=custom_scorer)

# Print the custom scores obtained from cross-validation
print("Custom Scores:", scores)
print("Mean Custom Score:", scores.mean())