# Lab 7: Neural Networks with Scikit-learn (Regression)

#### Objective:
In this lab, you will use Scikit-learn's Multi-layer Perceptron (MLP) to build a neural network model that predicts house prices using the **California Housing dataset**.

### Dataset Description:
The California Housing dataset contains information about house prices from various regions in California. It is a regression task where we predict the median house value (target) based on features such as:

    MedInc: Median income in the block.
    HouseAge: Average age of houses in the block.
    AveRooms: Average number of rooms per household.
    AveOccup: Average number of household members.
    Latitude and Longitude: Geographical location of the block.

This dataset is commonly used for regression tasks and provides continuous target variables (house values). It offers a variety of features, such as socioeconomic factors and geographic location, that may influence house prices.

### Load and explore the Dataset
Start by importing the necessary libraries and loading the dataset.

In [5]:
import pandas as pd
import numpy as np
from sklearn.datasets import fetch_california_housing
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.neural_network import MLPRegressor
from sklearn.model_selection import GridSearchCV

import matplotlib.pyplot as plt

import warnings
warnings.filterwarnings('ignore')

# Load the dataset
data = fetch_california_housing()
X = data.data
y = data.target

# Split into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=41)

# Scale the data
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)


### What is happening here?
We're loading the dataset, splitting it into training and testing sets (80-20 split), and scaling the features to have zero mean and unit variance. This helps the neural network to learn faster and more efficiently.

##### Build a Feed-forward Neural Network (MLP) using Scikit-learn:
Scikit-learn provides an easy way to define a neural network using MLPRegressor. This model will use a simple multi-layer perceptron (feed-forward network). 

** Please note this task takes some time.

In [8]:
# Define the model
mlp = MLPRegressor(max_iter=50)

# Hyperparameter grid for tuning
param_grid = {
    'hidden_layer_sizes': [(5,), (10,), (15,)],
    'activation': ['tanh', 'relu'],
    'solver': ['adam', 'sgd'],
    'alpha': [0.0001, 0.001]
}

# Apply GridSearchCV for hyperparameter tuning
grid_search = GridSearchCV(mlp, param_grid, cv=5, scoring='neg_mean_squared_error', verbose=1)
grid_search.fit(X_train, y_train)

# Best model after grid search
best_model = grid_search.best_estimator_

# Evaluate the model on the test set
test_score = best_model.score(X_test, y_test)
print(f"Test set R^2: {test_score:.4f}")


Fitting 5 folds for each of 24 candidates, totalling 120 fits
Test set R^2: 0.7078


##### Steps:
- `MLPRegressor`: Creates a multi-layer perceptron regression model.
- Hyperparameter Grid: We define the search space for hyperparameters such as:
 
    `hidden_layer_sizes`: Number of neurons in the hidden layers.

    `activation`: Activation function for hidden layers (tanh, relu).

    `solver`: The solver used for weight optimization (Adam, SGD).

    `alpha`: Regularization term (L2 penalty).

- `GridSearchCV`: Performs grid search to find the best combination of hyperparameters using 5-fold cross-validation.
- Model Evaluation: We evaluate the model's performance using R² on the test set.

** R² (coefficient of determination): A measure of how well the model explains the variability of the target variable. The closer it is to 1, the better the model.

### Implement Regularisation and Optimisation:
Now, let's apply **L2** Regularisation (ridge) and explore different solvers like **Adam** and **SGD**.

In [11]:
param_grid = {
    'hidden_layer_sizes': [(5,), (10,), (15,)],
    'activation': ['relu'],
    'solver': ['adam', 'sgd'],
    'alpha': [0.0001, 0.001],  # L2 regularisation strength
    'learning_rate': ['constant', 'adaptive']
}

#mlp= MLPRegressor(max_iter=50)
grid_search = GridSearchCV(mlp, param_grid, cv=5, scoring='neg_mean_squared_error', verbose=1)
grid_search.fit(X_train, y_train)

best_model = grid_search.best_estimator_
test_score = best_model.score(X_test, y_test)
print(f"Test set R^2 with Regularisation: {test_score:.4f}")


Fitting 5 folds for each of 24 candidates, totalling 120 fits
Test set R^2 with Regularisation: 0.6912
