# Boston Housing Regression

This notebook contains hands-on exercises for participants to work with the Boston Housing dataset. We will cover data preprocessing, building a Scikit-learn Linear Regression model, and constructing a TensorFlow neural network for regression.

## 1. Setting Up the Environment

First, we'll import all the necessary libraries for our analysis:
- **pandas & numpy**: For data manipulation and numerical operations
- **scikit-learn**: For machine learning algorithms and preprocessing tools
- **TensorFlow**: For building neural networks
- **matplotlib & seaborn**: For data visualization

In [2]:
import pandas as pd
import numpy as np
from sklearn.datasets import fetch_openml
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
import tensorflow as tf
import matplotlib.pyplot as plt
import seaborn as sns

## 2. Data Preprocessing

In this section, we will load the Boston Housing dataset, check for missing values, and scale the features.

In [3]:
# Load the Boston Housing dataset
boston = fetch_openml(name='boston', version=1, as_frame=False)
df = pd.DataFrame(data=boston.data, columns=boston.feature_names)
df['PRICE'] = boston.target

# Check for missing values
print('Missing values in each column:', df.isnull().sum())

# Scale the features
X = df.drop('PRICE', axis=1)
y = df['PRICE']
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X_scaled, y, test_size=0.2, random_state=42)

Missing values in each column: CRIM       0
ZN         0
INDUS      0
CHAS       0
NOX        0
RM         0
AGE        0
DIS        0
RAD        0
TAX        0
PTRATIO    0
B          0
LSTAT      0
PRICE      0
dtype: int64


## 3. Building a Scikit-learn Linear Regression Model

In this section, we will create a Linear Regression model using Scikit-learn and evaluate its performance.

In [4]:
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error, r2_score

# Create and train the model
model_sk = LinearRegression()
model_sk.fit(X_train, y_train)

# Make predictions
y_pred_sk = model_sk.predict(X_test)

# Evaluate the model
mse_sk = mean_squared_error(y_test, y_pred_sk)
r2_sk = r2_score(y_test, y_pred_sk)
print('Scikit-learn Linear Regression MSE:', mse_sk)
print('Scikit-learn Linear Regression R^2:', r2_sk)

Scikit-learn Linear Regression MSE: 24.29111947497353
Scikit-learn Linear Regression R^2: 0.6687594935356318


## 4. Constructing a TensorFlow Neural Network for Regression

In this section, we will build a simple neural network using TensorFlow for regression tasks.

In [5]:
# Build the neural network model
model_tf = tf.keras.Sequential([
    tf.keras.layers.Dense(64, activation='relu', input_shape=(X_train.shape[1],)),
    tf.keras.layers.Dense(32, activation='relu'),
    tf.keras.layers.Dense(1)  # Output layer for regression
])

# Compile the model
model_tf.compile(optimizer='adam', loss='mean_squared_error', metrics=['mae'])

# Train the model
model_tf.fit(X_train, y_train, epochs=100, batch_size=16, verbose=1)

# Evaluate the model
loss, mae_tf = model_tf.evaluate(X_test, y_test)
print('TensorFlow Neural Network MAE:', mae_tf)

  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


Epoch 1/100
[1m26/26[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 5ms/step - loss: 590.1748 - mae: 22.5749
Epoch 2/100
[1m26/26[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 6ms/step - loss: 517.6006 - mae: 21.0358
Epoch 3/100
[1m26/26[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 5ms/step - loss: 412.3836 - mae: 18.3887
Epoch 4/100
[1m26/26[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 7ms/step - loss: 237.8670 - mae: 13.2342
Epoch 5/100
[1m26/26[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 7ms/step - loss: 93.2354 - mae: 7.8462
Epoch 6/100
[1m26/26[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 8ms/step - loss: 73.7786 - mae: 6.2799 
Epoch 7/100
[1m26/26[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 5ms/step - loss: 36.6407 - mae: 4.4063 
Epoch 8/100
[1m26/26[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 6ms/step - loss: 26.5717 - mae: 3.8014
Epoch 9/100
[1m26/26[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0

## 5. Conclusion

In this notebook, we successfully performed data preprocessing, built a Linear Regression model using Scikit-learn, and constructed a neural network using TensorFlow for the Boston Housing dataset. Participants can further explore model tuning and additional evaluation metrics.