# Week 8 Lab Assignment: Neural Networks and Overfitting

### Objective
In this lab, you will build a simple neural network using Python and TensorFlow/Keras. You will learn to train the network on a dataset, evaluate its performance, and apply techniques to prevent overfitting.

### 1. Setup and Installations
**Objective:** Ensure all necessary packages are installed and imported for the lab.

**Tasks:**
1. Install required Python packages: TensorFlow, Keras, pandas, numpy, matplotlib.

In [1]:
# Install necessary packages
%pip install tensorflow keras pandas numpy matplotlib

### 2. Import Libraries
**Objective:** Import all necessary libraries for data manipulation, neural network building, and evaluation.


In [None]:
# Import necessary packages
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from tensorflow import keras
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout
from tensorflow.keras.callbacks import EarlyStopping
%matplotlib inline

### 3. Load and Explore Dataset
**Objective:** Gain a preliminary understanding of the dataset to be used for training the neural network.

**Tasks:**
1. **Load the Dataset:** Import the dataset into a Pandas DataFrame.
2. **Inspect the Data:** Use Pandas functions to inspect the first few rows, check for missing values, and understand the data types.
3. **Summary Statistics:** Generate summary statistics for numerical columns.

In [None]:
# Load the dataset
df = pd.read_csv('neural_network_data.csv')

# Inspect the first few rows
print(df.head())

# Check for missing values
print(df.isnull().sum())

# Generate summary statistics
print(df.describe())

### 4. Data Preparation
**Objective:** Prepare the data for training by normalizing features and splitting into training and test sets.

**Tasks:**
1. **Normalize Features:** Scale features to have zero mean and unit variance.
2. **Train-Test Split:** Split the data into training and testing sets.

In [None]:
# Normalize features
from sklearn.preprocessing import StandardScaler

scaler = StandardScaler()
X = df.drop('Target', axis=1)
y = df['Target']
X_scaled = scaler.fit_transform(X)

# Split the data into training and testing sets
from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X_scaled, y, test_size=0.3, random_state=42)
print(f'Training set size: {X_train.shape}')
print(f'Test set size: {X_test.shape}')

### 5. Building and Training a Neural Network
**Objective:** Build a simple neural network, train it on the dataset, and evaluate its performance.

**Tasks:**
1. **Build the Neural Network:** Define the architecture of the neural network using Keras.
2. **Train the Neural Network:** Compile the model, specify the optimizer, loss function, and metrics, and fit the model to the training data.

In [None]:
# Build the neural network
model = Sequential()
model.add(Dense(64, input_dim=X_train.shape[1], activation='relu'))
model.add(Dense(32, activation='relu'))
model.add(Dense(1, activation='sigmoid'))

# Compile the model
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

# Train the neural network
history = model.fit(X_train, y_train, validation_split=0.2, epochs=50, batch_size=10, verbose=1)

# Plot training & validation accuracy values
plt.plot(history.history['accuracy'])
plt.plot(history.history['val_accuracy'])
plt.title('Model accuracy')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['Train', 'Validation'], loc='upper left')
plt.show()

### 6. Preventing Overfitting
**Objective:** Implement dropout, regularization, and early stopping to prevent overfitting.

**Tasks:**
1. **Add Dropout:** Integrate dropout layers into the neural network to reduce overfitting.
2. **Apply Regularization:** Use L2 regularization to penalize large weights.
3. **Early Stopping:** Implement early stopping to halt training when validation performance degrades.

In [None]:
# Rebuild the neural network with dropout and regularization
model = Sequential()
model.add(Dense(64, input_dim=X_train.shape[1], activation='relu', kernel_regularizer=keras.regularizers.l2(0.01)))
model.add(Dropout(0.5))
model.add(Dense(32, activation='relu', kernel_regularizer=keras.regularizers.l2(0.01)))
model.add(Dropout(0.5))
model.add(Dense(1, activation='sigmoid'))

# Compile the model
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

# Implement early stopping
early_stopping = EarlyStopping(monitor='val_loss', patience=5, restore_best_weights=True)

# Train the model with early stopping
history = model.fit(X_train, y_train, validation_split=0.2, epochs=50, batch_size=10, verbose=1, callbacks=[early_stopping])

# Plot training & validation accuracy values
plt.plot(history.history['accuracy'])
plt.plot(history.history['val_accuracy'])
plt.title('Model accuracy with Overfitting Prevention')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['Train', 'Validation'], loc='upper left')
plt.show()

### 7. Submission
**Deliverables:**
- Jupyter Notebook (.ipynb) with all code and model evaluations.
- A brief report (1-2 paragraphs) summarizing the findings, comparing model performance, and discussing the impact of overfitting prevention techniques.

**Deadline:** Submit your completed notebook and report to the course portal by the end of class.