<a href='https://ai.meng.duke.edu'> = <img align="left" style="padding-top:10px;" src=https://storage.googleapis.com/aipi_datasets/Duke-AIPI-Logo.png>

# Deep Learning in TensorFlow
This notebook provides an introduction to building neural networks in TensorFlow for modeling tasks using structured data.

### Contents
1) [Regression Models](#regression)  
2) [Binary Classification Models](#binary-classification)  
3) [Multiclass Classification](#multiclass-classification)  
4) [Saving Models](#saving-models)


In [None]:
# Run this cell only if working in Colab
# Connects to any needed files from GitHub and Google Drive
import os

# Remove Colab default sample_data
!rm -r ./sample_data

# Clone GitHub files to colab workspace
repo_name = "AIPI540-Deep-Learning-Applications" # Enter repo name
git_path = 'https://github.com/AIPI540/AIPI540-Deep-Learning-Applications.git'
!git clone "{git_path}"

# Install dependencies from requirements.txt file
#!pip install -r "{os.path.join(repo_name,'requirements.txt')}"

# Change working directory to location of notebook
notebook_dir = '1_intro_neuralnets'
path_to_notebook = os.path.join(repo_name,notebook_dir)
%cd "{path_to_notebook}"
%ls

In [None]:
import numpy as np
import pandas as pd
import tensorflow as tf

from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
import matplotlib.pyplot as plt

# Regression
For regression we will use MSE as our loss function.

In [None]:
# Read data in and clean up
crimes = pd.read_csv('data/communities.csv',na_values=['?'])
crimes.fillna(crimes.mean(),inplace=True)
crimes.drop(columns=['state','country','community','communityname','fold'],inplace=True)

X = crimes.iloc[:,:-1]
y = crimes.iloc[:,-1]

# Split our data into training and test sets
X_train, X_test, y_train, y_test = train_test_split(X,y,random_state=0, test_size=0.2)

# Define input shape
input_shape=(X_train.shape[1])

### Step 1: Set up dataloaders for our data
The first step is to set up the dataloaders to feed our data into the model.  We will create a 'trainloder' and a 'testloader' for the training and test data which allow us to iteratively feed the data into our model in batches (called "mini-batches") of a size that we can specify.

In [None]:
def prep_dataloaders(X_train, y_train, X_test, y_test, batch_size):
    # Convert training and test data to TensorFlow Dataset
    trainset = tf.data.Dataset.from_tensor_slices((np.array(X_train).astype('float32'),
                                                  np.array(y_train).astype('float32').reshape(-1, 1)))
    testset = tf.data.Dataset.from_tensor_slices((np.array(X_test).astype('float32'),
                                                 np.array(y_test).astype('float32').reshape(-1, 1)))

    # Shuffle and batch the training data
    trainloader = trainset.shuffle(len(X_train)).batch(batch_size)
    # Batch the test data
    testloader = testset.batch(batch_size)

    return trainloader, testloader

# Use the function with your data
batchsize = 32
trainloader, testloader = prep_dataloaders(X_train, y_train, X_test, y_test, batchsize)


### Step 2: Define the regression network using the Tensorflow Sequential API.

Compile the model with our loss = 'mean_squared_error' and set Stochastic Gradient Descent as our optimizer with a learning rate of 0.01.

In [None]:
# Define the regression network using the Sequential API
def build_regression_net(input_shape, n_hidden1, n_hidden2):
    model = tf.keras.Sequential([
        tf.keras.layers.Dense(n_hidden1, activation='relu', input_shape=(input_shape,)),
        tf.keras.layers.Dense(n_hidden2, activation='relu'),
        tf.keras.layers.Dense(1)
    ])
    return model

# Instantiate the neural network
model = build_regression_net(input_shape=input_shape, n_hidden1=50, n_hidden2=5)

# Compile the model
model.compile(optimizer=tf.keras.optimizers.SGD(learning_rate=0.01),
            loss='mean_squared_error')

model.summary()

### Step 3 - Train the Model

You can use model.fit to train the model. Set verbose = 0, 1, or 2 depending on how much information you want to be able to see while training.

In [None]:
# Number of iterations (epochs) to train
n_iter = 200

# Train the model
history = model.fit(trainloader, epochs=n_iter, verbose=0)

# Plotting the cost (loss) over epochs
plt.plot(history.history['loss'])
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.show()

### Step 4 - Test the model on the test set

Use model.evaluate

In [None]:
test_loss = model.evaluate(testloader)

print('Test loss:', test_loss)

# Binary classification
For binary classification, we can use a sigmoid activation function on the output layer to get our predictions in the range (0,1) and then use binary cross entropy as our loss function.

In [None]:
from sklearn.datasets import load_breast_cancer
data=load_breast_cancer(as_frame=True)
X,y=data.data,data.target
# Since the default in the file is 0=malignant 1=benign we want to reverse these
y=(y==0).astype(int)
X,y= np.array(X),np.array(y)

# Let's set aside a test set and use the remainder for training and cross-validation
X_train,X_test,y_train,y_test = train_test_split(X, y, random_state=0,test_size=0.2)

# Let's scale our data to help the algorithm converge faster
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

# Set input shape of model
input_shape=(X_train_scaled.shape[1])

In [None]:
# Set random seeds for reproducibility
tf.random.set_seed(0)

# Convert training and test data to TensorFlow Datasets
trainloader = tf.data.Dataset.from_tensor_slices((X_train_scaled, y_train)).batch(32).shuffle(len(X_train_scaled))
testloader = tf.data.Dataset.from_tensor_slices((X_test_scaled, y_test)).batch(32)


In [None]:
# Define the feedforward network using the Sequential API
def build_feedforward_net(input_shape, n_hidden1, n_hidden2):
    model = tf.keras.Sequential([
        tf.keras.layers.Dense(n_hidden1, activation='relu', input_shape=(input_shape,)),
        tf.keras.layers.Dense(n_hidden2, activation='relu'),
        tf.keras.layers.Dense(1, activation='sigmoid')
    ])
    return model

# Instantiate the neural network
model = build_feedforward_net(input_shape=input_shape, n_hidden1=50, n_hidden2=20)

# Compile the model
model.compile(optimizer=tf.keras.optimizers.SGD(learning_rate=0.01),
              loss='binary_crossentropy',  # Use binary crossentropy for binary classification
              metrics=['accuracy'])  # Track accuracy

model.summary()

In [None]:
# Train the model
num_iter = 200
history = model.fit(trainloader, epochs=num_iter,verbose=0)

# Plotting the cost (loss) over epochs
plt.plot(history.history['loss'])
plt.xlabel('Epoch')
plt.ylabel('Cost/Loss')
plt.show()


In [None]:
# Evaluate the model on the test set
test_loss, test_acc = model.evaluate(testloader)
print('Test set accuracy is {:.3f}'.format(test_acc))

# Predict and process the outputs
test_predictions = model.predict(testloader)
test_predictions = np.round(test_predictions).flatten()  # Convert probabilities to binary predictions
test_accuracy = np.sum(test_predictions == y_test) / len(y_test)
print('Test set accuracy calculated manually is {:.3f}'.format(test_accuracy))


# Multiclass classification
For a multi-class problem we use a softmax as the activation function to convert the outputs to probabilities, rather than sigmoid as we did in binary classification.  We will use cross-entropy as the loss function.

In [None]:
# Load the iris data
iris = pd.read_csv('data/iris.csv')
iris.head()

In [None]:
# Separate into X and y
# Convert string species values in y to numerical codes for modeling
X = iris.drop('species',axis=1)
y = iris['species'].astype('category').cat.codes

In [None]:
# Split data into training and test sets
X_train,X_test,y_train,y_test = train_test_split(X,y,test_size=0.2,random_state=0)
print("Shape of X_train, y_train:",X_train.shape,y_train.shape)
print("Shape of X_test, y_test:",X_test.shape,y_test.shape)

# Let's scale our data to help the algorithm converge faster
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

# Convert y_train and y_test to arrays so all inputs are in NumPy
y_train = np.array(y_train)
y_test = np.array(y_test)

# Define input shape
input_shape=(X_train_scaled.shape[1])

In [None]:
## Set random seeds for reproducibility
tf.random.set_seed(0)

# Convert training and test data to TensorFlow Datasets
batch_size=32
trainloader = tf.data.Dataset.from_tensor_slices((X_train_scaled, y_train)).batch(batch_size).shuffle(len(X_train_scaled))
testloader = tf.data.Dataset.from_tensor_slices((X_test_scaled, y_test)).batch(batch_size)

In [None]:
def build_multiclass_net(input_shape, n_hidden1, n_hidden2, n_hidden3, n_output):
    model = tf.keras.Sequential([
        tf.keras.layers.Dense(n_hidden1, activation='relu', input_shape=(input_shape,)),
        tf.keras.layers.Dense(n_hidden2, activation='relu'),
        tf.keras.layers.Dense(n_hidden3, activation='relu'),
        tf.keras.layers.Dense(n_output, activation='softmax')
    ])
    return model


# Instantiate the neural network
model = build_multiclass_net(input_shape=input_shape, n_hidden1=100, n_hidden2=50, n_hidden3=10, n_output=3)

# Compile the model (specify the optimizer, loss, and metrics)
model.compile(optimizer=tf.keras.optimizers.SGD(learning_rate=0.01),
            loss='sparse_categorical_crossentropy',
            metrics=['accuracy'])

model.summary()

In [None]:
# Train the model
num_iter = 200
history = model.fit(trainloader, epochs=num_iter,verbose=0)

# Plotting the cost (loss) over epochs
plt.plot(history.history['loss'])
plt.xlabel('Epoch')
plt.ylabel('Cost/Loss')
plt.show()

In [None]:
# Evaluate the model on the test set
test_loss, test_acc = model.evaluate(testloader)
print('Test set accuracy is {:.3f}'.format(test_acc))

## Saving models
To save PyTorch models for later use, we have two options:  
1) We can save the `state_dict` which contains all the learned parameters of the model (the weights and biases) but not the architecture itself.  To use it, we instantiate a new model of the desired architecture and then load the saved `state_dict` to assign values to all the parameters in the model  
2) We can alternatively save the entire model including the architecture, and then load it up and use it for prediction

### 1. Save/Load Entire Model (Architecture, Weights, Training Configuration, Optimizer State)

This saves the architecture, weights, training configuration (loss, optimizer), and the state of the optimizer so that you can resume training where you left off.

In [None]:
# Saving
model.save('path_to_my_model.h5')

# Loading
from tensorflow import keras
model = keras.models.load_model('path_to_my_model.h5')

### 2. Save/Load Only the Model's Weights

This is useful when you need to use the same model architecture with different data or training configurations.

In [None]:
# Saving
model.save_weights('path_to_my_weights.h5')

# Loading
model.load_weights('path_to_my_weights.h5')

### 3. Save/Load Only the Model's Architecture

This method saves only the architecture of the model, not its weights or training configuration.

In [None]:
# Saving
json_string = model.to_json()
with open('path_to_my_model.json', 'w') as file:
    file.write(json_string)


# Loading
with open('path_to_my_model.json', 'r') as file:
    json_string = file.read()
model = keras.models.model_from_json(json_string)

### 4. Save/Load Using the tf.saved_model Module

This will create a SavedModel folder with a TensorFlow checkpoint and a .pb file containing the graph.

In [None]:
# Saving
tf.saved_model.save(model, 'path_to_saved_model')

# Loading
loaded_model = tf.saved_model.load('path_to_saved_model')