<a href="https://colab.research.google.com/github/cloudpedagogy/data-science-programming/blob/main/deep-learning-keras/02_Data_Preparation_and_Model_Building.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Model Building


## Overview

Model building in deep learning is a crucial process that involves designing, creating, and training neural network architectures to solve complex problems and make predictions based on input data. Deep learning models are a subset of artificial intelligence that aim to mimic the human brain's structure and functioning, allowing them to learn patterns and representations from large amounts of data. This overview will provide a high-level understanding of the key steps involved in model building in deep learning:

1. Data Collection and Preprocessing:
The first step in model building is to collect and prepare the data. High-quality data is essential for training accurate deep learning models. This involves gathering relevant datasets, cleaning the data to handle missing values and outliers, and transforming the data into a suitable format for training.

2. Defining the Architecture:
The architecture of a deep learning model refers to its overall structure, including the number and types of layers, the number of neurons in each layer, and how they are interconnected. The choice of architecture depends on the specific task at hand, such as image classification, natural language processing, or time series forecasting.

3. Choosing Activation Functions:
Activation functions introduce non-linearity to the model, allowing it to learn complex patterns in the data. Common activation functions include ReLU (Rectified Linear Unit), sigmoid, and tanh. The selection of appropriate activation functions is crucial for the model's performance.

4. Building the Model:
After defining the architecture and activation functions, the model is constructed using deep learning frameworks or libraries like TensorFlow or PyTorch. These libraries provide convenient interfaces for creating and configuring neural networks.

5. Compiling the Model:
Before training the model, it needs to be compiled with an appropriate optimizer, loss function, and evaluation metrics. The optimizer, such as Adam or SGD (Stochastic Gradient Descent), updates the model's weights during training to minimize the loss function. The loss function quantifies the difference between the predicted outputs and the true labels, guiding the model to make better predictions.

6. Training the Model:
During the training phase, the model is exposed to the training data, and the weights of the neural network are adjusted iteratively to minimize the loss function. The process involves forward propagation (making predictions) and backward propagation (updating weights through backpropagation) for each batch of data.

7. Hyperparameter Tuning:
Deep learning models have several hyperparameters, such as learning rate, batch size, and the number of epochs. Optimizing these hyperparameters can significantly impact the model's performance. Hyperparameter tuning involves experimenting with different values and selecting the best configuration based on performance metrics.

8. Evaluating the Model:
Once the model is trained, it needs to be evaluated on a separate validation or test dataset to assess its performance. Common evaluation metrics for classification tasks include accuracy, precision, recall, and F1 score, while regression tasks may use mean squared error (MSE) or mean absolute error (MAE).

9. Fine-tuning and Regularization:
To further enhance the model's performance and prevent overfitting, techniques such as regularization (e.g., L1 and L2 regularization) and dropout can be applied. Fine-tuning the model involves adjusting the model's parameters or using transfer learning from pre-trained models to improve performance on specific tasks.

10. Deployment and Inference:
After successfully building and evaluating the model, it can be deployed to make predictions on new, unseen data. This involves feeding the new data into the trained model and obtaining predictions or classifications based on the learned patterns.

In conclusion, model building in deep learning is a multidimensional process that requires careful consideration of data, architecture, hyperparameters, and evaluation metrics. By iteratively refining these components, developers can create powerful and accurate deep learning models capable of solving a wide range of real-world problems.

# Training the model and making predictions



Let's use the Pima Indians Diabetes dataset to build a simple deep learning model in Keras. This example will take you through loading the data, preparing it, defining the model, compiling the model, fitting the model, and making predictions.

Firstly, we need to install necessary packages such as pandas and keras. If they're not installed you can use the following commands in Jupyter notebook:

**Step 1: Load the Data**



In [None]:
import pandas as pd

url = "https://raw.githubusercontent.com/jbrownlee/Datasets/master/pima-indians-diabetes.data.csv"
names = ['Pregnancies', 'Glucose', 'BloodPressure', 'SkinThickness', 'Insulin', 'BMI', 'DiabetesPedigreeFunction', 'Age', 'Outcome']
data = pd.read_csv(url, names=names)


**Step 2: Preparing the Data**
Here, we'll split the dataset into input features (X) and the feature to predict (Y).


In [None]:
# Split the data into input (X) and target (Y)
X = data.iloc[:, 0:8]
Y = data.iloc[:, 8]


It is also a good practice to normalize the input data, using techniques such as Z-Score normalization or Min-Max Scaling. But to keep this example simple, we won't do it here.

**Step 3: Define the Model**
Next, let's define our deep learning model. We'll create a simple model with one hidden layer.


In [None]:
from keras.models import Sequential
from keras.layers import Dense

# Define the keras model
model = Sequential()
model.add(Dense(12, input_dim=8, activation='relu'))
model.add(Dense(8, activation='relu'))
model.add(Dense(1, activation='sigmoid'))


**Step 4: Compile the Model**
Before we can run the model, we need to compile it.


In [None]:
# Compile the keras model
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])


**Step 5: Fit the Model**
Let's fit the model to our data.


In [None]:
# Fit the keras model on the dataset
model.fit(X, Y, epochs=150, batch_size=10)


**Step 6: Making Predictions**
Now, we can make predictions on new data. But for simplicity, we'll just reuse our training data.


In [None]:
# Make probability predictions with the model
predictions = model.predict(X)
# Round predictions
rounded = [round(x[0]) for x in predictions]


Note that there are many ways to improve this model. Here we didn't split the data into training and test datasets, nor did we use cross-validation, in order to keep the example simple. In a real application, you'd want to do that, and also experiment with different model structures, different forms of regularization, and other techniques to optimize your model's performance. You might also want to save your trained model's weights for future use, which you can do using Keras's model.save() function.


# Reflection Points

1. **Data Cleaning, Transformation, and Preparation for Modeling**:
   - How do you handle missing values in a dataset? (Answer: Missing values can be imputed using techniques such as mean, median, or interpolation, or the rows with missing values can be dropped depending on the context.)
   - What are some common techniques for handling outliers in data? (Answer: Outliers can be detected and treated using methods like statistical measures, z-scores, or percentiles.)
   - Why is feature scaling important in machine learning models? (Answer: Feature scaling ensures that different features have a similar scale, preventing certain features from dominating the model's learning process.)

2. **Building and Compiling a Keras Model**:
   - How do you define a sequential model in Keras? (Answer: A sequential model is defined by stacking layers one after another using the `Sequential` class in Keras.)
   - What are activation functions and why are they used in neural networks? (Answer: Activation functions introduce non-linearity and help the model learn complex patterns in the data.)
   - What is the purpose of compiling a Keras model? (Answer: Compiling a model defines the loss function, optimizer, and metrics to be used during the training process.)

3. **Training the Model and Making Predictions**:
   - How do you split a dataset into training and testing sets? (Answer: The dataset can be split using functions like `train_test_split` from scikit-learn, allocating a certain percentage of the data for testing.)
   - What is the purpose of training a machine learning model? (Answer: Training involves fitting the model to the training data, allowing it to learn the underlying patterns and relationships in the data.)
   - How do you evaluate the performance of a trained model? (Answer: Performance evaluation can be done using metrics such as accuracy, precision, recall, F1-score, or using techniques like cross-validation.)


# Exercise


Task 1: Data Preparation and Preprocessing

1. Load the dataset from the given URL.
2. Explore the dataset and check for any missing values.
3. Split the dataset into features (X) and the target variable (y).
4. Split the data into training and testing sets.

Task 2: Build and Train the Neural Network

1. Build a sequential neural network using Keras with the following architecture:
   - Input layer with appropriate input shape (number of features in X).
   - One or more hidden layers with ReLU activation.
   - Output layer with a sigmoid activation function (since it's a binary classification problem).

2. Compile the model with an appropriate loss function and optimizer for binary classification.
3. Train the model using the training data for a reasonable number of epochs.

Task 3: Evaluate the Model

1. Evaluate the trained model using the testing data.
2. Calculate the accuracy and other relevant metrics (e.g., precision, recall, F1-score) for the model's performance.



# Sample Solution

In [None]:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from keras.models import Sequential
from keras.layers import Dense

# Task 1: Data Preparation and Preprocessing
url = "https://raw.githubusercontent.com/jbrownlee/Datasets/master/pima-indians-diabetes.data.csv"
names = ["preg", "plas", "pres", "skin", "insu", "mass", "pedi", "age", "class"]
df = pd.read_csv(url, names=names)

# Check for missing values
print(df.isnull().sum())

# Split features (X) and target variable (y)
X = df.iloc[:, :-1].values
y = df.iloc[:, -1].values

# Split data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Task 2: Build and Train the Neural Network
model = Sequential()
model.add(Dense(8, input_dim=X_train.shape[1], activation='relu'))
model.add(Dense(4, activation='relu'))
model.add(Dense(1, activation='sigmoid'))

model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])

model.fit(X_train, y_train, epochs=50, batch_size=10, verbose=1)

# Task 3: Evaluate the Model
_, accuracy = model.evaluate(X_test, y_test)
print("Test Accuracy: {:.2f}%".format(accuracy * 100))


# A quiz on Model Building


1. What is Keras?
   <br>a) A deep learning library for Java
   <br>b) A high-level API for building and training deep learning models
   <br>c) A programming language for machine learning
   <br>d) A database management system

2. What does the `Sequential` model in Keras allow you to do?
   <br>a) Build complex recurrent neural networks
   <br>b) Stack layers linearly to create feedforward neural networks
   <br>c) Create unsupervised learning models
   <br>d) Combine different types of neural network architectures

3. How do you add layers to a Keras model?
   <br>a) `add_layer()` method
   <br>b) `add()` function
   <br>c) `add_layer()` function
   <br>d) `add()` method

4. What is the purpose of compiling a Keras model?
   <br>a) Save the model to disk
   <br>b) Prepare the model for deployment
   <br>c) Convert the model to TensorFlow format
   <br>d) Specify the optimizer, loss function, and metrics for training

5. Which optimizer is commonly used in Keras for stochastic gradient descent?
   <br>a) RMSprop
   <br>b) AdaBoost
   <br>c) L-BFGS
   <br>d) Adam

6. What is the purpose of the `fit()` method in Keras?
   <br>a) Evaluate the model on a test dataset
   <br>b) Fine-tune the model's hyperparameters
   <br>c) Compile the model with an optimizer
   <br>d) Train the model on training data

7. How do you evaluate a trained Keras model on a test dataset?
   <br>a) `evaluate_model()`
   <br>b) `evaluate()`
   <br>c) `predict()`
   <br>d) `score()`

8. What is a common technique used to prevent overfitting in Keras models?
   <br>a) Data augmentation
   <br>b) Early stopping
   <br>c) Dropout
   <br>d) Batch normalization

---
**Answers:**

1. b) A high-level API for building and training deep learning models.

2. b) Stack layers linearly to create feedforward neural networks.

3. d) `add()` method.

4. d) Specify the optimizer, loss function, and metrics for training.

5. d) Adam.

6. d) Train the model on training data.

7. b) `evaluate()`.

8. c) Dropout.

---