# Submodule 4 - Model Building, Evaluation, Interpretation, and Deployment


## Overview
This submodule will explore various model types, model evaluation techniques, delve into interpretability methods, and
learn best practices for real-world deployment, ensuring responsible and effective use of AI/ML.

## Learning Objectives
At the end of this module, you should be able to:

+ Gain a comprehensive understanding of the AI/ML model building process.
+ Master
essential techniques for evaluating model performance and identifying potential biases. 
+ Develop skills in
interpreting AI/ML models and understanding their decision-making processes. 
+ Learn about the challenges
and best practices for deploying AI/ML models in real-world biomedical settings.

## Prerequisites
* An AWS account with access to Amazon SageMaker
* Basic understanding of Python programming

## Get Started
- Watch the Lecture Videos.
- Complete the Quizzes to solidify your understanding.
- Enhance your programming skills with Tutorials.
- Challenge yourself with the Exercises.

## 1. ML Models and Model Evaluation

The lecture cover machine learning (ML) models and methods for evaluating their performance. A machine learning model defines the relationship between input (independent) and output (dependent) variables in a dataset. Various types of models exist based on the nature of the input-output relationship, including regression, classification, and clustering models.

**Regression models**, such as linear and non-linear regression, are supervised learning methods primarily used for predictions involving continuous variables. Simple linear regression examines the linear relationship between one independent and one dependent variable, while multiple linear regression incorporates additional predictors. Non-linear regression models fit data using polynomial transformations and are also used for continuous output variables.

**Classification models**, another category of supervised learning, output categorical labels instead of continuous values. They are widely used in tasks such as spam detection and image classification. These models include linear models (logistic regression), non-parametric models (k-nearest neighbors), tree-based methods (decision trees and random forests), and neural networks. Logistic regression, for instance, uses probability to classify inputs and supports binary, multi-class, and ordinal categorization. Support Vector Machines (SVM) and decision trees can handle both classification and regression tasks, offering flexible options depending on data structure and desired output.

**Clustering models**, represent unsupervised learning techniques. These models group unlabeled data points based on similarity, commonly using methods like K-means (partition-based clustering), hierarchical clustering, and density-based clustering. K-means divides data into predefined groups, while hierarchical clustering creates a tree-like structure of data groupings, allowing for a flexible number of clusters. Density-based clustering forms clusters based on areas of high data density, though it may struggle with datasets that vary in density or dimension.

The lecture also address model evaluation, essential for tuning models and enhancing prediction accuracy. For classification models, common evaluation metrics include accuracy, precision, recall, and the F1 score. The Receiver Operating Characteristic (ROC) curve and its Area Under the Curve (AUC) measure a classifier's effectiveness across different thresholds, ideally achieving a high AUC score. Regression model evaluation often relies on metrics like the Coefficient of Determination (R²) and Mean Squared Error (MSE) to quantify model fit and prediction error. Clustering models use external validation methods, such as homogeneity and completeness, as well as internal metrics like compactness and separation, to assess clustering quality.

Overall this lecture provides a comprehensive overview of ML models and their respective evaluation strategies, emphasizing practical and theoretical approaches to ensure robust and effective model performance.

### Lecture Video

In [None]:
from IPython.display import YouTubeVideo

# Youtube
YouTubeVideo(id='ml_models_and_model_evaluation', height=200, width=400)

### Lecture Slides

Download the lecture slides [ML Models and Model Evaluation](Submodule_4/Lectures/Submodule_4_Lecture_1_ML_Models_and_Model_Evaluation.pptx).

### Quizzes

In [None]:
%pip install jupyterquiz
from jupyterquiz import display_quiz
display_quiz("Submodule_4/Quizzes/Submodule_4_Quiz_1_ML_Models_and_Model_Evaluation.json")

## 2. Model Tunning, Model Interpretation and Model Deployment

The lecture focuses on three main aspects of machine learning: model tuning, interpretation, and deployment. Model tuning involves adjusting hyperparameters, which are preset parameters in an algorithm that guide the model's learning process but are not derived from the data itself. Examples include the learning rate in neural networks and the regularization penalty in logistic regression. Tuning aims to balance the bias-variance tradeoff, a critical factor in model performance. High bias can lead to underfitting, where the model fails to capture relevant patterns in the data, while high variance can lead to overfitting, where the model performs well on training data but poorly on new data. Techniques like cross-validation, specifically K-fold and leave-one-out cross-validation, are commonly used to evaluate and fine-tune models.

Hyperparameter tuning strategies, such as grid search and randomized search, help optimize model performance. Grid search tests a range of values across specified hyperparameters, but requires manually defining the parameter grid. Randomized search, a variation, randomly samples from parameter distributions, often making it more efficient for large or complex datasets. These methods, combined with cross-validation, provide a structured approach to finding optimal hyperparameters.

Model interpretation is vital for understanding how a model makes predictions, especially in complex or "black box" algorithms. Interpretable models, like decision trees, offer insight into variable importance and feature interactions directly. Tools such as Skater, an open-source Python library, can be used for model-agnostic interpretation, allowing both global and local interpretations to clarify how specific features influence predictions. This interpretability is crucial for debugging models, enhancing transparency, and explaining results to stakeholders.

Finally, model deployment involves saving the trained model and making it accessible for future predictions. One common approach is model persistence, where the model is stored on a permanent medium and used for batch or real-time predictions. Another method is custom development, in which the prediction logic is separated from the training process, enabling scalable deployment solutions. Best practices in deployment emphasize adherence to the FAIR data principles—ensuring that data is findable, accessible, interoperable, and reusable, which promotes long-term model usability and reliability.

This lecture provide an overview of model tuning, interpretation, and deployment, emphasizing practical techniques and considerations to maximize a model’s performance, interpretability, and utility in real-world applications.

### Lecture Video

In [None]:
from IPython.display import YouTubeVideo

# Youtube
YouTubeVideo(id='model_tunning_interpretation_deployment', height=200, width=400)

### Lecture Slides

Download the lecture slides [Model Tunning, Model Interpretation, Model Deployment](Submodule_4/Lectures/Submodule_4_Lecture_2_Model_Tuning_Interpretation_Deployment.pptx).

### Quizzes

In [None]:
%pip install jupyterquiz
from jupyterquiz import display_quiz
display_quiz("Submodule_4/Quizzes/Submodule_4_Quiz_2_Model_Tuning_Interpretation_Deployment.json")

## 3. Tutorials
+ [Model Building and Evaluation](Submodule_4/Tutorials/Submodule_4_Tutorial_1_Model_Building_and_Evaluation.ipynb)
+ [Model Tunning, Model Interpretation, Model Deployment](Submodule_4/Tutorials/Submodule_4_Tutorial_2_Model_Tunning_Interpretation_Deployment.ipynb)
+ [Predict Drug Activity for Androgen Receptor](Submodule_4/Tutorials/Submodule_4_Tutorial_3_Predict_Drug_Activity_for_Androgen_Receptor.ipynb)

## 4. Exercises
+ [Exploratory Analysis of Wine Types and Quality Data](Submodule_4/Exercises/Submodule_4_Exercise_1_Exploratory_Analysis_of_Wine_Types_and_Quality_Data.ipynb) ([Solution](Submodule_4/Exercises/Submodule_4_Exercise_1_Exploratory_Analysis_of_Wine_Types_and_Quality_Data_Solution.ipynb))
+ [Predicting Wine Types](Submodule_4/Exercises/Submodule_4_Exercise_2_Predicting_Wine_Types.ipynb) ([Solution](Submodule_4/Exercises/Submodule_4_Exercise_2_Predicting_Wine_Types_Solution.ipynb))
+ [Predicting Wine Qualities](Submodule_4/Exercises/Submodule_4_Exercise_3_Predicting_Wine_Quality.ipynb) ([Solution](Submodule_4/Exercises/Submodule_4_Exercise_3_Predicting_Wine_Quality_Solution.ipynb))

## Conclusions
This submodule covers machine learning (ML) models and their evaluation, tuning, interpretation, and deployment. ML models, such as regression, classification, and clustering, are used to analyze and predict patterns in data. Model evaluation metrics, like accuracy, precision, recall, F1-score, and AUC-ROC, assess their performance. Model tuning involves optimizing hyperparameters to balance bias-variance trade-off and improve accuracy. Model interpretation techniques, like decision trees and Skater, help understand how models make predictions. Finally, model deployment involves saving and making models accessible for future use, adhering to FAIR data principles for long-term usability.

## Clean up
A reminder to shutdown VM and delete any relevant resources. <br><br>