# 🤖 Use Pre-trained CNN Models to Identify MPox

Here, we are going to use a few pre-trained CNN models to identify MPox from other dermatological diseases. The dataset we are using has the following sets of images.

- Actinic keratoses
- Basal cell carcinoma
- Benign keratosis-like lesions
- Chickenpox
- Cowpox
- Dermatofibroma
- Healthy
- HFMD
- Measles
- Melanocytic nevi
- Melanoma
- Monkeypox
- Squamous cell carcinoma
- Vascular lesions

We are going to use the following pre-trained CNN models to identify MPox.

- ***Resnet50V2***
  - ResNet50V2 is an improved version of the ResNet50 deep convolutional neural network architecture, designed with batch normalization before activation functions and identity mapping to enhance training stability and performance for image recognition tasks.

## 🗂️ Import Libraries

The following code block imports all the libraries that are necessary for our development purposes.

#### 🔢 `numpy`
- NumPy is a powerful Python library used for numerical computing, providing support for large, multi-dimensional arrays, matrices, and high-level mathematical functions to operate on them efficiently.

#### 🐼 `pandas`
- Pandas is a Python library designed for data manipulation and analysis, offering easy-to-use data structures like DataFrames and Series for handling structured data efficiently.

#### 👣 `pathlib`
- Pathlib is a Python library that provides an object-oriented interface for working with filesystem paths, making path manipulation and file operations more intuitive and cross-platform.

#### 📈 `matplotlib`
- Matplotlib is a Python library for creating static, interactive, and animated visualizations in a variety of formats, including plots, graphs, and charts.

#### ⚡ `tensorflow`
- TensorFlow is an open-source library designed for building and deploying machine learning and deep learning models, offering a flexible ecosystem for numerical computation and AI development.

#### 🌊 `seaborn`
- Seaborn is a Python library built on Matplotlib that simplifies creating aesthetically pleasing and informative statistical graphics for data visualization.

#### 💻 `os`
- The os library in Python provides a way to interact with the operating system, enabling tasks such as file and directory manipulation, environment variable access, and process management.

#### 🔬 `scikit-learn`
- Scikit-learn is a Python library that provides simple and efficient tools for data mining, data analysis, and machine learning, including classification, regression, and clustering algorithms.

In [7]:
## 🗂️ Import Libraries

import numpy as np
import pandas as pd
from pathlib import Path
import os
import matplotlib.pyplot as plt
from IPython.display import Image, display
import matplotlib.cm as cm
from sklearn.model_selection import train_test_split
from sklearn.metrics import confusion_matrix, classification_report
import seaborn as sns
import tensorflow as tf

## ♾️ Constants

In [8]:
## ♾️ Constants

### Directory Paths
ORIGINAL_DATA_PATH = './data/original'
AUGMENTED_DATA_PATH = './data/augmented'
TRAIN_DIRECTORY = 'train'
TEST_DIRECTORY = 'test'
VALIDATION_DIRECTORY = 'val'

### Directories containing the images
DATA_DIRECTORIES = ['Actinic keratoses', 'Basal cell carcinoma', 'Benign keratosis-like lesions', 'Chickenpox', 'Cowpox', 'Dermatofibroma', 'Healthy', 'HFMD', 'Measles', 'Melanocytic nevi', 'Melanoma', 'Monkeypox', 'Squamous cell carcinoma', 'Vascular lesions']


## ✨ Part 1 - Create Models using Original Images

In this section, we will create the models using only the original images, and not the augmented images. The reason behind this is, we already have a sufficient amount of data in our training dataset and we need to check whether we can have a good accuracy without using the augmented images.

If the accuracies are low, then we are going to use augmented images and create the models again.

## 🗃️ Creating File Data Frame

In [12]:
## 🗃️ Creating File Data Frame

### Load the base directory path
path = os.path.join(ORIGINAL_DATA_PATH, TRAIN_DIRECTORY)
image_dir = Path(path)

### Get file paths and assign labels
file_paths = list(image_dir.glob(r'**/*.jpg'))
labels = list(map(lambda x: os.path.split(os.path.split(x)[0])[1], file_paths))

file_paths = pd.Series(file_paths, name='Path').astype(str)
labels = pd.Series(labels, name='Label')

### Concatenate file paths and labels
image_df = pd.concat([file_paths, labels], axis=1)

### Sample 190 images from each class
### Otherwise, we are going to have a bias in the dataset
samples = []
for record in image_df['Label'].unique():
    samples.append(image_df[image_df['Label']==record].sample(190, random_state=42))
image_df = pd.concat(samples, axis=0).sample(frac=1.0, random_state=42).reset_index(drop=True)

### Show the results
image_df.head(10)

Unnamed: 0,Path,Label
0,data/original/train/Squamous cell carcinoma/IS...,Squamous cell carcinoma
1,data/original/train/Benign keratosis-like lesi...,Benign keratosis-like lesions
2,data/original/train/HFMD/HFMD_98_01_7.jpg,HFMD
3,data/original/train/Chickenpox/CHP_23_01_2.jpg,Chickenpox
4,data/original/train/Measles/MSL_34_01_5.jpg,Measles
5,data/original/train/Actinic keratoses/ISIC_006...,Actinic keratoses
6,data/original/train/Chickenpox/CHP_25_01_7.jpg,Chickenpox
7,data/original/train/Squamous cell carcinoma/IS...,Squamous cell carcinoma
8,data/original/train/Benign keratosis-like lesi...,Benign keratosis-like lesions
9,data/original/train/Healthy/HEALTHY_10_01_7.jpg,Healthy
