# Skin Problem Classification Notebook 
---

## Team Information

**Team ID:** C241-PS385  

**Members:**    
- Stefanus Bernard Melkisedek - [GitHub Profile](https://github.com/stefansphtr)
- Debby Trinita - [GitHub Profile](https://github.com/debbytrinita)
- Mhd. Reza Kurniawan Lubis - [GitHub Profile](https://github.com/rezakur)

## Chosen Development Environment

For this project, our team opted to utilize Google Colab as our primary development environment. The decision to use Google Colab was primarily driven by its provision of complimentary access to GPU and TPU resources. These resources significantly expedite the model training process, thereby enhancing our productivity and efficiency.

## 1. Import Libraries

In [None]:
# Standard library imports
import os
import random
import shutil
import zipfile

# Third-party imports
import matplotlib.image as mpimg
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import tensorflow as tf
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout
from tensorflow.keras.optimizers import Adam

## 2. Load and Preprocess Data

This section will cover the data loading and preprocessing steps. 

The step by step process is as follows:
1. **Mounting Google Drive** - This step is necessary to access the dataset stored in Google Drive.
   
2. **Extracting the Dataset** - The dataset is stored in a zip file. We will extract the contents of the zip file to access the dataset.
   
3. **Copying the Data to the Local Directory** - We will copy the dataset to the local directory to facilitate data loading.
   
4. **Defining Directories and Parameters** - We will define the directories and parameters required for data loading.
   
5. **Checking Column Names** - We will check the column names to ensure that they are clean and consistent.
   
6. **Cleaning Column Names** - We will clean the column names to ensure that they are consistent and easy to work with.

### 2.1 Mounting Google Drive

The code mounts Google Drive to access the dataset stored there.

In [None]:
# Mount Google Drive
from google.colab import drive
drive.mount('/content/drive', force_remount=True)

### 2.2 Extracting the Dataset

The code defines the path to the dataset zip file and the path where the dataset will be extracted. It then checks if the data is already extracted. If not, it extracts the zip file.

In [None]:
# Define the path to the dataset zip file
dataset_zip_file_path = '/content/drive/Shareddrives/Capstone_Project/Machine_Learning/data/skin_problem_dataset.zip'

# Define the path where the dataset will be extracted
extraction_path = '/content/drive/Shareddrives/Capstone_Project/Machine_Learning/data/'

# Check if the data is already extracted
if not os.path.exists(extraction_path):
    # Open the dataset zip file in read mode
    with zipfile.ZipFile(dataset_zip_file_path, 'r') as dataset_zip_file:
        try:
            # Extract all files from the dataset zip file to the defined path
            dataset_zip_file.extractall(extraction_path)
        except Exception as e:
            print(f"An error occurred while extracting the zip file: {e}")
else:
    print("Data is already extracted.")

### 2.3 Copying the Data to the Local Directory

The code defines the source and destination directories and copies the data from the source to the destination.

In [None]:
# Define source and destination directories
source_dir = '/content/drive/Shareddrives/Capstone_Project/Machine_Learning/data/'
destination_dir = '/content/data/'

In [None]:
# Copy the data to the local environment
try:
    if not os.path.exists(destination_dir):
        shutil.copytree(source_dir, destination_dir)
    else:
        print("Destination directory already exists. Files were not copied.")
except Exception as e:
    print(f"An error occurred while copying files: {e}")