<a href="https://colab.research.google.com/github/YuanlongZHANG96/COVID19-CT-Team_16/blob/main/DL4H_Team_16.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Final Project - Team 16

- Paper Name: Deep learning enables accurate diagnosis of novel coronavirus (COVID-19) with CT images
- Paper Link: https://www.medrxiv.org/content/10.1101/2020.02.23.20026930v1
- GitHub Repo: https://github.com/biomed-AI/COVID19-CT

---

# FAQ and Attentions - TO BE REMOVED
* Copy and move this template to your Google Drive. Name your notebook by your team ID (upper-left corner). Don't eidt this original file.
* This template covers most questions we want to ask about your reproduction experiment. You don't need to exactly follow the template, however, you should address the questions. Please feel free to customize your report accordingly.
* any report must have run-able codes and necessary annotations (in text and code comments).
* The notebook is like a demo and only uses small-size data (a subset of original data or processed data), the entire runtime of the notebook including data reading, data process, model training, printing, figure plotting, etc,
must be within 8 min, otherwise, you may get penalty on the grade.
  * If the raw dataset is too large to be loaded  you can select a subset of data and pre-process the data, then, upload the subset or processed data to Google Drive and load them in this notebook.
  * If the whole training is too long to run, you can only set the number of training epoch to a small number, e.g., 3, just show that the training is runable.
  * For results model validation, you can train the model outside this notebook in advance, then, load pretrained model and use it for validation (display the figures, print the metrics).
* The post-process is important! For post-process of the results,please use plots/figures. The code to summarize results and plot figures may be tedious, however, it won't be waste of time since these figures can be used for presentation. While plotting in code, the figures should have titles or captions if necessary (e.g., title your figure with "Figure 1. xxxx")
* There is not page limit to your notebook report, you can also use separate notebooks for the report, just make sure your grader can access and run/test them.
* If you use outside resources, please refer them (in any formats). Include the links to the resources if necessary.

# Notebook Instructions
- This notebook has been tested to run in Google Colab, with GPU (support CUDA) available.
- The notebook may have a GPU issue when directly running in the Google Colab environment (Please consider upgrading to the paid version for more GPU available).

# Introduction - TO BE UPDATED

## Background of the Problem
- **Type of Problem**: The paper tackles diagnosing COVID-19 using computed tomography (CT) images, categorizing it under disease prediction and medical image analysis that leverages deep learning for feature extraction and classification.
- **Importance/Meaning of Solving the Problem**: Accurate and rapid diagnosis of COVID-19 is crucial due to its fast spread and severe health implications. CT images are vital diagnostic tools, especially when other testing methods are constrained or slow. Enhancing diagnosis accuracy with AI supports timely treatment and aids in controlling the spread.
- **Difficulty of the Problem**: Diagnosing is challenging due to the subtle differences between COVID-19 and other types of pneumonia visible in CT scans, which require highly accurate models capable of differentiating these fine details.
- **State of the Art Methods and Effectiveness**: Before this study, methods such as standard convolutional neural networks (CNNs) were employed but did not achieve the accuracy needed for fine-grained classification required by CT images. This paper advances these methods by improving both accuracy and interpretability.

## Paper Explanation
- **Proposal**: The research introduces a deep learning-based CT diagnosis system named DRENet, designed to enhance COVID-19 diagnosis from CT images. It incorporates ResNet50 with a Feature Pyramid Network (FPN) for improved feature extraction at multiple scales.
- **Innovations of the Method**: The innovation resides in merging deep learning techniques with attention mechanisms to better detect and classify COVID-19 features in CT scans. Utilizing FPN allows detecting lesions at various scales, thus boosting the model’s ability to identify pertinent features across diverse image presentations.
- **Effectiveness of the Proposed Method**: The method showed high effectiveness with an AUC (Area Under the Curve) of 0.95, recall of 0.96, and precision of 0.79, indicating a robust capability to distinguish between COVID-19 and bacterial pneumonia, which are frequently confused in clinical settings.
- **Contribution to the Research Regime**: The paper's contribution is noteworthy as it not only increases the accuracy of COVID-19 diagnosis through imaging but also aids the interpretability of AI in medical diagnostics, essential for clinical acceptance where comprehending the AI's decision-making process can assist physicians in making informed decisions.

This holistic approach not only pushes forward the technology in medical diagnostics but also sets a foundation for future research into AI applications in medicine, particularly in improving the reliability and usability of such systems in real-world clinical environments.



# Scope of Reproducibility - TODO:

List hypotheses from the paper you will test and the corresponding experiments you will run.


1.   Hypothesis 1: xxxxxxx
2.   Hypothesis 2: xxxxxxx

You can insert images in this notebook text, [see this link](https://stackoverflow.com/questions/50670920/how-to-insert-an-inline-image-in-google-colaboratory-from-google-drive) and example below:

![sample_image.png](https://drive.google.com/uc?export=view&id=1g2efvsRJDxTxKz-OY3loMhihrEUdBxbc)


# Methodology

This methodology is the core of your project. It consists of run-able codes with necessary annotations to show the expeiment you executed for testing the hypotheses.

The methodology at least contains two subsections **data** and **model** in your experiment.

### Environment and Packages
#### Import the Published Packages

In [None]:
# Please un comment and run all installation if you don't have package
#!pip install requests numpy torch scikit-learn gdown matplotlib seaborn

In [None]:
import sys
import os
import zipfile
import requests
from io import BytesIO
from datetime import datetime
import pickle
import numpy as np
import torch
import torch.utils.data
from torch.nn import DataParallel
from torch.optim.lr_scheduler import MultiStepLR
from torch.utils.data import DataLoader
from sklearn.metrics import roc_auc_score, mean_squared_error
from math import sqrt
import gdown
import matplotlib.pyplot as plt
import seaborn as sns

#### Download and Import the Private Packages
- Define the download functions

In [None]:
# https://drive.google.com/file/d/1sSbRS4-cNyATJbwqpe97YvGG8qZkulTG/view?usp=sharing
# Function to download and unzip files from GitHub into a target directory
def download_and_unzip(url, target_folder):
    response = requests.get(url, stream=True)
    if response.status_code == 200:
        zipfile_path = os.path.join(target_folder, 'temp.zip')
        with open(zipfile_path, 'wb') as f:
            f.write(response.content)
        with zipfile.ZipFile(zipfile_path, 'r') as zip_ref:
            zip_ref.extractall(target_folder)
        os.remove(zipfile_path)  # Clean up temp file
    else:
        print(f"Failed to download from {url}")

# Function to download and unzip files from Google Drive to target directory
def gdown_and_unzip(file_id, output, target_folder):
    os.makedirs(target_folder, exist_ok=True)  # Create the target folder if it doesn't exist
    zipfile_path = os.path.join(target_folder, output)
    url = f"https://drive.google.com/uc?id={file_id}"
    gdown.download(url, zipfile_path, quiet=False)

    with zipfile.ZipFile(zipfile_path, 'r') as zip_ref:
        zip_ref.extractall(target_folder)

    os.remove(zipfile_path)  # Clean up temp file



- Call the functionds and import the private packages

In [None]:
# Download and extract the ZIP file
file_id = "1sSbRS4-cNyATJbwqpe97YvGG8qZkulTG"
output = "core.zip"
target_folder = '.'

gdown_and_unzip(file_id, output, target_folder)

# Import the required functions from downloaded packages
from core import model, dataset
from core.config import BATCH_SIZE, PROPOSAL_NUM, SAVE_FREQ, LR, WD, resume, save_dir
from core.utils import init_log, progress_bar

### Mode Setup
**demo_mode**
- If **True**: We will use a small data sample and training in short time;
- If **False**, we will download original data and training for regular time.

In [None]:
demo_mode = True

##  Data - TODO
Data includes raw data (MIMIC III tables), descriptive statistics (our homework questions), and data processing (feature engineering).
  * Source of the data: where the data is collected from; if data is synthetic or self-generated, explain how. If possible, please provide a link to the raw datasets.
  * Statistics: include basic descriptive statistics of the dataset like size, cross validation split, label distribution, etc.
  * Data process: how do you munipulate the data, e.g., change the class labels, split the dataset to train/valid/test, refining the dataset.
  * Illustration: printing results, plotting figures for illustration.
  * You can upload your raw dataset to Google Drive and mount this Colab to the same directory. If your raw dataset is too large, you can upload the processed dataset and have a code to load the processed dataset.

### Load the Data Depends on Mode

In [None]:
raw_data_dir = ""

# Call the data loading based on different mode
# If use demo mode, we will download small dataset from google drive;
# If use regular mode, we will download complete dataset from orginal github repo.
if demo_mode == False:
    # URLs
    url_prefix = "https://github.com/biomed-AI/COVID19-CT/blob/a7f9e65cc2c1dd699b010a8963f6923b9b426ae4/local_traniner/input"
    test_zip_url = url_prefix + "/test.zip?raw=true"
    train_zip_url = url_prefix + "/train.zip?raw=true"
    val_zip_url = url_prefix + "/val.zip?raw=true"

    # Path to the target folder
    target_folder = './input_complete'

    # Ensure the target directory exists
    os.makedirs(target_folder, exist_ok=True)

    # Download and unzip each file
    download_and_unzip(test_zip_url, target_folder)
    download_and_unzip(train_zip_url, target_folder)
    download_and_unzip(val_zip_url, target_folder)
else:
    file_id = "1N2k3Sm3m7aKNQE1b9AbcGrknRzJdvT90"
    output = "input.zip"
    target_folder = './input'
    gdown_and_unzip(file_id, output, target_folder)


def load_image_data():
    # Load the image folders
    if demo_mode == True:
        train_path = './input/train/'
        val_path = './input/val/'
        test_path = './input/test/'
    else:
        train_path = './input_complete/train/'
        val_path = './input_complete/val/'
        test_path = './input_complete/test/'

    trainset = dataset.SARS(root=train_path, is_train=True)
    valset = dataset.SARS(root=val_path, is_train=False)
    testset = dataset.SARS(root=test_path, is_train=False)
    return trainset, valset, testset

# Load raw data
trainset, valset, testset = load_image_data()

### Calculate Statistics of Data

In [None]:
# calculate statistics
def calculate_stats(trainset, valset, testset):
  # implement this function to calculate the statistics
  # it is encouraged to print out the results

    # Calculate the number of samples in each set
    num_train_samples = len(trainset)
    num_val_samples = len(valset)
    num_test_samples = len(testset)

    print(f'Number of training samples: {num_train_samples}')
    print(f'Number of validation samples: {num_val_samples}')
    print(f'Number of test samples: {num_test_samples}')

calculate_stats(trainset, valset, testset)


##   Model - TODO
The model includes the model definitation which usually is a class, model training, and other necessary parts.
  * Model architecture: layer number/size/type, activation function, etc
  * Training objectives: loss function, optimizer, weight of each loss term, etc
  * Others: whether the model is pretrained, Monte Carlo simulation for uncertainty analysis, etc
  * The code of model should have classes of the model, functions of model training, model validation, etc.
  * If your model training is done outside of this notebook, please upload the trained model here and develop a function to load and test it.

### The Models
- The model we use is from model.py in core folder, which we directly imported at the begining.

## Training

### Define the Training Process
- We deploy the training process to ensure it can run in the local environment.

In [None]:
# Initialize Variable
save_dir1 = './'

# Define Training Process
def train(batch_size, proposal_num, save_freq, lr, wd, resume_file, save_directory, net, end_epoch):
    os.environ['CUDA_VISIBLE_DEVICES'] = '0'
    start_epoch = 1
    save_dir = os.path.join(save_directory, datetime.now().strftime('%Y%m%d_%H%M%S'))

    import torch
    print(torch.__version__)
    print(torch.cuda.is_available())
    print(torch.version.cuda)

    net = net.cuda()
    net = DataParallel(net)

    train_losses = []
    test_losses = []
    train_accuracies = []
    test_accuracies = []

    skip_epoch = 0

    for epoch in range(start_epoch, end_epoch):
        if epoch > skip_epoch:
            add = True
        else:
            add = False
        for scheduler in schedulers:
            scheduler.step()

        # begin training
        _print('--' * 50)
        net.train()
        train_correct = 0
        total = 0
        for i, data in enumerate(trainloader):
            img, label, img_raw = data[0].cuda(), data[1].cuda(), data[2]
            batch_size = img.size(0)
            raw_optimizer.zero_grad()
            part_optimizer.zero_grad()
            concat_optimizer.zero_grad()
            partcls_optimizer.zero_grad()
            raw_logits, concat_logits, part_logits, _, top_n_prob = net(img, img_raw, add)
            part_loss = model.list_loss(part_logits.view(batch_size * PROPOSAL_NUM, -1),
                                        label.unsqueeze(1).repeat(1, PROPOSAL_NUM).view(-1)).view(batch_size, PROPOSAL_NUM)
            raw_loss = creterion(raw_logits, label)
            concat_loss = creterion(concat_logits, label)
            rank_loss = model.ranking_loss(top_n_prob, part_loss)
            partcls_loss = creterion(part_logits.view(batch_size * PROPOSAL_NUM, -1),
                                     label.unsqueeze(1).repeat(1, PROPOSAL_NUM).view(-1))

            total_loss = raw_loss + rank_loss + concat_loss + partcls_loss
            total_loss.backward()
            raw_optimizer.step()
            part_optimizer.step()
            concat_optimizer.step()
            partcls_optimizer.step()
            progress_bar(i, len(trainloader), 'train')

            _, concat_predict = torch.max(concat_logits, 1)
            total += batch_size
            train_correct += torch.sum(concat_predict.data == label.data)

        print(float(train_correct) / total)
        pickle.dump(net, open('./model.pkl', 'wb'))
        if epoch % SAVE_FREQ == 0 :#and epoch > 20:
            train_loss = 0
            train_correct = 0
            total = 0
            net.eval()
            auc_label_lst = []
            auc_pred_lst = []
            people_lst = []
            file_name_lst = []
            for i, data in enumerate(valloader):
                with torch.no_grad():
                    img, label, img_raw = data[0].cuda(), data[1].cuda(), data[2]
                    batch_size = img.size(0)
                    _, concat_logits, _, _, _, = net(img, img_raw, add)
                    # calculate loss
                    concat_loss = creterion(concat_logits, label)
                    # calculate accuracy
                    _, concat_predict = torch.max(concat_logits, 1)
                    auc_label_lst += list(label.data.cpu().numpy())
                    pred = torch.nn.Softmax(1)(concat_logits)
                    auc_pred_lst.append(pred.data.cpu().numpy())
                    people_lst.append(data[3])
                    file_name_lst.append(data[4])

                    total += batch_size
                    train_correct += torch.sum(concat_predict.data == label.data)
                    train_loss += concat_loss.item() * batch_size
                    progress_bar(i, len(valloader), 'eval train set')
            train_acc = float(train_correct) / total
            train_loss = train_loss / total

            # For final reporting purposes
            train_losses.append(train_loss)
            train_accuracies.append(train_acc)

            _print(
                'epoch:{} - train loss: {:.3f} and train acc: {:.3f} total sample: {}'.format(
                    epoch,
                    train_loss,
                    train_acc,
                    total))

            print(f'auc: {roc_auc_score(auc_label_lst, np.concatenate(auc_pred_lst, 0)[:, 1]):.4f}')
            np.save('./train_pred.npy', np.concatenate(auc_pred_lst, 0))
            np.save('./train_label.npy', np.array(auc_label_lst))
            np.save('./train_people.npy', np.concatenate(people_lst, 0))
            np.save('./train_file_name.npy', np.concatenate(file_name_lst, 0))
        # evaluate on test set
            test_loss = 0
            test_correct = 0
            total = 0
            auc_label_lst = []
            auc_pred_lst = []
            people_lst = []
            img_vis_lst = []
            file_name_lst = []
            anchor_lst = []
            for i, data in enumerate(testloader):
    # =============================================================================
    #             if i < 1:
    #                 continue
    # =============================================================================
                with torch.no_grad():
                    img, label, img_raw = data[0].cuda(), data[1].cuda(), data[2]
                    batch_size = img.size(0)
                    _, concat_logits, _, _, _ = net(img, img_raw, add, False)
                    # calculate loss
                    concat_loss = creterion(concat_logits, label)
                    # calculate accuracy
                    _, concat_predict = torch.max(concat_logits, 1)
                    auc_label_lst += list(label.data.cpu().numpy())
                    pred = torch.nn.Softmax(1)(concat_logits)
                    auc_pred_lst.append(pred.data.cpu().numpy())
                    people_lst.append(data[3])
                    file_name_lst += list(data[4])
    # =============================================================================
    #                 img_vis_lst.append(img_vis)
    #                 anchor_lst.append(anchor)
    # =============================================================================

                    total += batch_size
                    test_correct += torch.sum(concat_predict.data == label.data)
                    test_loss += concat_loss.item() * batch_size
                    progress_bar(i, len(testloader), 'eval test set')
            test_acc = float(test_correct) / total
            test_loss = test_loss / total

            # Final eval purposes
            test_losses.append(test_loss)
            test_accuracies.append(test_acc)

            _print(
                'epoch:{} - test loss: {:.3f} and test acc: {:.3f} total sample: {}'.format(
                    epoch,
                    test_loss,
                    test_acc,
                    total))


            print(f'auc: {roc_auc_score(auc_label_lst, np.concatenate(auc_pred_lst, 0)[:, 1]):.4f}')
            np.save('./test_pred.npy', np.concatenate(auc_pred_lst, 0))
            np.save('./test_label.npy', np.array(auc_label_lst))
            np.save('./test_people.npy', np.concatenate(people_lst, 0))
            np.save('./test_file_name.npy', np.array(file_name_lst))

    # =============================================================================
    #         np.save('./test_anchor_lst.npy', np.concatenate(anchor_lst, 0))
    #         np.save('./test_vis.npy', np.concatenate(img_vis_lst, 0))
    #         assert 0
    # =============================================================================
        # save model
            net_state_dict = net.module.state_dict()
            if not os.path.exists(save_dir):
                os.mkdir(save_dir)
            torch.save({
                'epoch': epoch,
                'train_loss': train_loss,
                'train_acc': train_acc,
                'test_loss': test_loss,
                'test_acc': test_acc,
                'net_state_dict': net_state_dict},
                os.path.join(save_dir, '%03d.ckpt' % epoch))
    # =============================================================================
    #         assert 0
    # =============================================================================
    print('finishing training')
    # Store final results in a dictionary
    training_results = {
        'train_losses': train_losses,
        'test_losses': test_losses,
        'train_accuracies': train_accuracies,
        'test_accuracies': test_accuracies
    }

    return training_results

### Initialize the Training Session

In [None]:
# Initialize the training model
def initialize_training():
    global net, trainloader, valloader, testloader, creterion, raw_optimizer, concat_optimizer, part_optimizer, partcls_optimizer, schedulers, _print, start_epoch, save_dir
    os.environ['CUDA_VISIBLE_DEVICES'] = '0'
    start_epoch = 1
    save_dir = os.path.join(save_dir1, datetime.now().strftime('%Y%m%d_%H%M%S'))
    if os.path.exists(save_dir):
        raise NameError('model dir exists!')
    os.makedirs(save_dir)
    logging = init_log(save_dir)
    _print = logging.info

    # read dataset

    if demo_mode == True:
        train_path = './input/train/'
        val_path = './input/val/'
        test_path = './input/test/'
    else:
        train_path = './input_complete/train/'
        val_path = './input_complete/val/'
        test_path = './input_complete/test/'

    trainset = dataset.SARS(root=train_path, is_train=True)
    valset = dataset.SARS(root=val_path, is_train=False)
    testset = dataset.SARS(root=test_path, is_train=False)
    trainloader = torch.utils.data.DataLoader(trainset, batch_size=BATCH_SIZE,
                                              shuffle=True, num_workers=8, drop_last=False)
    testloader = torch.utils.data.DataLoader(testset, batch_size=BATCH_SIZE,
                                             shuffle=False, num_workers=8, drop_last=False)
    valloader = torch.utils.data.DataLoader(valset, batch_size=BATCH_SIZE,
                                            shuffle=False, num_workers=8, drop_last=False)

    n_class = 2
    # define model
    net = model.attention_net(topN=PROPOSAL_NUM, n_class=n_class)
    if resume:
        ckpt = torch.load(resume)
        net.load_state_dict(ckpt['net_state_dict'])
        start_epoch = ckpt['epoch'] + 1
    creterion = torch.nn.CrossEntropyLoss()

    # define optimizers
    raw_parameters = list(net.pretrained_model.parameters())
    part_parameters = list(net.proposal_net.parameters())
    concat_parameters = list(net.concat_net.parameters())
    partcls_parameters = list(net.partcls_net.parameters())

    raw_optimizer = torch.optim.SGD(raw_parameters, lr=LR, momentum=0.9, weight_decay=WD)
    concat_optimizer = torch.optim.SGD(concat_parameters, lr=LR, momentum=0.9, weight_decay=WD)
    part_optimizer = torch.optim.SGD(part_parameters, lr=LR, momentum=0.9, weight_decay=WD)
    partcls_optimizer = torch.optim.SGD(partcls_parameters, lr=LR, momentum=0.9, weight_decay=WD)

    schedulers = [MultiStepLR(raw_optimizer, milestones=[60, 100], gamma=0.1),
                  MultiStepLR(concat_optimizer, milestones=[60, 100], gamma=0.1),
                  MultiStepLR(part_optimizer, milestones=[60, 100], gamma=0.1),
                  MultiStepLR(partcls_optimizer, milestones=[60, 100], gamma=0.1)]

    if resume:
        ckpt = torch.load(resume)
        net.pretrained_model.load_state_dict({layer.replace('pretrained_model.', ''): ckpt['net_state_dict'][layer]
                                              for layer in ckpt['net_state_dict'] if 'pretrained_model' in layer})

        start_epoch = ckpt['epoch'] + 1



### Training Process

In [None]:
# Initialize
if demo_mode == True:
    END_EPOCH = 6
else:
    END_EPOCH = 500

initialize_training()
training_results = train(net=net,
      batch_size=BATCH_SIZE,
      proposal_num=PROPOSAL_NUM,
      save_freq=SAVE_FREQ,
      lr=LR,
      wd=WD,
      save_directory=save_dir,
      resume_file=resume,
      end_epoch=END_EPOCH)

## Testing

### Download Trained Models

In [None]:
# Download the existed model
url = 'https://drive.google.com/uc?id=1vGOnn_KPy9InVgGdymivurewcWIK5f0X'
output = 'model.pth'
gdown.download(url, output, quiet=False)

### Define the Test Function and Run the Test

In [None]:
def test(model_path, net, testloader, creterion):
    checkpoint = torch.load(model_path)
    if 'net_state_dict' in checkpoint:
        net.load_state_dict(checkpoint['net_state_dict'])
    else:
        net.load_state_dict(checkpoint)

    net.eval()  # Set the model to evaluation mode
    test_loss = 0
    test_correct = 0
    total = 0
    auc_label_lst = []
    auc_pred_lst = []

    with torch.no_grad():
        for data in testloader:
            img, label, img_raw = data[0].cuda(), data[1].cuda(), data[2].cuda()
            outputs = net(img, img_raw)

            # Check outputs format, and select the right output if necessary
            if isinstance(outputs, list):
                outputs = outputs[0]  # Assuming the first element is the logits

            loss = creterion(outputs, label)
            test_loss += loss.item() * label.size(0)
            _, predicted = torch.max(outputs, 1)
            test_correct += (predicted == label).sum().item()
            total += label.size(0)
            auc_label_lst.append(label.cpu().numpy())
            auc_pred_lst.append(outputs.softmax(dim=1).cpu().numpy())  # Using softmax to get probabilities for AUC

    test_loss /= total
    test_acc = test_correct / total
    auc_score = roc_auc_score(np.concatenate(auc_label_lst), np.concatenate(auc_pred_lst, axis=0)[:, 1])

    return {'loss': test_loss, 'accuracy': test_acc, 'auc': auc_score}

model_path = 'model.pth'

# Move model to GPU
net = net.cuda()

# Call the test function
test_results = test(model_path, net, testloader, creterion)

# Results - TO BE UPDATED
In this section, you should finish training your model training or loading your trained model. That is a great experiment! You should share the results with others with necessary metrics and figures.

Please test and report results for all experiments that you run with:

*   specific numbers (accuracy, AUC, RMSE, etc)
*   figures (loss shrinkage, outputs from GAN, annotation or label of sample pictures, etc)


## Define the Results Function

In [None]:
def summarize_results(train_losses, val_losses, train_accuracies, val_accuracies, test_results):
    # Calculate the final metrics
    final_train_acc = train_accuracies[-1]
    final_val_acc = val_accuracies[-1]
    final_auc = test_results['auc']
    final_accuracy = test_results['accuracy']
    test_loss = test_results['loss']

    # Print the summary of results
    print("Final Training Accuracy: {:.2f}%".format(final_train_acc * 100))
    print("Final Validation Accuracy: {:.2f}%".format(final_val_acc * 100))
    print("Test Accuracy: {:.2f}%".format(final_accuracy * 100))
    print("Test AUC: {:.4f}".format(final_auc))
    print("Test Loss: {:.4f}".format(test_loss))

    # print("Test RMSE: {:.4f}".format(final_rmse))

    # Plotting the metrics
    plt.figure(figsize=(12, 5))
    plt.subplot(1, 2, 1)
    plt.plot(train_losses, label='Training Loss')
    plt.plot(val_losses, label='Validation Loss')
    plt.title('Loss over epochs')
    plt.xlabel('Epochs')
    plt.ylabel('Loss')
    plt.legend()

    plt.subplot(1, 2, 2)
    plt.plot(train_accuracies, label='Training Accuracy')
    plt.plot(val_accuracies, label='Validation Accuracy')
    plt.title('Accuracy over epochs')
    plt.xlabel('Epochs')
    plt.ylabel('Accuracy')
    plt.legend()

    plt.show()

## Run the Evaluation and Display Results

In [None]:
summarize_results(training_results['train_losses'],
                  training_results['test_losses'],
                  training_results['train_accuracies'],
                  training_results['test_accuracies'],
                  test_results)

## Results, Analysis and Plans

### Results
The replication attempt of the original study yielded a final training accuracy of 69.09%, with a drop in validation accuracy to 53.57%. The test accuracy was 54.76%, with an AUC of 0.6569 and a test loss of 1.6631. The loss and accuracy graphs over epochs show a trend of increasing loss and plateauing accuracy, indicating potential issues with the model's learning and generalization capability.

### Analyses
The loss graph demonstrates an overfitting trend, with the training loss decreasing while the validation loss increases. This divergence suggests that the model is memorizing the training data rather than learning features generalizable to unseen data. The accuracy graph supports this, as the validation accuracy does not improve in tandem with the training accuracy. The suboptimal AUC value further indicates that the model's ability to distinguish between classes is limited.

### Plans
Given these results, the following plan is proposed to improve the replication attempt:

- **Data Examination**: Reassess the data used for training and validation to ensure it is representative and balanced.
- **Training Strategy**: Implement a more robust training strategy, such as cross-validation or a different split of the data, to increase the model's ability to generalize.
- **Increased Iterations**: Extend the number of epochs while incorporating early stopping mechanisms to find a better balance between learning and overfitting.


## Model comparison

In [None]:
# compare you model with others
# you don't need to re-run all other experiments, instead, you can directly refer the metrics/numbers in the paper

# Discussion

In this section,you should discuss your work and make future plan. The discussion should address the following questions:
  * Make assessment that the paper is reproducible or not.
  * Explain why it is not reproducible if your results are kind negative.
  * Describe “What was easy” and “What was difficult” during the reproduction.
  * Make suggestions to the author or other reproducers on how to improve the reproducibility.
  * What will you do in next phase.



In [None]:
# no code is required for this section
'''
if you want to use an image outside this notebook for explanaition,
you can read and plot it here like the Scope of Reproducibility
'''

# References

1.   Sun, J, [paper title], [journal title], [year], [volume]:[issue], doi: [doi link to paper]



# Feel free to add new sections