<a href="https://colab.research.google.com/github/abarrie2/cs598-dlh-project/blob/main/DL4H_Team_24.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Grading rubric

## Draft (20 Points)

Jupyter Notebook (both .PDF and .ipynb files)

You need to use the report template and fill out the following sections, each of which we will score based on the clarity and appropriateness of your writing (percentage of total grade for each component shown). All the information must be in the Jupyter notebook.

- Introduction (2)
  - A clear, high-level description of what the original paper is about and what is the contribution of it
- Scope of reproducibility (2)
- Methodology (8)
  - Data
    - Data descriptions
    - Implementation code
  - Model
    - Model descriptions
    - Implementation code
  - Training
    - Computational requirements
    - Implementation code
  - Evaluation
    - Metrics descriptions
    - Implementation code
- Results (8)
  - Results
  - Analyses
  - Plans

## Final Descriptive Notebook Report (55 Points)

Jupyter Notebook (both in .PDF and .ipynb format)

- Introduction (5):
    - A clear, high-level description of what the original paper is about and what is the contribution of it
- Scope of reproducibility (5)
- Methodology (15)
  - Environment
    - Python version
    - Dependencies/packages needed
  - Data
    - Data download instruction
    - Data descriptions with helpful charts and visualizations
    - Preprocessing code + command
  - Model
    - Citation to the original paper
    - Link to the original paper’s repo (if applicable)
    - Model descriptions
    - Implementation code
    - Pretrained model (if applicable)
  - Training
    - Hyperparams
      - Report at least 3 types of hyperparameters such as learning rate, batch size, hidden size, dropout
    - Computational requirements
      - Report at least 3 types of requirements such as type of hardware, average runtime for each epoch, total number of trials, GPU hrs used, # training epochs
      - Training code
  - Evaluation
    - Metrics descriptions
    - Evaluation code
- Results (15)
  - Table of results (no need to include additional experiments, but main reproducibility result should be included)
  - All claims should be supported by experiment results
  - Discuss with respect to the hypothesis and results from the original paper
  - Experiments beyond the original paper
    - Credits for each experiment depend on how hard it is to run the experiments. Each experiment should include results and a discussion
    - Ablation Study.
- Discussion (10)
  - Implications of the experimental results, whether the original paper was reproducible, and if it wasn’t, what factors made it irreproducible
  - “What was easy”
  - “What was difficult”
  - Recommendations to the original authors or others who work in this area for improving reproducibility
- Public GitHub Repo (5)
  - Publish your code in a public repository on GitHub and attach the URL in the notebook.
  - Make sure your code is documented properly. 
    - A README.md file describing the exact steps to run your code is required. 
    - Check [ML Code Completeness Checklist](https://github.com/paperswithcode/releasing-research-code)
    - Check [Best Practices for Reproducibility](https://www.cs.mcgill.ca/~ksinha4/practices_for_reproducibility/)

---

# FAQ and Attentions
* Copy and move this template to your Google Drive. Name your notebook by your team ID (upper-left corner). Don't eidt this original file.
* This template covers most questions we want to ask about your reproduction experiment. You don't need to exactly follow the template, however, you should address the questions. Please feel free to customize your report accordingly.
* any report must have run-able codes and necessary annotations (in text and code comments).
* The notebook is like a demo and only uses small-size data (a subset of original data or processed data), the entire runtime of the notebook including data reading, data process, model training, printing, figure plotting, etc,
must be within 8 min, otherwise, you may get penalty on the grade.
  * If the raw dataset is too large to be loaded  you can select a subset of data and pre-process the data, then, upload the subset or processed data to Google Drive and load them in this notebook.
  * If the whole training is too long to run, you can only set the number of training epoch to a small number, e.g., 3, just show that the training is runable.
  * For results model validation, you can train the model outside this notebook in advance, then, load pretrained model and use it for validation (display the figures, print the metrics).
* The post-process is important! For post-process of the results,please use plots/figures. The code to summarize results and plot figures may be tedious, however, it won't be waste of time since these figures can be used for presentation. While plotting in code, the figures should have titles or captions if necessary (e.g., title your figure with "Figure 1. xxxx")
* There is not page limit to your notebook report, you can also use separate notebooks for the report, just make sure your grader can access and run/test them.
* If you use outside resources, please refer them (in any formats). Include the links to the resources if necessary.

# Mount Notebook to Google Drive
Upload the data, pretrianed model, figures, etc to your Google Drive, then mount this notebook to Google Drive. After that, you can access the resources freely.

Instruction: https://colab.research.google.com/notebooks/io.ipynb

Example: https://colab.research.google.com/drive/1srw_HFWQ2SMgmWIawucXfusGzrj1_U0q

Video: https://www.youtube.com/watch?v=zc8g8lGcwQU

In [None]:
from google.colab import drive
drive.mount('/content/drive')

# Introduction
This is an introduction to your report, you should edit this text/mardown section to compose. In this text/markdown, you should introduce:

*   Background of the problem
  * what type of problem: disease/readmission/mortality prediction,  feature engineeing, data processing, etc
  * what is the importance/meaning of solving the problem
  * what is the difficulty of the problem
  * the state of the art methods and effectiveness.
*   Paper explanation
  * what did the paper propose
  * what is the innovations of the method
  * how well the proposed method work (in its own metrics)
  * what is the contribution to the reasearch regime (referring the Background above, how important the paper is to the problem).


In [None]:
# code comment is used as inline annotations for your coding

# Scope of Reproducibility:

List hypotheses from the paper you will test and the corresponding experiments you will run.


1.   Hypothesis 1: xxxxxxx
2.   Hypothesis 2: xxxxxxx

You can insert images in this notebook text, [see this link](https://stackoverflow.com/questions/50670920/how-to-insert-an-inline-image-in-google-colaboratory-from-google-drive) and example below:

![sample_image.png](https://drive.google.com/uc?export=view&id=1g2efvsRJDxTxKz-OY3loMhihrEUdBxbc)



You can also use code to display images, see the code below.

The images must be saved in Google Drive first.


In [None]:
# no code is required for this section
'''
if you want to use an image outside this notebook for explanaition,
you can upload it to your google drive and show it with OpenCV or matplotlib
'''
# mount this notebook to your google drive
drive.mount('/content/gdrive')

# define dirs to workspace and data
img_dir = '/content/gdrive/My Drive/Colab Notebooks/<path-to-your-image>'

import cv2
img = cv2.imread(img_dir)
cv2.imshow("Title", img)


# Methodology

This methodology is the core of your project. It consists of run-able codes with necessary annotations to show the expeiment you executed for testing the hypotheses.

The methodology at least contains two subsections **data** and **model** in your experiment.

### Create environment

Create `conda` environment for the project using the `environment.yml` file:

```bash
conda env create --prefix .envs/dlh-team24 -f environment.yml
```

Activate the environment with:
```bash
conda activate .envs/dlh-team24
```

In [None]:
# Import packages
import os
import random

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
import torch
from torch.utils.data import Dataset
import vitaldb

#from google.colab import drive


##  Data
Data includes raw data (MIMIC III tables), descriptive statistics (our homework questions), and data processing (feature engineering).
  * Source of the data: where the data is collected from; if data is synthetic or self-generated, explain how. If possible, please provide a link to the raw datasets.
  * Statistics: include basic descriptive statistics of the dataset like size, cross validation split, label distribution, etc.
  * Data process: how do you munipulate the data, e.g., change the class labels, split the dataset to train/valid/test, refining the dataset.
  * Illustration: printing results, plotting figures for illustration.
  * You can upload your raw dataset to Google Drive and mount this Colab to the same directory. If your raw dataset is too large, you can upload the processed dataset and have a code to load the processed dataset.

### Set Up Local Data Caches

Since the VitalDB data is static, local copies are stored and reused to avoid expensive downloads and to speed up data processing.

The default directory defined below is already in the project `.gitignore` file. If later modified, it should also be added to the project `.gitignore`.

In [None]:
VITALDB_CACHE = './vitaldb_cache'
VITAL_ALL = 'vital_all'
VITAL_MINI = 'vital_mini'
VITAL_METADATA = 'metadata'

In [None]:
!mkdir -p $VITALDB_CACHE
!mkdir -p $VITALDB_CACHE/$VITAL_ALL
!mkdir -p $VITALDB_CACHE/$VITAL_MINI
!mkdir -p $VITALDB_CACHE/$VITAL_METADATA
!ls -l $VITALDB_CACHE

### OSFS Bulk Data Download

**This step is not required, but will significantly speed up downstream processing and avoid a high volume of API requests to the VitalDB web site.**

The cache population code checks if OSFS bulk download data of VitalDB vital files is locally available.

- Manually downloaded the OSF Store archives from the following site: https://osf.io/dtc45/
    - `Vital Files 0001-2000`
    - `Vital Files 2001-4000`
    - `Vital Files 4001-6388`
- Once the `OSF Storage (United States)` link is clicked a `Download as zip` link will appear.
- Once downloaded, extract each of the 3 zip archives.
- Move all files from each of the unzip directories into the `${VITALDB_CACHE}/${VITAL_ALL}` directory.

In [None]:
# Returns the Pandas DataFrame for the specified dataset.
#   One of 'cases', 'labs', or 'trks'
# If the file exists locally, create and return the DataFrame.
# Else, download and cache the csv first, then return the DataFrame.
def vitaldb_dataframe_loader(dataset_name):
    if dataset_name not in ['cases', 'labs', 'trks']:
        raise ValueError(f'Invalid dataset name: {dataset_name}')
    file_path = f'{VITALDB_CACHE}/{VITAL_METADATA}/{dataset_name}.csv'
    if os.path.isfile(file_path):
        print(f'{dataset_name}.csv exists locally.')
        df = pd.read_csv(file_path)
        return df
    else:
        print(f'downloading {dataset_name} and storing in the local cache for future reuse.')
        df = pd.read_csv(f'https://api.vitaldb.net/{dataset_name}')
        df.to_csv(file_path, index=False)
        return df

# Cases

In [None]:
cases = vitaldb_dataframe_loader('cases')
cases = cases.set_index('caseid')
cases.shape

In [None]:
cases.index.nunique()

In [None]:
cases.head()

In [None]:
cases['sex'].value_counts()

# Tracks

In [None]:
trks = vitaldb_dataframe_loader('trks')
trks = trks.set_index('caseid')
trks.shape

In [None]:
trks.index.nunique()

In [None]:
trks.groupby('caseid')[['tid']].count().plot();

In [None]:
trks.groupby('caseid')[['tid']].count().hist();

In [None]:
trks.groupby('tname').count().sort_values(by='tid', ascending=False)

## Parameters of Interest

### Hemodynamic Parameters Reference
https://vitaldb.net/dataset/?query=overview#h.f7d712ycdpk2

**Solar8000/ART_MBP**

mean blood pressure

Parameter, Description, Type/Hz, Unit

Solar8000/ART_MBP, Mean arterial pressure, N, mmHg

In [None]:
trks[trks['tname'].str.contains('Solar8000/ART_MBP')].shape

**SNUADC/ART**

arterial blood pressure waveform

Parameter, Description, Type/Hz, Unit

SNUADC/ART, Arterial pressure wave, W/500, mmHg

In [None]:
trks[trks['tname'].str.contains('SNUADC/ART')].shape

**SNUADC/ECG_II**

electrocardiogram waveform

Parameter, Description, Type/Hz, Unit

SNUADC/ECG_II, ECG lead II wave, W/500, mV

In [None]:
trks[trks['tname'].str.contains('SNUADC/ECG_II')].shape

**BIS/EEG1_WAV**

electroencephalogram waveform

Parameter, Description, Type/Hz, Unit

BIS/EEG1_WAV, EEG wave from channel 1, W/128, uV

In [None]:
trks[trks['tname'].str.contains('BIS/EEG1_WAV')].shape

# Cases of Interest

These are the subset of case ids for which modelling and analysis will be performed based upon inclusion criteria and waveform data availability.

In [None]:
TRACK_NAMES = ['SNUADC/ART', 'SNUADC/ECG_II', 'BIS/EEG1_WAV']
TRACK_SRATES = [500, 500, 128]

In [None]:
# As in the paper, select cases which meet the following criteria:
#
# For patients, the inclusion criteria were as follows:
# (1) adults (age >= 18)
# (2) administered general anaesthesia
# (3) undergone non-cardiac surgery. 
#
# For waveform data, the inclusion criteria were as follows:
# (1) no missing monitoring for ABP, ECG, and EEG waveforms
# (2) no cases containing false events or non-events due to poor signal quality
#     (checked in second stage of data preprocessing)

# adult
inclusion_1 = cases.loc[cases['age'] >= 18].index
print(f'{len(cases)-len(inclusion_1)} cases excluded, {len(inclusion_1)} remaining due to age criteria')

# general anesthesia
inclusion_2 = cases.loc[cases['ane_type'] == 'General'].index
print(f'{len(cases)-len(inclusion_2)} cases excluded, {len(inclusion_2)} remaining due to anesthesia criteria')

# non-cardiac surgery
inclusion_3 = cases.loc[
    ~cases['opname'].str.contains("cardiac", case=False)
    & ~cases['opname'].str.contains("aneurysmal", case=False)
].index
print(f'{len(cases)-len(inclusion_3)} cases excluded, {len(inclusion_3)} remaining due to non-cardiac surgery criteria')

# ABP, ECG, EEG waveforms
inclusion_4 = trks.loc[trks['tname'].isin(TRACK_NAMES)].index.value_counts()
inclusion_4 = inclusion_4[inclusion_4 == len(TRACK_NAMES)].index
print(f'{len(cases)-len(inclusion_4)} cases excluded, {len(inclusion_4)} remaining due to missing waveform data')

cases_of_interest_idx = inclusion_1 \
    .intersection(inclusion_2) \
    .intersection(inclusion_3) \
    .intersection(inclusion_4)

cases_of_interest = cases.loc[cases_of_interest_idx]

print()
print(f'{cases_of_interest_idx.shape[0]} out of {cases.shape[0]} total cases remaining after exclusions applied')

In [None]:
cases_of_interest.head(n=5)

# Tracks of Interest

These are the subset of tracks (waveforms) for the cases of interest identified above.

In [None]:
# A single case maps to one or more waveform tracks. Select only the tracks required for analysis.
trks_of_interest = trks.loc[cases_of_interest_idx][trks.loc[cases_of_interest_idx]['tname'].isin(TRACK_NAMES)]
trks_of_interest.shape

In [None]:
trks_of_interest.head(n=5)

In [None]:
trks_of_interest_idx = trks_of_interest.set_index('tid').index
trks_of_interest_idx.shape

## Build Tracks Cache for Local Processing

Tracks data are large and therefore expensive to download every time used.
By default, the vital file format stores all tracks for each case internally. Since only certain tracks per case are required, each vital file can be further truncated to only store the tracks for needed waveforms.

In [None]:
# Maximum number of cases of interest for which to download data.
# Set to a small value for demo purposes, else set to None to disable and download all.
#MAX_CASES = None
MAX_CASES = 20

In [None]:
# Trim cases of interest to MAX_CASES
if MAX_CASES:
    cases_of_interest_idx = cases_of_interest_idx[:MAX_CASES]

In [None]:
# Ensure the full vital file dataset is available for cases of interest.
count_downloaded = 0
count_present = 0

#for i, idx in enumerate(cases.index):
for i, idx in enumerate(cases_of_interest_idx):
    if MAX_CASES and i >= MAX_CASES:
        break

    full_path = f'{VITALDB_CACHE}/{VITAL_ALL}/{idx:04d}.vital'
    if not os.path.isfile(full_path):
        print(f'Missing vital file: {full_path}')
        # Download and save the file.
        vf = vitaldb.VitalFile(idx)
        vf.to_vital(full_path)
        count_downloaded += 1
    else:
        count_present += 1

print()
print(f'Count of cases of interest:           {cases_of_interest_idx.shape[0]}')
print(f'Count of vital files downloaded:      {count_downloaded}')
print(f'Count of vital files already present: {count_present}')

In [None]:
# Convert vital files to "mini" versions including only the subset of tracks based on TRACK_NAMES defined above.
# Only perform conversion for the cases of interest.
# NOTE: If this cell is interrupted, it can be restarted and will continue where it left off.
count_minified = 0
count_present = 0

for i, idx in enumerate(cases_of_interest_idx):
    if MAX_CASES and i >= MAX_CASES:
        break
    
    full_path = f'{VITALDB_CACHE}/{VITAL_ALL}/{idx:04d}.vital'
    mini_path = f'{VITALDB_CACHE}/{VITAL_MINI}/{idx:04d}_mini.vital'
    if not os.path.isfile(mini_path):
        print(f'Creating mini vital file: {idx}')
        vf = vitaldb.VitalFile(full_path, TRACK_NAMES)
        vf.to_vital(mini_path)
        count_minified += 1
    else:
        count_present += 1

print()
print(f'Count of cases of interest:           {cases_of_interest_idx.shape[0]}')
print(f'Count of vital files minified:        {count_minified}')
print(f'Count of vital files already present: {count_present}')

In [None]:
# Exclude cases where ABP j signal quality (jSQI) < 0.8
# TODO: Implement jSQI function
# TODO: Filter cases with jSQI < 0.8

In [None]:
# Generate hypotensive events
# Hypotensive events are defined as a 1-minute interval with sustained ABP of less than 65 mmHg
# Note: Hypotensive events should be at least 20 minutes apart to minimize potential residual effects from previous events
# TODO: Implement hypotension event generation function
# TODO: Generate hypotension events

# Generate hypotension non-events
# To sample non-events, 30-minute segments where the ABP was above 75 mmHG were selected, and then
# three one-minute samples of each waveform were obtained from the middle of the segment
# TODO: Implement hypotension non-event generation function
# TODO: Generate hypotension non-events

# XXX Create dummy events with random labels for now
def generate_dummy_data(cases_of_interest_idx):
    # Initialize an empty DataFrame
    generated_data = []
    
    # Loop through each case index
    for case in cases_of_interest_idx:
        # Generate a random number of rows between 5 and 20
        num_rows = random.randint(5, 20)
        
        # Generate data for each row
        for _ in range(num_rows):
            starttime = random.randint(0, 1200)
            endtime = starttime + 60
            label = random.randint(0, 1)
            
            # Append the data to the DataFrame
            generated_data.append([
                case,
                starttime,
                endtime,
                label
            ])
    
    return pd.DataFrame(generated_data, columns=['caseidx', 'starttime', 'endtime', 'label'])
samples = generate_dummy_data(cases_of_interest_idx)
samples

In [None]:
# Preprocess data tracks

# ABP waveforms are used without further pre-processing
# ECG waveforms are band-pass filtered between 1 and 40 Hz, and Z-score normalized
# EEG waveforms are band-pass filtered between 0.5 and 40 Hz

In [None]:
# Split data into training, validation, and test sets
# Use 6:1:3 ratio and prevent samples from a single case from being split across different sets
# Note: number of samples at each time point is not the same, because the first event can occur before the 3/5/10/15 minute mark

# Set target sizes
train_ratio = 0.6
val_ratio = 0.1
test_ratio = 1 - train_ratio - val_ratio # ensure ratios sum to 1

# Assume that on average cases have the ~same number of events so we can split by case rather than event
# Note: this means that the ratios will be approximate

# Get unique cases
unique_cases = samples['caseidx'].unique()

# Split cases into train and other
train_caseidx, other_caseidx = train_test_split(unique_cases, test_size=(1 - train_ratio), random_state=42)
# Split other into val and test
val_caseidx, test_caseidx = train_test_split(other_caseidx, test_size=(test_ratio / (1 - train_ratio)), random_state=42)

# Create datasets
samples_train = samples[samples['caseidx'].isin(train_caseidx)]
samples_val = samples[samples['caseidx'].isin(val_caseidx)]
samples_test = samples[samples['caseidx'].isin(test_caseidx)]

# Check how many samples are in each set
print(f"Train samples: {len(samples_train)}, ({len(samples_train) / len(samples):.2%})")
print(f"Val samples: {len(samples_val)}, ({len(samples_val) / len(samples):.2%})")
print(f"Test samples: {len(samples_test)}, ({len(samples_test) / len(samples):.2%})")

In [None]:
# Create vitalDataset class
class vitalDataset(Dataset):
    def __init__(self, file_dir, samples, track_names, track_srates_hz):
        # samples should be a list of (caseidx, starttime, endtime, label)
        self.file_dir = file_dir
        self.samples = samples
        self.track_names = track_names
        self.track_srates_hz = track_srates_hz

    def __len__(self):
        return len(self.samples)

    def __getitem__(self, idx):
        # Get metadata for this event
        caseidx, starttime, endtime, label = self.samples.iloc[idx]
        # Load vital file
        file_path = os.path.join(self.file_dir, f"{caseidx:04d}.vital")
        vf = vitaldb.VitalFile(file_path, self.track_names)
        # Crop samples to target interval
        vf.crop(starttime, endtime)
        # Create target tensor
        samples = torch.zeros(len(self.track_names)*int((endtime-starttime)*sum(self.track_srates_hz)))
        # Populate each track
        for i, (track_name, rate) in enumerate(zip(self.track_names, self.track_srates_hz)):
            # Get samples for this track
            track_samples, _ = vf.get_samples(track_name, 1/rate)
            #track_samples = vf.to_numpy(track_name, 1/rate)
            # Convert to tensor and store in samples
            start = int((endtime-starttime)*sum(self.track_srates_hz[:i]))
            end = start + int((endtime-starttime)*self.track_srates_hz[i])
            samples[start:end] = torch.tensor(track_samples)
        return samples, label

In [None]:
train_dataset = vitalDataset(f'{VITALDB_CACHE}/{VITAL_ALL}/', samples_train, TRACK_NAMES, TRACK_SRATES)
val_dataset = vitalDataset(f'{VITALDB_CACHE}/{VITAL_ALL}/', samples_val, TRACK_NAMES, TRACK_SRATES)
test_dataset = vitalDataset(f'{VITALDB_CACHE}/{VITAL_ALL}/', samples_test, TRACK_NAMES, TRACK_SRATES)
train_loader = torch.utils.data.DataLoader(train_dataset, batch_size=32, shuffle=True)
val_loader = torch.utils.data.DataLoader(val_dataset, batch_size=32, shuffle=False)
test_loader = torch.utils.data.DataLoader(test_dataset, batch_size=32, shuffle=False)

In [None]:
# dir and function to load raw data
raw_data_dir = '/content/gdrive/My Drive/Colab Notebooks/<path-to-raw-data>'

def load_raw_data(raw_data_dir):
  # implement this function to load raw data to dataframe/numpy array/tensor
  return None

raw_data = load_raw_data(raw_data_dir)

# calculate statistics
def calculate_stats(raw_data):
  # implement this function to calculate the statistics
  # it is encouraged to print out the results
  return None

# process raw data
def process_data(raw_data):
    # implement this function to process the data as you need
  return None

processed_data = process_data(raw_data)

''' you can load the processed data directly
processed_data_dir = '/content/gdrive/My Drive/Colab Notebooks/<path-to-raw-data>'
def load_processed_data(raw_data_dir):
  pass

'''

##   Model
The model includes the model definitation which usually is a class, model training, and other necessary parts.
  * Model architecture: layer number/size/type, activation function, etc
  * Training objectives: loss function, optimizer, weight of each loss term, etc
  * Others: whether the model is pretrained, Monte Carlo simulation for uncertainty analysis, etc
  * The code of model should have classes of the model, functions of model training, model validation, etc.
  * If your model training is done outside of this notebook, please upload the trained model here and develop a function to load and test it.

In [None]:
class my_model():
  # use this class to define your model
  pass

model = my_model()
loss_func = None
optimizer = None

def train_model_one_iter(model, loss_func, optimizer):
  pass

num_epoch = 10
# model training loop: it is better to print the training/validation losses during the training
for i in range(num_epoch):
  train_model_one_iter(model, loss_func, optimizer)
  train_loss, valid_loss = None, None
  print("Train Loss: %.2f, Validation Loss: %.2f" % (train_loss, valid_loss))


# Results
In this section, you should finish training your model training or loading your trained model. That is a great experiment! You should share the results with others with necessary metrics and figures.

Please test and report results for all experiments that you run with:

*   specific numbers (accuracy, AUC, RMSE, etc)
*   figures (loss shrinkage, outputs from GAN, annotation or label of sample pictures, etc)


In [None]:
# metrics to evaluate my model

# plot figures to better show the results

# it is better to save the numbers and figures for your presentation.

## Model comparison

In [None]:
# compare you model with others
# you don't need to re-run all other experiments, instead, you can directly refer the metrics/numbers in the paper

# Discussion

In this section,you should discuss your work and make future plan. The discussion should address the following questions:
  * Make assessment that the paper is reproducible or not.
  * Explain why it is not reproducible if your results are kind negative.
  * Describe “What was easy” and “What was difficult” during the reproduction.
  * Make suggestions to the author or other reproducers on how to improve the reproducibility.
  * What will you do in next phase.



In [None]:
# no code is required for this section
'''
if you want to use an image outside this notebook for explanaition,
you can read and plot it here like the Scope of Reproducibility
'''

# References

1.   Sun, J, [paper title], [journal title], [year], [volume]:[issue], doi: [doi link to paper]



# Feel free to add new sections