# MedNIST Dataset Upload

This notebook will upload the MedNIST dataset into an XNAT.

Before running this notebook, create a new project in your XNAT named MedNIST, then Start Jupyter from the MedNIST project. 

In [1]:
import requests
import xnat

In [2]:
projectId = 'None'

### Download the MedNIST dataset

Source: https://github.com/Project-MONAI/tutorials/blob/main/2d_classification/mednist_tutorial.ipynb

The MedNIST dataset was gathered from several sets from TCIA, the RSNA Bone Age Challenge, and the NIH Chest X-ray dataset.

The dataset is kindly made available by Dr. Bradley J. Erickson M.D., Ph.D. (Department of Radiology, Mayo Clinic) under the Creative Commons CC BY-SA 4.0 license.

In [3]:
import requests, tarfile, io

resource = "https://github.com/Project-MONAI/MONAI-extra-test-data/releases/download/0.8.1/MedNIST.tar.gz"

data_directory = "./MedNIST"

r = requests.get(resource)
file = tarfile.open(name=None, fileobj=io.BytesIO(r.content))
file.extractall(data_directory)
file.close()

### Connect to XNAT

If you are running this upload notebook outside of XNAT/Jupyter you'll need to add a host, username, and password to XNATpy.

In [4]:
connection = xnat.connect()
connection.caching = False

In [5]:
project = connection.projects[projectId]

### Upload Data
Each image is imported to a new subject and new image session. This might take a while to upload. Each image category has ~10000 images, except for Breast MRI with ~8900. We'll upload the first 1000.

In [6]:
subj_start = 0
subj_stop  = 1000

#### Upload AbdomenCT

In [7]:
for i in range(subj_start, subj_stop):
    
    num = f"{i}".zfill(6)
    subjectLabel = f"MedNIST_AbdomenCT_S{num}"
    experimentLabel = f"MedNIST_AbdomenCT_E{num}"
    
    subject = connection.classes.SubjectData(parent=project, label=subjectLabel)
    experiment = connection.classes.CtSessionData(parent=subject, label=experimentLabel)
    scan = connection.classes.CtScanData(parent=experiment, id='AbdomenCT')
    resource = connection.classes.ResourceCatalog(parent=scan, label='JPEG')
    resource.upload(f'{data_directory}/MedNIST/AbdomenCT/{num}.jpeg', f'{num}.jpeg')

#### Upload BreastMRI

In [8]:
for i in range(subj_start, subj_stop):
    
    num = f"{i}".zfill(6)
    subjectLabel = f"MedNIST_BreastMRI_S{num}"
    experimentLabel = f"MedNIST_BreastMRI_E{num}"
    
    subject = connection.classes.SubjectData(parent=project, label=subjectLabel)
    experiment = connection.classes.MrSessionData(parent=subject, label=experimentLabel)
    scan = connection.classes.MrScanData(parent=experiment, id='BreastMRI')
    resource = connection.classes.ResourceCatalog(parent=scan, label='JPEG')
    resource.upload(f'{data_directory}/MedNIST/BreastMRI/{num}.jpeg', f'{num}.jpeg')

#### Upload ChestCT

In [9]:
for i in range(subj_start, subj_stop):
    
    num = f"{i}".zfill(6)
    subjectLabel = f"MedNIST_ChestCT_S{num}"
    experimentLabel = f"MedNIST_ChestCT_E{num}"
    
    subject = connection.classes.SubjectData(parent=project, label=subjectLabel)
    experiment = connection.classes.CtSessionData(parent=subject, label=experimentLabel)
    scan = connection.classes.CtScanData(parent=experiment, id='ChestCT')
    resource = connection.classes.ResourceCatalog(parent=scan, label='JPEG')
    resource.upload(f'{data_directory}/MedNIST/ChestCT/{num}.jpeg', f'{num}.jpeg')

#### Upload CXR

In [13]:
for i in range(subj_start, subj_stop):
    
    num = f"{i}".zfill(6)
    subjectLabel = f"MedNIST_Hand_S{num}"
    experimentLabel = f"MedNIST_Hand_E{num}"
    
    subject = connection.classes.SubjectData(parent=project, label=subjectLabel)
    experiment = connection.classes.CrSessionData(parent=subject, label=experimentLabel)
    scan = connection.classes.CrScanData(parent=experiment, id='Hand')
    resource = connection.classes.ResourceCatalog(parent=scan, label='JPEG')
    resource.upload(f'{data_directory}/MedNIST/Hand/{num}.jpeg', f'{num}.jpeg')

#### Upload Hand

In [13]:
for i in range(subj_start, subj_stop):
    
    num = f"{i}".zfill(6)
    subjectLabel = f"MedNIST_Hand_S{num}"
    experimentLabel = f"MedNIST_Hand_E{num}"
    
    subject = connection.classes.SubjectData(parent=project, label=subjectLabel)
    experiment = connection.classes.CrSessionData(parent=subject, label=experimentLabel)
    scan = connection.classes.CrScanData(parent=experiment, id='Hand')
    resource = connection.classes.ResourceCatalog(parent=scan, label='JPEG')
    resource.upload(f'{data_directory}/MedNIST/Hand/{num}.jpeg', f'{num}.jpeg')

#### Upload HeadCT

In [14]:
for i in range(subj_start, subj_stop):
    
    num = f"{i}".zfill(6)
    subjectLabel = f"MedNIST_HeadCT_S{num}"
    experimentLabel = f"MedNIST_HeadCT_E{num}"
    
    subject = connection.classes.SubjectData(parent=project, label=subjectLabel)
    experiment = connection.classes.CtSessionData(parent=subject, label=experimentLabel)
    scan = connection.classes.CtScanData(parent=experiment, id='HeadCT')
    resource = connection.classes.ResourceCatalog(parent=scan, label='JPEG')
    resource.upload(f'{data_directory}/MedNIST/HeadCT/{num}.jpeg', f'{num}.jpeg')

### Restart
If your running this in XNAT/Jupyter, you may need to Stop Jupyter then Start Jupyter again from XNAT for the data to appear in /data.