<a href="https://colab.research.google.com/github/kirbyju/TCIA_Notebooks/blob/main/ACNS0332/ACNS0332.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Accessing and visualizing DICOM annotations from the ACNS0332 dataset hosted on TCIA

This notebook is focused on accessing the **"Chemotherapy and Radiation Therapy in Treating Young Patients With Newly Diagnosed, Previously Untreated, High-Risk Medulloblastoma/PNET (ACNS0332)"** Collection hosted on [The Cancer Imaging Archive(TCIA)](https://cancerimagingarchive.net).  This dataset includes [DICOM MRI images](https://doi.org/10.7937/TCIA.582B-XZ89) hosted on TCIA and [clinical data](https://nctn-data-archive.nci.nih.gov/node/838) hosted by the NCTN Data Archive.  The National Cancer Institute has also funded an activity to generate and publish [annotations (3d segmentation labels and seed points)](https://doi.org/10.7937/D8A8-6252) on TCIA to help jumpstart research on tumor detection and auto-segmentation methods.  


# 1 Learn about and request access to the ACNS0332 datasets

The imaging, clinical and annotation data for ACNS0332 are described in detail at the following links.  These pages are publicly visible without logging in so you can obtain an understanding of the dataset before going through the trouble of requesting access:

1.  [ACNS0332 Collection Summary](https://doi.org/10.7937/TCIA.582B-XZ89)
2.  [ACNS0332 Annotation Summary](https://doi.org/10.7937/D8A8-6252)
3.  Descriptions of the 3 clinical datasets can be viewed at https://nctn-data-archive.nci.nih.gov/node/838.  Clicking on each dataset also allows you to view a detailed Data Dictionary outlining the types of clinical variables that were collected.

### Requesting Access to the data
In order to download the actual data you must request access through the NCTN Data Archive via the following steps:
 
 1. [Register an account on the NCTN Data Archive](https://nctn-data-archive.nci.nih.gov/).  
 2. After logging in, use the "Request Data" link in the left side menu.  
 3. Follow the on screen instructions, and enter ***NCT00392327*** when asked which trial you want to request.  
 4. In step 2 of the Create Request form, be sure to select “Imaging Data Requested”. 
 
Once you are approved for access you'll be able to download the clinical data from the NCTN Archive.  You will then be asked to create an account on TCIA with the same email address so that you can access the imaging data.  Please contact NCINCTNDataArchive@mail.nih.gov for any questions about access requests.  

# 2 Set up your TCIA credential file

Since the ACNS0332 collection requires logging in you must setup a TCIA credential file which contains your user name and password. 

**NOTE:** You must enter your real user name and password before you run this, or go and edit the resulting text file with your real credentials after it's created. 

In [None]:
# Create the credential file

lines = ['userName=YourUserName', 'passWord=YourPassword']
with open('credentials.txt', 'w') as f:
    f.write('\n'.join(lines))

# 3 Downloading images and annotations with NBIA Data Retriever

TCIA utilizes software called NBIA to manage its DICOM data.  One way to download TCIA data is to install the [linux command-line version of the NBIA Data Retriever](https://wiki.cancerimagingarchive.net/x/2QKPBQ) using the following steps.  This tool provides a number of useful features such as multi-threaded downloads, auto-retry if there are any problems, saving data in an organized hierarchy on your hard drive (Collection > Patient > Study > Series > Images) and providing a CSV file continaing key DICOM metadata about the images you've downloaded.

### 3.1 Install the NBIA Data Retriever CLI package

In [None]:
# install NBIA Data Retriever CLI software for downloading images later in this notebook

!mkdir /usr/share/desktop-directories/
!wget -P /content/NBIA-Data-Retriever https://cbiit-download.nci.nih.gov/nbia/releases/ForTCIA/NBIADataRetriever_4.4/nbia-data-retriever-4.4.deb
!dpkg -i /content/NBIA-Data-Retriever/nbia-data-retriever-4.4.deb

# NOTE: If you're working on a Linux OS that uses RPM packages you can change the wget line above to point to
#       https://cbiit-download.nci.nih.gov/nbia/releases/ForTCIA/NBIADataRetriever_4.4/NBIADataRetriever-4.4-1.x86_64.rpm

### 3.2 Download the full dataset using NBIA Data Retriever CLI
The Data Retriever software works by ingesting a "manifest" file that contains the DICOM Series Instance UIDs of the scans you'd like to download. The manifest files can be downloaded from [this page](https://doi.org/10.7937/D8A8-6252), but you can also use wget to obtain these manifests with the commands below.

* ACNS0332 Annotations -- Segmentations, Seed Points, and Negative Findings Assessments 
* Original ACNS0332 Images used to create Segmentations & Seed Points
* Original ACNS0332 Images used to create Negative Assessment reports
* Manifest containing examples of each annotation type for a single subject/study (useful for quick testing/demos)

In [None]:
# ACNS0332 Annotations -- Segmentations, Seed Points, and Negative Findings Assessments
!wget -O /content/ACNS0332-Tumor-Annotations-manifest.tcia https://wiki.cancerimagingarchive.net/download/attachments/119703167/ACNS0332-Tumor-Annotations-manifest.tcia?api=v2

# Original ACNS0332 Images used to create Segmentations & Seed Points
!wget -O /content/ACNS0332-OriginalMRs-SEGSandSeedpoints-manifest.tcia https://wiki.cancerimagingarchive.net/download/attachments/119703167/ACNS0332-OriginalMRs-SEGSandSeedpoints-manifest.tcia?api=v2

# Original ACNS0332 Images used to create Negative Assessment reports
# (no segmentation or seed points created for the scan)
!wget -O /content/ACNS0332-OriginalMRs-NegativeAssessments-manifest.tcia https://wiki.cancerimagingarchive.net/download/attachments/119703167/ACNS0332-OriginalMRs-NegativeAssessments-manifest.tcia?api=v2

# Single subject manifest containing examples of each annotation type
# Use this one for a quick demo
!wget -O /content/acns0332-demo-PARJIR.tcia https://github.com/kirbyju/TCIA_Notebooks/raw/main/ACNS0332/acns0332-demo-PARJIR.tcia

Now we can open the manifest file(s) with the NBIA Data Retriever to download the actual data. ***Please note*** that after running the following command you have to ***click in the output cell and type "y"*** to agree with the TCIA Data Usage Policy to start the download.

You can repeat this step for each manifest you'd like to download by changing the path.  For demonstration purposes we'll use the single subject manifest.

In [None]:
# download the original MRI scans, seed points, and segmentations

!/opt/nbia-data-retriever/nbia-data-retriever --cli '/content/acns0332-demo-PARJIR.tcia' -d /content/ -l /content/credentials.txt

# 4 Accessing the REST APIs 
The NBIA REST APIs are another way to query metadata and download image data.  Since the ACNS0332 dataset is "limited access" we'll need to use the "NBIA Search with Authentication REST API" described at https://wiki.cancerimagingarchive.net/x/X4ATBg which enables you to use your login credentials to create API tokens to access this Collection.

In the following examples we'll use the API to construct queries to explore and download ACNS0332 data.  Many of these queries shown below allow for additional query parameters to refine your search results.  These are covered in the aforementioned documentation.

In [None]:
# imports

import requests
import pandas as pd

### 4.1 Use credential file to create an API token

These steps use the credential file you created previously to generate an access token to query restricted Collections on TCIA.  

***Note:*** Tokens are valid for 2 hours and must be refreshed after that point. See https://wiki.cancerimagingarchive.net/x/X4ATBg for more details. 

In [None]:
# extract the user/pw from the credential file to variables for use in subsequent API calls and downloads          

credentialFilePath = 'credentials.txt'
mylines = []                                  
with open (credentialFilePath, 'rt') as myfile: 
    for myline in myfile:                     
        mylines.append(myline)   

userName = mylines[0].rstrip('\n').split(r'userName=')[1]
passWord = mylines[1].rstrip('\n').split(r'passWord=')[1]  

In [None]:
# request token

token_url = "https://services.cancerimagingarchive.net/nbia-api/oauth/token?username="+userName+"&password="+passWord+"&grant_type=password&client_id=nbiaRestAPIClient&client_secret=ItsBetweenUAndMe"
access_token = requests.get(token_url).json()["access_token"]
print (access_token)


### 4.2 Explore the data with REST API Queries

Now we'll set some variables that will apply to the remaining queries.

In [None]:
# set base URL to use the NBIA Search API w/ Authentication.
# Documentation about this API is at https://wiki.cancerimagingarchive.net/x/X4ATBg
base_url = "https://services.cancerimagingarchive.net/nbia-api/services/v2/"

# set Advanced URL to use the NBIA Advanced API.
# Documentation about this API is at https://wiki.cancerimagingarchive.net/x/YoATBg
adv_url = "https://services.cancerimagingarchive.net/nbia-api/services/"

# set collection you want to explore
collection = "ACNS0332"

# set API call headers to use the access token we created
api_call_headers = {'Authorization': 'Bearer ' + access_token}

Next let's run some queries to learn about what types of images are available in this Collection.

In [None]:
# print body part(s) examined in the collection as JSON

data_url = base_url + "getBodyPartValues?Collection=" + collection
data = requests.get(data_url, headers = api_call_headers)
if data.text != "":
    data = data.json()
    print (data)
else:
    print("Collection not found")

In [None]:
# print modalities in the collection as JSON

data_url = base_url + "getModalityValues?Collection=" + collection
data = requests.get(data_url, headers = api_call_headers)
if data.text != "":
    data = data.json()
    print (data)
else:
    print("Collection not found")

In [None]:
# Count the number of patients with a given modality in the collection
# For ACNS0332 the 3D segmentations are SEG modality. 
# RTSTRUCT was used to record seed points and scans where no tumor was found.

# get list of available body parts examined
data_url = adv_url + "getModalityValuesAndCounts?Collection=" + collection
data = requests.get(data_url, headers = api_call_headers)

# count unique patients for each modality
if data.text != "":
    df = pd.DataFrame(data.json())
    df.rename(columns = {'criteria':'Modality', 'count':'PatientCount'}, inplace = True)
    df.PatientCount = df.PatientCount.astype(int)
    display(df.sort_values(by='PatientCount', ascending=False))
else:
    print("Collection not found.")

In [None]:
# Count the number of patients with a given body part examined in the collection

# get list of available body parts examined
data_url = adv_url + "getBodyPartValuesAndCounts?Collection=" + collection
data = requests.get(data_url, headers = api_call_headers)

# count unique patients for each modality
if data.text != "":
    df = pd.DataFrame(data.json())
    df.rename(columns = {'criteria':'BodyPartExamined', 'count':'PatientCount'}, inplace = True)
    df.PatientCount = df.PatientCount.astype(int)
    display(df.sort_values(by='PatientCount', ascending=False))
else:
    print("Collection not found.")

Now let's run some queries to see what we can learn about the patient cohort from the DICOM metadata.  This information can include things like age, gender, and ethnicity.  However, in the case of ACNS0332, most of this information is also available in the clinical data at https://nctn-data-archive.nci.nih.gov/node/838.

In [None]:
# obtain patient details (e.g. species, gender, ethnicity) for the collection 
# as JSON and create pandas dataframe w/ optional file export

data_url = base_url + "getPatient?Collection=" + collection
data = requests.get(data_url, headers = api_call_headers)
if data.text != "":
    df = pd.DataFrame(data.json())
    display(df)
    # optional - to save to JSON or CSV file
    df.to_csv(collection + '_patient_metadata.csv')
    # df.to_json(collection + '_patient_metadata.json')
else:
    print("Collection not found.")

In [None]:
# obtain study/visit details (e.g. anonymized study date, age at the time of visit) for 
# each patient in a given collection as JSON and create pandas dataframe w/ optional file export

data_url = base_url + "getPatientStudy?Collection=" + collection
data = requests.get(data_url, headers = api_call_headers)
if data.text != "":
    df = pd.DataFrame(data.json()).sort_values(by=['PatientID','StudyDate'])
    display(df)
    # optional - to save to JSON or CSV file
    df.to_csv(collection + '_study_metadata.csv')
    # df.to_json(collection + '_study_metadata.json')
else:
    print("Collection not found.")

We can also create a report that gives useful metadata about each scan in the dataset (e.g. series description, modality, scanner manufacturer & software version, number of images).  

***Note:*** We'll define a function for this so it can be used later in the notebook.

In [None]:
# obtain scan/series metadata for a collection as JSON

def getSeries(collection):
    data_url = base_url + "getSeries?Collection=" + collection
    data = requests.get(data_url, headers = api_call_headers)
    if data.text != "":
        df = pd.DataFrame(data.json()).sort_values(by=['PatientID','SeriesDate'])
        # optional - save to CSV file
        df.to_csv(collection + '_scan_metadata.csv')
        return df
    else:
        print("Collection not found.")

df = getSeries(collection)
display(df)

Finally, we can use that scan report dataframe to generate some helpful summary statistics about the Collection.

In [None]:
# Calculate summary statistics for a given collection 

# Summarize patients
print('Summary Statistics for', collection,'\n')
print('Subjects: ', len(df['PatientID'].value_counts()), 'subjects')
print('Subjects: ', len(df['StudyInstanceUID'].value_counts()), 'studies')
print('Subjects: ', len(df['SeriesInstanceUID'].value_counts()), 'series')
print('Images: ', df['ImageCount'].sum(), 'images\n')

# Summarize modalities
print("Series Counts - Modality:")
print(df['Modality'].value_counts(dropna=False),'\n')

# Summarize body parts
print("Series Counts - Body Parts Examined:")
print(df['BodyPartExamined'].value_counts(dropna=False),'\n')

# Summarize manufacturers
print("Series Counts - Device Manufacturers:")
print(df['Manufacturer'].value_counts(dropna=False))

### 4.3 Downloading data with the REST API
Now we'll walk through using the API to download data.  This can be useful if you'd like to download specific scans from previous API queries rather than using an existing manifest file or if you can't install the NBIA Data Retriever.  

As a reminder, many of the scans in the ACNS0332 Collection were not annotated by the authors of https://doi.org/10.7937/D8A8-6252.  The reasons for this are outlined in the Annotation Protocol on that page.  As a result, you may wish to download only a subset of the scans such as:

1. Seed points
2. 3d segmentations
3. All MRI images used to create either seed points or segmentations
4. Only MRI images used to create to seed points
5. Only MRI images used to create segmentations
6. Only MRI images with negative finding assessments

The following examples will demonstrate how to download the full collection as well as how to tackle each of these specialized use cases.  

***Note: By default only the first 3 scans for each use case below will be downloaded for demonstration purposes. If you'd like to download the full collection you must comment out the relevant lines.***

In [None]:
# download imports
import requests, zipfile
from io import BytesIO

First, let's define a generic download function that we can re-use for the various use cases.  This will take a list of series UIDs as the input, download each scan, and create a dataframe/CSV that contains the metadata about each of those scans.  It also accepts an optional parameter to specify a file name if you'd like a CSV export of the dataframe.

In [None]:
# define a function to accept a list of seriesInstanceUIDs and download it
# reminder: this only downloads the first 3 scans unless you comment out that section

def downloadSeries(series_data, csv_filename=""):  
    manifestDF=pd.DataFrame()
    seriesUID = ''
    count = 0
    for x in series_data:
        seriesUID = x
        data_url = base_url + "getImage?SeriesInstanceUID=" + seriesUID
        print("Downloading " + data_url)
        data = requests.get(data_url, headers = api_call_headers)
        file = zipfile.ZipFile(BytesIO(data.content))
        # print(file.namelist())
        file.extractall(path = "apiDownload/" + collection + "/" + seriesUID)
        # write the series metadata to a dataframe
        metadata_url = base_url + "getSeriesMetaData?SeriesInstanceUID=" + seriesUID
        metadata = requests.get(metadata_url, headers = api_call_headers).json()
        newRow = pd.DataFrame.from_dict(metadata)
        tmpManifest = pd.concat([manifestDF, newRow], ignore_index = True)
        tmpManifest.reset_index()
        manifestDF = tmpManifest
        # Repeat n times for demo purposes - comment out these next 3 lines to download a full results
        count += 1;
        if count == 3:
            break  
    # display manifest dataframe and/or save manifest to CSV file
    if csv_filename != "":
        manifestDF.to_csv(csv_filename + '.csv')
        display(manifestDF)
    else:
        display(manifestDF)

The most basic use case would be to simply download the entire Collection.  This will provide all of the annotation data (seed points, segmentations, negative finding reports) as well as all of the original scans in the ACNS0332 collection.  Make sure you have enough disk space (~95 GBytes) if you comment out the code that limits the download to the first 3 scans!  The following code will also save important metadata about the scans into a Pandas dataframe and CSV file.  

In [None]:
# call getSeries function to retrieve scan metadata for the whole collection
df = getSeries(collection)

# extract the SeriesInstanceUID column
series_data = list(df['SeriesInstanceUID'])

# feed series_data to our downloadSeries function
downloadSeries(series_data, collection + "_full_Collection")

To identify the subsets for the other use cases we'll leverage the supplemental spreadsheet the authors provided, which you can download directly from https://doi.org/10.7937/D8A8-6252 or retrieve with ***wget*** using the command below.

In [None]:
# wget ACNS0332 annotation metadata file

!wget -O /content/ACNS0332_annotations_metadata-2022-09-22.csv https://wiki.cancerimagingarchive.net/download/attachments/119703167/ACNS0332_annotations_metadata-2022-09-22.csv?api=v2

Let's take a look at the contents of the spreadsheet using a Pandas dataframe.  

In [None]:
# load annotation metadata spreadsheet to df

annotation_Metadata = pd.read_csv('ACNS0332_annotations_metadata-2022-09-22.csv')

display(annotation_Metadata)

The following cells will let you build a list of Series Instance UIDs to download based on the previously mentioned use cases.

In [None]:
# Use case: Download seed point RTSTRUCTs

# filter dataframe to only include seed point rows
seedPoints = annotation_Metadata[annotation_Metadata['StructureSetLabel'] == 'Seed Points']
#display(seedPoints)

# extract series UID column to list for downloading
series_data = seedPoints["SeriesInstanceUID"].tolist()

# feed series_data to our downloadSeries function
downloadSeries(series_data, collection + "_seedPoints")

In [None]:
# Use case: Download 3d segmentations

# filter dataframe to only include segmentations
segs = annotation_Metadata[annotation_Metadata['DICOM Type'] == 'SEG']
# display(segs)

# extract series UID column to list for downloading
series_data = segs["SeriesInstanceUID"].tolist()

# feed series_data to our downloadSeries function
downloadSeries(series_data, collection + "_segs")

The following cells will download the corresponding MRIs that were annotated.  ***There is significant overlap in the MRIs used between these two sets, so if you're doing full downloads you should only utilize one of the 3 cells below depending on your use case:***
1. All MRIs used for both segmentations ***and*** seedpoints
2. Only MRIs used for segmentations
3. Only MRIs used for seed points

In [None]:
# Use case: Download all MRIs for segmentations AND seed points

# filter dataframe to only include seg and seed point rows (remove "no findings")
ref_Series = annotation_Metadata[(annotation_Metadata['StructureSetLabel'] == 'Seed Points') |
                                 (annotation_Metadata['DICOM Type'] == 'SEG')]

# remove duplicate ReferencedSeriesUIDs
clean_refSeries = ref_Series.drop_duplicates(subset='ReferencedSeriesInstanceUID')
# display(clean_refSeries)

# extract series UID column to list for downloading
series_data = clean_refSeries["ReferencedSeriesInstanceUID"].tolist()

# feed series_data to our downloadSeries function
downloadSeries(series_data, collection + "_seg_seed_MRIs")

In [None]:
# Use case: Download only MRI images used to create seed points
# ONLY USE THIS OPTION IF YOU DO NOT WANT THE ADDITIONAL 
# MRI SCANS FOR SEGMENTATIONS 

# filter dataframe to only include seed point rows
ref_Series = annotation_Metadata[annotation_Metadata['StructureSetLabel'] == 'Seed Points']
# display(ref_Series)

# remove duplicate ReferencedSeriesUIDs
clean_refSeries = ref_Series.drop_duplicates(subset='ReferencedSeriesInstanceUID')
# display(clean_refSeries)

# extract series UID column to list for downloading
series_data = clean_refSeries["ReferencedSeriesInstanceUID"].tolist()

# feed series_data to our downloadSeries function
downloadSeries(series_data, collection + "_seed_MRIs")

In [None]:
# Use case: Download only MRI images used to create 3D segmentations
# ONLY USE THIS OPTION IF YOU DO NOT WANT THE ADDITIONAL 
# MRI SCANS USED FOR SEED POINTS 

# filter dataframe to only include seg
ref_Series = annotation_Metadata[annotation_Metadata['DICOM Type'] == 'SEG']
# display(ref_Series)

# remove duplicate ReferencedSeriesUIDs
clean_refSeries = ref_Series.drop_duplicates(subset='ReferencedSeriesInstanceUID')
# display(clean_refSeries)

# extract series UID column to list for downloading
series_data = clean_refSeries["ReferencedSeriesInstanceUID"].tolist()

# feed series_data to our downloadSeries function
downloadSeries(series_data, collection + "_seg_MRIs")

The following code will download the MRI scans for images with negative finding assessments.  These are cases where the authors of the dataset did not find anything that could be annotated.  Downloading these scans could be useful if you are training a tumor/metastases detection model.

In [None]:
# Use case: Download only images with negative finding assessments
# ONLY USE THIS OPTION IF YOU WANT TO REVIEW THE ADDITIONAL 
# MRI SCANS WHERE THE AUTHORS INDICATED THERE WAS NOTHING TO ANNOTATE 

# filter dataframe to only include MRIs with "no findings"
ref_Series = annotation_Metadata[annotation_Metadata['StructureSetLabel'] == 'No Findings']

# remove duplicate ReferencedSeriesUIDs
clean_refSeries = ref_Series.drop_duplicates(subset='ReferencedSeriesInstanceUID')

# extract series UID column to list for downloading
series_data = clean_refSeries["ReferencedSeriesInstanceUID"].tolist()

# feed series_data to our downloadSeries function
downloadSeries(series_data, collection + "_noFinding_MRIs")

In [None]:
# TROUBLESHOOTING ONLY - CAN DELETE IN FINAL COPY
# calculate diff between MRI UIDs for seed/seg to see if there is any

# filter dataframe to only include no findings rows
noFindings = annotation_Metadata[annotation_Metadata['StructureSetLabel'] == 'No Findings']
#display(noFindings)
print(len(noFindings))

# remove duplicates
clean_noFindings = noFindings.drop_duplicates(subset='ReferencedSeriesInstanceUID')
#display(clean_noFindings)
print(len(clean_noFindings))

# extract no findings ref series UID column
noFinding_refSeries = clean_noFindings["ReferencedSeriesInstanceUID"].tolist()
#print(len(seedPoint_refSeries))
#print(seedPoint_refSeries)

# filter dataframe to only include seed point rows
seedPoints = annotation_Metadata[annotation_Metadata['StructureSetLabel'] == 'Seed Points']
#display(seedPoints)
print(len(seedPoints))

# remove duplicates
clean_seedPoints = seedPoints.drop_duplicates(subset='ReferencedSeriesInstanceUID')
#display(clean_seedPoints)
print(len(clean_seedPoints))

# extract seed point ref series UID column
seedPoint_refSeries = clean_seedPoints["ReferencedSeriesInstanceUID"].tolist()
#print(len(seedPoint_refSeries))
#print(seedPoint_refSeries)

# filter dataframe to only include seg rows
segs = annotation_Metadata[annotation_Metadata['DICOM Type'] == 'SEG']
#display(segs)
print(len(segs))

# remove duplicates
clean_segs = segs.drop_duplicates(subset='ReferencedSeriesInstanceUID')
#display(clean_segs)
print(len(clean_segs))

# extract seg ref series UID column
seg_refSeries = clean_segs["ReferencedSeriesInstanceUID"].tolist()
#print(len(seg_refSeries))
#print(seg_refSeries)

# compare ref series UID columns for seed vs seg
diff = []
for element in seedPoint_refSeries:
    if element not in seg_refSeries:
        diff.append(element)
print('List seed point reference UIDs that are not in the segmentation list')
print(len(diff))
print(diff)

# compare ref series UID columns for seg vs seed
diff = []
for element in seg_refSeries:
    if element not in seedPoint_refSeries:
        diff.append(element)
print('List segmentation reference UIDs that are not in the seed point list')
print(len(diff))
print(diff)

# compare ref series UID columns for no finding vs seg
diff = []
for element in seg_refSeries:
    if element not in noFinding_refSeries:
        diff.append(element)
print(len(diff))
print(diff)

# compare ref series UID columns for no finding vs seed
diff = []
for element in seedPoint_refSeries:
    if element not in noFinding_refSeries:
        diff.append(element)
print(len(diff))
print(diff)

# Acknowledgements
[The Cancer Imaging Archive (TCIA)](https://www.cancerimagingarchive.net/) is a service which de-identifies and hosts a large publicly available archive of medical images of cancer.  TCIA is funded by the [Cancer Imaging Program (CIP)](https://imaging.cancer.gov/), a part of the United States [National Cancer Institute (NCI)](https://www.cancer.gov/), and is managed by the [Frederick National Laboratory for Cancer Research (FNLCR)](https://frederick.cancer.gov/).

This notebook was created by [Justin Kirby](https://www.linkedin.com/in/justinkirby82/), [Petr Jordan](https://www.linkedin.com/in/petrjordan/) and Qinyan Pan.  If you leverage the ACNS0332 or any other TCIA datasets in your work please be sure to comply with the [TCIA Data Usage Policy](https://wiki.cancerimagingarchive.net/x/c4hF). Upon receiving access, you must also abide by the terms of your NCTN/NCORP Data Archive’s Data Use Agreement (DUA). You are not allowed to redistribute the data or use it for other purposes. Attribution should include references to the following citations:

## Data Citations

1. Hwang, E. I., Kool, M., Burger, P. C., Capper, D., Chavez, L., Brabetz, S., Williams-Hughes, C., Billups, C., Heier, L., Jaju, A., Michalski, J., Li, Y., Leary, S., Zhou, T., von Deimling, A., Jones, D. T. W., Fouladi, M., Pollack, I. F., Gajjar, A., … Olson, J. M. (2021). Chemotherapy and Radiation Therapy in Treating Young Patients With Newly Diagnosed, Previously Untreated, High-Risk Medulloblastoma/PNET (ACNS0332) [Data set]. The Cancer Imaging Archive. https://doi.org/10.7937/TCIA.582B-XZ89
2. Rozenfeld, M., & Jordan, P. (2022). Annotations for Chemotherapy and Radiation Therapy in Treating Young Patients With Newly Diagnosed, Previously Untreated, High-Risk Medulloblastoma/PNET (ACNS0332-Tumor-Annotations) (Version 1) [Data set]. The Cancer Imaging Archive. https://doi.org/10.7937/D8A8-6252

## Publication Citation

Hwang, E. I., Kool, M., Burger, P. C., Capper, D., Chavez, L., Brabetz, S., Williams-Hughes, C., Billups, C., Heier, L., Jaju, A., Michalski, J., Li, Y., Leary, S., Zhou, T., von Deimling, A., Jones, D. T. W., Fouladi, M., Pollack, I. F., Gajjar, A., … Olson, J. M. (2018). Extensive Molecular and Clinical Heterogeneity in Patients With Histologically Diagnosed CNS-PNET Treated as a Single Entity: A Report From the Children’s Oncology Group Randomized ACNS0332 Trial. Journal of Clinical Oncology, 36(34), 3388–3395. https://doi.org/10.1200/jco.2017.76.4720. Epub ahead of print. PMID: 30332335.

## TCIA Citation

Clark, K., Vendt, B., Smith, K., Freymann, J., Kirby, J., Koppel, P., Moore, S., Phillips, S., Maffitt, D., Pringle, M., Tarbox, L., & Prior, F. (2013). The Cancer Imaging Archive (TCIA): Maintaining and Operating a Public Information Repository. Journal of Digital Imaging, 26(6), 1045–1057. https://doi.org/10.1007/s10278-013-9622-7