You can download and run this notebook locally, or you can run it for free in a cloud environment using Colab or Sagemaker Studio Lab:

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/SL2027/TCIA_Notebooks/blob/main/TCIA_REST_API_Complete_Documentation.ipynb)

[![Open In SageMaker Studio Lab](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github.com/SL2027/TCIA_Notebooks/blob/main/TCIA_REST_API_Complete_Documentation.ipynb)

### Set logging level to INFO in Google Colab (optional)
This step should be skipped unless you're running on **Google Colab** as its logging root handler only shows warnings and errors by default.  If you'd like to see INFO statements you can run the following code.  This is particularly helpful when running some of the download examples so you can see the progress as downloads complete.

In [None]:
import logging

# Check current handlers
#print(logging.root.handlers)

# Remove all handlers associated with the root logger object.
for handler in logging.root.handlers[:]:
    logging.root.removeHandler(handler)
#print(logging.root.handlers)

# Set handler with level = info
logging.basicConfig(format='%(asctime)s:%(levelname)s:%(message)s', 
                    level=logging.INFO)

print("Logging set to INFO")

<a id = "summary"></a>
# Summary

Access to large, high-quality datasets is essential for researchers to understand disease and precision medicine pathways, especially in cancer. However, HIPAA constraints make sharing medical images outside an individual institution complex process. [The Cancer Imaging Archive (TCIA)](https://www.cancerimagingarchive.net/) is a public service funded by the National Cancer Institute which addresses this challenge by providing hosting and de-identification services to take major burdens of data sharing off researchers.<br>

The tcia_utils package contains functions to simplify common tasks one might perform when interacting with The Cancer Imaging Archive (TCIA) via Jupyter/Python. Learn more about TCIA and its open-access datasets at https://www.cancerimagingarchive.net/. Please be sure to comply with the TCIA Data Usage Policy. Learn more about the tcia_utils package information on the PyPI page at https://pypi.org/project/tcia-utils/.

## Installation

In [None]:
!pip install --upgrade -q tcia-utils
!pip install --upgrade -q pandas
!pip install --upgrade -q requests

## Usage
To import functions related to the NBIA software, which holds TCIA's DICOM radiology data:

In [None]:
import requests
import pandas as pd
from tcia_utils import nbia

# 1. Learn about Available Collections on the TCIA Website

[Browsing Collections](https://www.cancerimagingarchive.net/collections) and viewing [Analysis Results](https://www.cancerimagingarchive.net/tcia-analysis-results/) of TCIA datasets are the easiest ways to become familiar with what is available. These pages will help you quickly identify datasets of interest, find valuable supporting data that are not available via our APIs (e.g. clinical spreadsheets and non-DICOM segmentation data), and answer the most common questions you might have about the datasets.

# 2. REST API Overview 
TCIA uses software called NBIA to manage DICOM data. The NBIA REST APIs are provided for the search and download functions used in the TCIA radiology portal and allow access to both public and limited access collections.
1. The [NBIA Search REST APIs](https://wiki.cancerimagingarchive.net/x/fILTB) allow you to perform basic queries and download data from **public** collections. These APIs do not require a TCIA account.
2. The [NBIA Search with Authentication REST APIs](https://wiki.cancerimagingarchive.net/x/X4ATBg) allow you to perform basic queries and download data from **public and limited-access** collections. These APIs require a TCIA account to create authentication tokens.
3. The [NBIA Advanced REST APIs](https://wiki.cancerimagingarchive.net/x/YoATBg) also allow access to **public and limited-access** collections, but provide query endpoints mostly geared towards developers seeking to integrate searching and downloading TCIA data into web and desktop applications. This API requires a TCIA account to create authentication tokens.

# 3. Query Functions

Detailed usage of some of these functions can be found at https://github.com/kirbyju/TCIA_Notebooks/blob/main/TCIA_REST_API_Queries.ipynb.

## queryData

**Params: (endpoint, api_url)**

* Provides error handling for requests.get()
* Formats output as JSON by default with options for "df" (dataframe) and "csv"
* Because it is called by query functions that use requests.get(), ***<font color='red'>please do NOT use this function</font>***.

## getCollections

**Params: (api_url = "", format = "")**

* *Optional: api_url, format*
* Gets a list of collections from a specified api_url

In [None]:
# If the format is not specified, it returns a JSON object.
nbia.getCollections(format = "")

In [None]:
# If the format is set to "df", it returns a pandas dataframe object.
nbia.getCollections(format = "df")

In [None]:
# If the format is set to "csv", it saves a csv file to the workspace.
nbia.getCollections(format = "csv")

## getBodyPart

**Params: (collection = "", modality = "", api_url = "", format = "")**

* *Optional: api_url, format*
* Gets Body Part Examined metadata from a specified api_url
* Allows filtering by collection and modality

In [None]:
# If the format is not specified, it returns a JSON object.
nbia.getBodyPart(collection = "CPTAC-SAR", modality = "CT", format = "")

In [None]:
# If the format is set to "df", it returns a pandas dataframe object.
nbia.getBodyPart(collection = "CPTAC-SAR", modality = "CT", format = "df")

In [None]:
# If the format is set to "csv", it saves a csv file to the workspace.
nbia.getBodyPart(collection = "CPTAC-SAR", modality = "CT", format = "csv")

## getModality

**Params: (collection = "", bodyPart = "", api_url = "", format = "")**

* *Optional: api_url, format*
* Gets Modalities metadata from a specified api_url
* Allows filtering by collection and bodyPart

In [None]:
# If the format is not specified, it returns a JSON object.
nbia.getModality(collection = "CPTAC-SAR", bodyPart = "EXTREMITY", format = "")

In [None]:
# If the format is set to "df", it returns a pandas dataframe object.
nbia.getModality(collection = "CPTAC-SAR", bodyPart = "EXTREMITY", format = "df")

In [None]:
# If the format is set to "csv", it saves a csv file to the workspace.
nbia.getModality(collection = "CPTAC-SAR", bodyPart = "EXTREMITY", format = "csv")

## getPatient

**Params: (collection = "", api_url = "", format = "")**

* *Optional: api_url, format*
* Gets Patient metadata from a specified api_url
* Allows filtering by collection

In [None]:
# If the format is not specified, it returns a JSON object.
nbia.getPatient(collection = "CPTAC-SAR", format = "")

In [None]:
# If the format is set to "df", it returns a pandas dataframe object.
nbia.getPatient(collection = "CPTAC-SAR", format = "df")

In [None]:
# If the format is set to "csv", it saves a csv file to the workspace.
nbia.getPatient(collection = "CPTAC-SAR", format = "csv")

## getPatientByCollectionAndModality

**Params: (collection, modality, api_url = "", format = "")**

* *Optional: api_url, format*
* Gets Patient IDs from a specified api_url
* Returns a list of patient IDs

In [None]:
# If the format is not specified, it returns a JSON object.
nbia.getPatientByCollectionAndModality(collection = "CPTAC-SAR", modality = "CT", format = "")

In [None]:
# If the format is set to "df", it returns a pandas dataframe object.
nbia.getPatientByCollectionAndModality(collection = "CPTAC-SAR", modality = "CT", format = "df")

In [None]:
# If the format is set to "csv", it saves a csv file to the workspace.
nbia.getPatientByCollectionAndModality(collection = "CPTAC-SAR", modality = "CT", format = "csv")

## getNewPatientsInCollection

**Params: (collection, date, api_url = "", format = "")**

* *Optional: api_url, format*
* Gets "new" patient metadata from a specified api_url
* The date format is YYYY/MM/DD

In [None]:
# If the format is not specified, it returns a JSON object.
nbia.getNewPatientsInCollection(collection = "CPTAC-SAR", date = "2000/08/20", format = "")

In [None]:
# If the format is set to "df", it returns a pandas dataframe object.
nbia.getNewPatientsInCollection(collection = "CPTAC-SAR", date = "2000/08/20", format = "df")

In [None]:
# If the format is set to "csv", it saves a csv file to the workspace.
nbia.getNewPatientsInCollection(collection = "CPTAC-SAR", date = "2000/08/20", format = "csv")

## getStudy

**Params: (collection, patientId = "", studyUid = "", api_url = "", format = "")**
* *Optional: patientId, studyUid, api_url, format*
* Gets Study (visit/timepoint) metadata from a specified api_url

In [None]:
# If the format is not specified, it returns a JSON object.
nbia.getStudy(collection = "CPTAC-SAR", patientId = "", studyUid = "", format = "")

In [None]:
# If the format is set to "df", it returns a pandas dataframe object.
nbia.getStudy(collection = "CPTAC-SAR", patientId = "", studyUid = "", format = "df")

In [None]:
# If the format is set to "csv", it saves a csv file to the workspace.
nbia.getStudy(collection = "CPTAC-SAR", patientId = "", studyUid = "", format = "csv")

## getNewStudiesInPatient
**Params: (collection, patientId, date, api_url = "", format = "")**

* *Optional: api_url, format*
* Gets "new" patient metadata from a specified api_url
* The date format is YYYY/MM/DD

In [None]:
# If the format is not specified, it returns a JSON object.
nbia.getNewStudiesInPatient(collection = "CPTAC-SAR", patientId = "C3N-00843", date = "2010/09/06", format = "")

In [None]:
# If the format is set to "df", it returns a pandas dataframe object.
nbia.getNewStudiesInPatient(collection = "CPTAC-SAR", patientId = "C3N-00843", date = "2010/09/06", format = "df")

In [None]:
# If the format is set to "csv", it saves a csv file to the workspace.
nbia.getNewStudiesInPatient(collection = "CPTAC-SAR", patientId = "C3N-00843", date = "2010/09/06", format = "csv")

## getSeries

**Params: (collection = "", patientId = "", studyUid = "", seriesUid = "", modality = "", bodyPart = "",<br>
manufacturer = "", manufacturerModel = "", api_url = "", format = "")**

* *All parameters are optional.*
* Gets Series (scan) metadata from a specified api_url
* Allows filtering by collection, patient ID, study UID, series UID, modality, body part, manufacturer & model
* ***Note: Since the output of this function can be very long, it is advisable to save the output to a variable and only display a portion of it at a time when the output format is JSON.***

In [None]:
# If the format is not specified, it returns a JSON object.
data = nbia.getSeries(collection = "CPTAC-SAR", patientId = "", studyUid = "", seriesUid = "", 
                    modality = "", bodyPart = "", manufacturer = "", manufacturerModel = "", format = "")

sample = data[:3]
print(sample)

In [None]:
# If the format is set to "df", it returns a pandas dataframe object.
nbia.getSeries(collection = "CPTAC-SAR", patientId = "", studyUid = "", seriesUid = "", 
                    modality = "", bodyPart = "", manufacturer = "", manufacturerModel = "", format = "df")

In [None]:
# If the format is set to "csv", it saves a csv file to the workspace.
nbia.getSeries(collection = "CPTAC-SAR", patientId = "", studyUid = "", seriesUid = "", 
                    modality = "", bodyPart = "", manufacturer = "", manufacturerModel = "", format = "csv")

## getUpdatedSeries
**Params: (date, api_url = "", format = "")**

* *Optional: api_url, format*
* Gets "new" series metadata from a specified api_url
* The date format is YYYY/MM/DD
* ***NOTE: Unlike other API endpoints, this function expects MM/DD/YYYY, we'll convert from YYYY/MM/DD so tcia-utils is consistent***

In [None]:
# If the format is not specified, it returns a JSON object.
nbia.getUpdatedSeries(date = "2010/09/06", format = "")

In [None]:
# If the format is set to "df", it returns a pandas dataframe object.
nbia.getUpdatedSeries(date = "2010/09/06", format = "df")

In [None]:
# If the format is set to "csv", it saves a csv file to the workspace.
nbia.getUpdatedSeries(date = "2010/09/06", format = "csv")

## getSeriesMetadata
**Params: (seriesUid, api_url = "", format = "")**

* *Optional: api_url, format*
* Gets Series (scan) metadata from a specified api_url
* Output includes DOI and license details that are not in the getSeries() function

In [None]:
# If the format is not specified, it returns a JSON object.
nbia.getSeriesMetadata(seriesUid = "1.3.6.1.4.1.14519.5.2.1.3320.3273.106936860187940539374736870621", format = "")

In [None]:
# If the format is set to "df", it returns a pandas dataframe object.
nbia.getSeriesMetadata(seriesUid = "1.3.6.1.4.1.14519.5.2.1.3320.3273.106936860187940539374736870621", format = "df")

In [None]:
# If the format is set to "csv", it saves a csv file to the workspace.
nbia.getSeriesMetadata(seriesUid = "1.3.6.1.4.1.14519.5.2.1.3320.3273.106936860187940539374736870621", format = "csv")

## getSeriesSize
**Params: (seriesUid, api_url = "", format = "")**

* *Optional: api_url, format*
* Gets the file count and disk size of a series/scan

In [None]:
# If the format is not specified, it returns a JSON object.
nbia.getSeriesSize(seriesUid = "1.3.6.1.4.1.14519.5.2.1.3320.3273.106936860187940539374736870621", format = "")

In [None]:
# If the format is set to "df", it returns a pandas dataframe object.
nbia.getSeriesSize(seriesUid = "1.3.6.1.4.1.14519.5.2.1.3320.3273.106936860187940539374736870621", format = "df")

In [None]:
# If the format is set to "csv", it saves a csv file to the workspace.
nbia.getSeriesSize(seriesUid = "1.3.6.1.4.1.14519.5.2.1.3320.3273.106936860187940539374736870621", format = "csv")

## getSopInstanceUids
**Params: (seriesUid, api_url = "", format = "")**

* *Optional: api_url, format*
* Gets SOP Instance UIDs from a specific series/scan

In [None]:
# If the format is not specified, it returns a JSON object.
nbia.getSopInstanceUids(seriesUid = "1.3.6.1.4.1.14519.5.2.1.3320.3273.106936860187940539374736870621", format = "")

In [None]:
# If the format is set to "df", it returns a pandas dataframe object.
nbia.getSopInstanceUids(seriesUid = "1.3.6.1.4.1.14519.5.2.1.3320.3273.106936860187940539374736870621", format = "df")

In [None]:
# If the format is set to "csv", it saves a csv file to the workspace.
nbia.getSopInstanceUids(seriesUid = "1.3.6.1.4.1.14519.5.2.1.3320.3273.106936860187940539374736870621", format = "csv")

## getManufacturer
**Params: (collection = "", modality = "", bodyPart = "", api_url = "", format = "")**

* *All parameters are optional.*
* Gets manufacturer metadata from a specified api_url
* Allows filtering by collection, body part & modality

In [None]:
# If the format is not specified, it returns a JSON object.
nbia.getManufacturer(collection = "", modality = "", bodyPart = "", format = "")

In [None]:
# If the format is set to "df", it returns a pandas dataframe object.
nbia.getManufacturer(collection = "", modality = "", bodyPart = "", format = "df")

In [None]:
# If the format is set to "csv", it saves a csv file to the workspace.
nbia.getManufacturer(collection = "", modality = "", bodyPart = "", format = "csv")

## getSharedCart

**Params: (name, api_url = "", format = "")**

* *Optional: api_url, format*
* Gets "Shared Cart" (scan) metadata from a specified api_url<br>
* First use https://nbia.cancerimagingarchive.net/nbia-search/ to create a cart, then add data to your basket, then click "Share" > "Share my cart".
* The "name" parameter is part of the URL that generates. E.g https://nbia.cancerimagingarchive.net/nbia-search/?saved-cart=nbia-49121659384603347 has a cart "name" of "nbia-49121659384603347".

In [None]:
# If the format is not specified, it returns a JSON object.
nbia.getSharedCart(name = "nbia-49121659384603347", format = "")

In [None]:
# If the format is set to "df", it returns a pandas dataframe object.
nbia.getSharedCart(name = "nbia-49121659384603347", format = "df")

In [None]:
# If the format is set to "csv", it saves a csv file to the workspace.
nbia.getSharedCart(name = "nbia-49121659384603347", format = "csv")

## getCollectionDescriptions

**Params: (api_url = "", format = "")**

* *All parameters are optional.*
* Gets HTML-formatted descriptions of collections and their DOIs

In [None]:
# If the format is not specified, it returns a JSON object.
nbia.getCollectionDescriptions(format = "")

In [None]:
# If the format is set to "df", it returns a pandas dataframe object.
nbia.getCollectionDescriptions(format = "df")

In [None]:
# If the format is set to "csv", it saves a csv file to the workspace.
nbia.getCollectionDescriptions(format = "csv")

## getCollectionPatientCounts
**Params: (api_url = "", format = "")**

* *All parameters are optional.*
* Gets counts of Patient by collection from Advanced API

In [None]:
# If the format is not specified, it returns a JSON object.
nbia.getCollectionPatientCounts(format = "")

In [None]:
# If the format is set to "df", it returns a pandas dataframe object.
nbia.getCollectionPatientCounts(format = "df")

In [None]:
# If the format is set to "csv", it saves a csv file to the workspace.
nbia.getCollectionPatientCounts(format = "csv")

## getModalityCounts

**Params: (collection = "", bodyPart = "", api_url = "", format = "")**

* *All parameters are optional.*
* Gets counts of Modality metadata from Advanced API
* Allows filtering by collection and bodyPart

In [None]:
# If the format is not specified, it returns a JSON object.
nbia.getModalityCounts(collection = "CPTAC-SAR", bodyPart = "", format = "")

In [None]:
# If the format is set to "df", it returns a pandas dataframe object.
nbia.getModalityCounts(collection = "CPTAC-SAR", bodyPart = "EXTREMITY", format = "df")

In [None]:
# If the format is set to "csv", it saves a csv file to the workspace.
nbia.getModalityCounts(collection = "CPTAC-SAR", bodyPart = "EXTREMITY", format = "csv")

## getBodyPartCounts

**Params: (collection = "", modality = "", api_url = "", format = "")**

* *All parameters are optional.*
* Gets counts of Body Part metadata from Advanced API
* Allows filtering by collection and modality

In [None]:
# If the format is not specified, it returns a JSON object.
nbia.getBodyPartCounts(collection = "CPTAC-SAR", modality = "CT", format = "")

In [None]:
# If the format is set to "df", it returns a pandas dataframe object.
nbia.getBodyPartCounts(collection = "CPTAC-SAR", modality = "CT", format = "df")

In [None]:
# If the format is set to "csv", it saves a csv file to the workspace.
nbia.getBodyPartCounts(collection = "CPTAC-SAR", modality = "CT", format = "csv")

## getManufacturerCounts

**Params: (collection = "", modality = "", bodyPart = "", api_url = "", format = "")**

* *All parameters are optional.*
* Gets counts of Manufacturer metadata from Advanced API
* Allows filtering by collection, body part and modality

In [None]:
# If the format is not specified, it returns a JSON object.
nbia.getManufacturerCounts(collection = "CPTAC-SAR",  modality = "CT", bodyPart = "", format = "")

In [None]:
# If the format is set to "df", it returns a pandas dataframe object.
nbia.getManufacturerCounts(collection = "CPTAC-SAR",  modality = "CT", bodyPart = "", format = "df")

In [None]:
# If the format is set to "csv", it saves a csv file to the workspace.
nbia.getManufacturerCounts(collection = "CPTAC-SAR",  modality = "CT", bodyPart = "", format = "csv")

## getSeriesList

**Params: (list, api_url = "", csv_filename = "")**

* *Optional: api_url, csv_filename*
* Get series metadata from Advanced API
* Allows submission of a list of UIDs
* Returns result as dataframe and CSV

In [None]:
series_list = ["1.3.6.1.4.1.14519.5.2.1.3320.3273.106936860187940539374736870621", "1.3.6.1.4.1.14519.5.2.1.6834.5010.215193814203822462481389051414"]
nbia.getSeriesList(list = series_list)

## getDicomTags

**Params: (seriesUid, api_url = "", format = "")**

* *Optional: api_url, format*
* Gets DICOM tag metadata for a given series UID (scan)

In [None]:
# If the format is not specified, it returns a JSON object.
nbia.getDicomTags(seriesUid = "1.3.6.1.4.1.14519.5.2.1.3320.3273.106936860187940539374736870621", format = "")

In [None]:
# If the format is set to "df", it returns a pandas dataframe object.
nbia.getDicomTags(seriesUid = "1.3.6.1.4.1.14519.5.2.1.3320.3273.106936860187940539374736870621", format = "df")

In [None]:
# If the format is set to "csv", it saves a csv file to the workspace.
nbia.getDicomTags(seriesUid = "1.3.6.1.4.1.14519.5.2.1.3320.3273.106936860187940539374736870621", format = "csv")

## getDoiMetadata

**Params: (doi, output, api_url = "", format = "")**

* *Optional: output, api_url, format*
* Gets a list of Collections if output = "", or Series if output = "series", associated with a DOI.
* The result includes whether the data are 3rd party analyses or not.

In [None]:
# If the format is not specified, it returns a JSON object.
nbia.getDoiMetadata(doi = "https://doi.org/10.7937/K9/TCIA.2018.PAT12TBS", output = "", format = "")

In [None]:
# If the format is not specified, it returns a JSON object.
nbia.getDoiMetadata(doi = "https://doi.org/10.7937/K9/TCIA.2018.PAT12TBS", output = "series", format = "")

In [None]:
# If the format is set to "df", it returns a pandas dataframe object.
nbia.getDoiMetadata(doi = "https://doi.org/10.7937/K9/TCIA.2018.PAT12TBS", output = "", format = "df")

In [None]:
# If the format is set to "df", it returns a pandas dataframe object.
nbia.getDoiMetadata(doi = "https://doi.org/10.7937/K9/TCIA.2018.PAT12TBS", output = "series", format = "df")

In [None]:
# If the format is set to "csv", it saves a csv file to the workspace.
nbia.getDoiMetadata(doi = "https://doi.org/10.7937/K9/TCIA.2018.PAT12TBS", output = "", format = "csv")

In [None]:
# If the format is set to "csv", it saves a csv file to the workspace.
nbia.getDoiMetadata(doi = "https://doi.org/10.7937/K9/TCIA.2018.PAT12TBS", output = "series", format = "csv")

## getSimpleSearchWithModalityAndBodyPartPaged

**Params: (collections = [], species = [], modalities = [], bodyParts = [], manufacturers  = [], <br>
fromDate = "", toDate = "", patients = [], minStudies: int = 0, modalityAnded = False, <br>
start = 0, size = 10, sortDirection = 'ascending', sortField = 'subject', api_url = "", format = "")**

* *All parameters are optional.*
* Takes the same parameters as the SimpleSearch GUI
* Use more parameters to narrow the number of subjects received.
* **Note: This function only supports output of JSON format, please leavel the format parameter as it.**

In [None]:
nbia.getSimpleSearchWithModalityAndBodyPartPaged(collections = ["CPTAC-LUAD"], modalities = ["CT"], format = "")

# 4. Download Functions

Detailed usage of some of these functions can be found at https://github.com/kirbyju/TCIA_Notebooks/blob/main/TCIA_REST_API_Downloads.ipynb.

## downloadSeries

**Params: (series_data, number = 0, path = "", hash = "", api_url = "",<br>
input_type = "", format = "", csv_filename = "")**

* Ingests a set of seriesUids and downloads them
* By default, series_data expects JSON containing "SeriesInstanceUID" elements.
* Set number = n to download the first n series if you don't want the full dataset.
* Set hash = "y" if you'd like to retrieve MD5 hash values for each image.
* Saves to tciaDownload folder in current directory if no path is specified
* Set input_type = "list" to pass a list of Series UIDs instead of JSON.
* Set input_type = "manifest" to pass the path of a *.TCIA manifest file as series_data.
* Format can be set to "df" or "csv" to return series metadata.
* Setting a csv_filename will create the csv even if format isn't specified.
* The metadata includes info about series that have previously been downloaded.

## downloadImage
**Params: (downloadImage(seriesUID, sopUID, path = "", api_url = ""))**

* Ingests a seriesUids and SopInstanceUid and downloads the image

# 5. Image Visualization Functions

Detailed usage of these functions can be found at https://github.com/kirbyju/TCIA_Notebooks/blob/main/TCIA_REST_API_Queries.ipynb.

## viewSeries

**Params: (seriesUid = "", path = "")**

* Visualizes a Series (scan) you've downloaded in the notebook
* Requires EITHER a seriesUid or path parameter
* Leave seriesUid empty if you want to provide a custom path.
* The function assumes "tciaDownload/\<seriesUid\>/" as path if seriesUid is provided since this is where downloadSeries() saves data.

## makeVizLinks

**Params: (series_data, csv_filename="")**

* Ingests JSON output of getSeries() or getSharedCart()
* Creates URLs to visualize them in a browser
* The links appear in the last 2 columns of the dataframe.
* TCIA links display the individual series described in each row.
* IDC links display the entire study (all scans from that time point).
* IDC links may not work if they haven't mirrored the series from TCIA, yet.
* This function only works with fully public datasets (no limited-access data).
* Optionally accepts a csv_filename parameter if you'd like to export a CSV file.

# 6. Other Functions

## setApiUrl

**Params: (endpoint, api_url)**

* Checks for valid security tokens where needed
* Because it is called by other functions to select base URL, ***<font color='red'>please do NOT use this function</font>***.
* ***Note: Nearly all functions allow you to specify **api_url** as a query parameter.  This allows you to specify if you'd like to access restricted collections or the [National Lung Screening Trial (NLST)](https://doi.org/10.7937/TCIA.HMQ8-J677) collection, which lives on a separate server due to its size (>26,000 patients!).  We'll provide examples to show how this works later in the notebook.***

## manifestToList

**Params: (manifest)**

* Ingests a TCIA manifest file and removes header
* Returns a list of series UIDs
* Because it is primarily a helper function used by downloadSeries() and makeSeriesReport(), ***<font color='red'>please do NOT use this function</font>***.

## getToken
**Params: (user = "", pw = "", api_url = "")**

* Retrieves security token to access APIs that require authorization
* Provides interactive prompts for user/pw if they're not specified as parameters
* Uses getToken() for querying restricted collections with "Search API"
* Uses getToken(api_url = "nlst") for "Advanced API" queries of National Lung Screening Trial
* Sets expiration time for tokens (2 hours from creation)

In [None]:
nbia.getToken()

## makeCredentialFile
**Params: (user = "", pw = "")**

* Creates a credential file to use with NBIA Data Retriever
* Provides interactive prompts for user/pw if they're not specified as parameters
* ***Note: A credential file is a text file that passes the user's credentials in the following format:***
    * userName = YourUserName
    * passWord = YourPassword
    * *Both parameters are case-sensitive.*
* Users are encouraged to take a look at the file being generated.
* Documentation at https://wiki.cancerimagingarchive.net/x/2QKPBQ and notebook at https://github.com/kirbyju/TCIA_Notebooks/blob/main/TCIA_Linux_Data_Retriever_App.ipynb.

In [None]:
nbia.makeCredentialFile()

## makeSeriesReport

**Params: (series_data, input_type = "", format = "", filename = None, api_url = "")**

* Ingests JSON output from any function that returns series-level data and creates summary report
* Specify input_type = "manifest" to ingest a *.TCIA manifest file or "list" for a python list of UIDs.
* If input_type = "manifest" or "list" and there are series UIDs that are restricted, you must call getToken() with a user ID that has access to all UIDs before calling this function.
* Specifying api_url is only necessary if you are using input_type = "manifest" or "list" with NLST data (e.g. api_url = "nlst").
* Specify format = "var" to return the report values as a dictionary.
* Access variables example after saving function output to report_data: subjects = report_data["subjects"].
* Specify format = "file" to save the report to a file.
* Specify a filename parameter to set a filename if you don't want the default filename.

In [None]:
data = nbia.getSeries(collection = "CPTAC-SAR", patientId = "", studyUid = "", seriesUid = "", 
                    modality = "", bodyPart = "", manufacturer = "", manufacturerModel = "", format = "df")
nbia.makeSeriesReport(data)

In [None]:
data = nbia.getSharedCart(name = "nbia-49121659384603347")
nbia.makeSeriesReport(data)

In [None]:
manifest = requests.get("https://wiki.cancerimagingarchive.net/download/attachments/22512757/doiJNLP-Fo0H1NtD.tcia?version=1&modificationDate=1534787017928&api=v2")
with open('RIDER_Breast_MRI.tcia', 'wb') as f:
    f.write(manifest.content)
nbia.makeSeriesReport("RIDER_Breast_MRI.tcia", input_type = "manifest")

We can also use other parameters to make the report in the format we want.

In [None]:
data = nbia.getSharedCart(name = "nbia-49121659384603347")
nbia.makeSeriesReport(data, format = "file", filename = "MyCart.txt")

In [None]:
data = nbia.getSharedCart(name = "nbia-49121659384603347")
nbia.makeSeriesReport(data, format = "var")

# Acknowledgements
TCIA is funded by the [Cancer Imaging Program (CIP)](https://imaging.cancer.gov/), a part of the United States [National Cancer Institute (NCI)](https://www.cancer.gov/).  It is managed by the [Frederick National Laboratory for Cancer Research (FNLCR)](https://frederick.cancer.gov/) and hosted by the [University of Arkansas for Medical Sciences (UAMS)](https://www.uams.edu/)

This notebook was created by [Justin Kirby](https://www.linkedin.com/in/justinkirby82/) and [Adam Li](https://www.linkedin.com/in/adam-l-713885121). If you leverage this notebook or any TCIA datasets in your work, please be sure to comply with the [TCIA Data Usage Policy](https://wiki.cancerimagingarchive.net/x/c4hF). In particular, make sure to cite the DOI(s) for the specific TCIA datasets you used in addition to the following paper!

# TCIA Citation

Clark, K., Vendt, B., Smith, K., Freymann, J., Kirby, J., Koppel, P., Moore, S., Phillips, S., Maffitt, D., Pringle, M., Tarbox, L., & Prior, F. (2013). The Cancer Imaging Archive (TCIA): Maintaining and Operating a Public Information Repository. Journal of Digital Imaging, 26(6), 1045–1057. https://doi.org/10.1007/s10278-013-9622-7