## 📦 Download Collection From TCIA

The **Cancer Imaging Archive (TCIA)** hosts a large archive of de-identified medical images, organized into **collections**.
A **collection** refers to a group of related imaging datasets—usually from a specific study, disease type, or institution (e.g., `UPENN-GBM` for glioblastoma patients from the University of Pennsylvania).

Before you can interact with TCIA in Python, you’ll need to install the [`tcia_utils`](https://pypi.org/project/tcia-utils/) package:

```bash
pip install tcia-utils
```

This tutorial assumes the package is already installed.

The next block of code will:

1. **Import the necessary libraries**, including `nbia` from `tcia_utils`, which provides access to TCIA data.
2. **Fetch a list of available collections** using `nbia.getCollections()`. This helps you explore the different datasets hosted on TCIA.
3. **Retrieve the series metadata** for a specific collection (in this case, `UPENN-GBM`) using `nbia.getSeries(...)`. A **series** refers to a set of related DICOM images (e.g., an MRI scan sequence).
4. **Download the image series data** using `nbia.downloadSeries(...)`, which saves the selected series to your local machine for analysis or visualization. For example, you can choose to download only the first 2 series to keep it lightweight.

In [1]:
# import the required modules
from tcia_utils import nbia

In [2]:
# fetch a list of available collections
nbia.getCollections()

2025-05-12 18:01:06,390:INFO:Success - Token saved to global api_call_headers variable and expires at 2025-05-12 20:01:06.390890
2025-05-12 18:01:06,393:INFO:Accessing public data anonymously. To access restricted data use nbia.getToken() with your credentials.
2025-05-12 18:01:06,396:INFO:Calling getCollectionValues with parameters {}


[{'Collection': '4D-Lung'},
 {'Collection': 'ACRIN-6698'},
 {'Collection': 'ACRIN-Contralateral-Breast-MR'},
 {'Collection': 'ACRIN-FLT-Breast'},
 {'Collection': 'ACRIN-NSCLC-FDG-PET'},
 {'Collection': 'Adrenal-ACC-Ki67-Seg'},
 {'Collection': 'Advanced-MRI-Breast-Lesions'},
 {'Collection': 'Anti-PD-1_Lung'},
 {'Collection': 'B-mode-and-CEUS-Liver'},
 {'Collection': 'BREAST-DIAGNOSIS'},
 {'Collection': 'Breast-Cancer-Screening-DBT'},
 {'Collection': 'Breast-MRI-NACT-Pilot'},
 {'Collection': 'C4KC-KiTS'},
 {'Collection': 'CBIS-DDSM'},
 {'Collection': 'CC-Radiomics-Phantom'},
 {'Collection': 'CC-Radiomics-Phantom-2'},
 {'Collection': 'CC-Radiomics-Phantom-3'},
 {'Collection': 'CC-Tumor-Heterogeneity'},
 {'Collection': 'CMB-AML'},
 {'Collection': 'CMB-BRCA'},
 {'Collection': 'CMB-CRC'},
 {'Collection': 'CMB-GEC'},
 {'Collection': 'CMB-LCA'},
 {'Collection': 'CMB-MEL'},
 {'Collection': 'CMB-MML'},
 {'Collection': 'CMB-OV'},
 {'Collection': 'CMB-PCA'},
 {'Collection': 'CMMD'},
 {'Collection'

In [3]:
# retrieve a series
data = nbia.getSeries(collection='UPENN-GBM')

2025-05-12 18:01:10,759:INFO:Calling getSeries with parameters {'Collection': 'UPENN-GBM'}


In [4]:
# download the image series data
nbia.downloadSeries(data, number=2)

2025-05-12 18:01:19,641:INFO:Downloading 2 out of 3680 Series Instance UIDs (scans).
2025-05-12 18:01:19,645:INFO:Directory 'tciaDownload' created successfully.
2025-05-12 18:01:19,650:INFO:Downloading... https://services.cancerimagingarchive.net/nbia-api/services/v2/getImage?NewFileNames=Yes&SeriesInstanceUID=1.3.6.1.4.1.14519.5.2.1.269607322454961545178354587485557855557
2025-05-12 18:01:26,304:INFO:Downloading... https://services.cancerimagingarchive.net/nbia-api/services/v2/getImage?NewFileNames=Yes&SeriesInstanceUID=1.3.6.1.4.1.14519.5.2.1.326941089799128875716457447813731935085
2025-05-12 18:01:30,850:INFO:Downloaded 2 out of 2 Series Instance UIDs (scans).
0 failed to download.
0 previously downloaded.


## View the Downloaded DICOM Series

After downloading the image series, you can **visualize the DICOM files** using the [`simpleDicomViewer`](https://pypi.org/project/simpleDicomViewer/) package. This lightweight viewer helps you quickly inspect the medical imaging data without needing a full-fledged DICOM viewer.

### 📦 Prerequisite

Make sure to install the package before running the viewer code:

```bash
pip install simpleDicomViewer
```

### The Next Code Blocks:
* **Imports** the DICOM viewer from `simpleDicomViewer`.
* **Defines** the `Series UID` of the scan you want to display.
* **Calls** `viewSeries()` with the path to the downloaded series folder (usually in `tciaDownload/<seriesUid>`).

This will launch a viewer window where you can scroll through the image slices — a helpful step for understanding the structure of the data before doing any processing or analysis.

In [5]:
from simpleDicomViewer import dicomViewer

seriesUid = "1.3.6.1.4.1.14519.5.2.1.269607322454961545178354587485557855557"
dicomViewer.viewSeries(f"tciaDownload/{seriesUid}")

interactive(children=(IntSlider(value=59, description='x', max=119), Output()), _dom_classes=('widget-interact…

## Querying and Downloading Specific DICOM Series from TCIA

TCIA (The Cancer Imaging Archive) provides the `getSeries()` function from the `tcia_utils.nbia` module to query imaging series using various parameters. This allows users to filter datasets based on clinical or technical characteristics.

### The following block of code will:

* Query the TCIA database for DICOM image series that:

  * Are part of the **UPENN-GBM** collection (breast cancer data),
  * Have the imaging **modality** set to **MR** (Magnetic Resonance),
  * Were acquired using **SIEMENS** imaging equipment.
* Print the total number of image series that match the specified criteria.

In [6]:
# getSeries with query parameters
data = nbia.getSeries(collection = 'UPENN-GBM',
                      modality = "MR",
                      manufacturer = "SIEMENS")

print(len(data), 'Series returned')

2025-05-12 18:01:42,284:INFO:Calling getSeries with parameters {'Collection': 'UPENN-GBM', 'Modality': 'MR', 'Manufacturer': 'SIEMENS'}


3625 Series returned


## Downloading and Previewing Series Data

After retrieving metadata about the available series, the `downloadSeries()` function can be used to download a specified number of series for local inspection or processing.

### The subsequent block of code will:

* Download **two image series** from the previously filtered dataset.
* Return the download information as a **Pandas DataFrame** using the `format="df"` argument.
* Display the DataFrame to provide a structured overview of the downloaded series, including attributes such as SeriesInstanceUID, StudyDate, and Modality.

In [7]:
df = nbia.downloadSeries(data, number = 2, format = "df")
display(df)

2025-05-12 18:01:48,944:INFO:Downloading 2 out of 3625 Series Instance UIDs (scans).
2025-05-12 18:01:48,946:INFO:Directory 'tciaDownload' already exists.
2025-05-12 18:01:48,951:INFO:Downloading... https://services.cancerimagingarchive.net/nbia-api/services/v2/getImage?NewFileNames=Yes&SeriesInstanceUID=1.3.6.1.4.1.14519.5.2.1.228154008480265056687138530279461887340
2025-05-12 18:01:56,824:INFO:Downloading... https://services.cancerimagingarchive.net/nbia-api/services/v2/getImage?NewFileNames=Yes&SeriesInstanceUID=1.3.6.1.4.1.14519.5.2.1.23166033272237112806900422604932457206
2025-05-12 18:02:13,357:INFO:Downloaded 2 out of 2 Series Instance UIDs (scans).
0 failed to download.
0 previously downloaded.


Unnamed: 0,Series UID,Collection,3rd Party Analysis,Data Description URI,Subject ID,Study UID,Study Description,Study Date,Series Description,Manufacturer,...,License URL,Annotation Size,Date Released,Series Date,Protocol Name,Body Part Examined,Annotations Flag,Manufacturer Model Name,Software Versions,TimeStamp
0,1.3.6.1.4.1.14519.5.2.1.2281540084802650566871...,UPENN-GBM,NO,https://doi.org/10.7937/TCIA.709X-DN49,UPENN-GBM-00273,1.3.6.1.4.1.14519.5.2.1.1677264445954634641384...,BrainTumor,03-19-2003,AX T2 ProcessedCaPTk,SIEMENS,...,https://creativecommons.org/licenses/by/4.0/,0,Tue Jun 21 00:00:00 UTC 2022,Wed Mar 19 00:00:00 UTC 2003,AX T2,HEAD,False,Avanto,syngo MR B15,Thu May 06 19:56:53 UTC 2021
1,1.3.6.1.4.1.14519.5.2.1.2316603327223711280690...,UPENN-GBM,NO,https://doi.org/10.7937/TCIA.709X-DN49,UPENN-GBM-00273,1.3.6.1.4.1.14519.5.2.1.2813012108293682536587...,BrainTumor,03-19-2003,AX T1 MPRAGE ISOTROPIC ProcessedCaPTk,SIEMENS,...,https://creativecommons.org/licenses/by/4.0/,0,Tue Jun 21 00:00:00 UTC 2022,Wed Mar 19 00:00:00 UTC 2003,AX T1 MPRAGE ISOTROPIC,HEAD,False,Avanto,syngo MR B15,Thu May 06 20:30:55 UTC 2021


In [8]:
# getSeries with query parameters
df = nbia.getSeries(collection = 'UPENN-GBM',
               modality = "MR",
               manufacturer = "SIEMENS", format = "df")

2025-05-12 18:02:13,461:INFO:Calling getSeries with parameters {'Collection': 'UPENN-GBM', 'Modality': 'MR', 'Manufacturer': 'SIEMENS'}


In [9]:
# Filter the downloaded series to include only those with "flair" in the ProtocolName and "t2" in the SeriesDescription
filtered_df = df[(df['ProtocolName'].str.lower().str.contains('flair')) &
                 (df['SeriesDescription'].str.lower().str.contains('t2'))]

display(filtered_df)

Unnamed: 0,SeriesInstanceUID,StudyInstanceUID,Modality,ProtocolName,SeriesDate,SeriesDescription,BodyPartExamined,SeriesNumber,Collection,PatientID,...,ImageCount,TimeStamp,LicenseName,LicenseURI,CollectionURI,FileSize,DateReleased,StudyDesc,StudyDate,ThirdPartyAnalysis
9,1.3.6.1.4.1.14519.5.2.1.2750266617443020225997...,1.3.6.1.4.1.14519.5.2.1.1781261480501855891335...,MR,t2_Flair_axial,2002-08-23 00:00:00.0,t2_Flair_axial: Processed_CaPTk,BRAIN,12,UPENN-GBM,UPENN-GBM-00004,...,60,2020-12-03 12:57:11.0,Creative Commons Attribution 4.0 International...,https://creativecommons.org/licenses/by/4.0/,https://doi.org/10.7937/TCIA.709X-DN49,6077172,2022-06-21 00:00:00.0,MRI BRAIN W/INJ/MHDI,2002-08-23 00:00:00.0,NO
15,1.3.6.1.4.1.14519.5.2.1.2409809537973331376182...,1.3.6.1.4.1.14519.5.2.1.2904152627740700204304...,MR,t2_Flair_axial,2002-09-01 00:00:00.0,t2_Flair_axial: Processed_CaPTk,BRAIN,12,UPENN-GBM,UPENN-GBM-00009,...,60,2020-12-03 12:56:43.0,Creative Commons Attribution 4.0 International...,https://creativecommons.org/licenses/by/4.0/,https://doi.org/10.7937/TCIA.709X-DN49,6078302,2022-06-21 00:00:00.0,MRI BRAIN W/INJ/MHDI,2002-09-01 00:00:00.0,NO
20,1.3.6.1.4.1.14519.5.2.1.6773477728861886890986...,1.3.6.1.4.1.14519.5.2.1.1936324223877448285580...,MR,t2_Flair_axial,2003-04-16 00:00:00.0,t2_Flair_axial: Processed_CaPTk,BRAIN,2,UPENN-GBM,UPENN-GBM-00003,...,60,2020-12-03 12:57:07.0,Creative Commons Attribution 4.0 International...,https://creativecommons.org/licenses/by/4.0/,https://doi.org/10.7937/TCIA.709X-DN49,6075996,2022-06-21 00:00:00.0,BRAIN^ROUTINE,2003-04-16 00:00:00.0,NO
31,1.3.6.1.4.1.14519.5.2.1.2548372895971539919666...,1.3.6.1.4.1.14519.5.2.1.2078155039591835232909...,MR,t2_Flair_axial,2003-06-29 00:00:00.0,t2_Flair_axial: Processed_CaPTk,,12,UPENN-GBM,UPENN-GBM-00015,...,60,2021-04-22 19:50:23.0,Creative Commons Attribution 4.0 International...,https://creativecommons.org/licenses/by/4.0/,https://doi.org/10.7937/TCIA.709X-DN49,6074812,2022-06-21 00:00:00.0,BRAIN^SPECTROSCOPY,2003-06-29 00:00:00.0,NO
53,1.3.6.1.4.1.14519.5.2.1.1308287424086917791399...,1.3.6.1.4.1.14519.5.2.1.2073339823343933670886...,MR,t2_Flair_axial,2003-02-12 00:00:00.0,t2_Flair_axial: Processed_CaPTk,,12,UPENN-GBM,UPENN-GBM-00014,...,60,2021-04-22 20:00:23.0,Creative Commons Attribution 4.0 International...,https://creativecommons.org/licenses/by/4.0/,https://doi.org/10.7937/TCIA.709X-DN49,6076960,2022-06-21 00:00:00.0,BRAIN^SPECTROSCOPY,2003-02-12 00:00:00.0,NO
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
3595,1.3.6.1.4.1.14519.5.2.1.4134921694029967302218...,1.3.6.1.4.1.14519.5.2.1.3159814803738447676292...,MR,T2 AXIAL FLAIR_TERA,2012-10-05 00:00:00.0,T2 AXIAL FLAIR_TERA : Processed_CaPTk,HEADNECK,27,UPENN-GBM,UPENN-GBM-00552,...,60,2021-05-12 21:13:49.0,Creative Commons Attribution 4.0 International...,https://creativecommons.org/licenses/by/4.0/,https://doi.org/10.7937/TCIA.709X-DN49,23775406,2022-06-21 00:00:00.0,MR HEAD FUNCTIONAL MOTOR LANGUAGE,2012-10-05 00:00:00.0,NO
3605,1.3.6.1.4.1.14519.5.2.1.6383474903657008282180...,1.3.6.1.4.1.14519.5.2.1.3372525308556639919145...,MR,t2_Flair_axial,2002-05-11 00:00:00.0,t2_Flair_axial: Processed_CaPTk,BRAIN,13,UPENN-GBM,UPENN-GBM-00056,...,60,2021-04-22 20:03:15.0,Creative Commons Attribution 4.0 International...,https://creativecommons.org/licenses/by/4.0/,https://doi.org/10.7937/TCIA.709X-DN49,6077484,2022-06-21 00:00:00.0,MRI BRAIN W/INJ/MHDI,2002-05-11 00:00:00.0,NO
3610,1.3.6.1.4.1.14519.5.2.1.5173797555894628549331...,1.3.6.1.4.1.14519.5.2.1.2710970768513155461290...,MR,t2_Flair_axial_TERA,2012-06-20 00:00:00.0,t2_Flair_axial_TERA : Processed_CaPTk,HEADNECK,12,UPENN-GBM,UPENN-GBM-00476,...,60,2021-04-17 15:13:44.0,Creative Commons Attribution 4.0 International...,https://creativecommons.org/licenses/by/4.0/,https://doi.org/10.7937/TCIA.709X-DN49,6082246,2022-06-21 00:00:00.0,BrainTumor,2012-06-20 00:00:00.0,NO
3617,1.3.6.1.4.1.14519.5.2.1.2083220381141573778379...,1.3.6.1.4.1.14519.5.2.1.1316014409073684955697...,MR,t2_Flair_axial,2013-11-17 00:00:00.0,t2_Flair_axial: Processed_CaPTk,HEADNECK,12,UPENN-GBM,UPENN-GBM-00608,...,60,2021-05-12 21:03:06.0,Creative Commons Attribution 4.0 International...,https://creativecommons.org/licenses/by/4.0/,https://doi.org/10.7937/TCIA.709X-DN49,6083300,2022-06-21 00:00:00.0,MR HEAD W AND WO IV CONTRAST TUMOR NEW,2013-11-17 00:00:00.0,NO
