Skip to content

idc_new_data_v17_to_v24.md

Andrey Fedorov edited this page May 29, 2026 · 1 revision

IDC Data Added Since v17 (v18–v24)

Generated: 2026-05-29

Prompt summarize the new data in IDC since v17
Skill imaging-data-commons v1.6.4
Model claude-sonnet-4-6
IDC data version v24
idc-index 0.12.3

Volume Overview

Version New Collections New Patients New Series Size Added
v18 4 26,934 384,653 15.0 TB
v19 45 11,172 42,103 12.5 TB
v20 27 7,679 12,363 6.7 TB
v21 10 1,987 3,393 1.9 TB
v22 25 8,266 14,473 5.6 TB
v23 36 9,775 27,709 0.5 TB
v24 16 6,039 39,872 5.7 TB
Total 163 71,852 524,566 ~47 TB

New Clinical/Imaging Collections by Version

v18 — +15 TB, dominated by AI-generated segmentations

collection_id Cancer Types Patients Supporting Data
advanced_mri_breast_lesions Breast Cancer 632 Clinical

v19 — Largest expansion in collection count (+45 collections)

collection_id Cancer Types Patients Supporting Data
ccdi_mci Various 4,407
cmb_aml Acute Myeloid Leukemia 11 Clinical
cmb_crc Colorectal Cancer 74 Clinical
cmb_gec Gastroesophageal Cancer 16 Clinical
cmb_lca Lung Cancer 162 Clinical
cmb_mel Melanoma 54 Clinical
cmb_mml Multiple Myeloma 138 Clinical
cmb_pca Prostate Cancer 51 Clinical
gtex Non-diseased (controls) 971 Clinical, Genomics

v20

collection_id Cancer Types Patients Supporting Data
cmb_brca Breast Invasive Carcinoma 78 Clinical
cmb_ov Ovarian Cancer 31 Clinical
mediastinal_lymph_node_seg Multi-cancer (11 types) 513 Clinical
spine_mets_ct_seg Spine mets, 9 cancer types 55 Clinical

v21

collection_id Cancer Types Patients Supporting Data
varepop_apollo 8 types (esophageal, HNSCC, lung, pancreatic, thymoma, colon) 41

v22 — Phantoms and pathology

collection_id Cancer Types Patients Supporting Data
bonemarrowwsi_pediatricleukemia Leukemia (WSI) 245 Clinical
cbis_ddsm Breast Cancer / Non-Cancer (mammography) 6,671 Image Analyses
cc_radiomics_phantom Lung Phantom 17 Image Analyses
cc_radiomics_phantom_2 Phantom 251
cc_radiomics_phantom_3 Head/Chest/Phantom 95 Image Analyses
ct4harmonization_multicentric Liver Phantom 1 Software/Source Code
qiba_ct_liver_phantom Liver Phantom 3 Image Analyses
qin_breast_02 Breast Cancer 13 Clinical
qin_pet_phantom Phantom 2
rider_phantom_mri Phantom 10
rider_phantom_pet_ct Phantom 20 Image Analyses

v24

collection_id Cancer Types Patients Supporting Data
catch Melanoma, SCC, MPNST, skin fibrous histiocytoma (WSI) 282
cddp_eagle_1 Lung Adenocarcinoma 49
cgci_blgsp Burkitt Lymphoma 388
cgci_htmcp_cc Cervical Squamous Cell Carcinoma 211
cgci_htmcp_dlbcl Diffuse Large B-Cell Lymphoma 43
cgci_htmcp_lc Non-Small Cell Carcinoma (lung) 27
cptac_stad Stomach Adenocarcinoma 20 Clinical, Genomics, Proteomics
eay131 NCI-MATCH basket trial, 47 cancer types 2,813 Clinical
hcmi_cmdc 7 cancer types across 14 body sites 382
htan_tnp_sardana Colon Mucinous Adenocarcinoma 1
ldct_and_projection_data Lung Cancer (low-dose CT + raw projections) 200
pdxnet PDX Network, 29 cancer types 919 Clinical, Image Analyses, Genomics
psma_pet_ct_lesions Prostate Cancer (PSMA PET/CT + segmentations) 378 Image Analyses
spinal_multiple_myeloma_seg Multiple Myeloma (spine segmentations) 67 Clinical
uw_cirp_mouse_pet_ct_nsclc Lung SCC (mouse model, PET/CT) 14 Image Analyses, Clinical

New Analysis Results (AI/Expert Annotations)

Version analysis_result_id Content Series Size
v18 TotalSegmentator-CT-Segmentations Multi-organ auto-segmentations (TotalSegmentator) 378,153 14.3 TB
v18 RMS-Mutation-Prediction-Expert-Annotations Expert annotations for RMS mutation prediction 193 7 GB
v19 Pan-Cancer-Nuclei-Seg-DICOM Pan-cancer nuclei segmentations (pathology WSI) 12,149 7.0 TB
v19 Pancreas-CT-SEG Pancreas CT segmentations 80 0.2 GB
v23 Lung-PET-CT-Dx-Annotations Lung PET/CT diagnostic annotations 1,091 0.1 GB
v23 NLSTSeg NLST lung segmentations 1,803 0.4 GB
v23 NLST-Sybil NLST risk prediction outputs (Sybil model) 970
v23 ProstateX-Targets ProstateX biopsy target annotations 345
v23 TCGA-GBM360 TCGA GBM 360° annotations 691 0.4 GB
v23 TCGA-SBU-TIL-Maps TCGA tumor-infiltrating lymphocyte maps 21,030 2.0 GB
v24 EAY131-Tumor-Annotations Tumor annotations for NCI-MATCH (eay131) 15,799 1.1 GB

Key Highlights

  • TotalSegmentator (v18) is the single largest addition: 14.3 TB of multi-organ CT auto-segmentations across 378K series
  • Cancer Moonshot Biobank (v19) introduced 7 clinical cohorts (cmb_*) with multi-omic supporting data spanning blood, GI, lung, and prostate cancers
  • CCDI-MCI (v19) added the largest new patient cohort (~4,400 subjects) with broad pediatric/young adult cancer coverage
  • NCI-MATCH (eay131) (v24) added a multi-institution basket trial cohort with 47 cancer types and 2,813 patients
  • PDXNet (v24) added patient-derived xenograft imaging across 29 cancer types with genomics
  • CGCI suite (v24) added 4 lymphoma/cervical/lung cohorts with 669 total patients
  • Pan-Cancer-Nuclei-Seg-DICOM (v19) added 7 TB of pathology nuclei segmentations
  • GTEx (v19) provides non-diseased tissue controls (971 subjects) paired with genomics data