The Childhood Cancer Data Lab was established by Alex’s Lemonade Stand Foundation (ALSF) in 2017. The Data Lab is a team of data scientists, designers, engineers, and communicators. Our mission is to accelerate the pace of finding novel cures and treatments for childhood cancer by putting resources and knowledge in the hands of pediatric cancer experts.
We construct tools that make vast amounts of data widely available, easily mineable, and broadly reusable. We also train researchers to better understand their own data and to advance their work more quickly. The Data Lab team simultaneously contributes to childhood cancer research and to the open science and open source software communities.
refine.bio is a multi-organism collection of genome-wide transcriptome or gene expression data that has been obtained from publicly available repositories and uniformly processed and normalized.
- Read the documentation.
- Get started with example workflows for use with refine.bio data.
- Learn about building and running the refine.bio project source code.
Single-cell Pediatric Cancer Atlas (ScPCA)
The Single-cell Pediatric Cancer Atlas (ScPCA) focuses on single-cell and single-nuclei profiling with the goal of creating a publicly available atlas of pediatric cancer data.
Ten ScPCA awards were funded by Alex’s Lemonade Stand Foundation.
The ALSF-funded researchers submit their single-cell, single-nuclei, and bulk RNA sequencing data to the Data Lab for processing.
The data from these patient tumors are made available in one location through the ScPCA Portal.
- Processing workflows for ScPCA data.
- ScPCA tools and data files for testing them.
- User information about ScPCA processing.
Open Pediatric Brain Tumor Atlas (OpenPBTA)
Open Pediatric Brain Tumor Atlas (OpenPBTA) is a global open science initiative, which analyzes a vast collection of pediatric brain tumor data, comprising data from 943 tumors.
This project operates on an open contribution model, crowdsourcing expertise from childhood brain cancer experts from across the world.
training workshops to teach researchers the data science skills they need to examine their own data. Participants are introduced to the R programming language and to cutting-edge technologies used in single-cell and bulk RNA-sequencing data analysis. All of our training materials are openly licensed and freely available!
- View our current training modules.
- Sign up to be notified about future training workshops.
- Questions? Interested in using our materials to hold your own workshop? Contact us at
For inquiries, please contact us at
Support our work by making a tax-deductible contribution to ALSF’s Childhood Cancer Data Lab. Donate here!