Even when cancer treatment works at first, relapse can happen for one brutal reason: cancer evolves.
A tumor isn’t one uniform enemy—it’s a shifting population of clones. Therapy may wipe out the dominant clone, while a smaller, hidden subclone survives and expands—becoming resistant disease.
In cancer genomics, researchers often receive a flat mutation table with Variant Allele Frequency (VAF) values—but the key questions are evolutionary:
- What happened first?
- What branches emerged?
- How heterogeneous is this tumor?
- Which subclones might drive resistance?
We built Cancer Tumor Evolution Tracker to turn mutation data into a clear evolutionary map that cancer scientists can use immediately.
Cancer Tumor Evolution Tracker converts standard tumor variant data into an evolution blueprint:
A trunk → branch tree visualizing likely tumor progression.
An interactive view of mutations ordered by VAF, with variant details on hover.
Groups mutations into “expansion waves” and labels events as:
- Early / Trunk
- Late / Branch
Computes Shannon entropy to quantify tumor diversity
A proxy for evolutionary potential and resistance risk.
- Accepts CSV and MAF uploads
- Can pull real mutation profiles via cBioPortal API
This tool is designed for oncology researchers studying tumor dynamics, progression, behavior, and treatment response—not just static tumor classification.
We created an end-to-end research workflow in Python:
- Streamlit — Interactive analysis app (upload/fetch → analyze → visualize)
- Pandas / NumPy — Data cleaning, VAF handling, clone binning, metrics
- Plotly — Interactive timeline and evolution tree visualizations
- cBioPortal API integration — Real-world cancer mutation datasets
We use Variant Allele Frequency (VAF) as a proxy for mutation prevalence and relative timing:
- Earlier mutations are inherited by more descendant cells → tend to have higher VAF
- Later mutations arise in subclones → tend to have lower VAF
- Sort mutations by VAF
- Group similar VAF values into clone bins
(Mutations that likely rose together in the same expansion wave) - Convert bins into a tree:
- Highest-VAF bin = trunk
- Progressively lower-VAF bins = branches
- Compute Shannon entropy to quantify heterogeneity
We are transparent that tumor purity and copy-number changes can distort VAF.
However, for rapid hypothesis generation, this produces a clear and useful first-pass evolutionary reconstruction.
- VAF reported as percent vs fraction
- Missing columns
- Variable formats across data sources
Real tumors can contain many low-frequency mutations, cluttering trees.
Balancing clarity and honesty about:
- Tumor purity effects
- Copy-number variation effects
Handling empty responses and edge cases from cBioPortal.
- Built a tool focused on tumor dynamics, not static mutation lists
- Turned raw mutation tables into:
- Evolution tree
- Clone bins
- Heterogeneity score
- Made it usable on real-world cancer genomics inputs
- Designed it to be explainable:
- “Trunk vs branch”
- “Expansion waves”
- “Resistance seeds”
- The biggest gap in cancer genomics isn’t always lack of data—it’s lack of interpretability.
- Researchers need tools that convert outputs into structure and narrative.
- Transparent assumptions build trust and usability across interdisciplinary teams.
To evolve from a research/educational tool into a frontline genomics utility:
Reduce VAF distortion using copy-number and tumor purity modeling.
Compare:
- Primary vs metastasis
- Pre-treatment vs post-treatment
to visualize clonal selection.
Add probabilistic clustering modes while keeping a fast default binning method.
Generate structured reports including:
- Evolution tree
- Clone table
- Heterogeneity score
- Research summary
Flag branch mutations in known actionable pathways to accelerate follow-up experiments.
- Python
- Streamlit
- Pandas
- NumPy
- Plotly
- Matplotlib
- cBioPortal
- OpenAI API
Cancer Tumor Evolution Tracker
Turning mutation tables into evolutionary insight.