Skip to content

AdamCarter27/TumorEvolution

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 

Repository files navigation

Cancer Tumor Evolution Tracker

Inspiration

Even when cancer treatment works at first, relapse can happen for one brutal reason: cancer evolves.

A tumor isn’t one uniform enemy—it’s a shifting population of clones. Therapy may wipe out the dominant clone, while a smaller, hidden subclone survives and expands—becoming resistant disease.

In cancer genomics, researchers often receive a flat mutation table with Variant Allele Frequency (VAF) values—but the key questions are evolutionary:

  • What happened first?
  • What branches emerged?
  • How heterogeneous is this tumor?
  • Which subclones might drive resistance?

We built Cancer Tumor Evolution Tracker to turn mutation data into a clear evolutionary map that cancer scientists can use immediately.


What It Does

Cancer Tumor Evolution Tracker converts standard tumor variant data into an evolution blueprint:

Clonal Evolution Tree

A trunk → branch tree visualizing likely tumor progression.

Mutation Timeline

An interactive view of mutations ordered by VAF, with variant details on hover.

Clone Binning + Labels

Groups mutations into “expansion waves” and labels events as:

  • Early / Trunk
  • Late / Branch

Heterogeneity Score

Computes Shannon entropy to quantify tumor diversity
A proxy for evolutionary potential and resistance risk.

Real-World Data Support

  • Accepts CSV and MAF uploads
  • Can pull real mutation profiles via cBioPortal API

This tool is designed for oncology researchers studying tumor dynamics, progression, behavior, and treatment response—not just static tumor classification.


How We Built It

We created an end-to-end research workflow in Python:

  • Streamlit — Interactive analysis app (upload/fetch → analyze → visualize)
  • Pandas / NumPy — Data cleaning, VAF handling, clone binning, metrics
  • Plotly — Interactive timeline and evolution tree visualizations
  • cBioPortal API integration — Real-world cancer mutation datasets

Core Idea (Why It Works)

We use Variant Allele Frequency (VAF) as a proxy for mutation prevalence and relative timing:

  • Earlier mutations are inherited by more descendant cells → tend to have higher VAF
  • Later mutations arise in subclones → tend to have lower VAF

Algorithm Overview

  1. Sort mutations by VAF
  2. Group similar VAF values into clone bins
    (Mutations that likely rose together in the same expansion wave)
  3. Convert bins into a tree:
    • Highest-VAF bin = trunk
    • Progressively lower-VAF bins = branches
  4. Compute Shannon entropy to quantify heterogeneity

We are transparent that tumor purity and copy-number changes can distort VAF.
However, for rapid hypothesis generation, this produces a clear and useful first-pass evolutionary reconstruction.


Challenges We Ran Into

Genomics Data Messiness

  • VAF reported as percent vs fraction
  • Missing columns
  • Variable formats across data sources

Interpretability

Real tumors can contain many low-frequency mutations, cluttering trees.

Scientific Caveats

Balancing clarity and honesty about:

  • Tumor purity effects
  • Copy-number variation effects

API Reliability

Handling empty responses and edge cases from cBioPortal.


Accomplishments We’re Proud Of

  • Built a tool focused on tumor dynamics, not static mutation lists
  • Turned raw mutation tables into:
    • Evolution tree
    • Clone bins
    • Heterogeneity score
  • Made it usable on real-world cancer genomics inputs
  • Designed it to be explainable:
    • “Trunk vs branch”
    • “Expansion waves”
    • “Resistance seeds”

What We Learned

  • The biggest gap in cancer genomics isn’t always lack of data—it’s lack of interpretability.
  • Researchers need tools that convert outputs into structure and narrative.
  • Transparent assumptions build trust and usability across interdisciplinary teams.

What’s Next

To evolve from a research/educational tool into a frontline genomics utility:

Purity + CNV-Aware Corrections

Reduce VAF distortion using copy-number and tumor purity modeling.

Multi-Sample Evolution Tracking

Compare:

  • Primary vs metastasis
  • Pre-treatment vs post-treatment
    to visualize clonal selection.

Improved Clustering Options

Add probabilistic clustering modes while keeping a fast default binning method.

Exportable Reports

Generate structured reports including:

  • Evolution tree
  • Clone table
  • Heterogeneity score
  • Research summary

Resistance Annotation Layer

Flag branch mutations in known actionable pathways to accelerate follow-up experiments.


Built With

  • Python
  • Streamlit
  • Pandas
  • NumPy
  • Plotly
  • Matplotlib
  • cBioPortal
  • OpenAI API

Cancer Tumor Evolution Tracker
Turning mutation tables into evolutionary insight.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages