Skip to content


Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?

Latest commit


Git stats


Failed to load latest commit information.
Latest commit message
Commit time

High-Grade Serous Ovarian Cancer Subtypes - Why has the field settled on four?



In this repository, we compare high-grade serous ovarian cancer (HGSC) subtypes across Australian, American, and Japanese populations. We determine that two or three subtypes are most consistent across different datasets. A full report of this analysis is published in G3: Genes, Genomes, Genetics (Way et al. 2016). Instructions are provided in release version 1.3 to reproduce the analysis.

We leverage data extracted from the bioconductor package curatedOvarianData (Ganzfried et al. 2013) as well as a dataset we uploaded to GEO (GSE74357). We apply a unified, unsupervised bioinformatics pipeline to compare subtypes across these populations and determine that specific subtypes are reliably identified. The most replicable subtypes are mesenchymal-like and proliferative-like and their sample representation was highly concordant with other independent clustering studies performed on single populations.

We are currently working on adding African American HGSC samples to this pipeline to determine the representation of HGSC subtypes in an additional population. This project is in development and will be associated with a future release.


For all analysis or coding related questions please file a GitHub issue


To ensure analysis reproducibility, most packages are versioned using conda. The only exceptions are MCPcounter and ESTIMATE, which are downloaded by running install_custom.R.

To create a complete instance of this environment run the following:

conda env create --force --file environment.yml
source activate hgsc_subtypes

R --no-save < install_custom.R


There are currently two pipelines in place to analyze hgsc subtypes. To reproduce the results of either pipeline, activate the hgsc_subtypes environment and run:

# Cross-population HGSC subtypes analysis 

# African American HGSC subtypes analysis


All data was retrieved from curatedOvarianData except for the Mayo data and AACES data.


This work was supported by the Institute for Quantitative Biomedical Sciences (Dartmouth); The graduate program in Genomics and Computational Biology (Penn); The Norris Cotton Cancer Center Developmental Funds; the National Cancer Institute at the National Institutes of Health (R01 CA168758 to J.A.D., F31 CA186625 to J.R., R01 CA122443 to E.L.G.); The Mayo Clinic Ovarian Cancer SPORE (P50 CA136393 to E.L.G.); The Mayo Clinic Comprehensive Cancer Center-Gene Analysis Shared Resource (P30 CA15083); The Gordon and Betty Moore Foundation’s Data-Driven Discovery Initiative (grant number GBMF 4552 to C.S.G.); and The American Cancer Society (grant number IRG 8200327 to C.S.G.).