Top level repository for bioinformatic and analysis pipelines used the publication "The copy-number landscape of recurrent ovarian high grade serous carcinoma" Smith & Bradley et al. 2023 - Nature Communications.
The drivers of recurrence and resistance in ovarian high grade serous carcinoma (HGSC) remain unclear. We investigated the acquisition of resistance by collecting tumour biopsies from a cohort of 276 women with relapsed HGSC in the BriTROC-1 study. Panel sequencing showed close concordance between diagnosis and relapse, with only four discordant cases. There was also very strong concordance in copy number (CN) between diagnosis and relapse, with no significant difference in purity, ploidy or focal somatic CN alterations, even when stratified by platinum sensitivity or prior chemotherapy lines. CN signatures were strongly correlated with immune cell infiltration, whilst diagnosis samples from patients with primary platinum resistance had increased rates of CCNE1 and KRAS amplification and CN signature 1 exposure. Our data show that the HGSC genome is remarkably stable between diagnosis and relapse and acquired chemotherapy resistance does not select for common copy number drivers.
This repository is acts a top-level directory for the NGS data, pre-processing pipelines, analysis pipelines, and figure rendering for the publication.
Analysis | Details |
---|---|
swgs-absolutecn | copy number fitting pipeline used to generated absolute copy number profiles from BAM files |
britroc-1-cn-analysis | code and scripts used to generate copy number alteration & signature analysis markdowns and figures |
britroc-1-sig-analysis | code and scripts used to run differential signature abundance analysis markdowns and figures |
britroc-1-snv-analysis | code and scripts used to generate the SNV calls and perform analysis |
Data | Details |
---|---|
sWGS data | an EGA data repository containing hg19 aligned BAM files for the samples with available sWGS sequencing |
TAM-Seq data | an EGA data repository containing FASTQ files for the samples with available TAM-Seq SNV sequencing data |
pre-processed sWGS data | a zenodo repositiory containing pre-processed absolute copy number profiles from sWGS BAM files |
pre-processed TAM-Seq data | pre-processed SNV data from TAM-Seq sequencing data |
Software versions and dependencies are installed independently for each analysis or pipeline using conda. These pipelines have been tested on cent0S linux 7.8 & RHEL 7 linux OS
Each repositiory provides its own installation instructions on installation of required software environments, additional software package and data requirements and implementations. See the README.md documents for each specific analysis for more details.
Required runtime for replicating the results of this publication are approximately 4 hours on a High performance computing server (6-8 hours on a high-spec laptop). Full runtime from raw data will vary considerably on implementation, available hardware, and data processing during non-scripted analyical steps.
No demo sofware or data is provided for the pipelines associated with this publication. All data is freely available (as pre-processed outputs) or upon request via EGA.
- Data access
- Raw sequencing files - apply via the EGA data access commitee.
- Pre-processed data - donwload directly from open-access repositories.
- Repository-specific issues - please open an issue on the relevant github repository.
- General questions about running the pipelines - open an issue on this repository.
- Queries directly about the paper - email the corresponding author listed on the publication.
This repository is citable using the DOI 10.5281/zenodo.7942206.
A vast majority of the code used in this publication is open-source and covered by GPL-3.0 license.
The copy number signature methodology is covered by a GAP Available Source License v1.0 (see here) as part of a pending patent application. This license limitation applys to the copy number signature methodology used in the copy number analysis pipeline.