This is a repository for experiments on the Nature Communication paper Deciphering complex breakage-fusion-bridge genome rearrangements with Ambigram. Here, we aim to benchmark the efficacy of Ambigram to decipher various BFB events on Illumina pair-end (PE) reads, Oxford Nanopore (ONT) long reads, Pacific Biosciences (PB) long reads, 10x Genomics linked reads with varying tumor purity and sequencing depth.
- Benchmark based on simulated data: We simulated 4 sets of data to test the efficacy of Ambigram in deciphering different BFB paths, including fold-back inversions, deletions, and translocations.
- Construct a BFB path from a set of test data (sv and seg files) with Ambigram.
- Generate a BFB fasta file (base sequence) with reference to hg38.fa.
- Simulate sequencing reads on the BFB fasta with different simulators for PE, PB, ONT, and 10x, respectively.
- Align the simulated reads with the Homo sapiens (human) genome reference (e.g., hg38.fa).
- Extract SV (structural variant) information from the BAM file derived from alignment.
- Reconstruct the BFB path with Ambigram and new SV information.
**Note: There is a .sh file in each directory for the specific experiment. **
- Extract partial reads from the whole genome bam file.
- (optional) Test the effect of tumor purity and sequencing depth.
- Merge the normal sample with the tumor (with SVs) sample in a ratio.
- Subsample the clipped bam file to generate bam files with different depths.
- Call SVs from the clipped bam file with some tools, e.g., svaba or sniffles.
- Convert the vcf file into a sv file.
- Generate a lh file with the sv file and seg file.
- Reconstruct the BFB path with Ambigram and new SV information.
- SVAS (https://github.com/paprikachan/SVAS)
- localHap (to be released)
- wgsim (https://github.com/lh3/wgsim)
- bwa (https://github.com/lh3/bwa)
- samtools (https://github.com/samtools/samtools)
- svaba (https://github.com/walaj/svaba)
- pbsim2 (https://github.com/yukiteruono/pbsim2)
- ngmlr (https://github.com/philres/ngmlr)
- sniffles (https://github.com/fritzsedlazeck/Sniffles)
- LRSIM (https://github.com/aquaskyline/LRSIM)
@article{li2023deciphering,
title={Deciphering complex breakage-fusion-bridge genome rearrangements with Ambigram},
author={Li, Chaohui and Chen, Lingxi and Pan, Guangze and Zhang, Wenqian and Li, Shuai Cheng},
journal={Nature Communications},
volume={14},
number={1},
pages={5528},
year={2023},
publisher={Nature Publishing Group UK London}
}