Skip to content

A lightweight utility for BS-seq and MethylC-seq data which applies a double-masking procedure on bisulfite alignments, facilitating variant calling with conventional software.

License

Notifications You must be signed in to change notification settings

bio15anu/revelio

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation


This program takes an input BAM file and masks base positions that are potentially influenced by bisulfite-seq as observed from alignments between query and reference sequences. The script requires either an input FASTA sequence of the reference genome (slow) or a BAM file with MD tags (fast) as generated by 'samtools calmd'.

Requirements

The script requires Pysam which can be installed via Bioconda:

conda install -c bioconda pysam

Basic Usage

The script can run either on an input BAM file with proper MD tags such as those generated by 'samtools calmd', or by taking the reference genome fasta directly. Note that MD tags generated directly via bisulfite-aware alignment software may be inappropriate, as they sometimes follow a non-standard definition.

# using proper MD tags
samtools calmd -b input.bam genome.fa 1> calmd.bam 2> /dev/null
./revelio.py calmd.bam masked.bam

# using reference genome directly
./revelio.py -f genome.fa input.bam masked.bam

About

A lightweight utility for BS-seq and MethylC-seq data which applies a double-masking procedure on bisulfite alignments, facilitating variant calling with conventional software.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages