VR2C

Visualize read-to-contig alignment.

NOTE: This package needs Python-3.6, and is still very rough around the corners, but I just put it earlier in case anyone would like to contribute.

Motivation

Visualizing read-to-contig alignment could be difficult because the number of contigs could be huge compared to a typical genome, which has only a few dozens of contigs/chromosomes. It may be difficult to index the contig FASTA file and load it into a genome browser like IGV.

Usage

vr2c --contigs-to-genome test_data/c2g.sorted.bam --reads-to-contigs test_data/r2c_sorted.bam --contig-id [contig-id] --seqname [chromosome] --positions [positions]

Or equivalently

vr2c -c test_data/c2g.sorted.bam -r test_data/r2c_sorted.bam -t [contig-id] -s [chromosome] -p [positions]

By default it will save the output in [chromosome]_[contig_id]_[positions].png, or you could change the output file name by specifying --output/-o.

vr2c --help
usage: vr2c [-h] -c CONTIGS_TO_GENOME -r READS_TO_CONTIGS -t CONTIG_ID -s
            SEQNAME -p POSITIONS [POSITIONS ...] [--figure-width FIGURE_WIDTH]
            [--seqname-beg-end SEQNAME_BEG_END SEQNAME_BEG_END] [-o OUTPUT]

plot r2c and c2g in one coordinate

optional arguments:
  -h, --help            show this help message and exit
  -c CONTIGS_TO_GENOME, --contigs-to-genome CONTIGS_TO_GENOME
                        input contig-to-genome alignment BAM file
  -r READS_TO_CONTIGS, --reads-to-contigs READS_TO_CONTIGS
                        input read-to-contig alignment BAM file
  -t CONTIG_ID, --contig-id CONTIG_ID
                        contig id, e.g. A1.J70460
  -s SEQNAME, --seqname SEQNAME
                        e.g. chr12. This is required because one contig could
                        be aligned to two chromosomes, e.g. hardclipping
  -p POSITIONS [POSITIONS ...], --positions POSITIONS [POSITIONS ...]
                        plot vertical lines for intended positions. e.g.
                        cleavage site
  --figure-width FIGURE_WIDTH
                        figure width, default to 16 based on experience
  --seqname-beg-end SEQNAME_BEG_END SEQNAME_BEG_END
                        if specified, the beginning and ending of the plot on
                        the chromosome are enforced, e.g. [25357088, 25357993]
  -o OUTPUT, --output OUTPUT
                        default to be <contig_id>.png

An example use case

Output:

$ vr2c -c test_data/c2g.sorted.bam -r test_data/r2c_sorted.bam -t E1.L.4362 -s chr16 -p 23698789
2018-08-23 14:09:09,598|INFO|c2g_bam_file: test_data/c2g.sorted.bam
2018-08-23 14:09:09,598|INFO|r2g_bam_file: test_data/r2c_sorted.bam
329275it [00:00, 470121.10it/s]
2018-08-23 14:09:10,361|INFO|contig.query_name: E1.L.4362
2018-08-23 14:09:10,361|INFO|contig.is_reverse: False
2018-08-23 14:09:10,362|INFO|contig.cigarstring: 501M743N169M661N145M1005N94M1712N220M3379N156M1043N78M492N155M101N183M183N508M13S
2018-08-23 14:09:10,362|INFO|contig.reference_start: 23690160
2018-08-23 14:09:10,362|INFO|contig.reference_end: 23701688
2018-08-23 14:09:10,811|INFO|found 17779 reads; randomly pick 100

Generated plot:

Legend:

shown in the bottom is the contig
green are regular reads
yellow is softclip
red is hardclip (not available in this example)
introns are skipped and the distance between neighbouring exons are not to scale

A few TODOs

currently, it only plot the first 100 reads per contig by default, need to make it adjustable
calculate figure size more properly, currently there could be excessive top margin as seen from above.
choose matplotlib backend accordingly
write tests

Development

virtualenv venv
. venv/bin/activate

python setup.py develop

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
test_data		test_data
vr2c		vr2c
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

VR2C

Motivation

Usage

An example use case

A few TODOs

Development

About

Releases

Packages

Languages

License

zyxue/vr2c

Folders and files

Latest commit

History

Repository files navigation

VR2C

Motivation

Usage

An example use case

A few TODOs

Development

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages