Skip to content

DasProsad/CaVar

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

cavar logo

CaVar

Check status of CRISPR-Cas9 off-targets from genomic variants

GitHub Release PyPI Release License Python versions PyPI Downloads Repo Size

Introduction

CaVar (pronounced as ka-vee-a) is a CLI tool to check status of CRISPR-Cas off-target sites (OTs) from genomic variants. It utilizes output from an upstream OT finding tool, for example, Cas-Offinder and a VCF file to infer status of the OT sites. The following scoring scheme (Table 1) is used to report the status of the PAM (pm). To calculate the protospacer score (ps), a hamming-distance-based metric is used. If the number of mismatches between the supplied gRNA and the variant protospacer sequence of an off-target site is less than the maximum allowed value, the protospacer score (ps) is set to unity; otherwise, it is set to zero. Finally, the off-target score is calculated as: os := (ps × 10) + pm.

old PAM new PAM Score (pm)
valid valid 0
valid valid (same) 1
invalid valid 2
valid invalid 3
invalid invalid 4

Table 1. Off-target PAM scoring scheme

Dependencies

Cavar requires Python >= 3.9 and depends on the following external libraries

  • cyvcf2 == 0.31.4
  • pysam == 0.23.3

Installation

From PyPI

To install with pip use

pip install cavar

From source

For installation from the source use the following

git clone https://github.com/dasprosad/cavar.git
cd cavar
pip install .

Usage and options

Prerequisites

  • Usage of reference genome

The reference genome must be the same which was used for variant calling. VCF specification states that CHROM field must not contain any white space so, the FASTA headers must be reformated to contain only the FASTA ID and not any other description.

  • Finding off-target sites with Cas-Offinder

Cas-Offinder usage is listed on its GitHub project page. Use the header-reformatted reference genome for quering with Cas-Offinder. Set the maximum number tolerated mismatches according to your Cas system of choice leaving the PAM as "NNN". An example input file for Cas-Offinder with a maximum of 4 mismatches looks like the following.

/home/user/path/to/ref/hg38.fa
NNNNNNNNNNNNNNNNNNNNNN
ATGTTGATGATAGGATGATNNN 4

Run Cas-Offinder with

cas-offinder input.txt C casoffinder_out.tab
  • Converting Cas-Offinder result to BED

There are many ways of converting it to a BED file, but I have used the following awk oneliner

First add a BED header (optional)

echo -e "#BED3\n#CHROM\tSTART\tEND\tNAME\tSCORE\tSTRAND" >casoffinder_out.bed

Then use awk to format to BED

awk '$1 !~ /^#/ {print $1, $4, $5}' OFS='\t' casoffinder_out.tab >>casoffinder_out.bed

Cavar usage

  • For help use cavar --help

Options

usage:  cavar [OPTIONS] <grna> <bed file> <vcf file>
        cavar --help

Check status of CRISPR-Cas9 off-targets from genomic variants

positional arguments:
  grna (STR)           grna without the pam sequence
  bedfile (PATH)       path of bedfile
  vcffile (PATH)       path of vcffile

options:
  -h, --help           show this help message and exit
  -v, --version        show program's version number and exit
  -p, --pam-regex STR  pam as regex (default: [ATCG]GG)
  -d, --distance INT   maximum number of mismatches in crRNA (default: 4)
  -o, --outfile PATH   name of outfile (default: outfile.bed)

Examples

  1. Find off-target status of a gRNA with a given Cas9
  • Find OTs of the gRNA with Cas9 across the reference genome
cas-offinder input.txt C casoffinder_ots.tab
  • Convert the casoffinder_ots.tab to BED

  • Check status of those OTs with genomic variants

cavar --outfile p1_ots.bed ATGTTGATGATAGGATGAT casoffinder_ots.bed p1.vcf.gz

Release

  • 1.0b1 First beta release.

Issues

If you have found any bugs or would like to request any new features please report it on CAVAR.

License

CaVar is distributed under the GNU Public License v3 (GPL3). You should have received a copy of the license with CaVar.

About

Check status of CRISPR-Cas9 off-targets from genomic variants

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages