Skip to content
This repository has been archived by the owner on Feb 22, 2024. It is now read-only.

zhangtaolab/CRISPRMatch

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

49 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CRISPRMatch

CRISPRMatch is no longer actively maintained.

We release new CRIPSR data analysis software: CrisprStitch.

Brief introduction

An automatic calculation and visualization tool for high-throughput CRISPR genome-editing data analysis

I. Requirements

Anaconda
python3
bwa
samtools
picard
FLASH

#f03c15 Note: Using Anaconda to Install all packages (bwa,samtools,picard,FLASH)

II. Manually Install

CentOS Linux release 7.3.1611 (terminal)

  1. Install Anaconda
$ yum install wget git
$ mkdir /home/software
$ cd /home/software
$ wget https://repo.continuum.io/archive/Anaconda3-5.0.1-Linux-x86_64.sh
$ bash Anaconda3-5.0.1-Linux-x86_64.sh
  1. Install required packages
$ conda install bwa \  
                samtools \  
                picard \  
                flash \ 
                matplotlib \  
                pysam \  
                pandas \  
                argparse \  
                numpy \
  • #f03c15 Note: To ensure the tool working, please using Anaconda to install all packages (bwa,samtools,picard,FLASH ...)
  1. Download CRISPRMatch and test
$ cd /home/software
$ git clone https://github.com/zhangtaolab/CRISPRMatch.git
$ python3 /home/software/CRISPRMatch/CRISPRMatch.py -h
  
  usage: CRISPRMatch [-h] [--version] [-b BWA] [-sm SAMTOOLS] [-pi PICARD] -g
                    GENOME -i INPUT -gi GROUPINFO [-s SAVED] [-r RESULT]
                    [-t THREADS] [--docker DOCKER]

  CRISPRMatch is for location finding

  optional arguments:
    -h, --help            show this help message and exit
    --version             show program's version number and exit
    -b BWA, --bwa BWA     bwa path
    -sm SAMTOOLS, --samtools SAMTOOLS
                          samtools path
    -pi PICARD, --picard PICARD
                          picard path
    -g GENOME, --genome GENOME
                          fasta format genome file
    -i INPUT, --input INPUT
                          sample information input file
    -gi GROUPINFO, --groupinfo GROUPINFO
                          group information input file
    -s SAVED, --save SAVED
                          tmp saved folder
    -r RESULT, --result RESULT
                          result saved folder
    -t THREADS, --threads THREADS
                          threads number or how may cpu you wanna use

III. Start running

  1. Files for mutation calculation
  • File1: Genome-editing target sequences
    Fasta format example
  • File2: NGS samples information
    #f03c15 note:
    For CRISPR-Cas9 system, the 'Note' must contain 'gRNA' label.
    For CRISPR-Cpf1 system, the 'Note' must contain 'crRNA' label.
    example:
    sample information
  • File3: NGS group information
    #f03c15 note: At present, two repeats are supported
    example:
    group information
  • Note: the information files File1, File2 and File3 are required!

2. command line example:

(1) For single long reads

$ cd /home/software/CRISPRMatch/
$ python3 CRISPRMatch.py -g sampledata/Samples_gene.fa -i sampledata/sample_infor.txt -gi sampledata/group_info.txt -t 2
- Note: absolute path is preferred when using customer data

(2) For paired-end reads

$ cd /home/software/CRISPRMatch/
$ python3 CRISPRMatch_paired.py -g sampledata2/Samples_gene.fa -i sampledata2/sample_infor.txt -gi sampledata2/group_info.txt -t 2
- Note: absolute path is preferred when using customer data