Skip to content

kerenzhou062/rriScan

Repository files navigation

Overview

Here, we describe the RNA-RNA Interaction Scan (rriScan), a universal software for analyzing RNA-RNA interactome sequencing data, such as MARIO, PARIS, PARIS2 and LIGR-seq data.

System requirements

The software package was tested on Linux systems with the following specifications:

  • RAM: 64GB
  • CPU: 16+ cores

Run time

Written in C and C++, rriScan is highly efficient for analyzing RNA-RNA interactome sequencing data. After testing, most tasks are completed within 10 minutes.

Installation

  • Installing rriScan on a Linux server is straightforward. Use the following commands:
    #Assume your installation directory is /username/software
    
    cd /username/software
    
    git clone https://github.com/kerenzhou062/rriScan.git
    
    cd ./rriScan
    
    sh install.sh
    
    export PATH=$PATH:/username/software/rriScan/bin

Input

rriScan requires the following input files:

  • Genome sequence in FASTA format: Specify using the --fa argument. Chromosome names in this file must match those in the input bam file.

  • FAI index file: Specify using the --fai argument. Generate this file using samtools:

    #Example: Generate an FAI index for hg38.fa  
    
    samtools faidx hg38.fa
  • BAM file: Specify using the --bam argument. This file should contain sequence alignment data generated by STAR software and stored as Aligned.sortedByCoord.out.bam.

  • Junction file: Specify using the --jun argument. This file contains junction information, which can be generated by STAR software with the --chimOutType Junctions parameter and stored as Chimeric.out.junction.

  • FASTQ or FASTA file: An optional file containing sequencing reads.

Output

Here’s the description of the columns in the output files:

Column name Description
lChrom Chromosome name of left pair
lChromStart Start coordinate of left pair (0-base)
lChromEnd End coordinate of left pair
lName Name of left pair
lScore The quality of the alignment from left paired RNA
lStrand Strand of left pair
rChrom Chromosome name of right pair
rChromStart Start coordinate of right pair (0-base)
rChromEnd End coordinate of right pair
rName Name of right pair
rScore The quality of the alignment from right paired RNA
rStrand Strand of right pair
lociNum Chromosome name
gapDist Gap distance between pairs if they are in the same chromosome
readSeq Sequencing reads
chimericSeq Full sequence of the chimeric
chimericStruct Predict structure of chimeric
MFE Minimum free energy
rriType Type of RNA-RNA interaction
lAlignSeq Aligned sequence of left pair
pairs Base pairings
rAlignSeq Aligned sequence of right pair
pairNum The maximum continuous perfect pairings
alignScore The quality of the alignment between left paired and right paired RNA
loReadNum Read number of left pair
roReadNum Read number of right pair

Basic Usage

The available options for rriScan are as follows:

Usage:  rriScan [options] --fa <fasta file> --fai <fai file> --bam <mapped alignments> --jun <junctions>
File format for mapped alignments is BAM
[options]
-v/--verbose                   : verbose information
-V/--version                   : rriScan version
-h/--help                      : help informations
-S/--small                     : small genome
--fa                           : genome FASTA file. [required]
--fai                          : genome fai file, an index file for fasta file. [required]
--bam                          : alignment file, BAM format. [required]
--jun                          : junction file from STAR software, junction format. [required]
--read                         : read file[fastq or fasta]. [optional]
-o/--output <string>           : output file
-l/--min-seg-len <int>         : minimum length of segments in a chimera [default>=15]
-m/--min-read-number <int>     : minimum read number for chimera [default>=1]
-M/--max-mfe <double>          : maximum MFE in duplex[default<=-5.0]
-p/--min-pair <int>            : minimum pair number in duplex [default>=0]
-s/--min-score <int>           : minimum alignment score in duplex [default>=5]
-g/--min-gap <int>             : minimum gaps between two segments [default>=1]

Run rriScan on testing dataset

Please check this guide to learn how to run rriScan on a testing dataset.

Acknowledgements

Thanks to everyone who contributed to the public codes and libraries (e.g., BamTools) used by rriScan.

Contact

  • Jian-Hua Yang yangjh7@mail.sysu.edu.cn, RNA Information Center, School of Life Sciences, Sun Yat-Sen University
  • Keren Zhou kzhou@stjude.org, Department of Pathology, St. Jude Children’s Research Hospital, Memphis, TN, USA

About

An universal software for analyzing RNA-RNA interactome sequencing data

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages