Skip to content

cbbzhang/JCcirc

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

53 Commits
 
 
 
 
 
 
 
 

Repository files navigation

JCcirc

1. Introduction

JCcirc v1.0.0 (circRNA assembler through integrated junction contigs) is a computational tool that utilizes both back-splice junction (BSJ) and junction contig (JC) features to reconstruct full-length sequences of circular RNAs from RNA-seq datasets. JCcirc integrates junction reads and junction contigs for the assembly of all circRNAs. The BSJ feature is employed to accurately determine the boundaries of circRNAs, while the JC feature acts as an extension of junction reads, exhibiting superior performance in assembling circRNAs with low expression levels. Figure 6

Workflow of JCcirc

2. Prerequisites

Software Prerequisites:

JCcirc is implemented in Perl under Linux system.

A de novo transcript assembler (one of them)

Aligner

Input files:

JCcirc works with six input files. A GTF annotation file, pair-end RNA-seq data, a contig file was generated by de novo assembler, a genome sequence file, and a circRNA junction list.

Contig file can be obtained by de novo transcript assemblers

Trinity (trinity/inchworm.DS.fa)
SPAdes (spades/K31/transcripts.fasta)
SOAPdenovo-Trans (SOAP/soap.contig)

CircRNA lists containing circRNA location (chromosome, start, end), host gene, strand, junction reads ID.

3. Usage

Command:
    perl JCcirc.pl -C circ -G genome -F annotation -O out_dir -P 8 --read1 read_1.fq --read2 read_2.fq --contig contig.fa -D 0
Arguments:

    -C, --circ
          input circRNA file, which includes chromosome, start site, end site, host gene, strand, and junction reads ID (required).
    -O, --output
          directory of output (required).
    -G, --genome
          FASTA file of all reference sequences. Please make sure this file is
          the same one provided to the prediction tool (required).
    -F, --annotation
          gene annotation file in gtf format. Please make sure this file is
          the same one provided to the prediction tool.
    -P, --thread
          set number of threads for parallel running (required).
    --read1
          RNA-Seq data, read_1 paired-end, fastq format).
    --read2
          RNA-Seq data, read_2 paired-end, fastq format).
    --contig
          contig sequences (required).
    -D, --difference
          the difference in support numbers between adjacent fragments when generating circRNA isoforms, default is 0 (recommend setting to 0, 1, or 2, the larger number means stricter).
    -H, --help
          show this help information.

4. Note

  • The RNA-seq data should be paired-end, and the same file when running de novo assembly.
  • The GTF annotation file should be the same one when running JCcirc and its upstream software.
  • Parameter difference|D recommend setting to 0, 1, or 2. If the intron length in the genome is short, set a large D value. For example, human data can be set to 0, and plant data can be set to 2.

5. Output file

  • Two columns of fragment_final.txt (split by tabs)

(1) circRNA location
(2) Location of circRNA fragments on genome

  • circ_full_seq.fa is the assembly result of circRNA full-length sequences.

6. Reference

Zhang J, Zhang H, Ju Z, Peng Y, Pan Y, Xi W, Wei Y. JCcirc: circRNA full-length sequence assembly through integrated junction contigs. Brief Bioinform. 2023 Sep 22;24(6):bbad363.

7. Contact

Please contact Jingjing Zhang (zhangjj@siat.ac.cn) for questions and comments.

About

circRNA full length sequences

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages