Skip to content
CAFU: A Galaxy framework for exploring unmapped RNA-Seq data
Branch: master
Clone or download
Latest commit 5d98022 Feb 6, 2019
Type Name Latest commit message Commit time
Failed to load latest commit information.
Source_codes Update CAFU source codes on 2019/01/22 Jan 22, 2019
Tutorials Update Feb 3, 2019

Docker Repository on Quay


  • CAFU is a Galaxy-based bioinformatics framework for comprehensive assembly and functional annotation of unmapped RNA-seq data from single- and mixed-species samples which integrates plenty of existing NGS analytical tools and our developed programs, and features an easy-to-use interface to manage, manipulate and most importantly, explore large-scale unmapped reads.

  • Besides the common process of reads cleansing, reads mapping, unmapped reads generation and novel transcription assembly, CAFU optionally offers the multiple-level evidence analysis of assembled transcripts, the sequence and expression characteristics of assembled transcripts, and the functional exploration of assembled transcripts through gene co-expression analysis and genome-wide association analysis.

  • Taking advantages of machine learning (ML) technologies, CAFU also effectively addresses the challenge of classifying species-specific transcripts assembled using unmapped reads from mixed-species samples.

  • The CAFU project is hosted on GitHub( and can be accessed from The CAFU Docker image is available at


Overview of functional modules in CAFU

How to use CAFU

News and updates

CAFU updated on Jan 1, 2019

  • In the function Assemble Unmapped Reads, a parameter "Memory" was added for setting the maximum memory to be used by Triniry (1G in default).
  • To run the function Species Assignment of Transcripts, users can now use pre-trained or self-trained models. Currently, a pre-trained model was provided by training 20,502 and 137,052 mRNAs annotated in the reference genome of stripe rust pathogen Puccinia striiformis f. sp. tritici (PST-78 v1) and Chinese Spring wheat (IWGSC RefSeq v1.0), respectively.
  • The user tutorial was updated to highlight the importance of CPUs, Memory and Swap settings for running CAFU docker.

CAFU updated on Nov 30, 2018

  • A function Remove Contamination was added to remove potential contamination sequences using Deconseq (Schmieder et al., 2011).
  • A function Remove Batch Effect was added to remove batch effects using an R package sva (Leek et al., 2012).

CAFU released on Oct 13, 2018

  • CAFU source codes, web server and Docker image were released for the first time.

How to access help

  • For any bugs/issues, please feel free to leave a message at Github issues. We will try our best to deal with all issues as soon as possible.
  • For any suggestions/comments, please send emails to: Siyuan Chen or Jingjing Zhai

How to cite this work

Siyuan Chen, Chengzhi Ren, Jingjing Zhai, Jiantao Yu, Xuyang Zhao, Zelong Li, Ting Zhang, Wenlong Ma, Zhaoxue Han, Chuang Ma, CAFU: A Galaxy framework for exploring unmapped RNA-Seq data. Briefings in Bioinformatics, doi:10.1093/bib/bbz018.

You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.