Skip to content
Go to file

Latest commit


Git stats


Failed to load latest commit information.
Latest commit message
Commit time


INTEGRATE-Neo is a gene fusion neoantigen discovering tool using next-generation sequencing data. It is written in C++ and Python.

  • Python
  • Perl
  • awk
  • GCC

If not, please install these languages or tools. You may also need to install some prerequisite tools:

HLAminer and NetMHC are also included in the vendor directory here.

To compile the C++ part of this pipeline, you may need to install CMAKE


Download INTEGRATE-Neo at

Run the installation script:

$ cd INTEGRATE-Neo-V-1.2.0
$ chmod +x
$ ./ -o /opt/bin/

Note that you can choose wherever you like to install the software. It can be different from "/opt/bin/".

Now you have installed:

  • integrate-neo

together with the modules of integrate-neo that can be used as standalone tools:

  • fusionBedpeAnnotator
  • fusionBedpeSubsetter
  • runHLAminer
  • HLAminerToTsv
  • runAddNetMHC4Result
  • runNetMHC4WithSMCRNABedpe

A setup.ini and a rule.txt file are also at your destination directory now. If you don't like them to be there, copy them to the place you like. But remember to use the --setup-file and --rule-file options to run integrate-neo if you moved them.


Remember to edit the setup.ini file before your first running the pipeline. The one in the installation packages are using example paths like "/SOME/PATH/...".

For the HLAminer reference HLA alleles, i.e. HLA_ABC_CDS.fasta, remember to index it with bwa before the first run.


If you type the following (or python ./ --help):

$ ./

you can see the 14 parameters and explanations.

The following are the required options:


The --fastq[1/2] and --reference options are clear enough, the FASTQ and FASTA formats for sequencing reads and human reference genome.

The --fusion-bedpe option requires a BEDPE format for gene fusions. This BEDPE format follows the standardized format provided by The ICGC-TCGA DREAM Somatic Mutation Calling - RNA Challenge (SMC-RNA).

The --gene-model option requires a gene annotation genePhred file.

Download the gtf file from Ensembl:

GRCh37, e.g., v75:

GRCh38, e.g., v86:

and run the following command for v75:

$ gunzip Homo_sapiens.GRCh37.75.gtf.gz
$ ./gtfToGenePred -genePredExt -geneNameAsName2 Homo_sapiens.GRCh37.75.gtf Homo_sapiens.GRCh37.75.genePred

for v86:

$ gunzip Homo_sapiens.GRCh38.86.gtf.gz.
$ ./gtfToGenePred -genePredExt -geneNameAsName2 Homo_sapiens.GRCh38.86.gtf Homo_sapiens.GRCh38.86.genePred

FASTA files can also be downloaded at Ensembl:




The output is in BEDPE format, the first 11 columns follows the SMC-RNA format. columns 12-19 are:

  • Epitope sequence
  • Epitope Affinity (nanoMolar)
  • HLA allele
  • HLA category
  • HLA score
  • HLA e-value
  • HLA confidence


The chromosome names in the reference genome, the gene models, and the fusions should be consistent.


Examples are provided for you to test the code.


Release notes:

12-23-2016: INTEGRATE-Neo v 1.2.0

updated BedpeAnnotator to v 0.2.0, which includes a new column for transcript Ids, a new column for lengths of nucleotides in the coding regions at 5p transcripts, a new column for whether the peptides are in-frame, and a new column for whether the fusion transcript follows canonical dinucleotides.

01-17-2017: INTEGRATE-Neo v 1.2.1

updated BedpeAnnotator to v 0.2.1, which includes a bug fixing.


No description, website, or topics provided.



No releases published


No packages published