LRCAGE (long-read CAGE)

This repository contains custom scripts, inclusing calling peaks, retaining a list of confident transcripts, and building a proteome database for immunopeptidome analysis using LRCAGE data as input.

Installation
- Run scripts using docker images
1. Download scripts from github
```
cd <your installation directory>
git clone https://github.com/juheon/LRCAGE.git
```
1. Download docker images from dockerhub
```
docker pull jhmaeng/lrcage
docker images ; See if you can find jhmaeng/lrcage
```
1. Modify and run “run_with_docker.sh” script in ”your installation directory”

List of custom scripts

callpeak: a script to call peaks from LRCAGE, LRhex, and nanoCAGE data

usage: LRCAGE callpeak [-h] --inputlist INPUTLIST --peak PEAK
                       (--tpm TPM | --readcount READCOUNT) [--gcap GCAP]
                       [--gcap_mincount GCAP_MINCOUNT]
                       [--half_peak_width HALF_PEAK_WIDTH] [--thread THREAD]

optional arguments:
  -h, --help            show this help message and exit
  --inputlist INPUTLIST
                        list of input bam files
  --peak PEAK           output peak file name
  --tpm TPM             minimum TPM per peak
  --readcount READCOUNT
                        minimum read count per peak
  --gcap GCAP           minimum G-cap ratio
  --gcap_mincount GCAP_MINCOUNT
                        minimum number of soft-clipped G reads
  --half_peak_width HALF_PEAK_WIDTH
                        half peak size
  --thread THREAD       number of threads

filtertx: a script to retain a list of confident transcripts using transcripts identified from LRCAGE data and a list of peaks.

usage: LRCAGE filtertx [-h] --gtf GTF --talon TALON [--libinfo LIBINFO]
                       [--mincount MINCOUNT] [--peak PEAK]
                       [--peakratio PEAKRATIO] --oprefix OPREFIX

optional arguments:
  -h, --help            show this help message and exit
  --gtf GTF             input gtf file
  --talon TALON         input TALON.tsv file
  --libinfo LIBINFO     library size information
  --mincount MINCOUNT   minimum count to define confident transcripts
  --peak PEAK           peaks used to retain transcripts with complete 5' ends
  --peakratio PEAKRATIO
                        minimum fraction of reads for peak-transcript pair per
                        trancsript
  --oprefix OPREFIX     prefix for output files

buildprot: a script to create a proteome database using newly characterized transcripts as input.

usage: LRCAGE buildprot [-h] --gtf GTF --ref REF [--txinfo TXINFO]
                        [--thread THREAD] --oproteome OPROTEOME --refproteome
                        REFPROTEOME --refgtf REFGTF

optional arguments:
  -h, --help            show this help message and exit
  --gtf GTF             input gtf file
  --ref REF             reference genome fasta
  --txinfo TXINFO       transcript information
  --thread THREAD       number of threads
  --oproteome OPROTEOME
                        output proteome
  --refproteome REFPROTEOME
                        reference proteome
  --refgtf REFGTF       reference gtf

WashU Epigenome Browser links

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
script		script
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
inputlist.txt		inputlist.txt
run_with_docker.sh		run_with_docker.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

script

script

Dockerfile

Dockerfile

LICENSE

LICENSE

README.md

README.md

inputlist.txt

inputlist.txt

run_with_docker.sh

run_with_docker.sh

Repository files navigation

LRCAGE (long-read CAGE)

About

Releases

Packages

Languages

License

twlab/LRCAGE

Folders and files

Latest commit

History

Repository files navigation

LRCAGE (long-read CAGE)

About

Resources

License

Stars

Watchers

Forks

Languages