TAC-seq data analysis

This repository contains TAC-seq data analysis software.

Requirements

Linux-based OS
FASTX-Toolkit
Git

Setup

Use the following commands to setup TAC-seq data analysis software on terminal:

Install FASTX-Toolkit
Install Git
Download the analysis software using Git: git clone https://github.com/hindrek/TAC-seq-data-analysis
Navigate to analysis location: cd TAC-seq-data-analysis
Make tacseq executable: chmod +x tacseq

Usage

`tacseq [options] <command>`

Analyze TAC-seq data.

options:

-h display help and exit

commands:

prep prepare samples (FASTQ files) for counting
count count reads and molecules per sample and target

`tacseq prep [options]`

Prepare samples (FASTQ files) for counting.

mandatory:

-i input file: gzip compressed/uncompressed FASTQ file or '-' as standard input (stdin)
-t target file: target file format is based on FASTX Barcode Splitter barcode file format
-o output directory

optional:

-h display help and exit
-m mismatches: number of allowed mismatches per target sequence (default: 5)

`tacseq count [options]`

Count reads and molecules per sample and target.

mandatory:

-i input directory: tacseq prep output directory

optional:

-h display help and exit
-u UMI threshold (default: 2)

Target file format

Target file is a text file which contains a list of targets. Each line has to contain a target ID (must be alphanumeric) which is followed by the target sequence (only A, C, G and T characters are allowed). Target ID and sequence are separated by a TAB character.

Target file example:

TARGET1 TAGGATAGGTGGATTCGGGAACTCCCCGATAGTTTTGTCACATCGACATACTAA
TARGET2 CCAAAGCTTCAACGGACATAGTGTACATACCTACCGTGTTTCCCAGCACCTTCC
TARGET3 CTGCTGTTGCCGCCTGGGGTTTACGCGTGTTGGAGATTGAGTAGCCTCCTCGGC

Output

tacseq prep outputs a directory for each sample with:

3 sub-directories with files for each target:
- loci
- umis
- merged
2 intermediate files:
- trimmed.fasta
- umi_joined.fasta

tacseq count outputs read and molecule counts per target for each sample.

Run example

Step 1 - prepare samples: ./tacseq prep -i example/samples/sample1/sample1.fastq.gz -t example/targets.txt -o example/output/sample1/ -m 5
Step 2 - count molecules and write results to counts.tsv file: ./tacseq count -i output/sample1/ -u 2 > counts.tsv

Name		Name	Last commit message	Last commit date
Latest commit History 100 Commits
example		example
LICENSE		LICENSE
README.md		README.md
count.sh		count.sh
prep.sh		prep.sh
tacseq		tacseq

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TAC-seq data analysis

Requirements

Setup

Usage

`tacseq [options] <command>`

`tacseq prep [options]`

`tacseq count [options]`

Target file format

Output

Run example

About

Releases

Packages

Languages

License

cchtEE/tac-seq-data-analysis

Folders and files

Latest commit

History

Repository files navigation

TAC-seq data analysis

Requirements

Setup

Usage

tacseq [options] <command>

tacseq prep [options]

tacseq count [options]

Target file format

Output

Run example

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

`tacseq [options] <command>`

`tacseq prep [options]`

`tacseq count [options]`

Packages