Skip to content

Sota-Nakashima/AESPA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

38 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AESPA

Version Version GPL v3 License

Overview

AESPA is a SSERAFIM wrapper tool.
AESPA makes AESPA support multiple spicies.

SSERAFIM only AESPA & SSEARFIM
single spicies
multiple spicies ×

Description

This tool simplifies and automates the workflow of gene expression analysis. It estimates the expression level of each gene based on the SRR-List downloaded from SRA, reference genome, and annotation data. SSERAFIM could only analyze one species at a time, but when combined with AESPA, multiple species can be analyzed simultaneously.

Demo

  1. Prepare SRR-List, Reference Genome, and Annotation Data

    Of course, you can use the file format from other sites as long as the file formats match. However, please make sure that the chromosome information match between the Reference Genome and the Annotation Data.

  2. Run AESPA in build mode

    aespa build -t ~/SraRunTable.txt
  3. Confirm result and prepare reference genome path file and annotaion path file following console output.

    Output example:

    PAIR-END
    
    *organism*
    Homo_sapiens
    
    Please prepare each absolute path list (.txt) of these reference genomes (.fasta) and anotations (.gtf).
    Use -g and -a option.
    
    SINGLE-END
    
    *organism*
    Gallus_gallus
    Gorilla_gorilla
    Homo_sapiens
    Macaca_mulatta
    Monodelphis_domestica
    Mus_musculus
    Ornithorhynchus_anatinus
    Xenopus_tropicalis
    
    Please prepare each absolute path list (.txt) of these reference genomes (.fasta) and anotations (.gtf).
    Use -G and -A option.
    
  4. Run AESPA in run mode

    aespa run -g ~/refernece_single_path.txt -G ~/refernece_pair_path.txt -a ~/annotaion_pair_path.txt -A ~/refernce_pair_path.txt -@ 20 -L

    This process is a little more complicated. If you are not sure, see example file.

Requirement

AESPA works on conda, conda-forge and bioconda.

AESPA depends on SSERAFIM. Please install it at the same time when you use AESPA.

Usage

build mode

aespa build [OPTION] [-t SRR_TABLE_PATH] [-o OUTPUT_DIR]
Mandatory arguments
-t SRR_TABLE_PATH (.txt) 

Default arguments
-o OUTPUT_DIR       Set output directory(default: ./AESPA)
-c CONDA_INIT_PATH  If SSERAFIM printed error "you have to check ...", please reset path.
                    "~/{YOUR CONDA PACKAGE}/etc/profile.d/conda.sh"
-h HELP             Show help                 
-V VERSION          Show version

run mode

<Both pair-end and single-end>
aespa run [OPTION] [-o OUTPUT_DIR] [-@ PALARREL]
[-g SINGLE-END_REFERENCE_GENNOME_PATH_FILE] [-G PAIR_END_REFERENCE_GENNOME_PATH_FILE_PAIR] 
[-a SINGLE-END_ANNOTATION_PATH_FILE] [-A PAIR-END_ANNOTATION_PATH_FILE]

<Only pair-end>
aespa run [OPTION] [-o OUTPUT_DIR] [-@ PALARREL]
[-G PAIR_END_REFERENCE_GENNOME_PATH_FILE_PAIR] [-A PAIR-END_ANNOTATION_PATH_FILE]

<Only single-end>
aespa run [OPTION] [-o OUTPUT_DIR] [-@ PALARREL]
[-g SINGLE-END_REFERENCE_GENNOME_PATH_FILE] [-a SINGLE-END_ANNOTATION_PATH_FILE]
Mandatory arguments
-g SINGLE-END_REFERENCE_GENNOME_PATH_FILE (.txt)
-a SINGLE-END_ANNOTATION_PATH_FILE (.txt)
-G PAIR_END_REFERENCE_GENNOME_PATH_FILE_PAIR (.txt)
-A PAIR-END_ANNOTATION_PATH_FILE (.txt)

Default arguments
-o OUTPUT_DIR       Set output directory(default: ./AESPA)
-S SSERAFIM_PATH    Sserafim path
-c CONDA_INIT_PATH  If printed error "you have to check ...",please reset path.
                    "~/{YOUR CONDA PACKAGE}/etc/profile.d/conda.sh"
-@ PARALLEL (int)   Set using CPU core(default: 1)
-L LIGHT_MODE       Don't make law_data.tar.gz
-h HELP             Show help                 
-V VERSION          Show version

Output

OUTPUT_DIR
│
├─── dataframe.csv
├─── pair
│       ├─── organism.txt
│       └─── single_SRR_List
│               ├─── organism1
│               ├─── organism2
│               │       :
│       
└─── single
        ├─── organism.txt
        └─── single_SRR_List
                ├─── organism1
                ├─── organism2
                │       :

Install

  • Create virtual environment in conda.
    AESPA can run the same environment to SSERAFIM. Please show SSERSFIM.
  • Use docker
    Please see this page.

Benchmark

See SSERAFIM.

Acknowledgements

Evolutionary Genetics Lab, Kyushu Univ.

License

GPL v3

Author

Sota Nakashima