SEDA (SEquence DAtaset builder) is an open source application for processing FASTA files containing DNA and protein sequences. Please, visit the official web page of the project for downloads, a complete online manual and support.
Main features
Among other functions, SEDA allows you to:
- Filter sequences based on different criteria (including text patterns).
- Translate nucleic acid sequences into amino acid sequences.
- Edit sequence headers in different ways.
- Remove duplicated sequences.
- Remove isoforms.
- Sort, merge, split, or reformat FASTA files.
- Use BLAST to perform different types of queries.
- Use Clustal Omega to perform multiple sequence alignments.
- Perform gene annotation using different tools: Splign/Compart, ProSplign/ProCompart, or Augustus (as implemented in SAPP).
Debugging
In case you need see the commands executed by SEDA to run third-party software, just run SEDA with -Dseda.execution.showcommands=true
.
For programmers
Programmers can take advantage of the SEDA core to develop new operations to process FASTA files. In addition, SEDA has a plugin-based architecture, so new functions can be added to SEDA through plugins. Take a look at the manual for detailed information about this.
Citing
Please, cite the following publication if you use SEDA:
- H. López-Fernández; P. Duque; N. Vázquez; F. Fdez-Riverola; M. Reboiro-Jato; C. P. Vieira; J. Vieira (2020) SEDA: a Desktop Tool Suite for FASTA Files Processing. IEEE/ACM Transactions on Computational Biology and Bioinformatics
Works using SEDA
- H. López-Fernández; P. Duque; S. Henriques; N. Vázquez; F. Fdez-Riverola; C.P. Vieira; M. Reboiro-Jato; J. Vieira (2018) A bioinformatics protocol for quickly creating large-scale phylogenetic trees. 12th International Conference on Practical Applications of Computational Biology & Bioinformatics: PACBB 2018. Toledo, Spain. 20 - June
- H. López-Fernández; P. Duque; S. Henriques; N. Vázquez; F. Fdez-Riverola; C.P. Vieira; M. Reboiro-Jato; J. Vieira (2018) Bioinformatics Protocols for Quickly Obtaining Large-Scale Data Sets for Phylogenetic Inferences. Interdisciplinary Sciences: Computational Life Sciences
- H. López-Fernández; P. Duque; N. Vázquez; F. Fdez-Riverola; M. Reboiro-Jato; C.P. Vieira; J. Vieira (2019) Inferring Positive Selection in Large Viral Datasets. 13th International Conference on Practical Applications of Computational Biology & Bioinformatics: PACBB 2019. Ávila, Spain. 26 - June