Skip to content

sing-group/seda

Repository files navigation

SEDA license release

SEDA (SEquence DAtaset builder) is an open source application for processing FASTA files containing DNA and protein sequences. Please, visit the official web page of the project for downloads, a complete online manual and support.

SEDA Screenshot

Main features

Among other functions, SEDA allows you to:

  • Filter sequences based on different criteria (including text patterns).
  • Translate nucleic acid sequences into amino acid sequences.
  • Edit sequence headers in different ways.
  • Remove duplicated sequences.
  • Remove isoforms.
  • Sort, merge, split, or reformat FASTA files.
  • Use BLAST to perform different types of queries.
  • Use Clustal Omega to perform multiple sequence alignments.
  • Perform gene annotation using different tools: Splign/Compart, ProSplign/ProCompart, Augustus (as implemented in SAPP), or the Conserved Genome Annotation (CGA) Pipeline.

Debugging

In case you need see the commands executed by SEDA to run third-party software, just run SEDA with -Dseda.execution.showcommands=true.

For programmers

Programmers can take advantage of the SEDA core to develop new operations to process FASTA files. In addition, SEDA has a plugin-based architecture, so new functions can be added to SEDA through plugins. Take a look at the manual for detailed information about this.

Citing

Please, cite the following publication if you use SEDA:

  • H. López-Fernández; P. Duque; N. Vázquez; F. Fdez-Riverola; M. Reboiro-Jato; C. P. Vieira; J. Vieira (2022) SEDA: a Desktop Tool Suite for FASTA Files Processing. IEEE/ACM Transactions on Computational Biology and Bioinformatics. Volume 19(3), pp. 1850-1860. DOI

Works using SEDA

  • H. López-Fernández; P. Duque; S. Henriques; N. Vázquez; F. Fdez-Riverola; C.P. Vieira; M. Reboiro-Jato; J. Vieira (2018) A bioinformatics protocol for quickly creating large-scale phylogenetic trees. 12th International Conference on Practical Applications of Computational Biology & Bioinformatics: PACBB 2018. Toledo, Spain. 20 - June DOI
  • H. López-Fernández; P. Duque; S. Henriques; N. Vázquez; F. Fdez-Riverola; C.P. Vieira; M. Reboiro-Jato; J. Vieira (2018) Bioinformatics Protocols for Quickly Obtaining Large-Scale Data Sets for Phylogenetic Inferences. Interdisciplinary Sciences: Computational Life Sciences DOI
  • H. López-Fernández; P. Duque; N. Vázquez; F. Fdez-Riverola; M. Reboiro-Jato; C.P. Vieira; J. Vieira (2019) Inferring Positive Selection in Large Viral Datasets. 13th International Conference on Practical Applications of Computational Biology & Bioinformatics: PACBB 2019. Ávila, Spain. 26 - June DOI

Credits

The Command-Line Interface (CLI) available from SEDA v1.6.0 was developed by David Vila Fernández as Master's Project.