Skip to content

Add option to output clean FASTA headers without coordinate annotations #1

@ayobi

Description

@ayobi

Problem

ITSxRust appends region coordinates to FASTA headers by default, e.g.:

>seq1|full:47-433
>seq2|its1:1-200

While useful for standalone analysis, this breaks interoperability with pipelines (e.g. nf-core/ampliseq) where downstream tools expect FASTA headers to match exactly with ASV/taxonomy tables.

Proposed solution

Add a flag like --strip-coords or --clean-headers that outputs plain headers without coordinate annotations:

>seq1
>seq2

Default behavior stays the same (coordinates included). The flag just provides an opt-out for pipeline integration.

Current workaround

Post-processing with sed:

sed -i 's/|[a-z]*[0-9]*:[0-9]*-[0-9]*//' output.fasta

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions