Skip to content
Simon Hegele edited this page Apr 20, 2026 · 11 revisions

Small Scripts for Small Bioinformatics Tasks

A collection of python scripts

1 Installation

conda create -n ssfsbt # (optional but recommended)
conda activate ssfsbt  # (optional but recommended)

git clone https://github.com/SimonHegele/SSfSBT
cd SSfSBT
pip install .

Will make scripts available as command-line-tools

2 Command-line-Tools

SSfSBT provides a variety of scripts available as command-line-tools.
Most are described in detail on the respective page of this wiki.

Tool Summary
aln_pos2pos Finding corresponding positions in multiple sequence alignments
busco_find Extracting BUSCO transcript sequences
busco_merge Merging multiple reports from BUSCO and compiling a plot
kallisto2nanosim Converting expression profiles from Kallisto for NanoSim
lengths Basic sequence length distribution analysis for one or more FASTA/FASTQ-files
lr_lordec_contam_filter Filtering contamination from long reads corrected by HALC or LoRDEC with Kraken2 filtered short reads
fa2fq FASTA to FASTQ conversion with different error rates for upper and lower case denoted nucleotides
fq2fa FASTQ to FASTA conversion
gfa2fa GFA to FASTA conversion
plot_msa Plotting multiple sequence alignments
rnaQUASTcompare Merging multiple reports from rnaQUAST and compiling a plot
sample Subsampling a fixed number of sequences from FASTA/FASTQ-files
unambiguous_codes Replacing ambiguity codes in FASTA/FASTQ-files

3 File services

SSfSBT provides a variety of file services that can read from and write to various files used in bioinformatics. They are located in the file_services folder. Each file service is a class providing class methods.
Their read()-methods are generators, yielding dictionaries.
Their write()-methods accept iterables of dictionaries.

File type Can read Can write Additional info
FASTA Sequences
FASTQ Sequences
PAF Pairwise sequence alignments from Minimap2
SAM Pairwise sequence alignments from basically any other alignment tool
BCALM (FASTA) De Bruijn Graph from BCALM
FASTG (FASTA) De Bruijn Graph from SPAdes

Clone this wiki locally