Skip to content

v1.1.0

Choose a tag to compare

@SebieF SebieF released this 25 Sep 14:05
· 222 commits to main since this release

25.09.2025 - Version 1.1.0

Feature

  • Adding blosum62 predefined embedder via the blosum python package using the blosum substitution matrix as embeddings
  • Adding AAOntology predefined embedder from https://doi.org/10.1016/j.jmb.2024.168717 using amino acid feature
    scales
  • Adding biotrainer-ready quickstart
    datasets (subcellular location
    and secondary structure) in the README.md
  • Adding masked language modeling (MLM) task via residue_to_class protocol, CNN decoder and random_masking option in
    finetuning config
  • Adding lora examples for MLM and downstream tasks
  • [BETA] Adding residue_to_value protocol

Breaking

  • Refactoring confidence range calculation to use empirical distribution.
    Bootstrapping and MCD used assumption of normal distribution, which is okay for large sample sizes due to CLT. But it
    is more appropriate to use the empirical distribution, giving better upper and lower bounds especially for small
    sample sizes
  • autoeval: Adding framework name to task name in autoeval. This makes it easier to add multiple frameworks in the
    future
  • autoeval: Changing autoeval FLIP scl protocol to sequence_to_class. This requires less resources but is also valid
    to evaluate plms

Maintenance

  • Updating dependencies

Fixes

  • Fixing broken use_half_precision embeddings mode and adding comment about downstream float32 precision usage