Skip to content

v0.3.1

Compare
Choose a tag to compare
@jonatanklosko jonatanklosko released this 14 Sep 11:04
· 143 commits to main since this release
ef214cc

Added

  • LLaMA model (#199)
  • GPT-NeoX model (#204)
  • Option to customize scores function in classification tasks (#211)
  • Text embedding serving (#214)
  • Bumblebee.cache_dir/0 for discovering cache location (#220)
  • Image embedding serving (#229)
  • Support for compiling text servings for multiple sequence lengths (#228)
  • Support for streaming chunks during text generation (#232)
  • Added :preallocate_params option to all servings, useful with multiple GPUs (#233)
  • Support for loading params in the .safetensors format (#231)