Skip to content

v0.1.3x

Choose a tag to compare

@DominguesM DominguesM released this 14 Jun 13:23
· 61 commits to develop since this release
4bc82c1

Added

  • Added high-level streaming completion APIs, including
    create_completion_stream, create_completion_stream_with_sampler,
    CompletionChunk, StreamControl and richer completion logprob
    metadata.
  • Added llama-crab-server, an HTTP server binary for local inference
    with completions, chat completions, embeddings, reranking,
    tokenization, detokenization, SSE streaming and optional multimodal
    chat support.
  • Added OpenAI-style high-level convenience helpers for text, chat and
    embeddings with token accounting.
  • Added the server_lfm example wrapper and an lfm-text download
    target for launching the HTTP server with LFM text models.
  • Added the streaming example to demonstrate callback-driven text
    generation.
  • Added mobile-oriented runtime presets through MobilePreset and
    LlamaParams::with_mobile_preset.
  • Added broader tool-call streaming support, including OpenAI-style
    tool-call deltas.
  • Added documentation deployment for the project guide.

Changed

  • Migrated the user guide from mdBook to Material for MkDocs, with
    English and Portuguese documentation trees and expanded server,
    mobile, streaming, chat, embeddings and grammar coverage.
  • README files now point users to the new MkDocs guide hosted at the
    GitHub Pages site.
  • CI and release workflows now build, test and publish
    llama-crab-server alongside the library crates.
  • CI workflows now run through manual dispatch instead of push triggers,
    and documentation jobs use nightly Cargo where required.
  • The hf-tokenizer dependency now enables the onig feature for
    tokenizer compatibility.
  • Rustdoc crate logos now reference the current Canarim Crab asset.

Fixed

  • Removed unused placeholder OpenAI-compat wrapper bindings from
    llama-crab-sys and the old chat module export.
  • Gated the Metal backend build configuration to macOS targets.
  • Hardened documentation builds and docs deployment workflow behavior.
  • Cleaned up server and example runner support for the new server and
    mobile workflows.