Skip to content

v0.5.14

Latest

Choose a tag to compare

@Andyyyy64 Andyyyy64 released this 29 Jun 04:32

Added

  • Sliding-window attention metadata is now used in model fetching and KV cache estimation, improving VRAM estimates for models that use SWA. (#124)
  • Intel Arc Pro B70 / Battlemage G31 now has curated detection and simulation defaults, including PCI device 0xe223, 32 GB VRAM, and 608 GB/s bandwidth. (#93, #136)

Fixed

  • Model and benchmark metadata fetches now request gzip, deflate instead of brotli, avoiding broken br responses from mirrors or intermediate servers. (#128, #136)