You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Sliding-window attention metadata is now used in model fetching and KV cache estimation, improving VRAM estimates for models that use SWA. (#124)
Intel Arc Pro B70 / Battlemage G31 now has curated detection and simulation defaults, including PCI device 0xe223, 32 GB VRAM, and 608 GB/s bandwidth. (#93, #136)
Fixed
Model and benchmark metadata fetches now request gzip, deflate instead of brotli, avoiding broken br responses from mirrors or intermediate servers. (#128, #136)