Skip to content

v0.5.10

Latest

Choose a tag to compare

@Andyyyy64 Andyyyy64 released this 11 Jun 07:52
· 2 commits to main since this release
9864db4

Fixed

  • Strong partial-offload candidates no longer get buried under weaker full-GPU models because the final sort no longer counts GPU fit twice.
  • Light partial offload is penalized less aggressively, while heavy dense offload still gets a strong discount.
  • MoE partial-offload scoring now gives a milder penalty when the active working set can plausibly stay on GPU.