Skip to content

1.4.3

Compare
Choose a tag to compare
@eastriverlee eastriverlee released this 07 Mar 17:19
· 6 commits to main since this release

Highlights

  • removed redundant BOS token in mistral template, as it is added by llama.cpp anyway.
  • added more quantization option that llama.cpp supports (it's just String typed enum so you can extend it anyway, but still)
  • func decode(_ token: Token) -> String is now private and you now have func decode(_ token: [Token]) -> String. the prior was handling under the hood multibyte character handling so it was not supposed to be public from the beginning.
  • changed the params.n_ctx = UInt32(maxTokenCount) + (maxTokenCount % 2 == 1 ? 1 : 2) to params.n_ctx = UInt32(self.maxTokenCount). the prior code was like that because of some error i was experiencing but just changed it to the code as it supposed to be from the beginning.

Full Changelog: v1.4.2...1.4.3