Highlights

removed redundant BOS token in mistral template, as it is added by llama.cpp anyway.
added more quantization option that llama.cpp supports (it's just String typed enum so you can extend it anyway, but still)
func decode(_ token: Token) -> String is now private and you now have func decode(_ token: [Token]) -> String. the prior was handling under the hood multibyte character handling so it was not supposed to be public from the beginning.
changed the params.n_ctx = UInt32(maxTokenCount) + (maxTokenCount % 2 == 1 ? 1 : 2) to params.n_ctx = UInt32(self.maxTokenCount). the prior code was like that because of some error i was experiencing but just changed it to the code as it supposed to be from the beginning.

Full Changelog: v1.4.2...1.4.3

Provide feedback