1.4.3
Highlights
- removed redundant BOS token in mistral template, as it is added by
llama.cpp
anyway. - added more quantization option that
llama.cpp
supports (it's justString
typedenum
so you can extend it anyway, but still) func decode(_ token: Token) -> String
is nowprivate
and you now havefunc decode(_ token: [Token]) -> String
. the prior was handling under the hood multibyte character handling so it was not supposed to bepublic
from the beginning.- changed the
params.n_ctx = UInt32(maxTokenCount) + (maxTokenCount % 2 == 1 ? 1 : 2)
toparams.n_ctx = UInt32(self.maxTokenCount)
. the prior code was like that because of some error i was experiencing but just changed it to the code as it supposed to be from the beginning.
Full Changelog: v1.4.2...1.4.3