Skip to content

Releases: airockchip/rknn-llm

release-v1.0.1

09 May 09:37
Compare
Choose a tag to compare
  • Optimize model conversion memory occupation
  • Optimize inference memory occupation
  • Increase prefill speed
  • Reduce initialization time
  • Improve quantization accuracy
  • Add support for Gemma, ChatGLM3, MiniCPM, InternLM2, and Phi-3
  • Add Server invocation
  • Add inference interruption interface
  • Add logprob and token_id to the return value