12 Jun 01:14

github-actions

59cf309

v3.10.0 Latest

Latest

3.10.0 (2025-06-12)

Features

JSON Schema Grammar: $defs and $ref support with full inferred types (#472) (9cdbce9)
inspect gguf command: format and print the Jinja chat template with --key .chatTemplate (#472) (9cdbce9)

Bug Fixes

JinjaTemplateChatWrapper: first function call prefix detection (#472) (9cdbce9)
QwenChatWrapper: improve Qwen chat template detection (#472) (9cdbce9)
apply maxTokens on function calling parameters (#472) (9cdbce9)
adjust default prompt completion length based on SWA size when relevant (#472) (9cdbce9)
improve thought segmentation syntax extraction (#472) (9cdbce9)
adapt to llama.cpp changes (#472) (9cdbce9)

Shipped with llama.cpp release b5640

Assets 16

node-llama-cpp-electron-example.Linux.3.10.0.amd64.deb

sha256:eb2ec8edf84ad08ea2ab849af2b862dd8d98b652c74f588301e35e6374e9edd4
152 MB 2025-06-12T01:28:19Z
node-llama-cpp-electron-example.Linux.3.10.0.amd64.snap

sha256:c2cd7864b42e9bbe0cfcce2d95394e6a5a0fac375f3cc4964bc6b52f997b3e14
219 MB 2025-06-12T01:28:12Z
node-llama-cpp-electron-example.Linux.3.10.0.arm64.AppImage

sha256:31dcb0c3286f27c550b5a187d6b11ad6a8789f8b4c920310e01b434adf22110c
142 MB 2025-06-12T01:27:59Z
node-llama-cpp-electron-example.Linux.3.10.0.arm64.deb

sha256:b2af03be00d51b313042b410c9ac0c410722be558909ba68316194bc9866e717
99.1 MB 2025-06-12T01:28:24Z
node-llama-cpp-electron-example.Linux.3.10.0.arm64.tar.gz

sha256:c193471667fcf191d8dde98eda1163d416e106d5ca40a46416d6a7969e53ffe3
134 MB 2025-06-12T01:28:27Z
node-llama-cpp-electron-example.Linux.3.10.0.x64.tar.gz

sha256:6093d0f7b6cd1d222a6d0b8939e8ca042c1d11d8938a382df66d6b9fbe75d736
260 MB 2025-06-12T01:28:30Z
node-llama-cpp-electron-example.Linux.3.10.0.x86_64.AppImage

sha256:9b5d9b7aeb592050ef9befd7700cf10ec3269cb26bde3050f4727b29671586f4
268 MB 2025-06-12T01:28:04Z
node-llama-cpp-electron-example.macOS.3.10.0.arm64.dmg

sha256:7c015b960cfc9c23d75db3677699e299f47e34c00087053acf682176edd72018
136 MB 2025-06-12T01:24:03Z
node-llama-cpp-electron-example.macOS.3.10.0.arm64.zip

sha256:070d5fe1f69174c173c80688778a09ed56164d6d1c891036447527dd7e879f76
131 MB 2025-06-12T01:24:17Z
node-llama-cpp-electron-example.macOS.3.10.0.x64.dmg

sha256:0656909eef5ef85f39814d670577c8713adaf936bc36ac6b3a205843ede537f2
145 MB 2025-06-12T01:24:11Z
Source code (zip)

2025-06-11T00:13:49Z
Source code (tar.gz)

2025-06-11T00:13:49Z

0 Join discussion

04 Jun 23:26

github-actions

v3.9.0

ea8d904

v3.9.0

3.9.0 (2025-06-04)

Features

reasoning budget (#468) (ea8d904) (documentation: Set Reasoning Budget)
SWA (Sliding Window Attention) support - greatly reduced context memory consumption on supported models (#468) (ea8d904)
documentation: LLMs friendly llms.md and llms-full.md files (#468) (ea8d904)

Bug Fixes

prompt completion edge cases (#468) (ea8d904)
adapt to llama.cpp changes (#468) (ea8d904)

Shipped with llama.cpp release b5590

Assets 16

0 Join discussion

19 May 20:53

github-actions

v3.8.1

1799127

v3.8.1

3.8.1 (2025-05-19)

Bug Fixes

getLlamaGpuTypes: edge case (#463) (1799127)
remove prompt completion from the cached context window (#463) (1799127)

Shipped with llama.cpp release b5415

Assets 16

17 May 22:11

github-actions

v3.8.0

f2cb873

v3.8.0

3.8.0 (2025-05-17)

Features

save and restore a context sequence state (#460) (f2cb873) (documentation: Saving and restoring a context sequence evaluation state)
stream function call parameters (#460) (f2cb873) (documentation: API: LLamaChatPromptOptions["onFunctionCallParamsChunk"])
configure Hugging Face remote endpoint for resolving URIs (#460) (f2cb873) (documentation: API: ResolveModelFileOptions["endpoints"])
Qwen 3 support (#460) (f2cb873)
QwenChatWrapper: support discouraging the generation of thoughts (#460) (f2cb873) (documentation: API: QwenChatWrapper constructor > thoughts option)
getLlama: dryRun option (#460) (f2cb873) (documentation: API: LlamaOptions["dryRun"])
getLlamaGpuTypes function (#460) (f2cb873) (documentation: API: getLlamaGpuTypes)

Bug Fixes

adapt to breaking llama.cpp changes (#460) (f2cb873)
capture multi-token segment separators (#460) (f2cb873)
race condition when reading extremely long gguf metadata (#460) (f2cb873)
adapt memory estimation to newly added model architectures (#460) (f2cb873)
skip binary testing on certain problematic conditions (#460) (f2cb873)
improve GPU backend loading error description (#460) (f2cb873)

Shipped with llama.cpp release b5414

Assets 16

0 Join discussion

28 Mar 01:07

github-actions

v3.7.0

c070e81

v3.7.0

3.7.0 (2025-03-28)

Features

extract function calling syntax from a Jinja template (#444) (c070e81)
Full support for Qwen and QwQ via QwenChatWrapper (#444) (c070e81)
export a llama instance getter on a model instance (#444) (c070e81)

Bug Fixes

better handling for function calling with empty parameters (#444) (c070e81)
reranking edge case crash (#444) (c070e81)
limit the context size by default in the node-typescript template (#444) (c070e81)
adapt to breaking llama.cpp changes (#444) (c070e81)
bump min nodejs version to 20 due to dependencies' requirements (#444) (c070e81)
defineChatSessionFunction type (#444) (c070e81)

Shipped with llama.cpp release b4980

Assets 16

0 Join discussion

21 Feb 19:00

github-actions

v3.6.0

599a161

v3.6.0

✨ DeepSeek R1 is here! ✨

Read about the release in the blog post

3.6.0 (2025-02-21)

Features

DeepSeek R1 support (#428) (ca6b901) (documentation: DeepSeek R1)
chain of thought segmentation (#428) (ca6b901) (documentation: Stream Response Segments)
pass a model to resolveChatWrapper (#428) (ca6b901)
defineChatSessionFunction: improve params type (#428) (ca6b901)
Electron template: show chain of thought (#428) (ca6b901) (documentation: DeepSeek R1)
Electron template: add functions template (#428) (ca6b901)
Electron template: new icon for the CI build (#428) (ca6b901)
Electron template: update model message in a more stable manner (#428) (ca6b901)
Electron template: more convenient completion (#428) (ca6b901)

Bug Fixes

check path existence before reading its content (#428) (ca6b901)
partial tokens handling (#428) (ca6b901)
uncaught exception (#430) (599a161)
Electron template: non-latin text formatting (#430) (599a161)

Shipped with llama.cpp release b4753

Assets 16

0 Join discussion

31 Jan 01:09

github-actions

v3.5.0

63a1066

v3.5.0

3.5.0 (2025-01-31)

Features

shorter model URIs (#421) (73454d9) (documentation: Model URIs)

Bug Fixes

add missing Jinja features for DeepSeek (#425) (6e4bf3d)

Shipped with llama.cpp release b4600

Assets 16

0 Join discussion

30 Jan 22:52

github-actions

v3.4.3

6e4bf3d

v3.4.3

3.4.3 (2025-01-30)

Bug Fixes

adapt to llama.cpp breaking changes (#424) (6e4bf3d)

Shipped with llama.cpp release b4599

Assets 16

27 Jan 19:17

github-actions

v3.4.2

314d7e8

v3.4.2

3.4.2 (2025-01-27)

Bug Fixes

metadata string encoding (#420) (314d7e8)
Vulkan parallel decoding (#420) (314d7e8)
try auth token on 401 response (#420) (314d7e8)

Shipped with llama.cpp release b4567

Assets 16

23 Jan 19:30

github-actions

v3.4.1

86e1bee

v3.4.1

3.4.1 (2025-01-23)

Bug Fixes

adapt to breaking llama.cpp changes (#415) (86e1bee)
ranking empty inputs (#415) (86e1bee)

Shipped with llama.cpp release b4529

Assets 16

Uh oh!

Releases: withcatai/node-llama-cpp

v3.10.0

3.10.0 (2025-06-12)

Features

Bug Fixes

Uh oh!

v3.9.0

3.9.0 (2025-06-04)

Features

Bug Fixes

Uh oh!

v3.8.1

3.8.1 (2025-05-19)

Bug Fixes

Uh oh!

v3.8.0

3.8.0 (2025-05-17)

Features

Bug Fixes

Uh oh!

v3.7.0

3.7.0 (2025-03-28)

Features

Bug Fixes

Uh oh!

v3.6.0

✨ DeepSeek R1 is here! ✨

3.6.0 (2025-02-21)

Features

Bug Fixes

Uh oh!

v3.5.0

3.5.0 (2025-01-31)

Features

Bug Fixes

Uh oh!

v3.4.3

3.4.3 (2025-01-30)

Bug Fixes

Uh oh!

v3.4.2

3.4.2 (2025-01-27)

Bug Fixes

Uh oh!

v3.4.1

3.4.1 (2025-01-23)

Bug Fixes

Uh oh!