Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Gemma 3 Support #440

Open
5 tasks
yhg8423 opened this issue Mar 13, 2025 · 5 comments · May be fixed by #444
Open
5 tasks

feat: Gemma 3 Support #440

yhg8423 opened this issue Mar 13, 2025 · 5 comments · May be fixed by #444
Assignees
Labels
new feature New feature or request roadmap Part of the roadmap for node-llama-cpp (https://github.com/orgs/withcatai/projects/1)

Comments

@yhg8423
Copy link
Contributor

yhg8423 commented Mar 13, 2025

Feature Description

Currently, Google's Gemma 3 models are released. It would enhance its utility and expand compatibility for users seeking to integrate the latest models into their applications.

The Solution

The node-llama-cpp cannot load model architecture of Gemma 3 now. The information about Gemma 3 gguf should be added.

Considered Alternatives

It just added new model to library, so I think there is no other ways to alternate the solution.

Additional Context

No response

Related Features to This Feature Request

  • Metal support
  • CUDA support
  • Vulkan support
  • Grammar
  • Function calling

Are you willing to resolve this issue by submitting a Pull Request?

Yes, I have the time, and I know how to start.

@yhg8423 yhg8423 added new feature New feature or request requires triage Requires triaging labels Mar 13, 2025
@giladgd
Copy link
Contributor

giladgd commented Mar 13, 2025

I'll release a new version next week that will come with the latest llama.cpp release, which supports Gemma 3 models.

You can always build the latest llama.cpp release in node-llama-cpp to use the latest features, even before a new version of node-llama-cpp is released.

There's currently an issue with the latest release of llama.cpp, so you can use release b4880 for now, which includes support for Gemma 3, but doesn't have the issue from the latest release.
Run this command inside your project directory to download this release and build it:

npx --no node-llama-cpp source download --release b4880

@giladgd giladgd self-assigned this Mar 13, 2025
@giladgd giladgd added roadmap Part of the roadmap for node-llama-cpp (https://github.com/orgs/withcatai/projects/1) and removed requires triage Requires triaging labels Mar 13, 2025
@giladgd giladgd moved this to In Progress in node-llama-cpp: roadmap Mar 13, 2025
@yhg8423
Copy link
Contributor Author

yhg8423 commented Mar 14, 2025

Oh, I see! Thanks! 👍

@briancullinan2
Copy link

Command looked promising! Not sure where to go from here:

FAILED: CMakeFiles/llama-addon.dir/addon/AddonGrammarEvaluationState.cpp.o 
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/c++ -DGGML_BACKEND_SHARED -DGGML_SHARED -DGGML_USE_BLAS -DGGML_USE_CPU -DGGML_USE_METAL -DLLAMA_SHARED -DNAPI_VERSION=7 -Dllama_addon_EXPORTS -I/Users/briancullinan/jupyter_ops/node_modules/node-addon-api -I/Users/briancullinan/jupyter_ops/node_modules/node-api-headers/include -I/Users/briancullinan/jupyter_ops/node_modules/node-llama-cpp/llama/gpuInfo -I/Users/briancullinan/jupyter_ops/node_modules/node-llama-cpp/llama/llama.cpp -I/Users/briancullinan/jupyter_ops/node_modules/node-llama-cpp/llama/./llama.cpp/common -I/Users/briancullinan/jupyter_ops/node_modules/node-llama-cpp/llama/llama.cpp/src/. -I/Users/briancullinan/jupyter_ops/node_modules/node-llama-cpp/llama/llama.cpp/src/../include -I/Users/briancullinan/jupyter_ops/node_modules/node-llama-cpp/llama/llama.cpp/src/../common -I/Users/briancullinan/jupyter_ops/node_modules/node-llama-cpp/llama/llama.cpp/ggml/src/../include -I/Users/briancullinan/jupyter_ops/node_modules/node-llama-cpp/llama/llama.cpp/common/. -D_DARWIN_USE_64_BIT_INODE=1 -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -DBUILDING_NODE_EXTENSION -O3 -DNDEBUG -std=gnu++17 -arch arm64 -isysroot /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX15.1.sdk -fPIC -fexceptions -Wno-c++17-extensions -MD -MT CMakeFiles/llama-addon.dir/addon/AddonGrammarEvaluationState.cpp.o -MF CMakeFiles/llama-addon.dir/addon/AddonGrammarEvaluationState.cpp.o.d -o CMakeFiles/llama-addon.dir/addon/AddonGrammarEvaluationState.cpp.o -c /Users/briancullinan/jupyter_ops/node_modules/node-llama-cpp/llama/addon/AddonGrammarEvaluationState.cpp
/Users/briancullinan/jupyter_ops/node_modules/node-llama-cpp/llama/addon/AddonGrammarEvaluationState.cpp:15:15: error: no matching function for call to 'llama_sampler_init_grammar'
   15 |     sampler = llama_sampler_init_grammar(model->model, grammarDef->grammarCode.c_str(), grammarDef->rootRuleName.c_str());
      |               ^~~~~~~~~~~~~~~~~~~~~~~~~~
/Users/briancullinan/jupyter_ops/node_modules/node-llama-cpp/llama/llama.cpp/src/../include/llama.h:1203:38: note: candidate function not viable: cannot convert argument of incomplete type 'llama_model *' to 'const struct llama_vocab *' for 1st argument
 1203 |     LLAMA_API struct llama_sampler * llama_sampler_init_grammar(
      |                                      ^
 1204 |             const struct llama_vocab * vocab,
      |             ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1 error generated.
[58/72] Building CXX object llama.cpp/common/CMakeFiles/common.dir/common.cpp.o
[59/72] Building CXX object llama.cpp/common/CMakeFiles/common.dir/arg.cpp.o
[60/72] Building CXX object llama.cpp/common/CMakeFiles/common.dir/json-schema-to-grammar.cpp.o
[61/72] Building CXX object llama.cpp/common/CMakeFiles/common.dir/chat.cpp.o
ninja: build stopped: subcommand failed.
ERR! OMG Process terminated: 1
✖️ Failed to compile llama.cpp
Failed to build llama.cpp with Metal support. Error: SpawnError: Command npm run -s cmake-js-llama -- compile --log-level warn --config Release --arch=arm64 --out localBuilds/mac-arm64-metal-release-b4880 --runtime-version=22.9.0 --parallel=9 --CDGGML_METAL=1 --CDGGML_CCACHE=OFF exited with code 1

@giladgd
Copy link
Contributor

giladgd commented Mar 16, 2025

@briancullinan2 Do you use the latest version of node-llama-cpp? (version 3.6.0)
If you do, please run this command inside your project and attach its result so I can help you:

npx --no node-llama-cpp inspect gpu

@briancullinan2
Copy link

briancullinan2 commented Mar 16, 2025

I believe I'm using the latest version. Other models work fine, but that's the first time I've tried that npx compile command.

OS: macOS 24.3.0 (arm64)
Node: 22.9.0 (arm64)
node-llama-cpp: 3.6.0

Metal: available

Metal device: Apple M1 Max
Metal used VRAM: 0% (64KB/48GB)
Metal free VRAM: 99.99% (48GB/48GB)
Metal unified memory: 48GB (100%)

CPU model: Apple M1 Max
Math cores: 8
Used RAM: 99.6% (63.75GB/64GB)
Free RAM: 0.39% (257.2MB/64GB)
Used swap: 97.6% (26.35GB/27GB)
Max swap size: dynamic
mmap: supported

Is there another version I could try other than --release b4880 that would narrow down if it is my machine/build process. I imagine the distributed pre-built binaries are working in my case since it didn't rebuild on install? Maybe it is a bug in the build process on Mac?

@giladgd giladgd linked a pull request Mar 20, 2025 that will close this issue
7 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
new feature New feature or request roadmap Part of the roadmap for node-llama-cpp (https://github.com/orgs/withcatai/projects/1)
Projects
Status: In Progress
Development

Successfully merging a pull request may close this issue.

3 participants