Hi team,
I would like to request official support and documentation for Gemma 3 4B and Gemma 4 integration within llama-cpp-python.
It would be helpful to have clarification regarding:
- Current compatibility status
- Required llama.cpp / llama-cpp-python versions
- Chat template handling requirements
- Recommended GGUF formats and quantizations
- Known limitations or issues
- Recommended inference parameters for stable usage
Any guidance, implementation notes, or examples would be greatly appreciated (without using ollama)
Kind regards,
Jonathan Park
Hi team,
I would like to request official support and documentation for Gemma 3 4B and Gemma 4 integration within llama-cpp-python.
It would be helpful to have clarification regarding:
Any guidance, implementation notes, or examples would be greatly appreciated (without using ollama)
Kind regards,
Jonathan Park