A lightweight, no-nonsense wrapper for llama.cpp designed specifically for Unity 6.
This package provides a high-performance bridge to run Large Language Models (LLMs) locally within Unity using GGUF quantization. It is designed to be "just a wrapper"—giving you direct access to llama.cpp features without the overhead of a complex AI framework.
- Unity 6 (6000.0.x) or higher.
To install llama-cpp-unity using the Unity Package Manager (UPM):
- Open your project in Unity 6.
- Navigate to Window > Package Manager.
- Click the + (plus) icon in the top-left corner and select Add package from git URL....
- Enter the following URL:
https://github.com/lookbe/llama-cpp-unity.git - Click Add.
- Import the BasicChat sample project from the Package Manager.
- Open the sample scene and locate the ChatCompletion component.
- Change the Model Path to the absolute path of your model.
While you can extend the component to use Application.streamingAssetsPath on Desktop, Android cannot load models directly from StreamingAssets. You must use Application.persistentDataPath for the model path.
On Android, you must do one of the following:
- Copying: Copy the model from
StreamingAssetstopersistentDataPathfirst, before doing any model loading. OR - Downloading: Create a downloader script and save the model asset directly into
persistentDataPath.
- LlamaCpp – LLM inference engine.
If this helps you, consider supporting me:
