This is a LLMs project implemented in Java.
You can use it to deploy your own private services, supports the Llama2
and GPT
models and other open-source models.
- Simple Java library
llama-java-core
- Complete application
octet-chat-app
API Services
Quickly realize privatized servicesCLI Interaction
Simple local chat interaction
- 🦙 Built on
llama.cpp
- 😊 Support
AI Agent
and implementsFunction calling
based onQwen-chat
- 🤖 Supports
parallel inference
,continuous conversation
andtext generation
- 📦 Supports the
Llama2
andGPT
models, such asBaichuan 7B
,Qwen 7B
Last updated
...
- 🚀 Added custom AI character and optimized OpenAPI
- 🚀 Added AI Agent and implemented Function calling
- 🚀 Supported dynamic temperature sampling.
- 🚀 Added WebUI to octet-chat-app.
Note
You can quantify the original model yourself or search for huggingface
to obtain open-source models.
How to use
Edit characters.template.json
to set a custom AI character. Run command line interaction and specify the set AI character name.
Example
{
"name": "Assistant Octet",
"agent_mode": false,
"prompt": "Answer the questions.",
"model_parameter": {
"model_path": "/models/ggml-model-7b_m-q6_k.gguf",
"model_type": "LLAMA2",
"context_size": 4096,
"threads": 6,
"threads_batch": 6,
"mmap": true,
"mlock": false,
"verbose": true
},
"generate_parameter": {
"temperature": 0.85,
"repeat_penalty": 1.2,
"top_k": 40,
"top_p": 0.9,
"verbose_prompt": true,
"user": "User",
"assistant": "Octet"
}
}
java -jar octet-chat-app.jar --character YOUR_CHARACTER
Tip
Use help
to view more parameters, for example:
java -jar octet-chat-app.jar --help
usage: Octet.Chat
--app <arg> App launch type: cli | api (default: cli).
-c,--completions Use completions mode.
-ch,--character <arg> Load the specified AI character, default:
llama2-chat.
-h,--help Show this help message and exit.
-q,--questions <arg> Load the specified user question list, example:
/PATH/questions.txt.
Note
Implementation based on the Qwen-chat
series model. For more information, please refer to: Qwen Github
How to use
Download the Qwen-chat
model, edit octet.json
to set the model file path, and change agent_mode
to true
to start the agent mode.
- Two plugins are currently implemented, and as examples you can continue to enrich them.
Plugin | Description |
---|---|
Datetime | A plugin that can query the current system time. |
API | A universal API calling plugin, based on which you can achieve access to services such as weather, text to image, and search. |
Plugin configuration file example: plugins.json
How to use
Just like CLI interaction, set a custom AI character and Launch the app.
open browser enjoy it now http://YOUR_IP_ADDR:8152/
# Default URL: http://YOUR_IP_ADDR:8152/
cd <YOUR_PATH>/octet-chat-app
bash app_server.sh start YOUR_CHARACTER
Tip
It can be integrated into your services, such as VsCode
, App
, Wechat
, etc.
How to call API
Api docs: http://127.0.0.1:8152/swagger-ui.html
curl --location 'http://127.0.0.1:8152/v1/chat/completions' \
--header 'Content-Type: application/json' \
--data '{
"messages": [
{
"role": "USER",
"content": "Who are you?"
}
],
"user": "User",
"stream": true
}'
The API will return data in a stream format:
{
"id": "octetchat-98fhd2dvj7",
"model": "Llama2-chat",
"created": 1695614393810,
"choices": [
{
"index": 0,
"delta": {
"content": "Hi"
},
"finish_reason": "NONE"
}
]
}
Development
Characters config
Important
- This project does not provide any models. Please obtain the model files yourself and comply with relevant agreements.
- Please do not use this project for illegal purposes, including but not limited to commercial use, profit-making use, or use that violates laws and regulations.
- Any legal liability arising from the use of this project shall be borne by the user, and this project shall not bear any legal liability.
- If you have any questions, please submit them in GitHub Issue.