📃 LangChain-Chatchat (formerly Langchain-ChatGLM)
An open-source, offline-deployable RAG and Agent application project based on large language models like ChatGLM and application frameworks like Langchain.
🤖️ A question-answering application based on local knowledge bases using the langchain concept. The goal is to create a friendly and offline-operable knowledge base Q&A solution that supports Chinese scenarios and open-source models.
💡 Inspired by GanymedeNil's project document.ai and AlexZhangji' s ChatGLM-6B Pull Request, this project aims to establish a local knowledge base Q&A application fully utilizing open-source models. The latest version of the project uses FastChat to integrate models like Vicuna, Alpaca, LLaMA, Koala, and RWKV, leveraging the langchain framework to support API calls provided by FastAPI or operations using a WebUI based on Streamlit.
✅ This project supports mainstream open-source LLMs, embedding models, and vector databases, allowing full **open-source ** model offline private deployment. Additionally, the project supports OpenAI GPT API calls and will continue to expand access to various models and model APIs.
⛓️ The implementation principle of this project is as shown below, including loading files -> reading text -> text
segmentation -> text vectorization -> question vectorization -> matching the top k
most similar text vectors with the
question vector -> adding the matched text as context along with the question to the prompt
-> submitting to the LLM
for generating answers.
From the document processing perspective, the implementation process is as follows:
🚩 This project does not involve fine-tuning or training processes but can utilize fine-tuning or training to optimize the project's performance.
🌐 The 0.3.0
version code used in
the AutoDL Mirror has been updated
to version v0.3.0
of this project.
🐳 Docker images will be updated soon.
🧑💻 If you want to contribute to this project, please refer to the Developer Guide for more information on development and deployment.
Features | 0.2.x | 0.3.x |
---|---|---|
Model Integration | Local: fastchat Online: XXXModelWorker |
Local: model_provider, supports most mainstream model loading frameworks Online: oneapi All model integrations are compatible with the openai sdk |
Agent | ❌ Unstable | ✅ Optimized for ChatGLM3 and QWen, significantly enhanced Agent capabilities |
LLM Conversations | ✅ | ✅ |
Knowledge Base Conversations | ✅ | ✅ |
Search Engine Conversations | ✅ | ✅ |
File Conversations | ✅ Only vector search | ✅ Unified as File RAG feature, supports BM25+KNN and other retrieval methods |
Database Conversations | ❌ | ✅ |
ARXIV Document Conversations | ❌ | ✅ |
Wolfram Conversations | ❌ | ✅ |
Text-to-Image | ❌ | ✅ |
Local Knowledge Base Management | ✅ | ✅ |
WEBUI | ✅ | ✅ Better multi-session support, custom system prompts... |
The core functionality of 0.3.x is implemented by Agent, but users can also manually perform tool calls:
Operation Method | Function Implemented | Applicable Scenario |
---|---|---|
Select "Enable Agent", choose multiple tools | Automatic tool calls by LLM | Using models with Agent capabilities like |
ChatGLM3/Qwen or online APIs | ||
Select "Enable Agent", choose a single tool | LLM only parses tool parameters | Using models with general Agent |
capabilities, unable to choose tools well Want to manually select functions |
||
Do not select "Enable Agent", choose a single tool | Manually fill in parameters for tool calls without using Agent | |
function | Using models without Agent capabilities |
More features and updates can be experienced in the actual deployment.
This project already supports mainstream models on the market, such as GLM-4-Chat and Qwen2-Instruct, among the latest open-source large language models and embedding models. Users need to start the model deployment framework and load the required models by modifying the configuration information. The supported local model deployment frameworks in this project are as follows:
Model Deployment Framework | Xinference | LocalAI | Ollama | FastChat |
---|---|---|---|---|
Aligned with OpenAI API | ✅ | ✅ | ✅ | ✅ |
Accelerated Inference Engine | GPTQ, GGML, vLLM, TensorRT | GPTQ, GGML, vLLM, TensorRT | GGUF, GGML | vLLM |
Model Types Supported | LLM, Embedding, Rerank, Text-to-Image, Vision, Audio | LLM, Embedding, Rerank, Text-to-Image, Vision, Audio | LLM, Text-to-Image, Vision | LLM, Vision |
Function Call | ✅ | ✅ | ✅ | / |
More Platform Support (CPU, Metal) | ✅ | ✅ | ✅ | ✅ |
Heterogeneous | ✅ | ✅ | / | / |
Cluster | ✅ | ✅ | / | / |
Documentation Link | Xinference Documentation | LocalAI Documentation | Ollama Documentation | FastChat Documentation |
Available Models | Xinference Supported Models | LocalAI Supported Models | Ollama Supported Models | FastChat Supported Models |
In addition to the above local model loading frameworks, the project also supports the One API framework for integrating online APIs, supporting commonly used online APIs such as OpenAI ChatGPT, Azure OpenAI API, Anthropic Claude, Zhipu Qingyan, and Baichuan.
Note
Regarding Xinference loading local models:
Xinference built-in models will automatically download. To load locally downloaded models, you can
execute streamlit run xinference_manager.py
in the tools/model_loaders directory of the project after starting the
Xinference service and set the local path for the specified model as prompted on the page.
💡 On the software side, this project supports Python 3.8-3.11 environments and has been tested on Windows, macOS, and Linux operating systems.
💻 On the hardware side, as version 0.3.0 has been modified to support integration with different model deployment frameworks, it can be used under various hardware conditions such as CPU, GPU, NPU, and MPS.
Starting from version 0.3.0, Langchain-Chatchat provides installation in the form of a Python library. Execute the following command for installation:
pip install langchain-chatchat -U
[!Note]
Since the model deployment framework Xinference requires additional Python dependencies when integrated with Langchain-Chatchat, it is recommended to use the following installation method if you want to use it with the Xinference framework:
pip install "langchain-chatchat[xinference]" -U
-
Model Inference Framework and Load Models
-
Model Inference Framework and Load Models
Starting from version 0.3.0, Langchain-Chatchat no longer directly loads models based on the local model path entered by users. The involved model types include LLM, Embedding, Reranker, and the multi-modal models to be supported in the future. Instead, it supports integration with mainstream model inference frameworks such as Xinference, Ollama, LocalAI, FastChat and One API.
Therefore, please ensure to run the model inference framework and load the required models before starting Langchain-Chatchat.
Here is an example of Xinference. Please refer to the Xinference Document for framework deployment and model loading.
Warning
To avoid dependency conflicts, place Langchain-Chatchat and model deployment frameworks like Xinference in different Python virtual environments, such as conda, venv, virtualenv, etc.
Starting from version 0.3.0, Langchain-Chatchat no longer modifies the configuration through local files but uses command-line methods and will add configuration item modification pages in future versions.
The following introduces how to view and modify the configuration.
Enter the following command to view the optional configuration types:
chatchat-config --help
You will get the following response:
Usage: chatchat-config [OPTIONS] COMMAND [ARGS]...
指令` chatchat-config` 工作空间配置
Options:
--help Show this message and exit.
Commands:
basic 基础配置
kb 知识库配置
model 模型配置
server 服务配置
You can choose the required configuration type based on the above commands. For example, to view or
modify basic configuration
, you can enter the following command to get help information:
chatchat-config basic --help
You will get the following response:
Usage: chatchat-config basic [OPTIONS]
基础配置
Options:
--verbose [true|false] 是否开启详细日志
--data TEXT 初始化数据存放路径,注意:目录会清空重建
--format TEXT 日志格式
--clear 清除配置
--show 显示配置
--help Show this message and exit.
To modify the default llm
model in model configuration
, you can execute the following command to view the
configuration item names:
chatchat-config basic --show
If no configuration item modification is made, the default configuration is as follows:
{
"log_verbose": false,
"CHATCHAT_ROOT": "/root/anaconda3/envs/chatchat/lib/python3.11/site-packages/chatchat",
"DATA_PATH": "/root/anaconda3/envs/chatchat/lib/python3.11/site-packages/chatchat/data",
"IMG_DIR": "/root/anaconda3/envs/chatchat/lib/python3.11/site-packages/chatchat/img",
"NLTK_DATA_PATH": "/root/anaconda3/envs/chatchat/lib/python3.11/site-packages/chatchat/data/nltk_data",
"LOG_FORMAT": "%(asctime)s - %(filename)s[line:%(lineno)d] - %(levelname)s: %(message)s",
"LOG_PATH": "/root/anaconda3/envs/chatchat/lib/python3.11/site-packages/chatchat/data/logs",
"MEDIA_PATH": "/root/anaconda3/envs/chatchat/lib/python3.11/site-packages/chatchat/data/media",
"BASE_TEMP_DIR": "/root/anaconda3/envs/chatchat/lib/python3.11/site-packages/chatchat/data/temp",
"class_name": "ConfigBasic"
}
To modify the default llm model
in model configuration
, you can execute the following command to view the
configuration item names:
chatchat-config model --help
You will get:
Usage: chatchat-config model [OPTIONS]
模型配置
Options:
--default_llm_model TEXT 默认llm模型
--default_embedding_model TEXT 默认embedding模型
--agent_model TEXT agent模型
--history_len INTEGER 历史长度
--max_tokens INTEGER 最大tokens
--temperature FLOAT 温度
--support_agent_models TEXT 支持的agent模型
--set_model_platforms TEXT 模型平台配置 as a JSON string.
--set_tool_config TEXT 工具配置项 as a JSON string.
--clear 清除配置
--show 显示配置
--help Show this message and exit.
First, view the current model configuration
parameters:
chatchat-config model --show
You will get:
{
"DEFAULT_LLM_MODEL": "glm4-chat",
"DEFAULT_EMBEDDING_MODEL": "bge-large-zh-v1.5",
"Agent_MODEL": null,
"HISTORY_LEN": 3,
"MAX_TOKENS": null,
"TEMPERATURE": 0.7,
...
"class_name": "ConfigModel"
}
To modify the default llm
model to qwen2-instruct
, execute:
chatchat-config model --default_llm_model qwen2-instruct
For more configuration modification help, refer to README.md
- Custom Model Integration Configuration
After completing the above project configuration item viewing and modification, proceed to step 2. Model Inference Framework and Load Models and select the model inference framework and loaded models. Model inference frameworks include Xinference,Ollama,LocalAI,FastChat and One API, supporting new Chinese open-source models like GLM-4-Chat and Qwen2-Instruct
If you already have an address with the capability of an OpenAI endpoint, you can directly configure it in MODEL_PLATFORMS as follows:
chatchat-config model --set_model_platforms TEXT Configure model platforms as a JSON string.
platform_name
can be arbitrarily filled, just ensure it is unique.platform_type
might be used in the future for functional distinctions based on platform types, so it should match the platform_name.- List the models deployed on the framework in the corresponding list. Different frameworks can load models with the same name, and the project will automatically balance the load.
- Set up the model
$ chatchat-config model --set_model_platforms "[{
\"platform_name\": \"xinference\",
\"platform_type\": \"xinference\",
\"api_base_url\": \"http://127.0.0.1:9997/v1\",
\"api_key\": \"EMPT\",
\"api_concurrencies\": 5,
\"llm_models\": [
\"autodl-tmp-glm-4-9b-chat\"
],
\"embed_models\": [
\"bge-large-zh-v1.5\"
],
\"image_models\": [],
\"reranking_models\": [],
\"speech2text_models\": [],
\"tts_models\": []
}]"
Warning
Before initializing the knowledge base, ensure that the model inference framework and corresponding embedding model are running, and complete the model integration configuration as described in steps 3 and 4.
cd # Return to the original directory
chatchat-kb -r
Specify text-embedding model for initialization (if needed):
cd # Return to the original directory
chatchat-kb -r --embed-model=text-embedding-3-smal
Successful output will be:
----------------------------------------------------------------------------------------------------
知识库名称 :samples
知识库类型 :faiss
向量模型: :bge-large-zh-v1.5
知识库路径 :/root/anaconda3/envs/chatchat/lib/python3.11/site-packages/chatchat/data/knowledge_base/samples
文件总数量 :47
入库文件数 :42
知识条目数 :740
用时 :0:02:29.701002
----------------------------------------------------------------------------------------------------
总计用时 :0:02:33.414425
The knowledge base path is in the knowledge_base directory under the path pointed by the DATA_PATH variable in
step 3.2
:
(chatchat) [root@VM-centos ~]# ls /root/anaconda3/envs/chatchat/lib/python3.11/site-packages/chatchat/data/knowledge_base/samples/vector_store
bge-large-zh-v1.5 text-embedding-3-small
This issue often occurs in newly created virtual environments and can be confirmed through the following methods:
from unstructured.partition.auto import partition
If the statement gets stuck and cannot be executed, the following command can be executed:
pip uninstall python-magic-bin
# check the version of the uninstalled package
pip install 'python-magic-bin=={version}'
Then follow the instructions in this section to recreate the knowledge base.
chatchat -a
Successful startup output:
Warning
As the DEFAULT_BIND_HOST
of the chatchat-config server configuration is set to 127.0.0.1
by default, it cannot be
accessed through other IPs.
To modify, refer to the following method:
Instructions
chatchat-config server --show
You will get:
{
"HTTPX_DEFAULT_TIMEOUT": 300.0,
"OPEN_CROSS_DOMAIN": true,
"DEFAULT_BIND_HOST": "127.0.0.1",
"WEBUI_SERVER_PORT": 8501,
"API_SERVER_PORT": 7861,
"WEBUI_SERVER": {
"host": "127.0.0.1",
"port": 8501
},
"API_SERVER": {
"host": "127.0.0.1",
"port": 7861
},
"class_name": "ConfigServer"
}
To access via the machine's IP (such as in a Linux system), change the listening address to 0.0.0.0
.
chatchat-config server --default_bind_host=0.0.0.0
You will get:
{
"HTTPX_DEFAULT_TIMEOUT": 300.0,
"OPEN_CROSS_DOMAIN": true,
"DEFAULT_BIND_HOST": "0.0.0.0",
"WEBUI_SERVER_PORT": 8501,
"API_SERVER_PORT": 7861,
"WEBUI_SERVER": {
"host": "0.0.0.0",
"port": 8501
},
"API_SERVER": {
"host": "0.0.0.0",
"port": 7861
},
"class_name": "ConfigServer"
}
- The structure of 0.3.x has changed significantly, it is strongly recommended to redeploy according to the documentation. The following guide does not guarantee 100% compatibility and success. Remember to backup important data in advance!
- First configure the operating environment according to the steps in
Installation
. - Configure
DATA
and other options. - Copy the knowledge_base directory of the 0.2.x project to the configured
DATA
directory.
The code of this project follows the Apache-2.0 agreement.
-
April 2023
:Langchain-ChatGLM 0.1.0
released, supporting local knowledge base question and answer based on ChatGLM-6B model. -
August 2023
:Langchain-ChatGLM
renamed toLangchain-Chatchat
, released0.2.0
version, usingfastchat
as model loading solution, supporting more models and databases. -
October 2023
:Langchain-Chatchat 0.2.5
released, launching Agent content, open source project won the third prize in the hackathon held byFounder Park & Zhipu AI & Zilliz
. -
December 2023
:Langchain-Chatchat
open source project received more than 20K stars. -
June 2024
:Langchain-Chatchat 0.3.0
released, bringing a new project architecture. -
🔥 Let us look forward to the future story of Chatchat ···
🎉 Langchain-Chatchat project WeChat exchange group, if you are also interested in this project, welcome to join the group chat to participate in the discussion.
🎉 Langchain-Chatchat project official public account, welcome to scan the code to follow.