Skip to content

Commit

Permalink
修改部分错误的文档,加长了历史记录(for 32k Models) (#1630)
Browse files Browse the repository at this point in the history
* 更新上agent提示词代码 (#1626)

* 修改部分错误的文档,加长了历史记录(for 32k Models) (#1629)

* 更新上agent提示词代码

* 按需修改
  • Loading branch information
zRzRzRzRzRzRzR committed Sep 29, 2023
1 parent eb6f5cf commit e88d926
Show file tree
Hide file tree
Showing 4 changed files with 40 additions and 33 deletions.
23 changes: 9 additions & 14 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -167,6 +167,9 @@ docker run -d --gpus all -p 80:8501 registry.cn-beijing.aliyuncs.com/chatchat/ch
- [BAAI/bge-small-zh](https://huggingface.co/BAAI/bge-small-zh)
- [BAAI/bge-base-zh](https://huggingface.co/BAAI/bge-base-zh)
- [BAAI/bge-large-zh](https://huggingface.co/BAAI/bge-large-zh)
- [BAAI/bge-base-zh-v1.5](https://huggingface.co/BAAI/bge-base-zh-v1.5)
- [BAAI/bge-large-zh-v1.5](https://huggingface.co/BAAI/bge-large-zh-v1.5)- [BAAI/bge-base-zh-v1.5](https://huggingface.co/BAAI/bge-base-zh-v1.5)
- [BAAI/bge-large-zh-v1.5](https://huggingface.co/BAAI/bge-large-zh-v1.5)
- [BAAI/bge-large-zh-noinstruct](https://huggingface.co/BAAI/bge-large-zh-noinstruct)
- [sensenova/piccolo-base-zh](https://huggingface.co/sensenova/piccolo-base-zh)
- [sensenova/piccolo-large-zh](https://huggingface.co/sensenova/piccolo-large-zh)
Expand Down Expand Up @@ -274,28 +277,20 @@ $ git clone https://huggingface.co/moka-ai/m3e-base
在开始执行 Web UI 或命令行交互前,请先检查 [configs/model_config.py](configs/model_config.py)[configs/server_config.py](configs/server_config.py) 中的各项模型参数设计是否符合需求:

- 请确认已下载至本地的 LLM 模型本地存储路径写在 `llm_model_dict` 对应模型的 `local_model_path` 属性中,如:
```
"chatglm2-6b": "/Users/xxx/Downloads/chatglm2-6b",
```python
llm_model_dict={
"chatglm2-6b": {
"local_model_path": "/Users/xxx/Downloads/chatglm2-6b",
"api_base_url": "http://localhost:8888/v1", # "name"修改为 FastChat 服务中的"api_base_url"
"api_key": "EMPTY"
},
}
```

- 请确认已下载至本地的 Embedding 模型本地存储路径写在 `embedding_model_dict` 对应模型位置,如:

```python
embedding_model_dict = {
"m3e-base": "/Users/xxx/Downloads/m3e-base",
}
```
"m3e-base": "/Users/xxx/Downloads/m3e-base",
```

- 请确认本地分词器路径是否已经填写,如:

```python
```
text_splitter_dict = {
"ChineseRecursiveTextSplitter": {
"source": "huggingface", ## 选择tiktoken则使用openai的方法,不填写则默认为字符长度切割方法。
Expand Down Expand Up @@ -358,7 +353,7 @@ $ python startup.py --all-webui --model-name Qwen-7B-Chat

```python
gpus=None,
num_gpus=1,
num_gpus= 1,
max_gpu_memory="20GiB"
```

Expand Down
39 changes: 22 additions & 17 deletions README_en.md
Original file line number Diff line number Diff line change
Expand Up @@ -160,6 +160,8 @@ Following models are tested by developers with Embedding class of [HuggingFace](
- [moka-ai/m3e-large](https://huggingface.co/moka-ai/m3e-large)
- [BAAI/bge-small-zh](https://huggingface.co/BAAI/bge-small-zh)
- [BAAI/bge-base-zh](https://huggingface.co/BAAI/bge-base-zh)
- [BAAI/bge-base-zh-v1.5](https://huggingface.co/BAAI/bge-base-zh-v1.5)
- [BAAI/bge-large-zh-v1.5](https://huggingface.co/BAAI/bge-large-zh-v1.5)
- [BAAI/bge-large-zh](https://huggingface.co/BAAI/bge-large-zh)
- [BAAI/bge-large-zh-noinstruct](https://huggingface.co/BAAI/bge-large-zh-noinstruct)
- [sensenova/piccolo-base-zh](https://huggingface.co/sensenova/piccolo-base-zh)
Expand Down Expand Up @@ -228,30 +230,33 @@ $ git clone https://huggingface.co/moka-ai/m3e-base
```

### 3. Setting Configuration
Copy the model-related parameter configuration template file [configs/model_config.py.example](configs/model_config.py.example) and store it under the project path `. /configs` path and rename it `model_config.py`.

Copy the model-related parameter configuration template file [configs/model_config.py.example](configs/model_config.py.example) and save it in the `./configs` path under the project path, and rename it to `model_config.py`.
Copy the service-related parameter configuration template file [configs/server_config.py.example](configs/server_config.py.example) and store it under the project path `. /configs` path and rename it `server_config.py`.

Copy the service-related parameter configuration template file [configs/server_config.py.example](configs/server_config.py.example) to save in the `./configs` path under the project path, and rename it to `server_config.py`.
Before you start executing the Web UI or command line interactions, check that each of the items in [configs/model_config.py](configs/model_config.py) and [configs/server_config.py](configs/server_config.py) The model parameters are designed to meet the requirements:

Before starting to execute Web UI or command line interaction, please check whether each model parameter in `configs/model_config.py` and `configs/server_config.py` meets the requirements.
- Please make sure that the local storage path of the downloaded LLM model is written in the `local_model_path` attribute of the corresponding model in `llm_model_dict`, e.g..
```
"chatglm2-6b":"/Users/xxx/Downloads/chatglm2-6b",
```

* Please confirm that the path to local LLM model and embedding model have been written in `llm_dict` of `configs/model_config.py`, here is an example:
* If you choose to use OpenAI's Embedding model, please write the model's ``key`` into `embedding_model_dict`. To use this model, you need to be able to access the OpenAI official API, or set up a proxy.
- Please make sure that the local storage path of the downloaded Embedding model is written in `embedding_model_dict` corresponding to the model location, e.g.:

```python
llm_model_dict={
"chatglm2-6b": {
"local_model_path": "/Users/xxx/Downloads/chatglm2-6b",
"api_base_url": "http://localhost:8888/v1", # "name"修改为 FastChat 服务中的"api_base_url"
"api_key": "EMPTY"
},
}
```
"m3e-base":"/Users/xxx/Downloads/m3e-base", ``` Please make sure that the local storage path of the downloaded Embedding model is written in the location of the corresponding model, e.g.
```

```python
embedding_model_dict = {
"m3e-base": "/Users/xxx/Downloads/m3e-base",
}
- Please make sure that the local participle path is filled in, e.g.:

```
text_splitter_dict = {
"ChineseRecursiveTextSplitter": {
"source": "huggingface", ## Select tiktoken to use openai's method, don't fill it in then it defaults to character length cutting method.
"tokenizer_name_or_path": "", ## Leave blank to use the big model of the tokeniser.
}
}
```

### 4. Knowledge Base Migration
Expand Down
2 changes: 2 additions & 0 deletions configs/model_config.py.example
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,8 @@ MODEL_PATH = {
"bge-base-zh": "BAAI/bge-base-zh",
"bge-large-zh": "BAAI/bge-large-zh",
"bge-large-zh-noinstruct": "BAAI/bge-large-zh-noinstruct",
"bge-base-zh-v1.5": "BAAI/bge-base-zh-v1.5",
"bge-large-zh-v1.5": "BAAI/bge-large-zh-v1.5",
"piccolo-base-zh": "sensenova/piccolo-base-zh",
"piccolo-large-zh": "sensenova/piccolo-large-zh",
"text-embedding-ada-002": "your OPENAI_API_KEY",
Expand Down
9 changes: 7 additions & 2 deletions webui_pages/dialogue/dialogue.py
Original file line number Diff line number Diff line change
Expand Up @@ -101,8 +101,10 @@ def llm_model_format_func(x):
st.success(msg)
st.session_state["prev_llm_model"] = llm_model

temperature = st.slider("Temperature:", 0.0, 1.0, TEMPERATURE, 0.05)
history_len = st.number_input("历史对话轮数:", 0, 10, HISTORY_LEN)
temperature = st.slider("Temperature:", 0.0, 1.0, TEMPERATURE, 0.01)

## 部分模型可以超过10抡对话
history_len = st.number_input("历史对话轮数:", 0, 20, HISTORY_LEN)

def on_kb_change():
st.toast(f"已加载知识库: {st.session_state.selected_kb}")
Expand All @@ -117,7 +119,10 @@ def on_kb_change():
key="selected_kb",
)
kb_top_k = st.number_input("匹配知识条数:", 1, 20, VECTOR_SEARCH_TOP_K)

## Bge 模型会超过1
score_threshold = st.slider("知识匹配分数阈值:", 0.0, 1.0, float(SCORE_THRESHOLD), 0.01)

# chunk_content = st.checkbox("关联上下文", False, disabled=True)
# chunk_size = st.slider("关联长度:", 0, 500, 250, disabled=True)
elif dialogue_mode == "搜索引擎问答":
Expand Down

0 comments on commit e88d926

Please sign in to comment.