Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 10 additions & 9 deletions README.md

Large diffs are not rendered by default.

19 changes: 10 additions & 9 deletions README_CN.md

Large diffs are not rendered by default.

18 changes: 9 additions & 9 deletions docs/source/LLM/LLM推理文档.md
Original file line number Diff line number Diff line change
Expand Up @@ -413,21 +413,21 @@ CUDA_VISIBLE_DEVICES=0 swift app-ui --model_type qwen-7b-chat
import os
os.environ['CUDA_VISIBLE_DEVICES'] = '0'

from swift.llm import InferArguments, ModelType, app_ui_main
from swift.llm import AppUIArguments, ModelType, app_ui_main

infer_args = InferArguments(model_type=ModelType.qwen_7b_chat)
app_ui_main(infer_args)
app_ui_args = AppUIArguments(model_type=ModelType.qwen_7b_chat)
app_ui_main(app_ui_args)
```

使用bnb量化:
```python
import os
os.environ['CUDA_VISIBLE_DEVICES'] = '0'

from swift.llm import InferArguments, ModelType, app_ui_main
from swift.llm import AppUIArguments, ModelType, app_ui_main

infer_args = InferArguments(model_type=ModelType.qwen_7b_chat, quantization_bit=4)
app_ui_main(infer_args)
app_ui_args = AppUIArguments(model_type=ModelType.qwen_7b_chat, quantization_bit=4)
app_ui_main(app_ui_args)
```

### qwen-7b
Expand All @@ -441,10 +441,10 @@ CUDA_VISIBLE_DEVICES=0 swift app-ui --model_type qwen-7b
import os
os.environ['CUDA_VISIBLE_DEVICES'] = '0'

from swift.llm import InferArguments, ModelType, app_ui_main
from swift.llm import AppUIArguments, ModelType, app_ui_main

infer_args = InferArguments(model_type=ModelType.qwen_7b)
app_ui_main(infer_args)
app_ui_args = AppUIArguments(model_type=ModelType.qwen_7b)
app_ui_main(app_ui_args)
```

### 微调后模型
Expand Down
35 changes: 23 additions & 12 deletions docs/source/LLM/命令行参数.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,12 @@
# 命令行参数
## 目录
- [sft 命令行参数](#sft-命令行参数)
- [merge-lora infer app-ui 命令行参数](#merge-lora-infer-app-ui-命令行参数)
- [deploy 命令行参数](#deploy-命令行参数)
- [SFT 参数](#SFT-参数)
- [DPO 参数](#DPO-参数)
- [merge-lora infer 参数](#merge-lora-infer-参数)
- [app-ui 参数](#app-ui-参数)
- [deploy 参数](#deploy-参数)

## sft 命令行参数
## SFT 参数
- `--model_type`: 表示你选择的模型类型, 默认是`None`. 如果没有指定`model_id_or_path`, 则抛出异常. 如果指定了`model_id_or_path`, 则会根据`model_id_or_path`以及`MODEL_MAPPING`推断`model_type`. `model_type`和`model_id_or_path`这两个参数不能同时指定. 可以选择的`model_type`可以查看`MODEL_MAPPING.keys()`.
- `--model_id_or_path`: 表示模型在ModelScope Hub中的`model_id`, 不区分大小写, 默认为`None`. 如果`--model_id_or_path`未被注册, 则会抛出异常. 你可以使用`model_type`的方式指定模型类型, 也可以通过`model_id_or_path`的方式指定模型类型.
- `--model_revision`: 表示模型在ModelScope Hub中对应`model_id`的版本号, 默认为`None`. `model_revision`指定为`None`, 则使用注册在`MODEL_MAPPING`中的revision. 否则强制使用命令行传入的`model_revision`.
Expand Down Expand Up @@ -92,15 +94,15 @@
- `--repetition_penalty`: 默认为`1.05`. 该参数只有在`predict_with_generate`设置为True的时候才生效.
- `--num_beams`: 默认为`1`. 该参数只有在`predict_with_generate`设置为True的时候才生效.

## DPO参数
## DPO 参数

DPO参数继承了上面的SFT参数, 除此之外增加了以下参数:
dpo参数继承了sft参数, 除此之外增加了以下参数:

- `--ref_model_type` 对比模型类型, 可以选择的`model_type`可以查看`MODEL_MAPPING.keys()`.
- `--max_prompt_length` 最大的提示长度, 该参数会传入DPOTrainer中, 使prompt长度不超过该值的设置, 默认值1024.
- `--ref_model_type` 对比模型的类型, 可以选择的`model_type`可以查看`MODEL_MAPPING.keys()`.
- `--max_prompt_length` 最大的提示长度, 该参数会传入DPOTrainer中, 使prompt长度不超过该值的设置, 默认值`1024`.


## merge-lora infer app-ui 命令行参数
## merge-lora infer 参数
- `--model_type`: 默认值为`None`, 具体的参数介绍可以在`sft.sh命令行参数`中查看.
- `--model_id_or_path`: 默认值为`None`, 具体的参数介绍可以在`sft.sh命令行参数`中查看. 推荐使用model_type的方式指定.
- `--model_revision`: 默认值为`None`. 具体的参数介绍可以在`sft.sh命令行参数`中查看. 如果`model_id_or_path`为None或者是本地的模型目录, 则该参数失效.
Expand Down Expand Up @@ -142,14 +144,23 @@ DPO参数继承了上面的SFT参数, 除此之外增加了以下参数:
- `--save_safetensors`: 保存成`safetensors`文件还是`bin`文件. 默认为`True`.
- `--overwrite_generation_config`: 是否将评估所使用的generation_config保存成`generation_config.json`文件, 默认为`None`. 如果指定了`ckpt_dir`, 则设置为`True`, 否则设置为`False`. 训练时保存的generation_config文件将被覆盖.
- `--verbose`: 如果设置为False, 则使用tqdm样式推理. 如果设置为True, 则输出推理的query, response, label. 默认为`None`, 进行自动选择, 即`len(val_dataset) >= 100`时, 设置为False, 否则设置为True. 该参数只有在使用数据集评估时生效.
- `--share`: 传递给gradio的`demo.queue().launch(...)`函数. 该参数只有在使用`app-ui`时才生效.
- `--gpu_memory_utilization`: 初始化vllm引擎`EngineArgs`的参数, 默认为`0.9`. 该参数只有在使用vllm时才生效.
- `--tensor_parallel_size`: 初始化vllm引擎`EngineArgs`的参数, 默认为`1`. 该参数只有在使用vllm时才生效.


## deploy 命令行参数
## app-ui 参数

app-ui参数继承了infer参数, 除此之外增加了以下参数:

- `server_name`: 默认为`'127.0.0.1'`. 传递给gradio的`demo.queue().launch(...)`函数.
- `server_port`: 默认为`7860`. 传递给gradio的`demo.queue().launch(...)`函数.
- `share`: 默认为`False`. 传递给gradio的`demo.queue().launch(...)`函数.

## deploy 参数

deploy参数继承了infer参数, 除此之外增加了以下参数:

- `--host`: 默认为`'127.0.0.1`.
- `--port`: 默认为`8000`.
- `--ssl_keyfile`: 默认为`None`.
- `--ssl_certfile`: 默认为`None`.
- 其他参数继承自infer的命令行参数.
19 changes: 12 additions & 7 deletions docs/source/LLM/支持的模型和数据集.md
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,8 @@
|yi-34b-chat|[01ai/Yi-34B-Chat](https://modelscope.cn/models/01ai/Yi-34B-Chat/summary)|q_proj, k_proj, v_proj|yi|✔|✔||
|deepseek-7b|[deepseek-ai/deepseek-llm-7b-base](https://modelscope.cn/models/deepseek-ai/deepseek-llm-7b-base/summary)|q_proj, k_proj, v_proj|default-generation-bos|✔|✔||
|deepseek-7b-chat|[deepseek-ai/deepseek-llm-7b-chat](https://modelscope.cn/models/deepseek-ai/deepseek-llm-7b-chat/summary)|q_proj, k_proj, v_proj|deepseek|✔|✔||
|deepseek-moe-16b|[deepseek-ai/deepseek-moe-16b-base](https://modelscope.cn/models/deepseek-ai/deepseek-moe-16b-base/summary)|q_proj, k_proj, v_proj|default-generation-bos|✔|✘||
|deepseek-moe-16b-chat|[deepseek-ai/deepseek-moe-16b-chat](https://modelscope.cn/models/deepseek-ai/deepseek-moe-16b-chat/summary)|q_proj, k_proj, v_proj|deepseek|✔|✘||
|deepseek-67b|[deepseek-ai/deepseek-llm-67b-base](https://modelscope.cn/models/deepseek-ai/deepseek-llm-67b-base/summary)|q_proj, k_proj, v_proj|default-generation-bos|✔|✔||
|deepseek-67b-chat|[deepseek-ai/deepseek-llm-67b-chat](https://modelscope.cn/models/deepseek-ai/deepseek-llm-67b-chat/summary)|q_proj, k_proj, v_proj|deepseek|✔|✔||
|openbuddy-llama2-13b-chat|[OpenBuddy/openbuddy-llama2-13b-v8.1-fp16](https://modelscope.cn/models/OpenBuddy/openbuddy-llama2-13b-v8.1-fp16/summary)|q_proj, k_proj, v_proj|openbuddy|✔|✔||
Expand All @@ -64,10 +66,10 @@
|openbuddy-zephyr-7b-chat|[OpenBuddy/openbuddy-zephyr-7b-v14.1](https://modelscope.cn/models/OpenBuddy/openbuddy-zephyr-7b-v14.1/summary)|q_proj, k_proj, v_proj|openbuddy|✔|✔|transformers>=4.34|
|openbuddy-deepseek-67b-chat|[OpenBuddy/openbuddy-deepseek-67b-v15.2](https://modelscope.cn/models/OpenBuddy/openbuddy-deepseek-67b-v15.2/summary)|q_proj, k_proj, v_proj|openbuddy|✔|✔||
|mistral-7b|[AI-ModelScope/Mistral-7B-v0.1](https://modelscope.cn/models/AI-ModelScope/Mistral-7B-v0.1/summary)|q_proj, k_proj, v_proj|default-generation-bos|✔|✔|transformers>=4.34|
|mistral-7b-chat|[AI-ModelScope/Mistral-7B-Instruct-v0.1](https://modelscope.cn/models/AI-ModelScope/Mistral-7B-Instruct-v0.1/summary)|q_proj, k_proj, v_proj|llama|✔|✔|transformers>=4.34|
|mistral-7b-chat-v2|[AI-ModelScope/Mistral-7B-Instruct-v0.2](https://modelscope.cn/models/AI-ModelScope/Mistral-7B-Instruct-v0.2/summary)|q_proj, k_proj, v_proj|llama|✔|✔|transformers>=4.34|
|mixtral-7b-moe|[AI-ModelScope/Mixtral-8x7B-v0.1](https://modelscope.cn/models/AI-ModelScope/Mixtral-8x7B-v0.1/summary)|q_proj, k_proj, v_proj|default-generation-bos|✔|✔|transformers>=4.36|
|mixtral-7b-moe-chat|[AI-ModelScope/Mixtral-8x7B-Instruct-v0.1](https://modelscope.cn/models/AI-ModelScope/Mixtral-8x7B-Instruct-v0.1/summary)|q_proj, k_proj, v_proj|llama|✔|✔|transformers>=4.36|
|mistral-7b-instruct|[AI-ModelScope/Mistral-7B-Instruct-v0.1](https://modelscope.cn/models/AI-ModelScope/Mistral-7B-Instruct-v0.1/summary)|q_proj, k_proj, v_proj|llama|✔|✔|transformers>=4.34|
|mistral-7b-instruct-v2|[AI-ModelScope/Mistral-7B-Instruct-v0.2](https://modelscope.cn/models/AI-ModelScope/Mistral-7B-Instruct-v0.2/summary)|q_proj, k_proj, v_proj|llama|✔|✔|transformers>=4.34|
|mixtral-moe-7b|[AI-ModelScope/Mixtral-8x7B-v0.1](https://modelscope.cn/models/AI-ModelScope/Mixtral-8x7B-v0.1/summary)|q_proj, k_proj, v_proj|default-generation-bos|✔|✔|transformers>=4.36|
|mixtral-moe-7b-instruct|[AI-ModelScope/Mixtral-8x7B-Instruct-v0.1](https://modelscope.cn/models/AI-ModelScope/Mixtral-8x7B-Instruct-v0.1/summary)|q_proj, k_proj, v_proj|llama|✔|✔|transformers>=4.36|
|baichuan-7b|[baichuan-inc/baichuan-7B](https://modelscope.cn/models/baichuan-inc/baichuan-7B/summary)|W_pack|default-generation|&#x2718;|&#x2714;|transformers<4.34|
|baichuan-13b|[baichuan-inc/Baichuan-13B-Base](https://modelscope.cn/models/baichuan-inc/Baichuan-13B-Base/summary)|W_pack|default-generation|&#x2718;|&#x2714;|transformers<4.34|
|baichuan-13b-chat|[baichuan-inc/Baichuan-13B-Chat](https://modelscope.cn/models/baichuan-inc/Baichuan-13B-Chat/summary)|W_pack|baichuan|&#x2718;|&#x2714;|transformers<4.34|
Expand Down Expand Up @@ -104,11 +106,11 @@
|tongyi-finance-14b-chat-int4|[TongyiFinance/Tongyi-Finance-14B-Chat-Int4](https://modelscope.cn/models/TongyiFinance/Tongyi-Finance-14B-Chat-Int4/summary)|c_attn|qwen|&#x2714;|&#x2718;|auto_gptq>=0.5|
|codefuse-codellama-34b-chat|[codefuse-ai/CodeFuse-CodeLlama-34B](https://modelscope.cn/models/codefuse-ai/CodeFuse-CodeLlama-34B/summary)|q_proj, k_proj, v_proj|codefuse-codellama|&#x2714;|&#x2714;||
|deepseek-coder-1_3b|[deepseek-ai/deepseek-coder-1.3b-base](https://modelscope.cn/models/deepseek-ai/deepseek-coder-1.3b-base/summary)|q_proj, k_proj, v_proj|default-generation-bos|&#x2714;|&#x2714;||
|deepseek-coder-1_3b-chat|[deepseek-ai/deepseek-coder-1.3b-instruct](https://modelscope.cn/models/deepseek-ai/deepseek-coder-1.3b-instruct/summary)|q_proj, k_proj, v_proj|deepseek-coder|&#x2714;|&#x2714;||
|deepseek-coder-1_3b-instruct|[deepseek-ai/deepseek-coder-1.3b-instruct](https://modelscope.cn/models/deepseek-ai/deepseek-coder-1.3b-instruct/summary)|q_proj, k_proj, v_proj|deepseek-coder|&#x2714;|&#x2714;||
|deepseek-coder-6_7b|[deepseek-ai/deepseek-coder-6.7b-base](https://modelscope.cn/models/deepseek-ai/deepseek-coder-6.7b-base/summary)|q_proj, k_proj, v_proj|default-generation-bos|&#x2714;|&#x2714;||
|deepseek-coder-6_7b-chat|[deepseek-ai/deepseek-coder-6.7b-instruct](https://modelscope.cn/models/deepseek-ai/deepseek-coder-6.7b-instruct/summary)|q_proj, k_proj, v_proj|deepseek-coder|&#x2714;|&#x2714;||
|deepseek-coder-6_7b-instruct|[deepseek-ai/deepseek-coder-6.7b-instruct](https://modelscope.cn/models/deepseek-ai/deepseek-coder-6.7b-instruct/summary)|q_proj, k_proj, v_proj|deepseek-coder|&#x2714;|&#x2714;||
|deepseek-coder-33b|[deepseek-ai/deepseek-coder-33b-base](https://modelscope.cn/models/deepseek-ai/deepseek-coder-33b-base/summary)|q_proj, k_proj, v_proj|default-generation-bos|&#x2714;|&#x2714;||
|deepseek-coder-33b-chat|[deepseek-ai/deepseek-coder-33b-instruct](https://modelscope.cn/models/deepseek-ai/deepseek-coder-33b-instruct/summary)|q_proj, k_proj, v_proj|deepseek-coder|&#x2714;|&#x2714;||
|deepseek-coder-33b-instruct|[deepseek-ai/deepseek-coder-33b-instruct](https://modelscope.cn/models/deepseek-ai/deepseek-coder-33b-instruct/summary)|q_proj, k_proj, v_proj|deepseek-coder|&#x2714;|&#x2714;||
|phi2-3b|[AI-ModelScope/phi-2](https://modelscope.cn/models/AI-ModelScope/phi-2/summary)|Wqkv|default-generation|&#x2714;|&#x2714;||
|cogagent-chat|[ZhipuAI/cogagent-chat](https://modelscope.cn/models/ZhipuAI/cogagent-chat/summary)|vision_expert_query_key_value, vision_expert_dense, language_expert_query_key_value, language_expert_dense, query, key_value, dense|cogagent|&#x2718;|&#x2718;||
|cogagent-vqa|[ZhipuAI/cogagent-vqa](https://modelscope.cn/models/ZhipuAI/cogagent-vqa/summary)|vision_expert_query_key_value, vision_expert_dense, language_expert_query_key_value, language_expert_dense, query, key_value, dense|cogagent|&#x2718;|&#x2718;||
Expand Down Expand Up @@ -172,5 +174,8 @@
|ner-jave-zh|[damo/zh_ner-JAVE](https://modelscope.cn/datasets/damo/zh_ner-JAVE/summary)|1266|0|118.3±45.5, min=44, max=223|chat, ner|
|coco-en|[modelscope/coco_2014_caption](https://modelscope.cn/datasets/modelscope/coco_2014_caption/summary)|414113|40504|298.8±2.8, min=294, max=351|chat, multi-modal, vision|
|🔥coco-mini-en|[modelscope/coco_2014_caption](https://modelscope.cn/datasets/modelscope/coco_2014_caption/summary)|20000|200|298.8±2.8, min=294, max=339|chat, multi-modal, vision|
|capcha-images|[AI-ModelScope/captcha-images](https://modelscope.cn/datasets/AI-ModelScope/captcha-images/summary)|6000|2000|29.0±0.0, min=29, max=29|chat, multi-modal, vision|
|aishell1-zh|[speech_asr/speech_asr_aishell1_trainsets](https://modelscope.cn/datasets/speech_asr/speech_asr_aishell1_trainsets/summary)|134424|7176|152.2±36.8, min=63, max=419|chat, multi-modal, audio|
|🔥aishell1-mini-zh|[speech_asr/speech_asr_aishell1_trainsets](https://modelscope.cn/datasets/speech_asr/speech_asr_aishell1_trainsets/summary)|14326|200|152.0±35.5, min=74, max=359|chat, multi-modal, audio|
|stack-exchange-paired|[AI-ModelScope/stack-exchange-paired](https://modelscope.cn/datasets/AI-ModelScope/stack-exchange-paired/summary)|4483004|0|534.5±594.6, min=31, max=56588|hfrl, dpo, pairwise|
|hh-rlhf|[AI-ModelScope/hh-rlhf](https://modelscope.cn/datasets/AI-ModelScope/hh-rlhf/summary)|42537|2312|163.4±117.7, min=27, max=964|hfrl, dpo, pairwise|
8 changes: 4 additions & 4 deletions docs/source/LLM/自我认知微调最佳实践.md
Original file line number Diff line number Diff line change
Expand Up @@ -260,14 +260,14 @@ CUDA_VISIBLE_DEVICES=0 swift infer --ckpt_dir 'qwen-7b-chat/vx-xxx/checkpoint-xx
import os
os.environ['CUDA_VISIBLE_DEVICES'] = '0'

from swift.llm import InferArguments, merge_lora_main, app_ui_main
from swift.llm import AppUIArguments, merge_lora_main, app_ui_main

best_model_checkpoint = 'qwen-7b-chat/vx-xxx/checkpoint-xxx'
infer_args = InferArguments(
app_ui_args = AppUIArguments(
ckpt_dir=best_model_checkpoint,
eval_human=True)
# merge_lora_main(infer_args)
result = app_ui_main(infer_args)
# merge_lora_main(app_ui_args)
result = app_ui_main(app_ui_args)
```

使用CLI:
Expand Down
8 changes: 4 additions & 4 deletions examples/pytorch/llm/app.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,14 +3,14 @@
# os.environ['CUDA_VISIBLE_DEVICES'] = '0'
import custom

from swift.llm import InferArguments, ModelType, app_ui_main
from swift.llm import AppUIArguments, ModelType, app_ui_main

if __name__ == '__main__':
# Please refer to the `infer.sh` for setting the parameters.
# text-generation
# args = InferArguments(model_type=ModelType.chatglm3_6b_base)
# args = AppUIArguments(model_type=ModelType.chatglm3_6b_base)
# or chat
args = InferArguments(model_type=ModelType.qwen_7b_chat_int4)
args = AppUIArguments(model_type=ModelType.qwen_7b_chat_int4)
# or load from ckpt dir
# args = InferArguments(ckpt_dir='xxx/vx_xxx/checkpoint-xxx')
# args = AppUIArguments(ckpt_dir='xxx/vx_xxx/checkpoint-xxx')
app_ui_main(args)
12 changes: 12 additions & 0 deletions examples/pytorch/llm/scripts/deepseek_moe_16b_chat/lora/infer.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
# Experimental environment: A100
CUDA_VISIBLE_DEVICES=0 \
swift infer \
--ckpt_dir "output/deepseek-moe-16b-chat/vx_xxx/checkpoint-xxx" \
--load_dataset_config true \
--max_length 4096 \
--use_flash_attn true \
--max_new_tokens 2048 \
--temperature 0.1 \
--top_p 0.7 \
--repetition_penalty 1.05 \
--do_sample true \
12 changes: 12 additions & 0 deletions examples/pytorch/llm/scripts/deepseek_moe_16b_chat/lora/sft.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
# Experimental environment: A100
# 52GB GPU memory
CUDA_VISIBLE_DEVICES=0 \
swift sft \
--model_type deepseek-moe-16b-chat \
--dataset damo-agent-mini-zh \
--train_dataset_sample 20000 \
--max_length 4096 \
--gradient_checkpointing true \
--eval_steps 100 \
--use_flash_attn true \
--output_dir output \
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
PYTHONPATH=../../.. \
CUDA_VISIBLE_DEVICES=0 \
python llm_infer.py \
--ckpt_dir "output/mistral-7b-chat/vx_xxx/checkpoint-xxx" \
--ckpt_dir "output/mistral-7b-instruct/vx_xxx/checkpoint-xxx" \
--load_dataset_config true \
--max_length 4096 \
--max_new_tokens 2048 \
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ torchrun \
--save_total_limit 2 \
--logging_steps 10 \
--push_to_hub false \
--hub_model_id mistral-7b-chat-lora \
--hub_model_id mistral-7b-instruct-lora \
--hub_private_repo true \
--hub_token 'your-sdk-token' \
--deepspeed_config_path 'ds_config/zero2.json' \
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
PYTHONPATH=../../.. \
CUDA_VISIBLE_DEVICES=0 \
python llm_infer.py \
--ckpt_dir "output/mistral-7b-chat/vx_xxx/checkpoint-xxx" \
--ckpt_dir "output/mistral-7b-instruct/vx_xxx/checkpoint-xxx" \
--load_dataset_config true \
--max_length 4096 \
--max_new_tokens 2048 \
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,6 @@ torchrun \
--save_total_limit 2 \
--logging_steps 10 \
--push_to_hub false \
--hub_model_id mistral-7b-chat-lora \
--hub_model_id mistral-7b-instruct-lora \
--hub_private_repo true \
--hub_token 'your-sdk-token' \
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
PYTHONPATH=../../.. \
CUDA_VISIBLE_DEVICES=0,1 \
python llm_infer.py \
--ckpt_dir "output/mixtral-7b-moe/vx_xxx/checkpoint-xxx" \
--ckpt_dir "output/mixtral-moe-7b/vx_xxx/checkpoint-xxx" \
--load_dataset_config true \
--max_length 2048 \
--use_flash_attn true \
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
PYTHONPATH=../../.. \
CUDA_VISIBLE_DEVICES=0,1 \
python llm_infer.py \
--ckpt_dir "output/mixtral-7b-moe-chat/vx_xxx/checkpoint-xxx" \
--ckpt_dir "output/mixtral-moe-7b-instruct/vx_xxx/checkpoint-xxx" \
--load_dataset_config true \
--max_length 2048 \
--use_flash_attn true \
Expand Down
Loading