Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
79 commits
Select commit Hold shift + click to select a range
719b075
fix sh
Jintao-Huang Oct 18, 2023
9bed2f9
update dataset_seed
Jintao-Huang Oct 18, 2023
b98e26e
update dataset_seed
Jintao-Huang Oct 18, 2023
9eca9ba
support model_id_or_path
Jintao-Huang Oct 18, 2023
90de8a6
default dtype is None
Jintao-Huang Oct 18, 2023
bc6d7d2
fix bug
Jintao-Huang Oct 18, 2023
02addf4
feat: acc eval_acc
Jintao-Huang Oct 18, 2023
97e6729
add require_version
Jintao-Huang Oct 18, 2023
44291b0
update readme
Jintao-Huang Oct 18, 2023
54d8455
update baichuan2-13b mp+ddp sh
Jintao-Huang Oct 19, 2023
b4aac14
fix bug
Jintao-Huang Oct 19, 2023
e561b42
model_id_or_path_lower
Jintao-Huang Oct 19, 2023
2e0c404
update register dataset
Jintao-Huang Oct 19, 2023
f58e376
fix acc bug
Jintao-Huang Oct 19, 2023
6c5436f
move code into swift
Jintao-Huang Oct 19, 2023
f8337d8
update register_model
Jintao-Huang Oct 20, 2023
826516c
update max_length -1
Jintao-Huang Oct 20, 2023
0d9021b
update register_template
Jintao-Huang Oct 20, 2023
81204f2
Merge branch 'main' into feat_1018
Jintao-Huang Oct 20, 2023
4878830
update ci test
Jintao-Huang Oct 20, 2023
a3deb9a
fix predict_with_generate bug
Jintao-Huang Oct 20, 2023
7c1c1f7
update ci test
Jintao-Huang Oct 21, 2023
c1cbcf1
update code
Jintao-Huang Oct 22, 2023
5a542ba
update template
Jintao-Huang Oct 22, 2023
8e5b4f5
fix bug
Jintao-Huang Oct 22, 2023
eb09b72
update dataset mixin
Jintao-Huang Oct 22, 2023
eb167a7
update run
Jintao-Huang Oct 22, 2023
9e0f5f4
update run
Jintao-Huang Oct 22, 2023
ce96d3d
fix dataset bug
Jintao-Huang Oct 22, 2023
e2115a6
update io_utils np_utils
Jintao-Huang Oct 22, 2023
76c6412
update code
Jintao-Huang Oct 23, 2023
bfac505
update adapter_cfg in configuration.json
Jintao-Huang Oct 23, 2023
8760aea
update stream inference
Jintao-Huang Oct 23, 2023
4065b78
fix citest
Jintao-Huang Oct 23, 2023
3ec4bf7
fix typo
Jintao-Huang Oct 23, 2023
f1dba7a
update custom example
Jintao-Huang Oct 24, 2023
46f1a33
update sh
Jintao-Huang Oct 24, 2023
caef5a0
update custom
Jintao-Huang Oct 24, 2023
401b59b
mv uilts/llm_utils.py -> llm/utils/utils.py
Jintao-Huang Oct 24, 2023
2b70f28
update readme
Jintao-Huang Oct 24, 2023
02dac23
fix inference_stream bug
Jintao-Huang Oct 24, 2023
e7c78e3
update sh. lora_dropout_p
Jintao-Huang Oct 24, 2023
51fdba6
fix stream bug
Jintao-Huang Oct 25, 2023
45afd5e
update pretrained sh
Jintao-Huang Oct 25, 2023
87dfddc
update code
Jintao-Huang Oct 25, 2023
600a2d7
fix bug
Jintao-Huang Oct 25, 2023
5e320b2
fix bug
Jintao-Huang Oct 25, 2023
4180bab
fix bug
Jintao-Huang Oct 25, 2023
8d2d539
update sh model_type -> model_id_or_path
Jintao-Huang Oct 25, 2023
9afc036
update sh
Jintao-Huang Oct 25, 2023
5a4ca7c
update sh
Jintao-Huang Oct 25, 2023
38e2b31
update sh
Jintao-Huang Oct 25, 2023
c8bcf5e
update readme
Jintao-Huang Oct 25, 2023
3a06d3b
update ci test
Jintao-Huang Oct 25, 2023
1ab60fd
fix typo
Jintao-Huang Oct 25, 2023
dac5729
fix citest
Jintao-Huang Oct 25, 2023
fa6026e
fix citest
Jintao-Huang Oct 25, 2023
b446220
fix citest
Jintao-Huang Oct 25, 2023
10121d1
update sh
Jintao-Huang Oct 25, 2023
0f88ca0
update citest
Jintao-Huang Oct 25, 2023
760edde
update max_new_tokens
Jintao-Huang Oct 25, 2023
1f10836
fix citest
Jintao-Huang Oct 25, 2023
e1849e0
update readme
Jintao-Huang Oct 25, 2023
1e43970
fix inference_stream, template round bug
Jintao-Huang Oct 25, 2023
85da951
update merge_lora
Jintao-Huang Oct 25, 2023
fd79578
fix citest
Jintao-Huang Oct 25, 2023
9b1962c
fix citest
Jintao-Huang Oct 25, 2023
6ef4c2b
update readme
Jintao-Huang Oct 26, 2023
838d33f
update register_dataset
Jintao-Huang Oct 27, 2023
1c95d11
fix bug
Jintao-Huang Oct 27, 2023
ae65e70
update setup.py
Jintao-Huang Oct 27, 2023
b4a1c85
fix bug
Jintao-Huang Oct 27, 2023
21ba918
Merge branch 'main' into feat_1018
Jintao-Huang Oct 27, 2023
b0970de
update sh
Jintao-Huang Oct 27, 2023
b53384e
update sh
Jintao-Huang Oct 27, 2023
1614de6
update readme
Jintao-Huang Oct 27, 2023
576e8cb
update readme
Jintao-Huang Oct 27, 2023
135eb14
update sh
Jintao-Huang Oct 27, 2023
07703b4
update readme
Jintao-Huang Oct 27, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 8 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ Press [this link](https://github.com/modelscope/swift/tree/main/examples/pytorch
- 🔥 qwen series: [qwen-7b](https://modelscope.cn/models/qwen/Qwen-7B/summary), [qwen-7b-chat](https://modelscope.cn/models/qwen/Qwen-7B-Chat/summary), [qwen-14b](https://modelscope.cn/models/qwen/Qwen-14B/summary), [qwen-14b-chat](https://modelscope.cn/models/qwen/Qwen-14B-Chat/summary), [qwen-7b-chat-int4](https://modelscope.cn/models/qwen/Qwen-7B-Chat-Int4/summary), [qwen-14b-chat-int4](https://modelscope.cn/models/qwen/Qwen-14B-Chat-Int4/summary), [qwen-7b-chat-int8](https://modelscope.cn/models/qwen/Qwen-7B-Chat-Int8/summary), [qwen-14b-chat-int8](https://modelscope.cn/models/qwen/Qwen-14B-Chat-Int8/summary)
- 🔥 qwen-vl series: [qwen-vl](https://modelscope.cn/models/qwen/Qwen-VL/summary), [qwen-vl-chat](https://modelscope.cn/models/qwen/Qwen-VL-Chat/summary), [qwen-vl-chat-int4](https://modelscope.cn/models/qwen/Qwen-VL-Chat-Int4/summary)
- baichuan series: [baichuan-7b](https://modelscope.cn/models/baichuan-inc/baichuan-7B/summary), [baichuan-13b](https://modelscope.cn/models/baichuan-inc/Baichuan-13B-Base/summary), [baichuan-13b-chat](https://modelscope.cn/models/baichuan-inc/Baichuan-13B-Chat/summary), [baichuan2-7b](https://modelscope.cn/models/baichuan-inc/Baichuan2-7B-Base/summary), [baichuan2-7b-chat](https://modelscope.cn/models/baichuan-inc/Baichuan2-7B-Chat/summary), [baichuan2-13b](https://modelscope.cn/models/baichuan-inc/Baichuan2-13B-Base/summary), [baichuan2-13b-chat](https://modelscope.cn/models/baichuan-inc/Baichuan2-13B-Chat/summary), [baichuan2-7b-chat-int4](https://modelscope.cn/models/baichuan-inc/Baichuan2-7B-Chat-4bits/summary), [baichuan2-13b-chat-int4](https://modelscope.cn/models/baichuan-inc/Baichuan2-13B-Chat-4bits/summary)
- chatglm2 series: [chatglm2-6b](https://modelscope.cn/models/ZhipuAI/chatglm2-6b/summary), [chatglm2-6b-32k](https://modelscope.cn/models/ZhipuAI/chatglm2-6b-32k/summary)
- chatglm series: [chatglm2-6b](https://modelscope.cn/models/ZhipuAI/chatglm2-6b/summary), [chatglm2-6b-32k](https://modelscope.cn/models/ZhipuAI/chatglm2-6b-32k/summary), [chatglm3-6b-base](https://modelscope.cn/models/ZhipuAI/chatglm3-6b-base/summary), [chatglm3-6b](https://modelscope.cn/models/ZhipuAI/chatglm3-6b/summary), [chatglm3-6b-32k](https://modelscope.cn/models/ZhipuAI/chatglm3-6b-32k/summary)
- llama series: [llama2-7b](https://modelscope.cn/models/modelscope/Llama-2-7b-ms/summary), [llama2-7b-chat](https://modelscope.cn/models/modelscope/Llama-2-7b-chat-ms/summary), [llama2-13b](https://modelscope.cn/models/modelscope/Llama-2-13b-ms/summary), [llama2-13b-chat](https://modelscope.cn/models/modelscope/Llama-2-13b-chat-ms/summary), [llama2-70b](https://modelscope.cn/models/modelscope/Llama-2-70b-ms/summary), [llama2-70b-chat](https://modelscope.cn/models/modelscope/Llama-2-70b-chat-ms/summary)
- openbuddy series: [openbuddy-llama2-13b-chat](https://modelscope.cn/models/OpenBuddy/openbuddy-llama2-13b-v8.1-fp16/summary), [openbuddy-llama-65b-chat](https://modelscope.cn/models/OpenBuddy/openbuddy-llama-65b-v8-bf16/summary), [openbuddy-llama2-70b-chat](https://modelscope.cn/models/OpenBuddy/openbuddy-llama2-70b-v10.1-bf16/summary), [openbuddy-mistral-7b-chat](https://modelscope.cn/models/OpenBuddy/openbuddy-mistral-7b-v13.1/summary)
- internlm series: [internlm-7b](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm-7b/summary), [internlm-7b-chat](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm-chat-7b-v1_1/summary), [internlm-7b-chat-8k](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm-chat-7b-8k/summary), [internlm-20b](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm-20b/summary), [internlm-20b-chat](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm-chat-20b/summary)
Expand All @@ -55,32 +55,32 @@ Press [this link](https://github.com/modelscope/swift/tree/main/examples/pytorch
- NLP:
- General: 🔥[alpaca-en](https://modelscope.cn/datasets/AI-ModelScope/alpaca-gpt4-data-en/summary)(gpt4), 🔥[alpaca-zh](https://modelscope.cn/datasets/AI-ModelScope/alpaca-gpt4-data-zh/summary)(gpt4), [multi-alpaca-all](https://www.modelscope.cn/datasets/damo/nlp_polylm_multialpaca_sft/summary), [instinwild-en](https://www.modelscope.cn/datasets/wyj123456/instinwild/summary), [instinwild-zh](https://www.modelscope.cn/datasets/wyj123456/instinwild/summary), [cot-en](https://www.modelscope.cn/datasets/YorickHe/CoT/summary), [cot-zh](https://www.modelscope.cn/datasets/YorickHe/CoT/summary), [firefly-all-zh](https://www.modelscope.cn/datasets/wyj123456/firefly/summary), [instruct-en](https://www.modelscope.cn/datasets/wyj123456/instruct/summary), [gpt4all-en](https://www.modelscope.cn/datasets/wyj123456/GPT4all/summary), [sharegpt-en](https://www.modelscope.cn/datasets/huangjintao/sharegpt/summary), [sharegpt-zh](https://www.modelscope.cn/datasets/huangjintao/sharegpt/summary)
- Agent: [damo-agent-zh](https://modelscope.cn/datasets/damo/MSAgent-Bench/summary), 🔥[damo-agent-mini-zh](https://modelscope.cn/datasets/damo/MSAgent-Bench/summary)
- Coding: [code-en](https://www.modelscope.cn/datasets/wyj123456/code_alpaca_en/summary), [code-python-zh](https://modelscope.cn/datasets/codefuse-ai/CodeExercise-Python-27k/summary), 🔥[leetcode-python-en](https://modelscope.cn/datasets/AI-ModelScope/leetcode-solutions-python/summary)
- Coding: [code-alpaca-en](https://www.modelscope.cn/datasets/wyj123456/code_alpaca_en/summary), [code-python-zh](https://modelscope.cn/datasets/codefuse-ai/CodeExercise-Python-27k/summary), 🔥[leetcode-python-en](https://modelscope.cn/datasets/AI-ModelScope/leetcode-solutions-python/summary)
- Medical: [medical-en](https://www.modelscope.cn/datasets/huangjintao/medical_zh/summary), [medical-zh](https://www.modelscope.cn/datasets/huangjintao/medical_zh/summary), [medical-mini-zh](https://www.modelscope.cn/datasets/huangjintao/medical_zh/summary)
- Law: 🔥[lawyer-llama-zh](https://modelscope.cn/datasets/AI-ModelScope/lawyer_llama_data/summary), [tigerbot-law-zh](https://modelscope.cn/datasets/AI-ModelScope/tigerbot-law-plugin/summary)
- Math: 🔥[blossom-math-zh](https://modelscope.cn/datasets/AI-ModelScope/blossom-math-v2/summary), [school-math-zh](https://modelscope.cn/datasets/AI-ModelScope/school_math_0.25M/summary)
- SQL: [text2sql-en](https://modelscope.cn/datasets/AI-ModelScope/texttosqlv2_25000_v2/summary), 🔥[sql-create-context-en](https://modelscope.cn/datasets/AI-ModelScope/sql-create-context/summary)
- Text Generation: 🔥[advertise-gen-zh](https://modelscope.cn/datasets/lvjianjin/AdvertiseGen/summary), 🔥[dureader-robust-zh](https://modelscope.cn/datasets/modelscope/DuReader_robust-QG/summary)
- Classification: [cmnli-zh](https://www.modelscope.cn/datasets/modelscope/clue/summary), [jd-zh](https://modelscope.cn/datasets/DAMO_NLP/jd/summary)
- Classification: [cmnli-zh](https://www.modelscope.cn/datasets/modelscope/clue/summary), [jd-sentiment-zh](https://modelscope.cn/datasets/DAMO_NLP/jd/summary)
- Other: [finance-en](https://www.modelscope.cn/datasets/wyj123456/finance_en/summary), [poetry-zh](https://www.modelscope.cn/datasets/modelscope/chinese-poetry-collection/summary), [cls-fudan-news-zh](https://modelscope.cn/datasets/damo/zh_cls_fudan-news/summary), [ner-jave-zh](https://modelscope.cn/datasets/damo/zh_ner-JAVE/summary)
- Multi-Modal: 🔥[coco-en](https://modelscope.cn/datasets/modelscope/coco_2014_caption/summary)
- Custom Dataset
- Supported Templates:
- Text Generation: default-generation, chatglm2-generation
- Chat: chatml(qwen), baichuan, chatglm2, llama, openbuddy-llama, default, internlm, xverse
- Chat: chatml(qwen), baichuan, chatglm2, chatglm3, llama, openbuddy-llama, default, internlm, xverse


### News
- 🔥 2023.10.17: Supported int8 models: qwen-7b-chat-int8, qwen-14b-chat-int8. The corresponding shell script can be found at `scripts/qwen_7b_chat_int8`, `scripts/qwen_14b_chat_int8`.
- 🔥 2023.10.16: Supported int4 models: qwen-7b-chat-int4, qwen-14b-chat-int4, qwen-vl-chat-int4, baichuan2-7b-chat-int4, baichuan2-13b-chat-int4. The corresponding shell script can be found at `scripts/qwen_7b_chat_int4`, `scripts/qwen_14b_chat_int4`, `scripts/qwen_vl_chat_int4`, `scripts/baichuan2_7b_chat_int4`, `scripts/baichuan2_13b_chat_int4`.
- 🔥 2023.10.27: Support for chatglm3 series models: chatglm3-6b-base, chatglm3-6b, chatglm3-6b-32k. The corresponding shell script can be found in `scripts/chatglm3_6b_32k`.
- 🔥 2023.10.24: Use the registration mechanism to add models, datasets, and chat templates. To customize models, datasets, and chat templates, refer to the "User Guide" section. The corresponding Python file can be found in `custom.py`, and the corresponding shell script can be found in `scripts/custom/tigerbot_13b_chat`.
- 🔥 2023.10.17: Supported int4, int8 models: qwen-7b-chat-int4, qwen-14b-chat-int4, qwen-vl-chat-int4, baichuan2-7b-chat-int4, baichuan2-13b-chat-int4, qwen-7b-chat-int8, qwen-14b-chat-int8. The corresponding shell script can be found at `scripts/qwen_7b_chat_int4`, `scripts/qwen_14b_chat_int4`, `scripts/qwen_vl_chat_int4`, `scripts/qwen_7b_chat_int8`, `scripts/qwen_14b_chat_int8`.
- 2023.10.15: Supported ziya2-13b model series: ziya2-13b, ziya2-13b-chat. The corresponding shell script can be found at `scripts/ziya2_13b_chat`.
- 2023.10.12: Supported mistral-7b model series: openbuddy-mistral-7b-chat, mistral-7b, mistral-7b-chat. The corresponding shell script can be found at `scripts/openbuddy_mistral_7b_chat`, `scripts/mistral_7b_chat`.
- 🔥 2023.10.7: Supported DeepSpeed ZeRO-2, enabling LoRA (not just QLoRA) to run DDP on 2*A10. The corresponding shell script can be found at `scripts/qwen_7b_chat/lora_ddp_ds/sft.sh`.
- 2023.10.4: Supported datasets in the fields of mathematics, law, SQL, and coding: blossom-math-zh, school-math-zh, text2sql-en, sql-create-context-en, lawyer-llama-zh, tigerbot-law-zh, leetcode-python-en.
- 🔥 2023.9.25: Supported qwen-14b model series: qwen-14b, qwen-14b-chat. The corresponding shell script can be found at `scripts/qwen_14b`, `scripts/qwen_14b_chat`.
- 2023.9.18: Supported internlm-20b model series: internlm-20b, internlm-20b-chat. The corresponding shell script can be found at `scripts/internlm_20b`, `scripts/internlm_20b_chat`.
- 🔥 2023.9.12: Supported training with MP+DDP to accelerate full-parameter fine-tuning speed. The corresponding shell script can be found at `scripts/qwen_7b_chat/full_mp_ddp/sft.sh`.
- 2023.9.5: Supported training that only saves model weights without saving intermediate states such as optimizer weights required for checkpoint resumption, avoiding long checkpoint-saving times and large storage space in full-parameter fine-tuning. You can check the command-line parameter `--only_save_model` in the `sft.sh` script.
- 2023.9.12: Supported training with MP+DDP to accelerate full-parameter fine-tuning speed. The corresponding shell script can be found at `scripts/qwen_7b_chat/full_mp_ddp/sft.sh`.


# Installation
Expand Down
Loading