[Docs] Improve Docs #22

LZHgrla · 2023-08-09T07:59:38Z

No description provided.

* Update README.md * fix bugs in plugins template * Update README.md * Update README.md * fix pre-commit * Update README.md * Update README.md * fix pre-commit * modify chat hyp * add docs * Update finetune.md * Update chat.md * fix pre-commit * Update README.md * Update chat.md * fix pre-commit * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * add CONTRIBUTING.md * Update CONTRIBUTING.md * Create LICENSE * Update README.md * Update README.md * Update README.md * add copyright * fix typo * remove unused configs * fix bugs * Create README_zh-CN.md * Create chat.md * Create finetune.md * fix pre-commit * Update README.md * Update README.md * Update README.md

* minimum dependency sft * fix dispatch * add timer * add tgs * internlm2 tp * rms support tp * gradient checkpointing * lazy load pretrain * temp * fix bugs * add data pipeline example * fix lints * remove useless code * fix hard pack bug * add comments * clean code * add shard strategy * support cpu offload * support cpu offload * trust remote code * fix soft packer bug * fix soft packer bug * fix soft packer bug * refactor data pipeline * fixup * fix pad tokens bug * check input_ids and labels * check input_ids and labels in collator * fix load local datasets bug * fix load cache datasts * restore dset order * save cached infos * accelerate start up * avoid all gather cached datasets * fixup * fix cache bug * Support group length (#4) * replace rmsnorm kernel * suport ftdp ds * suport load_bin * suport group by maxlen * add fsdp_ftdp_sft and fix fsdp_sft * suport ftdp ds * add lr min * fix bugs * fix bugs * delete * support llava * support packer cache * refactor dist load * Add sp tp (#5) * support sp and tp * add fsdp_tp_sft and modify fsdp_sft * move chat_template * fix load_ds * delete useless codes * delete useless codes * fix jsonl load * refactor * fix bug * fix lr scheduler * refactor setup parallel * update data load * fix bugs * move fsdp * adapt new parallel load * fix setup_parallel (#7) * fix some bugs * add remote codes * add convert script * support load image from ceph * support load image from ceph * fix cache dataset bugs * support mulit images * support llava interleave * fix load timeout * refactor datasets: optimize the cache mechanism and clean up code * distinguish dataset components based on algorithms * support fsdp2+3d parallel * fix lints * support contiguous batching * refactor parallel * zero wasting ppo * support asend npu * fix openai convert * fix npu bugs * fix npu bug * dispatch npu flash attn * adapt asend npu * fix ppo losses * steady increase in reward * faster ppo * fix top-p generate * support internlm3 * baseline 2.5 * fix internlm3 * (ing)support hard pack * support qwen2 * fix dataset bugs * baseline * del ppo.py * fixup * support hybrid sp * fix hybrid sp * qwen2 + hybird sp * fix requirements * avoid re-initialize dist * support group pack * pretrain (#13) * first commit: support internlm3 moe streaming dataset * move codes * Moe pretrain (#14) * first commit: support internlm3 moe streaming dataset * move codes * rmsnorm kernel support low version flash_attn * add barrier * support prompt length control (#15) * support VLM Base (#16) * add internvl * fix bug * remove dup code * support liger of internvl * fix bug * add get_repo_git_info * fix * add minicpmv * add minicpmv dispatch * accelerate tokenize * Updata InternVL (#17) * fix dpo error * fix sp error * update dataset * fix * fix rand sampler (#18) * llama support transformers >= 4.45 (#19) * convert fsdp1 to fsdp2 in sft.py * [Feature] Support Liger Kernel (#20) * filter data by max length (#21) * fix causal forward, prefetch, and remote code (#22) * [Enhancement] Accelerating Data Pipeline (#23) * sample ratio greater than 1.0 and trunc max len * accelerating the counting of tokens * log reduced loss * fix mirco bs greater than 1 * [Enhancement] Ensure data integrity when the sampling ratio is more than 1 (#24) * repeat dataset * fixup * fix typos * fix typos * [Fix] Pass in temperature during generation (#25) * Support Janus and fix some error (#27) * add prefetch * update prefetch * add janus * add janus * fix * fix * fix llama position id error * fix ProcessPoolExecutor * update * fix llama * delete cache * remove useless code --------- Co-authored-by: whcao <41630003+HIT-cwh@users.noreply.github.com> Co-authored-by: Happy <lsb19@tsinghua.org.cn> Co-authored-by: Haian Huang(深度眸) <1286304229@qq.com>

* [Feature] XTuner Lite (#974) * minimum dependency sft * fix dispatch * add timer * add tgs * internlm2 tp * rms support tp * gradient checkpointing * lazy load pretrain * temp * fix bugs * add data pipeline example * fix lints * remove useless code * fix hard pack bug * add comments * clean code * add shard strategy * support cpu offload * support cpu offload * trust remote code * fix soft packer bug * fix soft packer bug * fix soft packer bug * refactor data pipeline * fixup * fix pad tokens bug * check input_ids and labels * check input_ids and labels in collator * fix load local datasets bug * fix load cache datasts * restore dset order * save cached infos * accelerate start up * avoid all gather cached datasets * fixup * fix cache bug * Support group length (#4) * replace rmsnorm kernel * suport ftdp ds * suport load_bin * suport group by maxlen * add fsdp_ftdp_sft and fix fsdp_sft * suport ftdp ds * add lr min * fix bugs * fix bugs * delete * support llava * support packer cache * refactor dist load * Add sp tp (#5) * support sp and tp * add fsdp_tp_sft and modify fsdp_sft * move chat_template * fix load_ds * delete useless codes * delete useless codes * fix jsonl load * refactor * fix bug * fix lr scheduler * refactor setup parallel * update data load * fix bugs * move fsdp * adapt new parallel load * fix setup_parallel (#7) * fix some bugs * add remote codes * add convert script * support load image from ceph * support load image from ceph * fix cache dataset bugs * support mulit images * support llava interleave * fix load timeout * refactor datasets: optimize the cache mechanism and clean up code * distinguish dataset components based on algorithms * support fsdp2+3d parallel * fix lints * support contiguous batching * refactor parallel * zero wasting ppo * support asend npu * fix openai convert * fix npu bugs * fix npu bug * dispatch npu flash attn * adapt asend npu * fix ppo losses * steady increase in reward * faster ppo * fix top-p generate * support internlm3 * baseline 2.5 * fix internlm3 * (ing)support hard pack * support qwen2 * fix dataset bugs * baseline * del ppo.py * fixup * support hybrid sp * fix hybrid sp * qwen2 + hybird sp * fix requirements * avoid re-initialize dist * support group pack * pretrain (#13) * first commit: support internlm3 moe streaming dataset * move codes * Moe pretrain (#14) * first commit: support internlm3 moe streaming dataset * move codes * rmsnorm kernel support low version flash_attn * add barrier * support prompt length control (#15) * support VLM Base (#16) * add internvl * fix bug * remove dup code * support liger of internvl * fix bug * add get_repo_git_info * fix * add minicpmv * add minicpmv dispatch * accelerate tokenize * Updata InternVL (#17) * fix dpo error * fix sp error * update dataset * fix * fix rand sampler (#18) * llama support transformers >= 4.45 (#19) * convert fsdp1 to fsdp2 in sft.py * [Feature] Support Liger Kernel (#20) * filter data by max length (#21) * fix causal forward, prefetch, and remote code (#22) * [Enhancement] Accelerating Data Pipeline (#23) * sample ratio greater than 1.0 and trunc max len * accelerating the counting of tokens * log reduced loss * fix mirco bs greater than 1 * [Enhancement] Ensure data integrity when the sampling ratio is more than 1 (#24) * repeat dataset * fixup * fix typos * fix typos * [Fix] Pass in temperature during generation (#25) * Support Janus and fix some error (#27) * add prefetch * update prefetch * add janus * add janus * fix * fix * fix llama position id error * fix ProcessPoolExecutor * update * fix llama * delete cache * remove useless code --------- Co-authored-by: whcao <41630003+HIT-cwh@users.noreply.github.com> Co-authored-by: Happy <lsb19@tsinghua.org.cn> Co-authored-by: Haian Huang(深度眸) <1286304229@qq.com> * support mlu (#984) * cleanup * add internlm3 remote code * cleanup * auto patch * remove useless code --------- Co-authored-by: whcao <41630003+HIT-cwh@users.noreply.github.com> Co-authored-by: Happy <lsb19@tsinghua.org.cn> Co-authored-by: Haian Huang(深度眸) <1286304229@qq.com> Co-authored-by: Lantian Zhang <50076473+DoorKickers@users.noreply.github.com>

LZHgrla and others added 8 commits August 9, 2023 15:59

Update README.md

130b2cc

fix bugs in plugins template

3be3009

Update README.md

cb3b25a

Update README.md

5a6510c

fix pre-commit

5caf740

Update README.md

6bbf40c

Update README.md

dc9a602

fix pre-commit

0e2845c

LZHgrla changed the title ~~[WIP] [Docs] Improve README~~ [WIP] [Docs] Improve Docs Aug 9, 2023

LZHgrla and others added 20 commits August 9, 2023 17:41

modify chat hyp

bd48d76

add docs

7ff68c7

Update finetune.md

6eb2513

Update chat.md

76a7f0e

fix pre-commit

d92ea77

Update README.md

4283a2b

Update chat.md

78e8534

fix pre-commit

f0ed331

Update README.md

bbb8520

Update README.md

ec44824

Update README.md

9d5d5eb

Update README.md

6190d0e

Update README.md

62b56b1

add CONTRIBUTING.md

6299f83

Update CONTRIBUTING.md

10f0f40

Create LICENSE

475c2ce

Update README.md

6803278

Update README.md

26da0eb

Update README.md

2c1280d

add copyright

174ecc5

LZHgrla changed the title ~~[WIP] [Docs] Improve Docs~~ [Docs] Improve Docs Aug 9, 2023

LZHgrla and others added 10 commits August 9, 2023 22:39

fix typo

fd7773e

remove unused configs

dfbb1eb

fix bugs

877c640

Create README_zh-CN.md

443f4d4

Create chat.md

2fc62c4

Create finetune.md

da127f6

fix pre-commit

7812824

Update README.md

321b053

Update README.md

e1a25c9

Update README.md

30f296c

LZHgrla merged commit 4238102 into InternLM:main Aug 10, 2023

LZHgrla deleted the lzh/improve_readme branch August 10, 2023 05:05

hhaAndroid pushed a commit to hhaAndroid/xtuner that referenced this pull request Dec 16, 2024

fix causal forward, prefetch, and remote code (InternLM#22)

07c175f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Docs] Improve Docs #22

[Docs] Improve Docs #22

Uh oh!

LZHgrla commented Aug 9, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

[Docs] Improve Docs #22

[Docs] Improve Docs #22

Uh oh!

Conversation

LZHgrla commented Aug 9, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant