Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add more datasets #1065

Merged
merged 113 commits into from
Jun 18, 2024
Merged

Add more datasets #1065

merged 113 commits into from
Jun 18, 2024

Conversation

tastelikefeet
Copy link
Collaborator

@tastelikefeet tastelikefeet commented Jun 4, 2024

PR type

  • Bug Fix
  • New Feature
  • Document Updates
  • More Models or Datasets Support

PR information

  1. Support 70+ datasets, about 1/3 of them are multi-modal datasets
  2. Refactor template.py, make all multi-modal templates support 'images' key

Experiment results

Paste your experiment result here(if needed).

* main:
  Fix code format and docs (modelscope#847)
  update (modelscope#846)
  fix xcomposer device_map (modelscope#844)
  fix merge_lora_dtype (modelscope#842)
  Fix infer default dtype (modelscope#834)
  fix ui (modelscope#830)
  support Internvl-chat-v1.5 model (modelscope#824)
* commit 'bdc8f54848daad335e513183482e16cc5da17c88': (36 commits)
  fix export self-cognition (modelscope#929)
  fix deepseek2(modelscope#924)
  Add 34b quantized model (modelscope#920)
  yi1.5 quantized model (modelscope#917)
  update readme&doc (modelscope#916)
  init (modelscope#915)
  fix unsloth import (modelscope#912)
  add more models local repo support (modelscope#911)
  lint
  DeepseekVL add local_repo_path argument AND infer support delete truncation_strategy (modelscope#883)
  Support Hqq and Eetq quantization  (modelscope#900)
  fix val_sample (modelscope#909)
  Add val_dataset argument (modelscope#906)
  Refactor sequence parallel (modelscope#823)
  replace dataset name with modelscope dataset id (modelscope#899)
  replace dataset name with dataset path from modelscope (modelscope#897)
  fix doc link
  enable longlora and adalora merge (modelscope#892)
  fix lisa show bug (modelscope#891)
  update doc (modelscope#888)
  ...

# Conflicts:
#	swift/llm/utils/dataset.py
* commit '6e5b58a8af8e1fb92b1498d5c45cfbea11da1b36':
  fix Internvl-int8 device map (modelscope#937)
  support ms-agent-roleplay dataset (modelscope#936)
  FIx eval url (modelscope#941)
  update doc (modelscope#934)
  fix Internvl-int8 sft bug (modelscope#932)
* main:
  pass pre-commit
  fix hf space
  Fix/studio (modelscope#947)
  support LLava-Next(Stronger) model (modelscope#933)
  Fix studio (modelscope#946)
  A bunch of small features (modelscope#944)
* commit '70abbe70e990cada15c54420d7eaa2f03b117a94':
  Support peft 0.11.0 (modelscope#953)
  fix torch_dtype (modelscope#954)
  fix some ui components
  remove useless web-ui component
  add more note
  fix hf_space
* main: (23 commits)
  fix gr limit (modelscope#1016)
  fix minicpm-v (modelscope#1010)
  fix cogvlm2 history (modelscope#1005)
  更新了Command-line-parameters.md里面的一个链接 (modelscope#1001)
  fix template example copy (modelscope#1003)
  Feat/phi3 paligemma (modelscope#998)
  fix pt deploy lora (modelscope#999)
  fix args (modelscope#996)
  fix val_dataset (modelscope#992)
  update custom_val_dataset (modelscope#991)
  [TorchAcc][Experimental] Integrate more model in torchacc (modelscope#683)
  fix cpu 'torch._C' has no attribute '_cuda_resetPeakMemoryStats' (modelscope#914)
  refactor readme web-ui (modelscope#983)
  support  transformers==4.41 (modelscope#979)
  support more models (modelscope#971)
  Fix minicpm device map (modelscope#978)
  fix typing (modelscope#974)
  fix vllm eos_token (modelscope#973)
  Support minicpm-v-v2_5-chat (modelscope#970)
  support cogvlm2-en-chat-19b (modelscope#967)
  ...
* commit 'e3f0f741a0f093062ca92f3709118bd5082fb1af':
  fix deepseek-vl (modelscope#1046)
  update arguments (modelscope#1044)
  update arguments (modelscope#1043)
  fix phi3-vision bug
  fix phi3-vision bug (modelscope#1039)
  bump version
  Support SimPO Algorithm (modelscope#1037)
  support multimodal deploy (modelscope#1029)
  support mini-internvl (modelscope#1032)
  fix bugs (modelscope#1038)
  fix app_ui (modelscope#1036)
  fix vllm==0.4.* slower than vllm==0.3.* (modelscope#1035)
  fix custom (modelscope#1028)
  fix arguments (modelscope#1026)
  fix docs and a bug (modelscope#1023)
  Fix docs table (modelscope#1024)
  update docs table (modelscope#1021)
* commit 'f6c9e84e11ecdad667eb6ce92aeb603147573b06':
  fix phi3-small (modelscope#1148)
  fix eval strategy compat (modelscope#1143)
  fix eval args (modelscope#1141)
  support customizable loss scale (modelscope#997)
  compat with vllm==0.5 (modelscope#1136)
* commit '40ca72b47b5a6ed33bbd9a49c88dae7a931b1cae':
  Fix/web UI 0617 (modelscope#1158)
  refactor rlhf (modelscope#1090)
  fix py38 (modelscope#1152)

# Conflicts:
#	swift/llm/utils/dataset.py
#	swift/llm/utils/template.py
#	swift/trainers/dpo_trainer.py
#	swift/trainers/orpo_trainer.py
#	swift/trainers/simpo_trainer.py
@tastelikefeet tastelikefeet merged commit 5818a85 into modelscope:main Jun 18, 2024
1 of 2 checks passed
hjh0119 pushed a commit to hjh0119/swift that referenced this pull request Jul 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants