Releases
v0.2.1
v0.2.1: Variant Models, NEFTune Trick
Compare
Sorry, something went wrong.
No results found
New features
Support NEFTune trick for supervised fine-tuning by @anvie in #1252
Support loading dataset in the sharegpt format - read data/readme for details
Support generating multiple responses in demo API via the n parameter
Support caching the pre-processed dataset files via the cache_path argument
Better LLaMA Board (pagination, controls, etc.)
Support push_to_hub argument #1088
New models
Base models
ChatGLM3-6B-Base
Yi (6B/34B)
Mistral-7B
BlueLM-7B-Base
Skywork-13B-Base
XVERSE-65B
Falcon-180B
Deepseek-Coder-Base (1.3B/6.7B/33B)
Instruct/Chat models
ChatGLM3-6B
Mistral-7B-Instruct
BlueLM-7B-Chat
Zephyr-7B
OpenChat-3.5
Yayi (7B/13B)
Deepseek-Coder-Instruct (1.3B/6.7B/33B)
New datasets
Pre-training datasets
Supervised fine-tuning datasets
OpenPlatypus
ShareGPT Hyperfiltered
ShareGPT4
UltraChat 200k
AgentInstruct
LMSYS Chat 1M
Evol Instruct V2
Bug fix
You can’t perform that action at this time.