llm-efficiency-submission

Models

All the adapters and merged models are available on Hugging Face: model collection. Here is a summary of all the experiments.

Base model	Axolotl config	Training log	QLoRA r/alpha	Target modules
Llama-2-7b	yml file	W&B logs	1 (16/16)	gate, up, down*
Llama-2-13b	yml file	W&B logs	1 (16/16)	gate, up, down
Llama-2-13b	yml file	W&B logs	1 (64/64)	gate,up, down
Mistral-7b-v0.1	yml file	W&B logs	2 (32/16)	all modules
Mistral-7b-v0.1	yml file	W&B logs	2 (32/16)	gate, up, down
Mistral-7b-v0.1	yml file	W&B logs	2 (64/32)	gate, up, down

* Targeting gate_proj, down_proj, and up_proj modules follows the finetuning recipes reported by He et al. 2022 and Lee and Hunter et al. 2023.

Dataset

Inspired by the performance of Open-Platypus dataset, we have curated a similar dataset while limiting each contributing subset to datasets with permissive licences. Please see HF for additional information on Open-Otter dataset.

Future experiments

LoRA vs QLoRA vs VeRA vs full FT
NEFTune (see issue)
Data mixture (a la DoReMi)
Curriculum learning (no compelling evidence that this works?)
Model fusion/merge

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
axolotl_configs		axolotl_configs
README.md		README.md
submission_to_model_id.json		submission_to_model_id.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

axolotl_configs

axolotl_configs

README.md

README.md

submission_to_model_id.json

submission_to_model_id.json

Repository files navigation

llm-efficiency-submission

Models

Dataset

Future experiments

About

Releases

Packages

cx0/neurips-llm-efficiency

Folders and files

Latest commit

History

Repository files navigation

llm-efficiency-submission

Models

Dataset

Future experiments

About

Topics

Resources

Stars

Watchers

Forks