LoRA Soup for GSM-8k

train the math skill and code skill.

For example, we train the code skill based on gptneo model

python train_experts.py -c configs/models/gptneo_125m.json -k dataset=alpaca_code_train_epochs=3 output_dir=debug_alpaca_code

Evaluate the model on GSM

eval the dense model

we generate the python code first:

python gsm_evaluator_with_lora_soup.py -k model=EleutherAI/gpt-neo-125m dataset=gsm gsm_template=python max_input_length=2048 max_output_length=128 output_dir=gpt_125m_dense

there is a json file in the "gpt_125m_dense" dir.

eval the accuracy of gsm8k

python eval_gsm_mttl.py --file=gpt_125m_dense/predict_python_code.jsonl

then we got 0.0015

eval the alpaca_code skill

generate the python code

python gsm_evaluator_with_lora_soup.py -k model=EleutherAI/gpt-neo-125m dataset=gsm gsm_template=python max_input_length=2048 max_output_length=128 output_dir=gpt_125m_alpaca_code checkpoint=projects/modular_llm
/debug_alpaca_code/best_mode_min_metric_val-loss_value_1.1037_step_1239.ckpt

eval the gsm8k-hard

we got the same score. It seems the alpaca-code does not help the gpt125m

Name		Name	Last commit message	Last commit date
Latest commit History 3,903 Commits
.github/workflows		.github/workflows
examples		examples
mttl		mttl
projects		projects
tests		tests
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
Dockerfile		Dockerfile
LICENSE		LICENSE
NOTICE.md		NOTICE.md
README.md		README.md
SECURITY.md		SECURITY.md
SUPPORT.md		SUPPORT.md
eval_gsm_mttl.py		eval_gsm_mttl.py
generate_gsm8k_perturb.py		generate_gsm8k_perturb.py
gsm_evaluator_with_lora_soup.py		gsm_evaluator_with_lora_soup.py
math500_evaluator_with_lora_soup.py		math500_evaluator_with_lora_soup.py
merge_lora_to_backbone_push_to_hf.py		merge_lora_to_backbone_push_to_hf.py
push_to_hf.py		push_to_hf.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
test_load_expert.py		test_load_expert.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

LoRA Soup for GSM-8k

train the math skill and code skill.

Evaluate the model on GSM

About

Uh oh!

Releases

Packages

Languages

License

shuishen112/mttl

Folders and files

Latest commit

History

Repository files navigation

LoRA Soup for GSM-8k

train the math skill and code skill.

Evaluate the model on GSM

About

Resources

License

Code of conduct

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages