Cannot approach the performance of the uploaded self-rag ckpt when finetuning meta/Llama-2 myself #57

HazekiahWon · 2024-03-16T17:47:27Z

Thanks for your inspiring work @AkariAsai .

I tried to run the script_finetune_7b.sh script myself (using meta-llama/Llama-2-7b-hf and your provided generator data), which is expected to produce a ckpt that aligns with the uploaded ckpt in performance now that both the base ckpt and data align.

However the resulting model shows a significant performance gap wrt the uploaded self-rag checkpoint. For example, on triviaQA, my ckpt has only 0.503 acc compared with 0.679 the uploaded ckpt.

I do notice that there is difference between the final checkpoint dir and the uploaded checkpoint dir:

my reproduction ckpt are saved in *.savetensors, however the uploaded ckpt in *.bin.
I encounter the same issue as mentioned in #21. The checkpointing stores both the single checkpoint (but without embedding parameters) and sharded checkpoints (see the figure below). So The saved embed_tokens is empty #21 suggested model.safetensors be removed. I guess you did not encounter such issue.

I was wondering whether the underlying cause of such difference might result in this performance gap. Do you have any idea regarding this matter?

The text was updated successfully, but these errors were encountered:

Jack-ZC8 · 2024-04-11T08:31:20Z

It seems like I have the same question, I would appreciate it if there is any possible solution! @HazekiahWon @AkariAsai

hummingbird2030 · 2024-04-15T09:01:13Z

Hello , I encountered the same problem. I wonder how do you load the model. I have trained a model and get the same files, which have model.safetensors . I When I try to load the model to eval by running run_short_form.py , this problem occurs. I delete the model.safetensors as #21 suggested but I didn't solve the problem. @HazekiahWon @Jack-ZC8 @AkariAsai

File "run_short_form.py", line 302, in main
model = LLM(model=gpt, download_dir=args.download_dir,
File "/home/xxx/anaconda3/envs/rag/lib/python3.8/site-packages/vllm/entrypoints/llm.py", line 105, in init
self.llm_engine = LLMEngine.from_engine_args(engine_args)
File "/home/xxx/anaconda3/envs/rag/lib/python3.8/site-packages/vllm/engine/llm_engine.py", line 250, in from_engine_args
engine = cls(*engine_configs,
File "/home/xxx/anaconda3/envs/rag/lib/python3.8/site-packages/vllm/engine/llm_engine.py", line 110, in init
self._init_workers(distributed_init_method)
File "/home/xxx/anaconda3/envs/rag/lib/python3.8/site-packages/vllm/engine/llm_engine.py", line 146, in _init_workers
self._run_workers(
File "/home/xxx/anaconda3/envs/rag/lib/python3.8/site-packages/vllm/engine/llm_engine.py", line 755, in _run_workers
self._run_workers_in_batch(workers, method, *args, **kwargs))
File "/home/xxx/anaconda3/envs/rag/lib/python3.8/site-packages/vllm/engine/llm_engine.py", line 729, in _run_workers_in_batch
output = executor(*args, **kwargs)
File "/home/xxx/anaconda3/envs/rag/lib/python3.8/site-packages/vllm/worker/worker.py", line 79, in load_model
self.model_runner.load_model()
File "/home/xxx/anaconda3/envs/rag/lib/python3.8/site-packages/vllm/worker/model_runner.py", line 57, in load_model
self.model = get_model(self.model_config)
File "/home/xxx/anaconda3/envs/rag/lib/python3.8/site-packages/vllm/model_executor/model_loader.py", line 72, in get_model
model.load_weights(model_config.model, model_config.download_dir,
File "/home/xxx/anaconda3/envs/rag/lib/python3.8/site-packages/vllm/model_executor/models/llama.py", line 340, in load_weights
weight_loader(param, loaded_weight)
File "/home/xxx/anaconda3/envs/rag/lib/python3.8/site-packages/vllm/model_executor/layers/vocab_parallel_embedding.py", line 80, in weight_loader
assert loaded_weight.shape[parallel_dim] == self.num_embeddings
AssertionError

Jack-ZC8 · 2024-04-15T09:06:55Z

I encountered the problem (assert loaded_weight.shape[parallel_dim] == self.num_embeddings), but I solved it by deleting the file model.safetensors ...

hummingbird2030 · 2024-04-15T09:09:18Z

I encountered the problem (assert loaded_weight.shape[parallel_dim] == self.num_embeddings), but I solved it by deleting the file model.safetensors ...

Thanks, I solved this problem the same!

fate-ubw · 2024-05-13T07:57:30Z

@HazekiahWon Hello～HazekiahWon
Recently, I also trained selfrag-7b based on llama2. I encountered the same problem as you: using the training data and scripts provided by selfrag, I obtained selfrag-7b-myversion by fine-tuning llama2-7b. When I evaluated selfrag-7b-myversion, I found that its performance metrics were not as good as the officially released selfrag-7b. I saw that you provided the evaluation results of your fine-tuned model on the TQA dataset and the acc is 0.503. Could you please share the results of your model on the PopQA, Archallenge, and PubHealth datasets as well? Thank you very much, and I look forward to your reply.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cannot approach the performance of the uploaded self-rag ckpt when finetuning meta/Llama-2 myself #57

Cannot approach the performance of the uploaded self-rag ckpt when finetuning meta/Llama-2 myself #57

HazekiahWon commented Mar 16, 2024 •

edited

Loading

Jack-ZC8 commented Apr 11, 2024

hummingbird2030 commented Apr 15, 2024

Jack-ZC8 commented Apr 15, 2024

hummingbird2030 commented Apr 15, 2024

fate-ubw commented May 13, 2024

Cannot approach the performance of the uploaded self-rag ckpt when finetuning meta/Llama-2 myself #57

Cannot approach the performance of the uploaded self-rag ckpt when finetuning meta/Llama-2 myself #57

Comments

HazekiahWon commented Mar 16, 2024 • edited Loading

Jack-ZC8 commented Apr 11, 2024

hummingbird2030 commented Apr 15, 2024

Jack-ZC8 commented Apr 15, 2024

hummingbird2030 commented Apr 15, 2024

fate-ubw commented May 13, 2024

HazekiahWon commented Mar 16, 2024 •

edited

Loading