Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot approach the performance of the uploaded self-rag ckpt when finetuning meta/Llama-2 myself #57

Open
HazekiahWon opened this issue Mar 16, 2024 · 5 comments

Comments

@HazekiahWon
Copy link

HazekiahWon commented Mar 16, 2024

Thanks for your inspiring work @AkariAsai .

I tried to run the script_finetune_7b.sh script myself (using meta-llama/Llama-2-7b-hf and your provided generator data), which is expected to produce a ckpt that aligns with the uploaded ckpt in performance now that both the base ckpt and data align.

However the resulting model shows a significant performance gap wrt the uploaded self-rag checkpoint. For example, on triviaQA, my ckpt has only 0.503 acc compared with 0.679 the uploaded ckpt.

I do notice that there is difference between the final checkpoint dir and the uploaded checkpoint dir:

  1. my reproduction ckpt are saved in *.savetensors, however the uploaded ckpt in *.bin.
  2. I encounter the same issue as mentioned in #21. The checkpointing stores both the single checkpoint (but without embedding parameters) and sharded checkpoints (see the figure below). So The saved embed_tokens is empty #21 suggested model.safetensors be removed. I guess you did not encounter such issue.
image

I was wondering whether the underlying cause of such difference might result in this performance gap. Do you have any idea regarding this matter?

@Jack-ZC8
Copy link

It seems like I have the same question, I would appreciate it if there is any possible solution! @HazekiahWon @AkariAsai

@hummingbird2030
Copy link

Hello , I encountered the same problem. I wonder how do you load the model. I have trained a model and get the same files, which have model.safetensors . I When I try to load the model to eval by running run_short_form.py , this problem occurs. I delete the model.safetensors as #21 suggested but I didn't solve the problem. @HazekiahWon @Jack-ZC8 @AkariAsai

File "run_short_form.py", line 302, in main
model = LLM(model=gpt, download_dir=args.download_dir,
File "/home/xxx/anaconda3/envs/rag/lib/python3.8/site-packages/vllm/entrypoints/llm.py", line 105, in init
self.llm_engine = LLMEngine.from_engine_args(engine_args)
File "/home/xxx/anaconda3/envs/rag/lib/python3.8/site-packages/vllm/engine/llm_engine.py", line 250, in from_engine_args
engine = cls(*engine_configs,
File "/home/xxx/anaconda3/envs/rag/lib/python3.8/site-packages/vllm/engine/llm_engine.py", line 110, in init
self._init_workers(distributed_init_method)
File "/home/xxx/anaconda3/envs/rag/lib/python3.8/site-packages/vllm/engine/llm_engine.py", line 146, in _init_workers
self._run_workers(
File "/home/xxx/anaconda3/envs/rag/lib/python3.8/site-packages/vllm/engine/llm_engine.py", line 755, in _run_workers
self._run_workers_in_batch(workers, method, *args, **kwargs))
File "/home/xxx/anaconda3/envs/rag/lib/python3.8/site-packages/vllm/engine/llm_engine.py", line 729, in _run_workers_in_batch
output = executor(*args, **kwargs)
File "/home/xxx/anaconda3/envs/rag/lib/python3.8/site-packages/vllm/worker/worker.py", line 79, in load_model
self.model_runner.load_model()
File "/home/xxx/anaconda3/envs/rag/lib/python3.8/site-packages/vllm/worker/model_runner.py", line 57, in load_model
self.model = get_model(self.model_config)
File "/home/xxx/anaconda3/envs/rag/lib/python3.8/site-packages/vllm/model_executor/model_loader.py", line 72, in get_model
model.load_weights(model_config.model, model_config.download_dir,
File "/home/xxx/anaconda3/envs/rag/lib/python3.8/site-packages/vllm/model_executor/models/llama.py", line 340, in load_weights
weight_loader(param, loaded_weight)
File "/home/xxx/anaconda3/envs/rag/lib/python3.8/site-packages/vllm/model_executor/layers/vocab_parallel_embedding.py", line 80, in weight_loader
assert loaded_weight.shape[parallel_dim] == self.num_embeddings
AssertionError

@Jack-ZC8
Copy link

I encountered the problem (assert loaded_weight.shape[parallel_dim] == self.num_embeddings), but I solved it by deleting the file model.safetensors ...

@hummingbird2030
Copy link

I encountered the problem (assert loaded_weight.shape[parallel_dim] == self.num_embeddings), but I solved it by deleting the file model.safetensors ...

Thanks, I solved this problem the same!

@fate-ubw
Copy link

@HazekiahWon Hello~HazekiahWon
Recently, I also trained selfrag-7b based on llama2. I encountered the same problem as you: using the training data and scripts provided by selfrag, I obtained selfrag-7b-myversion by fine-tuning llama2-7b. When I evaluated selfrag-7b-myversion, I found that its performance metrics were not as good as the officially released selfrag-7b. I saw that you provided the evaluation results of your fine-tuned model on the TQA dataset and the acc is 0.503. Could you please share the results of your model on the PopQA, Archallenge, and PubHealth datasets as well? Thank you very much, and I look forward to your reply.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants