My results in open-domain QA are much lower using the given checkpoint for CEPE-LLaMA-2-7B. Could you provide some insights into the potential causes for this decline? #1

sunnynexus · 2024-03-05T12:14:00Z

I'm curious about the discrepancies between my results (in red font) and the results presented in your paper (in black font), both obtained using the default parameters with the run_qa.sh script.

Could there be any potential errors on my end that could explain these differences?

howard-yen · 2024-03-07T00:29:26Z

Hi, thanks for your interest in our work.
For CEPE at k = 10, we only use and put all the passages in the decoder model, which should match the results for LLaMA-2. There might have been a mistake in the config file, which I will look into.
Are you also using the QA files from the google drive?

sunnynexus · 2024-03-07T00:56:23Z

Hi, thanks for your interest in our work. For CEPE at k = 10, we only use and put all the passages in the decoder model, which should match the results for LLaMA-2. There might have been a mistake in the config file, which I will look into. Are you also using the QA files from the google drive?

Thank you for your reply. Yes, I used the QA files from the google drive.

sunnynexus · 2024-03-07T05:04:47Z

I have tried running it multiple times, but the results are still not superior to the basic llama-2-7b model.

sunnynexus closed this as completed Jul 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

My results in open-domain QA are much lower using the given checkpoint for CEPE-LLaMA-2-7B. Could you provide some insights into the potential causes for this decline? #1

My results in open-domain QA are much lower using the given checkpoint for CEPE-LLaMA-2-7B. Could you provide some insights into the potential causes for this decline? #1

sunnynexus commented Mar 5, 2024

howard-yen commented Mar 7, 2024

sunnynexus commented Mar 7, 2024

sunnynexus commented Mar 7, 2024

My results in open-domain QA are much lower using the given checkpoint for CEPE-LLaMA-2-7B. Could you provide some insights into the potential causes for this decline? #1

My results in open-domain QA are much lower using the given checkpoint for CEPE-LLaMA-2-7B. Could you provide some insights into the potential causes for this decline? #1

Comments

sunnynexus commented Mar 5, 2024

howard-yen commented Mar 7, 2024

sunnynexus commented Mar 7, 2024

sunnynexus commented Mar 7, 2024