Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistent performance on REC task #49

Open
ZhanYang-nwpu opened this issue Oct 13, 2023 · 4 comments
Open

Inconsistent performance on REC task #49

ZhanYang-nwpu opened this issue Oct 13, 2023 · 4 comments

Comments

@ZhanYang-nwpu
Copy link

The performance of Shikra on the dataset of REC task is quite surprising.
I am trying to get the shikra-7b model by using vicuna-7b as the base model and using the shikra-7b-delta-v1 as the delta model.

I evaluate the shikra-7b model on RefCOCO testA and RefCOCO testB, but only get 79.64% and 64.54% overall accuracy.
It does not match the performance on the Table 3.

Do you have any suggestion?

@niiickZ
Copy link

niiickZ commented Dec 12, 2023

Hi, I'm facing the same problem when reproducing the method. Have you solved it?

@ZhanYang-nwpu
Copy link
Author

Hi, I'm facing the same problem when reproducing the method. Have you solved it?

I'm sorry, but I still haven't solved the problem. I was confused. I didn't use the shikra model after that.

@Yeemkt
Copy link

Yeemkt commented Dec 27, 2023

The performance of Shikra on the dataset of REC task is quite surprising. I am trying to get the shikra-7b model by using vicuna-7b as the base model and using the shikra-7b-delta-v1 as the delta model.

I evaluate the shikra-7b model on RefCOCO testA and RefCOCO testB, but only get 79.64% and 64.54% overall accuracy. It does not match the performance on the Table 3.

Do you have any suggestion?

can u tell me how to evaluate the modle on refcoco or other custom datasets.
Thanks!

@sunsmarterjie
Copy link

Could you provide the test code on refcoco?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants