You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am trying to reproduce the InstructBLIP paper's results on GQA and TextVQA. Using both the HuggingFace and the LAVIS versions of the models, I am consistently getting 5-10% below the reported numbers in the table. Specifically, InstructBLIP Vicuna 7B is 1% worse on GQA, and 10% worse on TextVQA, while InstructBLIP Vicuna 13B is 6% worse on GQA and also 10% worse on Text VQA.
I have made sure to match the prompting strategy described in Appendix E of the paper. I have also tried a number of decoding strategies (number of beams, sampling hyper-parameters) and these only change performance by 1-2% total. The dataset loading and scoring has been evaluated on other open source models which reproduce (e.g. LLaVA).
Do you have any suggestions on missing hyperparameters that could cause this. Or even better would be a script that reproduces the papers numbers for InstructBLIP, I currently cannot find such a script in LAVIS.
Thank you!
The text was updated successfully, but these errors were encountered:
Hello,
I am trying to reproduce the InstructBLIP paper's results on GQA and TextVQA. Using both the HuggingFace and the LAVIS versions of the models, I am consistently getting 5-10% below the reported numbers in the table. Specifically, InstructBLIP Vicuna 7B is 1% worse on GQA, and 10% worse on TextVQA, while InstructBLIP Vicuna 13B is 6% worse on GQA and also 10% worse on Text VQA.
I have made sure to match the prompting strategy described in Appendix E of the paper. I have also tried a number of decoding strategies (number of beams, sampling hyper-parameters) and these only change performance by 1-2% total. The dataset loading and scoring has been evaluated on other open source models which reproduce (e.g. LLaVA).
Do you have any suggestions on missing hyperparameters that could cause this. Or even better would be a script that reproduces the papers numbers for InstructBLIP, I currently cannot find such a script in LAVIS.
Thank you!
The text was updated successfully, but these errors were encountered: