You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Great work! I've read the paper and it seems the LLaVA+S^2 is implemented with OpenCLIP VIsion Encoder, and the LLM is finetuned with LoRA. However, the LLaVA baseline you compared with is implemented with OpenAI-CLIP Vision Encoder, and the LLM is full-finetuned(without LoRA).
If I'm right, I just wonder if you have tried using the same Vision Encoder, or full-finetuning the LLM, and what are the results of this setting? Thank you.
The text was updated successfully, but these errors were encountered:
Hi @jungle-gym-ac, yeah good question. In the scaling experiment on llava (Fig 3 in the paper), all the models including the baselines use openclip. The experiment of comparing llava-s2 to official llava (Table 11 in Appendix) uses OpenAI clip.
And you are right, all the models we trained on llava use lora while the official llava checkpoint we compare to uses full fine tuning. According to the official llava repo, the performance of llava with ft/lora doesn't differ much on average, but yeah comparing to the official checkpoint Lora would be fairer. We will include this in a later version of the paper. And we didn't try llava-s2 with ft.
The training recipe is the exact same one as LLaVA. Loss of 2.5 seems
weird. That is almost the same loss as an untrained model. Did you change
anything from the llava repo that may cause some unexpected behaviors?
Great work! I've read the paper and it seems the
LLaVA+S^2
is implemented with OpenCLIP VIsion Encoder, and the LLM is finetuned with LoRA. However, the LLaVA baseline you compared with is implemented with OpenAI-CLIP Vision Encoder, and the LLM is full-finetuned(without LoRA).If I'm right, I just wonder if you have tried using the same Vision Encoder, or full-finetuning the LLM, and what are the results of this setting? Thank you.
The text was updated successfully, but these errors were encountered: