You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Appreciate the provided code.
I have a question regarding the training process. While the LLM is in the process of learning to generate images, it generates the same text as the input. Nevertheless, during inference, the model is able to produce sensible responses for the input. Consequently, I'm curious if there are any other instruction datasets utilized for fine-tuning the model's ability to follow instructions. If such datasets are indeed employed, could the instruction fine-tuning resources be made publicly accessible?
The text was updated successfully, but these errors were encountered:
We do not use any instruction datasets, the model is only finetuned on the CC3M image + caption data. I do think that GILL would benefit greatly from finetuning on instructions, though!
Appreciate the provided code.
I have a question regarding the training process. While the LLM is in the process of learning to generate images, it generates the same text as the input. Nevertheless, during inference, the model is able to produce sensible responses for the input. Consequently, I'm curious if there are any other instruction datasets utilized for fine-tuning the model's ability to follow instructions. If such datasets are indeed employed, could the instruction fine-tuning resources be made publicly accessible?
The text was updated successfully, but these errors were encountered: