Hi LLaVA team, thanks for the great work, it is really impressive of what you have done!
I am trying to understand what we could do with LLaVA, and to extract information from a photo but the answer is a bit interesting.
While LLaVA is able to tell the uploaded photo is a certificate of incorporation, but it is telling a wrong UEN, is it due to the training is not enough or the limitation of transformer? Thank you for your great effort!


