Could we use LLaVA to extract information from a given photo?

Hi LLaVA team, thanks for the great work, it is really impressive of what you have done!
I am trying to understand what we could do with LLaVA, and to extract information from a photo but the answer is a bit interesting. 

While LLaVA is able to tell the uploaded photo is a certificate of incorporation, but it is telling a wrong UEN, is it due to the training is not enough or the limitation of transformer? Thank you for your great effort!

![image](https://user-images.githubusercontent.com/17609528/236123433-9d2c7de6-8bbd-4244-b4ea-6a85ae19fa1b.png)
![image](https://user-images.githubusercontent.com/17609528/236123633-dd58f7cf-e413-402e-81db-c64da568e478.png)
![Screenshot 2023-05-04 135147](https://user-images.githubusercontent.com/17609528/236123649-13df1e3e-39d8-4940-b096-7fb98761104b.png)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Could we use LLaVA to extract information from a given photo? #93

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Could we use LLaVA to extract information from a given photo? #93

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions