Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regarding "<ImageHere>" #225

Closed
thiner opened this issue Mar 20, 2024 · 4 comments
Closed

Regarding "<ImageHere>" #225

thiner opened this issue Mar 20, 2024 · 4 comments
Assignees

Comments

@thiner
Copy link

thiner commented Mar 20, 2024

  1. Is <ImageHere> a fixed placeholder in text prompt?
  2. What kind of value does the VL model expect? A path, URL or base64 encoded image?
@yuhangzang
Copy link
Collaborator

Hi thiner, you may refer to this line and this line for your questions.

@thiner
Copy link
Author

thiner commented Mar 26, 2024

@yhcao6 Thanks for your answer.
I'd like to summarize my study from the code, please correct me if misunderstood the logic.

  1. <ImageHere> is a fixed placeholder which separate image and text prompt.
  2. XComposer-VL expects the image input be a path which is recognizable by PIL.Image.open method or a torch.Tensor instance.

Based on above summaries, I have a further question, does XComposer-VL supports multiple images as input? I think it's not supported currently, is it?

@yuhangzang
Copy link
Collaborator

yuhangzang commented Mar 28, 2024

XComposer-VL supports multiple images as input, e.g., query = '<ImageHere> <ImageHere> balabala', img_path = ['a.jpg', 'b.jpg']

@yuhangzang
Copy link
Collaborator

Kindly reopen this issue if you have any further questions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants