Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LLaVA-based architecture #11

Open
silence143 opened this issue Mar 27, 2024 · 1 comment
Open

LLaVA-based architecture #11

silence143 opened this issue Mar 27, 2024 · 1 comment

Comments

@silence143
Copy link

How does the Octopus dataset is organized and trained on LLaVA architecture? LLaVA doesn't support in-context learning, if we merge all subtasks into a multi-turn conversation, another problem raises: LLaVA will input all subtask's images embeddings at once, and this problem seems hard to solve.
So how do you deal with that, input no images and only use env information? could you provide a demo.json to show me how dataset is organized on LLaVA architecture? thanks a lot

@Jingkang50
Copy link
Collaborator

Jingkang50 commented Mar 27, 2024

Thank you for your interest in our work!

The LLaVA-version Octopus will be released once the official LLaVA video is released, as we used some internal code from the mentioned project. The release should be soon but I am not quite sure about the exact date.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants