-
Notifications
You must be signed in to change notification settings - Fork 69
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dear Developers: we ask some base question! #23
Comments
Thank you for your interest in our work! |
How long did you train your models on A100? |
For Bunny-v1.0-3B. It takes about 13 and 12 hours for pretraining and fine-tuning, separately. |
Is it possible to add more multimodal data, not just text and images, but also some intermediate states of processes (that cannot be described with language or images)? |
@QiaoTuCodes For the second question Can this model integrate the controller, Web-UI server, and Model Worker directly into one bash command? You may refer to the HuggingFace Space. |
@hxypqr We use a vision tower to encode the images and then map the vision embeddings into LLM embebedding space by an MLP. So, you can import another kind of data with related encoder and projector. |
Close the issue for now if there's no further discussions. Feel free to reopen it if there's any other questions. |
Dear Developers:
Thank you to the BAAI team for open-sourcing the Bunny model. I've been actively exploring it these past few days. I have a few doubts regarding the deployment of the model, and I hope to get answers from the BAAI official technical team. Nevertheless, I am extremely grateful! The first question is: I want to know the GPU running conditions required for several versions of the model. For example, the Bunny-v1_0-3B full parameter version and the bunny-phi-2-siglip-lora version. so can you provide a list for comparison and clarification? What are the officially recommended GPU models and VRAM sizes?The second question is: Can this model integrate the controller, Web-UI server, and Model Worker directly into one bash command ? Currently, it seems that three separate bash commands need to be executed to start the controller, WebUI, and model inference. This seems to be considered for "microservices architecture" or "distributed system architecture". Is my understanding correct?If we deploy using Docker containers and use Kubernetes as the container visual management framework, can an official post be provided to explain in more detail the standard deployment process?
The text was updated successfully, but these errors were encountered: