Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LLAVA Configuration #737

Closed
hswlab opened this issue May 13, 2024 · 4 comments
Closed

LLAVA Configuration #737

hswlab opened this issue May 13, 2024 · 4 comments
Assignees

Comments

@hswlab
Copy link
Contributor

hswlab commented May 13, 2024

Description

I have difficulties to figure out, how to correctly config the LLava example.

First I Initialized the backend with a path to libllama.dll and llava_shared.dll in
NativeLibraryConfig.Instance.WithLibrary(llamaPath, llavaPath);

image

Then I tried to Implement something like shown in this example. I don't understand where to find the suitable Models I need for

string multiModalProj = UserSettings.GetMMProjPath();
string modelPath = UserSettings.GetModelPath();

image

modelPath, I belive is the model, I can download here. But what is a clipModel, and where can I get it?

@SignalRT
Copy link
Collaborator

SignalRT commented May 13, 2024

You can see in Llama.Unitest.csproj the URLs of the models used in the example and UnitTest:

https://huggingface.co/cjpais/llava-1.6-mistral-7b-gguf/resolve/main/llava-v1.6-mistral-7b.Q3_K_XS.gguf
https://huggingface.co/cjpais/llava-1.6-mistral-7b-gguf/resolve/main/mmproj-model-f16.gguf

You will have both files in any vision model. Example:

image

@SignalRT SignalRT self-assigned this May 13, 2024
@hswlab
Copy link
Contributor Author

hswlab commented May 13, 2024

Ah, thank you. So both models can be found at Huggingface. That's something completely new to me, usually I'm using just a single model^^'

@SignalRT
Copy link
Collaborator

Yes, you should download both files from the model you choose to use. Normally you will have several quantized models and one projection model

Llava is using CLIP with an MML Projection (mmproj).

You can find the details in this paper:

https://arxiv.org/pdf/2310.03744

@AsakusaRinne
Copy link
Collaborator

Maybe some documentations are necessary. :D

@hswlab hswlab closed this as completed May 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants