Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

If I want to use another LLM model, which parts of the code do I need to customize? #70

Closed
liuweie opened this issue May 15, 2024 · 5 comments

Comments

@liuweie
Copy link

liuweie commented May 15, 2024

hi,I am trying to use llm2vec with another LLM model ( not Mistral or LLama),which parts of the code should I customize?
I personally think that the code in the models section of the llm2vec library should be modified, such as adding a code similar to bidirectional_llama.py, am I right?

thx

@vaibhavad
Copy link
Collaborator

Yes, that is correct, some portions of loading the model will also need to be changed
https://github.com/McGill-NLP/llm2vec/blob/main/llm2vec/llm2vec.py#L59

Finally, if you want to define a custom template, it will require changes here. In the future versions of the library, the template will be specified outside the package (#56 )

Let me know if you have any further questions.

@liuweie
Copy link
Author

liuweie commented May 20, 2024

Yes, that is correct, some portions of loading the model will also need to be changed https://github.com/McGill-NLP/llm2vec/blob/main/llm2vec/llm2vec.py#L59

Finally, if you want to define a custom template, it will require changes here. In the future versions of the library, the template will be specified outside the package (#56 )

Let me know if you have any further questions.

thanks for your reply! By the way, I found that if I want to use my customized dadaset, I also need to add a dataset code in llm2vec package, just like E5data.py in dataset module. I think it is not a flexible way , Is there another way to do this ?

@ciekawy
Copy link

ciekawy commented May 20, 2024

I tried the phi3 and while I managed to convert, I see the tensor names changed, e.g. the "model." prefix is missing etc...

@vaibhavad
Copy link
Collaborator

Do you have a fork where I can see the code?

@vaibhavad
Copy link
Collaborator

Closing as it is stale. Feel free to re-open if you have any more questions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants