Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

llama.cpp and gguf files #1760

Open
denijane opened this issue Apr 20, 2024 · 8 comments
Open

llama.cpp and gguf files #1760

denijane opened this issue Apr 20, 2024 · 8 comments
Labels
question Further information is requested

Comments

@denijane
Copy link

I'm trying to create a flow using a locally run llama3 model. I tried using ollama to run the llama3 model, but I'm getting strange responses. (Prompt simply doesn't stop and I'm watching the AI talk to itself). Also, it is very slow, like 10 times slower than what I get from directly running ollama.

Then I decided to use directly the downloaded model with LlamaCpp to see if it wokrs better. First thing: LLM->LlamaCPP accepts only bin files and the newer format is gguf.

Then I downloaded an older model that is in bin format and I'm getting
"ValueError: Error building node LlamaCpp(ID:LlamaCpp-BzhwI): Could not import llama-cpp-python library. Please install the llama-cpp-python library to use this embedding model: pip install llama-cpp-python"

I spent the night debugging this and I'm scratching my head. llama-cpp-python import Llama, while the error I get with LlamaCpp should be imported from langchain_community, not from llama-cpp-python.

I made a test in Python and after importing LlamaCPP from langchain_community, I was able to run fine Meta-Llama-3-8B-Instruct-Q6_K.gguf but not llama-2-7b-chat.ggmlv3.q3_K_L.bin which returs error(type=value_error).

I also made a test with from llama_cpp import Llama - again it works with .gguf files but not with bin files.

So I'm not sure which library LangFlow uses, maybe it's just naming convention calling it LlamaCPP when it's calling Llama, or it's really LlamaCpp and the error message about the library is wrong, but the file format is definitely wrong and LlamaCPP simply doesn't work.

@YamonBot
Copy link
Contributor

I recommend using Ollama, which has focused community support. Recently, I have been working on improvements to this component and expect to complete it within 2-3 days, as I have verified it works correctly. I suggest you review my draft and make your own modifications to temporarily operate it. (In my draft, remove and use the "buildConfig" section due to an incorrect implementation of the buildConfig method.)

#1701

@denijane
Copy link
Author

Hi, I managed to make LlamaCpp work by editing the python code in the LllamaCpp component to allow gguf files (I also played with some source files but I don't think they did it). So it runs.

Now the problem is similar to the one with using ollama - 1) very slow (compared to just doing >ollama run llama3 and talking to it) and 2) it doesn't stop. It starts generating human responses and then replying to them and it goes on forever.

So if you fixed at least 2) in the new version, that would be a significant update.

@anovazzi1
Copy link
Contributor

Hello @denijane,
Sorry for the delay. Did you try using the new version? Does the error still persist?

@carlosrcoelho
Copy link
Contributor

Hi @denijane

We hope you're doing well. Just a friendly reminder that if we do not hear back from you within the next 3 days, we will close this issue. If you need more time or further assistance, please let us know.


Thank you for your understanding!

@denijane
Copy link
Author

Hi @denijane

We hope you're doing well. Just a friendly reminder that if we do not hear back from you within the next 3 days, we will close this issue. If you need more time or further assistance, please let us know.


Thank you for your understanding!

Hi, I've been on work trips, so I can't update right now and do proper testing. I'll return home this Friday and I'll have the time to test the issue during the weekend, if this is ok with you.
I have one test code with CustomLlama which seems to be working both with LllamaCpp and Lllama from langflow but I can't test more than this today (moreover my nvidia seems to not be in the mood to work today, I probably need to restart)

@carlosrcoelho
Copy link
Contributor

@denijane

No worries at all, we totally understand! Take your time to get everything set up and test. We'll keep the issue open and look forward to your update.

Thanks for letting us know!

@carlosrcoelho carlosrcoelho added the question Further information is requested label Jul 22, 2024
@carlosrcoelho
Copy link
Contributor

Hi @denijane

We hope you're doing well. Just a friendly reminder that if we do not hear back from you within the next 3 days, we will close this issue. If you need more time or further assistance, please let us know.


Thank you for your understanding!

@denijane
Copy link
Author

Hi, just a quick note. I tested a flow about RAG with ollama - it worked quite nicely. There was a bug when reading ollama which didn't allow me to select the model and kept asking for llama2, but after I modified the code to set the url and the model, all worked and it's quite quick. I had a chat about a text document and it worked well.

I'm not sure where the llamacpp option went in this version of langflow and how you use gguf models, so if you can give me a hint, I could test also this. But ollama with llama3.1:8b worked very well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

4 participants