[Bug]: I am trying to run `ollama/dolphin-phi` available on ollama, but /chat/{chat_id}/question throws {"error":"model 'llama2' not found, try pulling it first"} #2056

shubham-dayma · 2024-01-22T12:42:23Z

What happened?

I am trying to run ollama/dolphin-phi model on ollama but /chat/{chat_id}/question throws {"error":"model 'llama2' not found, try pulling it first"} error. I don't want to load llama2 on my ollama due to less resources available on my system.

Relevant log output

INFO:     192.168.29.176:50344 - "POST /chat/9ba4c7d0-80a0-4798-a2b0-5749748775c7/question?brain_id=f92b9784-847c-48df-a016-fb0933fc2323 HTTP/1.
1" 500 Internal Server Error
ERROR:    Exception in ASGI application
Traceback (most recent call last):
  File "/usr/local/lib/python3.11/site-packages/uvicorn/protocols/http/h11_impl.py", line 428, in run_asgi
    result = await app(  # type: ignore[func-returns-value]
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/uvicorn/middleware/proxy_headers.py", line 78, in __call__
    return await self.app(scope, receive, send)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/fastapi/applications.py", line 276, in __call__
    await super().__call__(scope, receive, send)
  File "/usr/local/lib/python3.11/site-packages/starlette/applications.py", line 122, in __call__
    await self.middleware_stack(scope, receive, send)
  File "/usr/local/lib/python3.11/site-packages/starlette/middleware/errors.py", line 184, in __call__
    raise exc
  File "/usr/local/lib/python3.11/site-packages/starlette/middleware/errors.py", line 162, in __call__
    await self.app(scope, receive, _send)
  File "/usr/local/lib/python3.11/site-packages/starlette/middleware/cors.py", line 91, in __call__
    await self.simple_response(scope, receive, send, request_headers=headers)
  File "/usr/local/lib/python3.11/site-packages/starlette/middleware/cors.py", line 146, in simple_response
    await self.app(scope, receive, send)
  File "/usr/local/lib/python3.11/site-packages/starlette/middleware/exceptions.py", line 79, in __call__
    raise exc
  File "/usr/local/lib/python3.11/site-packages/starlette/middleware/exceptions.py", line 68, in __call__
    await self.app(scope, receive, sender)
  File "/usr/local/lib/python3.11/site-packages/fastapi/middleware/asyncexitstack.py", line 21, in __call__
    raise e
  File "/usr/local/lib/python3.11/site-packages/fastapi/middleware/asyncexitstack.py", line 18, in __call__
    await self.app(scope, receive, send)
  File "/usr/local/lib/python3.11/site-packages/starlette/routing.py", line 718, in __call__
    await route.handle(scope, receive, send)
  File "/usr/local/lib/python3.11/site-packages/starlette/routing.py", line 276, in handle
    await self.app(scope, receive, send)
  File "/usr/local/lib/python3.11/site-packages/starlette/routing.py", line 66, in app
    response = await func(request)
               ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/fastapi/routing.py", line 237, in app
    raw_response = await run_endpoint_function(
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/fastapi/routing.py", line 163, in run_endpoint_function
    return await dependant.call(**values)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/code/modules/chat/controller/chat_routes.py", line 171, in create_question_handler
    chat_answer = gpt_answer_generator.generate_answer(
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/code/llm/knowledge_brain_qa.py", line 122, in generate_answer
    model_response = qa(
                     ^^^
  File "/usr/local/lib/python3.11/site-packages/langchain/chains/base.py", line 312, in __call__
    raise e
  File "/usr/local/lib/python3.11/site-packages/langchain/chains/base.py", line 306, in __call__
    self._call(inputs, run_manager=run_manager)
  File "/usr/local/lib/python3.11/site-packages/langchain/chains/conversational_retrieval/base.py", line 151, in _call
    docs = self._get_docs(new_question, inputs, run_manager=_run_manager)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/langchain/chains/conversational_retrieval/base.py", line 308, in _get_docs
    docs = self.retriever.get_relevant_documents(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/langchain_core/retrievers.py", line 211, in get_relevant_documents
    raise e
  File "/usr/local/lib/python3.11/site-packages/langchain_core/retrievers.py", line 204, in get_relevant_documents
    result = self._get_relevant_documents(
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/langchain_core/vectorstores.py", line 656, in _get_relevant_documents
    docs = self.vectorstore.similarity_search(query, **self.search_kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/code/vectorstore/supabase.py", line 32, in similarity_search
    vectors = self._embedding.embed_documents([query])
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/langchain/embeddings/ollama.py", line 190, in embed_documents
    embeddings = self._embed(instruction_pairs)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/langchain/embeddings/ollama.py", line 175, in _embed
    embeddings = self._process_emb_response(prompt)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/langchain/embeddings/ollama.py", line 160, in _process_emb_response
    raise ValueError(
ValueError: Error raised by inference API HTTP code: 404, {"error":"model 'llama2' not found, try pulling it first"}

Twitter / LinkedIn details

No response

The text was updated successfully, but these errors were encountered:

dosubot · 2024-01-22T12:43:15Z

🤖

Hello @shubham-dayma! 👋 I'm Dosu, a friendly bot here to help while a human maintainer gets to your issue. I'm great at solving bugs, answering questions, and helping folks like you become contributors.

Consider me a friendly helper while we wait for a human touch. I'm currently diving into the issue you've posted and will have a full response for you shortly. Hang tight! 🚀

StanGirard · 2024-01-22T16:02:12Z

ollama pull dolphin-phi my friend ;)

shubham-dayma · 2024-01-24T07:59:54Z

@StanGirard Thanks for looking into this. I already had dolphin-phi in ollama. I tried checking ollama logs and it ended up that, while initializing langchain.embeddings.ollama.OllamaEmbeddings we are not passing param model and by default this is considering model as llama2. I changed this code as below and it worked.

@Property
def embeddings(self):
if self.brain_settings.ollama_api_base_url:
return OllamaEmbeddings(
base_url=self.brain_settings.ollama_api_base_url,
model=self.model.replace('ollama/', "")
) # pyright: ignore reportPrivateUsage=none
else:
return OpenAIEmbeddings()

Now I am wondering how should I get brain's model in backend/models/settings.py and update the initializing of OllamaEmbeddings here.

dosubot · 2024-01-24T08:06:18Z

🤖

Based on the context provided, it seems like you've already found a solution to the issue by modifying the OllamaEmbeddings initialization in quivr_rag.py. To apply a similar fix in backend/models/settings.py, you would need to have access to the model name in the get_embeddings function.

One way to achieve this could be to modify the BrainSettings class to include the model name, and then use this when initializing OllamaEmbeddings. Here's an example of how you could do this:

class BrainSettings(BaseSettings):
    ...
    model: str = "llama2"  # Add this line

    class Config:
        env_file = ".env"


def get_embeddings():
    settings = BrainSettings()
    if settings.ollama_api_base_url:
        embeddings = OllamaEmbeddings(
            base_url=settings.ollama_api_base_url,
            model=settings.model  # Use the model from settings here
        )
    else:
        embeddings = OpenAIEmbeddings()
    return embeddings

This solution assumes that the OllamaEmbeddings class accepts a model parameter during initialization. If it doesn't, you might need to modify the OllamaEmbeddings class to accept and use this parameter.

Please note that this is a potential solution based on the information provided and the context of the Quivr repository. It might not be the best or the only solution. If you have any further questions or need more help, feel free to ask.

For more information, you can refer to the Quivr repository and the FastAPI documentation on settings management.

Sources

backend/models/settings.py

This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

weihang-wills · 2024-03-18T04:20:30Z

@StanGirard Thanks for looking into this. I already had dolphin-phi in ollama. I tried checking ollama logs and it ended up that, while initializing langchain.embeddings.ollama.OllamaEmbeddings we are not passing param model and by default this is considering model as llama2. I changed this code as below and it worked.

@Property
def embeddings(self):
if self.brain_settings.ollama_api_base_url:
return OllamaEmbeddings(
base_url=self.brain_settings.ollama_api_base_url,
model=self.model.replace('ollama/', "")
) # pyright: ignore reportPrivateUsage=none
else:
return OpenAIEmbeddings()

Now I am wondering how should I get brain's model in backend/models/settings.py and update the initializing of OllamaEmbeddings here.

embeddings = OllamaEmbeddings(model="XXX")

make it your own model, like "qwen:7b"、“llama2” and so on, that will be solved

StanGirard · 2024-03-18T16:17:43Z

Exactly!

…

On Sun, Mar 17, 2024 at 21:20:51, Wills < ***@***.*** > wrote: > > > @ StanGirard ( https://github.com/StanGirard ) Thanks for looking into > this. I already had dolphin-phi in ollama. I tried checking ollama logs and > it ended up that, while initializing langchain.embeddings.ollama.OllamaEmbeddings > we are not passing param model and by default this is considering model as > llama2. I changed this code ( > https://github.com/StanGirard/quivr/blob/31ce6c05660db8088da96f7bd6f5a0f5f60ffaf7/backend/llm/rags/quivr_rag.py#L67 > ) as below and it worked. > > >> >> >> @ Property ( https://github.com/Property ) >> def embeddings(self): >> if self.brain_settings.ollama_api_base_url: >> return OllamaEmbeddings( >> base_url=self.brain_settings.ollama_api_base_url, >> *model=self.model.replace('ollama/', "")* >> ) # pyright: ignore reportPrivateUsage=none >> else: >> return OpenAIEmbeddings() >> >> > > > > Now I am wondering how should I get brain's model in backend/ models/ settings. > py ( http://backend/models/settings.py ) and update the initializing of OllamaEmbeddings > here ( > https://github.com/StanGirard/quivr/blob/31ce6c05660db8088da96f7bd6f5a0f5f60ffaf7/backend/models/settings.py#L133 > ). > > embeddings = OllamaEmbeddings(model="XXX") make it your own model, like "qwen:7b"、“llama2” and so on, that will be solved — Reply to this email directly, view it on GitHub ( #2056 (comment) ) , or unsubscribe ( https://github.com/notifications/unsubscribe-auth/AEVUW3DBZREPV5C7J6UV65LYYZTSHAVCNFSM6AAAAABCFEO5WWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMBSHEYDCNBQGA ). You are receiving this because you were mentioned. Message ID: <QuivrHQ/quivr/issues/2056/2002901400 @ github. com>

wwjCMP · 2024-05-13T15:37:04Z

Exactly!
…
On Sun, Mar 17, 2024 at 21:20:51, Wills < @.*** > wrote: > > > @ StanGirard ( https://github.com/StanGirard ) Thanks for looking into > this. I already had dolphin-phi in ollama. I tried checking ollama logs and > it ended up that, while initializing langchain.embeddings.ollama.OllamaEmbeddings > we are not passing param model and by default this is considering model as > llama2. I changed this code ( > https://github.com/StanGirard/quivr/blob/31ce6c05660db8088da96f7bd6f5a0f5f60ffaf7/backend/llm/rags/quivr_rag.py#L67 > ) as below and it worked. > > >> >> >> @ Property ( https://github.com/Property ) >> def embeddings(self): >> if self.brain_settings.ollama_api_base_url: >> return OllamaEmbeddings( >> base_url=self.brain_settings.ollama_api_base_url, >> model=self.model.replace('ollama/', "") >> ) # pyright: ignore reportPrivateUsage=none >> else: >> return OpenAIEmbeddings() >> >> > > > > Now I am wondering how should I get brain's model in backend/ models/ settings. > py ( http://backend/models/settings.py ) and update the initializing of OllamaEmbeddings > here ( > https://github.com/StanGirard/quivr/blob/31ce6c05660db8088da96f7bd6f5a0f5f60ffaf7/backend/models/settings.py#L133 > ). > > embeddings = OllamaEmbeddings(model="XXX") make it your own model, like "qwen:7b"、“llama2” and so on, that will be solved — Reply to this email directly, view it on GitHub ( #2056 (comment) ) , or unsubscribe ( https://github.com/notifications/unsubscribe-auth/AEVUW3DBZREPV5C7J6UV65LYYZTSHAVCNFSM6AAAAABCFEO5WWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMBSHEYDCNBQGA ). You are receiving this because you were mentioned. Message ID: </issues/2056/2002901400 @ github. com>

https://github.com/StanGirard/quivr/blob/31ce6c05660db8088da96f7bd6f5a0f5f60ffaf7/backend/llm/rags/quivr_rag.py#L67

How can I set up a local embedded model if the latest version does not have this file?

mkhludnev · 2024-06-18T20:10:07Z

come up with the fix #2690

shubham-dayma added the bug Something isn't working label Jan 22, 2024

dosubot bot added area: backend Related to backend functionality or under the /backend directory type: dependencies labels Jan 22, 2024

mkhludnev mentioned this issue Jun 18, 2024

feat: introducing Ollama embeddings properties. #2690

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: I am trying to run `ollama/dolphin-phi` available on ollama, but /chat/{chat_id}/question throws {"error":"model 'llama2' not found, try pulling it first"} #2056

[Bug]: I am trying to run `ollama/dolphin-phi` available on ollama, but /chat/{chat_id}/question throws {"error":"model 'llama2' not found, try pulling it first"} #2056

shubham-dayma commented Jan 22, 2024

dosubot bot commented Jan 22, 2024

StanGirard commented Jan 22, 2024

shubham-dayma commented Jan 24, 2024

dosubot bot commented Jan 24, 2024

weihang-wills commented Mar 18, 2024

StanGirard commented Mar 18, 2024 via email

wwjCMP commented May 13, 2024

mkhludnev commented Jun 18, 2024

[Bug]: I am trying to run ollama/dolphin-phi available on ollama, but /chat/{chat_id}/question throws {"error":"model 'llama2' not found, try pulling it first"} #2056

[Bug]: I am trying to run ollama/dolphin-phi available on ollama, but /chat/{chat_id}/question throws {"error":"model 'llama2' not found, try pulling it first"} #2056

Comments

shubham-dayma commented Jan 22, 2024

What happened?

Relevant log output

Twitter / LinkedIn details

dosubot bot commented Jan 22, 2024

StanGirard commented Jan 22, 2024

shubham-dayma commented Jan 24, 2024

dosubot bot commented Jan 24, 2024

Sources

weihang-wills commented Mar 18, 2024

StanGirard commented Mar 18, 2024 via email

wwjCMP commented May 13, 2024

mkhludnev commented Jun 18, 2024

[Bug]: I am trying to run `ollama/dolphin-phi` available on ollama, but /chat/{chat_id}/question throws {"error":"model 'llama2' not found, try pulling it first"} #2056

[Bug]: I am trying to run `ollama/dolphin-phi` available on ollama, but /chat/{chat_id}/question throws {"error":"model 'llama2' not found, try pulling it first"} #2056