Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Issue with EmptyIndex and streaming. #11680

Closed
felipearosr opened this issue Mar 6, 2024 · 5 comments
Closed

[Bug]: Issue with EmptyIndex and streaming. #11680

felipearosr opened this issue Mar 6, 2024 · 5 comments
Labels
bug Something isn't working triage Issue needs to be triaged/prioritized

Comments

@felipearosr
Copy link

Bug Description

Im trying to create a simple Intent Detection agent, the basic expected functionality is to select between to queryengines with RouterQueryEngine, one q_engine with an emptyindex, the idea of this one is to have the llm answer directly questions that are not related to the data, and another one with a vector_index. This is just to decrease cost and improve latency for basic responses that dont require going into the vectordb.

The router is working correctly, when the vector_query_engine is selected is working correctly too. Only problem arises when it selects the empty_query_engine.

It gives me the following error:
'AttributeError: 'Response' object has no attribute 'response_gen'

Streaming is set to true:

empty_index = EmptyIndex()

simple_query_engine = empty_index.as_query_engine(
  streaming=True,
)

This query_engines was also tested with the pinecone index and it worked correctly. So i believe is a problem with EmptyIndex().

If you print the response it gives me Empty Response

Version

0.10.16

Steps to Reproduce

1.Create an EmptyIndex()
2.Create a query_engine with this empty index.
3.Stream the response.

Relevant Logs/Tracbacks

No response

@felipearosr felipearosr added bug Something isn't working triage Issue needs to be triaged/prioritized labels Mar 6, 2024
Copy link

dosubot bot commented Mar 6, 2024

Hey @felipearosr! 👋 I'm Dosu, a friendly bot here to assist with bugs, answer questions, and guide you towards becoming an awesome contributor while we wait for a human maintainer. Looking forward to solving puzzles together! 🤖

Based on the error message you're encountering, it seems like you're trying to access an attribute (response_gen) that exists only in the StreamingResponse class, but you're attempting to access it on an instance of the Response class. The Response class does not have an attribute named response_gen; this attribute is defined only in the StreamingResponse class, which is designed to be used when streaming=True.

When you select the empty_query_engine with streaming set to true, you should be working with an instance of StreamingResponse rather than Response. The error suggests that somewhere in your code, you are either mistakenly working with a Response object when you should be working with a StreamingResponse object, or there is a logic path that incorrectly leads to treating a StreamingResponse as a Response.

To resolve this error, ensure that your code correctly handles instances of StreamingResponse when streaming is set to true, and does not attempt to access StreamingResponse-specific attributes (like response_gen) on Response objects. You can refer to the schema.py file for more details.

Regarding your question about the EmptyIndex class, it does support streaming responses, but only in a specific mode. According to the provided code, the as_query_engine method in the EmptyIndex class sets the response_mode to "generation" by default if it is not specified in the kwargs. It also raises a ValueError if the response_mode is set to anything other than "generation". This indicates that the EmptyIndex class is designed to support responses in a generation mode, which can be interpreted as a form of streaming response, especially in the context of language models where "generation" typically refers to the process of generating text responses. You can refer to the base.py file for more details.

I hope this helps! If you have any other questions or need further clarification, feel free to ask.

Sources

About Dosu

This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

Copy link

dosubot bot commented Mar 6, 2024

Hey @felipearosr! 👋 I'm Dosu, a friendly bot here to help you squash bugs 🐞, answer your queries 🤔, and guide you towards becoming an awesome contributor while we wait for a human maintainer to join us. Just wanted to let you know I'm diving into your issue about the EmptyIndex and streaming problem in LlamaIndex, and I'll get back to you with a full answer shortly. Hang tight! 🚀

@logan-markewich
Copy link
Collaborator

@felipearosr the empty index is a tad deprecated I guess?

It returns zero nodes (since its empty) and then if zero nodes are returned, Empty Response Is returned by the synthesizer

Even if this check wasn't there, the prompts for the synthesizer don't make sense without context from nodes.

probably I would just make a custom query engine to wrap an LLM call, to have more control over whats going on

from llama_index.core.query_engine import CustomQueryEngine
from llama_index.llms.openai import OpenAI


prompt = "Given the user query, respond as best as possible.\nQuery: {query}"

class MyQueryEngine(CustomQueryEngine):
    """My Query Engine."""

    llm: OpenAI
    prompt: str

    def custom_query(self, query_str: str):
        llm_prompt = self.prompt.format(query=query_str)
        llm_response = llm.complete(llm_prompt)
        return str(llm_response)
        

query_engine = MyQueryEngine(llm=OpenAI(), prompt=prompt)
response = query_engine.query("Hello!")
print(str(response))

https://docs.llamaindex.ai/en/stable/examples/query_engine/custom_query_engine.html#defining-a-custom-query-engine

@felipearosr
Copy link
Author

felipearosr commented Mar 6, 2024

Hi, thank you for your reply, how can i stream the response of this, also im getting the following error with the following implementation of the code:

class LlmQueryEngine(CustomQueryEngine):
    """Custom query engine for direct calls to the LLM model."""
    llm: OpenAI
    prompt: str

    def custom_query(self, query_str: str):
        llm_prompt = self.prompt.format(query=query_str)
        llm_response = self.llm.complete(llm_prompt, formatted=True)
        return llm_response

I get the following error on the RouterQueryEngine, any help would be greatly appreciated.

AttributeError: 'CompletionResponse' object has no attribute 'metadata'

Extra code:

    simple_query_engine = LlmQueryEngine(llm=OpenAI(), prompt=prompt)

    list_tool = QueryEngineTool.from_defaults(
        query_engine=simple_query_engine,
        description=( "... "),
    )

    vector_tool = QueryEngineTool.from_defaults(
        query_engine=vector_query_engine,
        description=("..."),
    )
    query_engine = RouterQueryEngine(
        selector=LLMSingleSelector.from_defaults(),
        query_engine_tools=[
            list_tool,
            vector_tool,
        ],
    )

@logan-markewich
Copy link
Collaborator

@felipearosr For non-streaming, you should cast the response as a string. Otherwise, you can use the below to stream

from llama_index.core.base.response.schema import StreamingResponse

class LlmQueryEngine(CustomQueryEngine):
    """Custom query engine for direct calls to the LLM model."""
    llm: OpenAI
    prompt: str

    def custom_query(self, query_str: str):
        llm_prompt = self.prompt.format(query=query_str)
        llm_response = self.llm.stream_complete(llm_prompt, formatted=True)
        
        def response_gen(llm_response):
            for r in llm_response:
                yield str(r)
        return StreamingResponse(response_gen=response_gen(llm_response))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working triage Issue needs to be triaged/prioritized
Projects
None yet
Development

No branches or pull requests

2 participants