Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] No debug context available when setting CE_DEBUG_INFO=true #297

Closed
2 tasks done
coreation opened this issue Feb 15, 2024 · 6 comments
Closed
2 tasks done

[Bug] No debug context available when setting CE_DEBUG_INFO=true #297

coreation opened this issue Feb 15, 2024 · 6 comments
Labels
bug Something isn't working

Comments

@coreation
Copy link
Contributor

Is this a new bug?

  • I believe this is a new bug
  • I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

Hello,

I want to use the context and knowledge base results of a chat_engine.chat execution. To do this, I've put the following in my .env file which i load using load_dotenv()

CE_DEBUG_INFO="true"

To confirm that this is read:

print(os.getenv("CE_DEBUG_INFO")) # true
CE_DEBUG_INFO = os.getenv("CE_DEBUG_INFO", "FALSE").lower() == "true" 
print(CE_DEBUG_INFO) # True

However, when I want to access the context, I get an empty object even though the response content contains a message based on RAG retrieval pieces, meaning there was a non-empty response.

       Tokenizer.initialize()

        pinecone_index = os.environ['PINECONE_INDEX']
        pinecone_namespace = os.environ['PINECONE_NAMESPACE']

        kb = KnowledgeBase(index_name=pinecone_index)
        kb.connect()
        # results = kb.query([Query(text="What is the outlook of the EV market?")])
        # print(results)

        context_engine = ContextEngine(kb)

        llm = OpenAILLM()
        chat_engine = ChatEngine(context_engine=context_engine, llm=llm)

        response = chat_engine.chat(messages=messages, stream=False, namespace=pinecone_namespace)
        print(response.debug_info) # This is empty

Expected Behavior

I would expect to have access to the full context/kb results when setting the CE_DEBUG_INFO variable.

Steps To Reproduce

I think the notebook used in the "library" part of canopy has the basic steps, just add the CE_DEBUG_INFO variable and check for the debug context. I hope that will suffice :)

Relevant log output

No response

Environment

- **OS**: OS X
- **Language version**: Python 3.9.2
- **Canopy version**: 0.7.0

Additional Context

No response

@coreation coreation added the bug Something isn't working label Feb 15, 2024
@coreation coreation changed the title [Bug] No CE_DEBUG_INFO available [Bug] No debug context available when setting CE_DEBUG_INFO=true Feb 15, 2024
@igiloh-pinecone
Copy link
Collaborator

@coreation I see you got unblocked in your own.
Is there a missing documentation that could have made this clearer somehow?

@coreation
Copy link
Contributor Author

@igiloh-pinecone no, unfortunately I'm not :) I'm running the code described in the ticket, but there's no context coming along, even though the chat response contains a properly formed answer. I'm now running copies of the code so that I can debug the entire RAG flow and see where it might go wrong

@izellevy
Copy link
Collaborator

izellevy commented Feb 15, 2024

Hi @coreation, is it possible to try to set CANOPY_DEBUG_INFO=true? We had added some more debug info to specific classes and decided to change the CE_DEBUG_INFO (CE meaning ContextEngine) to CANOPY_DEBUG_INFO to better reflect it is a project-wide config.

@coreation
Copy link
Contributor Author

coreation commented Feb 15, 2024

hey @izellevy thanks, I'll give that a try, but it seems that I've got issues just getting a proper retrieval going. I'm running both the canopy REST API and the code mentioned in the ticket to compare things side by side. The environment variables are the same, but the custom code, based on what the library documentation mentions isn't able to generate anything given the same question.

Using the REST API

Q: Is ChatGPT commandeering the mundane tasks that young employees have relied on to advance their careers?
A: Yes, ChatGPT is commandeering the mundane tasks that young employees have relied on to advance their careers. The generative-AI boom has led many companies to automate tasks such as spreadsheet building and generic copywriting in the name of becoming more efficient. These tasks are typically handled by entry-level workers, who were given them as a way to "earn their stripes" and develop in the workplace. However, with the rise of generative AI technology like ChatGPT, organizations are starting to automate these junior tasks, undermining the traditional path of advancement for young employees. This has raised concerns among members of Gen Z, with surveys indicating that 76% of them are worried about losing their jobs to ChatGPT.

Using the code mentioned in the ticket, based on the library.md file

Q: Is ChatGPT commandeering the mundane tasks that young employees have relied on to advance their careers?
A: There is no information in the provided context that directly addresses the impact of ChatGPT on young employees and their reliance on mundane tasks for career advancement. Therefore, I don't have enough information to answer your question.

I'm trying to wrap my head around what I'm doing wrong here...

@coreation
Copy link
Contributor Author

coreation commented Feb 15, 2024

@izellevy @igiloh-pinecone the debug flag works... but the larger issue is that using the following code, does not deliver any kind of response, whereas the canopy REST API does, given the exact same configuration.

If I look at the debug info, the documents that the KB retrieves are all...trash...just not relevant, while it's clear that by using the REST API endpoint on the same index does return information as it contains the sources that are in my index. Meaning, not something OpenAI can come up with.

       Tokenizer.initialize()

        pinecone_index = os.environ['PINECONE_INDEX']
        pinecone_namespace = os.environ['PINECONE_NAMESPACE']

        kb = KnowledgeBase(index_name=pinecone_index)
        kb.connect()
        # results = kb.query([Query(text="What is the outlook of the EV market?")])
        # print(results)

        context_engine = ContextEngine(kb)

        llm = OpenAILLM()
        chat_engine = ChatEngine(context_engine=context_engine, llm=llm)

        response = chat_engine.chat(messages=messages, stream=False, namespace=pinecone_namespace)
        print(response.debug_info) # This is empty

Is there anything I should watch out for here? My goal (not unimportant :) ) is to capture all the used sources so that I can fetch more meta-data of those sources to use in the UI that our end-users see.

@coreation
Copy link
Contributor Author

@izellevy @igiloh-pinecone I'm going to make a dedicated issue out of the last comment as the original issue has been solved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants