Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

File search only works on first thread message #95

Open
herrbenesch opened this issue May 2, 2024 · 7 comments
Open

File search only works on first thread message #95

herrbenesch opened this issue May 2, 2024 · 7 comments
Assignees
Labels
bug Something isn't working

Comments

@herrbenesch
Copy link
Contributor

herrbenesch commented May 2, 2024

When uploading a file to the assistants vector storage, you can access the file in newly created threads (thread id).
This is discussed here: https://community.openai.com/t/thread-unable-to-access-newer-assistant-files/689247/2

With Sam (as well as assistants) we want to reuse threads because they represent the slack conversation with an assistant.

@herrbenesch herrbenesch added the bug Something isn't working label May 2, 2024
@herrbenesch
Copy link
Contributor Author

The actual problem: If the first message in a thread was about something else than files, all future messages in the same thread won't be using file_search anymore.
Compare: https://community.openai.com/t/only-searches-files-if-first-message-uses-file-search/730992

@herrbenesch herrbenesch changed the title Files newly added to the assistants vector storage are not considered in existing threads File search only works on first thread message May 2, 2024
@herrbenesch
Copy link
Contributor Author

We can't do much here but wait for OpenAI to work on a bugfix.

@herrbenesch herrbenesch self-assigned this May 3, 2024
@herrbenesch
Copy link
Contributor Author

I hopefully created a ticket with their chat bot
image

@herrbenesch
Copy link
Contributor Author

I tried to add the assistants vector storage to the thread upon creation like here: https://platform.openai.com/docs/assistants/tools/file-search/attaching-vector-stores
This brought better results. However, we allow users to upload files. They are then added to the assistants own vector storage and hence leaking files across threads with the same assistant.

@herrbenesch
Copy link
Contributor Author

herrbenesch commented May 13, 2024

...I followed up in the chat. I hope there will be a response.
I received a response in the first place asking for more information. I did provide the information and didn't hear back from them.

@herrbenesch
Copy link
Contributor Author

By Open-AI service agent

Thank you for providing additional details about the issue you're encountering with the Assistants API, specifically regarding file search functionality within threads. Based on the information you've shared, it seems like the core of the issue is related to how file search is triggered within a thread after initial interactions that do not involve file search. Given the setup you've described, here are a few suggestions and clarifications that might help: Explicitly Invoking File Search: While it's understood that you cannot predict whether a user's request will require file search, ensuring that the file search tool is enabled for every thread, as you've tried, is a good approach. According to the Assistants API documentation, once the file_search tool is enabled, the model decides when to retrieve content based on user messages. However, the behavior you're experiencing suggests there might be limitations or specific conditions under which the file search is triggered. Thread-Specific Vector Stores: To address the issue of file leakage across threads when users upload files, consider utilizing thread-specific vector stores. When a user uploads a file, you can create a new vector store for that thread or add the file to an existing thread-specific vector store. This way, files uploaded by users will only be accessible within the context of their respective threads. The documentation on File Search provides guidance on creating vector stores and attaching them to threads. Reviewing File Search Tool Configuration: Ensure that the file search tool's configuration within your assistant and threads is set up to support your use case. As per the documentation, the file search tool uses specific settings by default, such as chunk size and embedding model. Reviewing these settings might provide insights into how file search is being triggered. Investigating Message Structure: The structure of the messages sent by users and how they're processed by your assistant could influence whether file search is triggered. Reviewing the structure of successful versus unsuccessful file search attempts might reveal patterns or requirements for triggering file search. Given the complexity of the issue and the potential need for a more tailored solution, I recommend reaching out to OpenAI support with the details you've provided here, including the assistant and vector store IDs, the model you're using, and specific examples of the issue. While I understand this might not provide an immediate solution, sharing these details with the support team could facilitate a more in-depth investigation into why file search isn't being triggered as expected in certain scenarios. Lastly, I want to assure you that your query is being read and taken seriously. We aim to provide the best support possible, and your feedback is invaluable in improving our services.

@herrbenesch
Copy link
Contributor Author

I'll need to again dig deeper on the vectore storage topic.
Perhaps we can create one vector storage per thread that inherits all files from the assistants vector storage.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant