Skip to content

0.1.126

Compare
Choose a tag to compare
@pchalasani pchalasani released this 18 Nov 19:20
· 501 commits to main since this release

OpenAIAssistant: Support caching via spoofed asssistant messages inserted with user role. Details below.

πŸ”₯ πŸ€– 0.1.126 OpenAI Assistants API: Caching Support

For those wondering about the limitation in the Assistants API that we are only allowed to add a message with user role in the thread: this was a blocker to implementing caching of assistant responses (critical for cost-saving if you are repeatedly running the same conversation, e.g. in a test).

Then I realized -- if the role parameter is not allowed to be other than user, well then we instead simply stick our own role in the content field, and it works just fine! With this ability, we can now cache assistant responses in Langroid, see (https://github.com/langroid/langroid/blob/main/langroid/agent/openai_assistant.py#L447)

I keep an updated hash H of the conversation in the thread metadata, and cache the assistant response R at any stage so C[H] = R. On any run, before starting an assistant run, I look whether there is a cached R = C[H] , and if there is, I stick in R (the cached assistant response) onto the thread, as a msg with user role (since that is the only role allowed), but I prefix the content with "ASSISTANT:...". This assistant spoofing is handled in the expected way by the assistant, i.e. it treats the ASSISTANT-labled messages as if they were generated by the assistant. E.g. see this test (https://github.com/langroid/langroid/blob/main/tests/main/test_openai_assistant.py#L58)