Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(streaming): add logger to include status-based messages in chat history file #537

Merged
merged 2 commits into from Sep 8, 2023

Conversation

douglas-reid
Copy link
Contributor

@douglas-reid douglas-reid commented Aug 29, 2023

This PR provides an approach for providing "status"-like events inline in the same "stream" as other File-based (or in the case of agents, History-based) events.

This PR uses a StreamHandler to capture existing logs and redirect them into the ChatHistory as blocks tagged with either a role of Agent or Tool or LLM. Those messages are then explicitly ignored when providing blocks for conversation history in LLM interactions (in an attempt to prevent contamination).

The StreamHandler is added/removed on a request-by-request basis by adding context manager capabilities to AgentContext.

Here is a BEFORE snapshot of an agent history (without status-like messages in stream):

----- Agent Chat History 6CEBF384-381D-40DE-9746-9A47D7F52138 -----

[user] draw a cat with a funny hat
[assistant] Here is the image you requested: 
[assistant] FD36377D-8C86-4CEC-A259-40FEAD220847 (image/png)
[user] who is the president of Taiwan?
[assistant] The current president of Taiwan is Tsai Ing-wen.
[user] draw them at the podium
[assistant] Here is the image you requested: 
[assistant] 7BEE023D-112F-4FBE-AF43-A6440A0A0FCB (image/png)

Here is the AFTER:

----- Agent Chat History 51A59B73-A398-4D53-8CEC-FE9280D5C043 -----

[user] draw a cat with a funny hat
[agent] Selecting next action...
[llm] OpenAI ChatComplete (draw a cat with a funny hat)
[agent] Selected next action: DalleTool
[agent] Running Tool DalleTool (a cat with a funny hat)
[agent] Tool DalleTool: (Block(8691E180-8216-46DE-B1B4-997429236BDA))
[agent] Selecting next action...
[llm] OpenAI ChatComplete (Block(8691E180-8216-46DE-B1B4-997429236BDA))
[agent] Selected next action: Agent-FinishAction
[agent] Next Tool: Agent-FinishAction
[assistant] Here is the image you requested: 
[assistant] 4D557AD0-FA0E-40F9-BB90-6EF95B898CB8 (image/png)
[user] who is the president of Taiwan?
[agent] Selecting next action...
[llm] OpenAI ChatComplete (who is the president of Taiwan?)
[agent] Selected next action: SearchTool
[agent] Running Tool SearchTool (current president of Taiwan)
[tool] Executing search: current president of Taiwan
[agent] Tool SearchTool: (Tsai Ing-wen)
[agent] Selecting next action...
[llm] OpenAI ChatComplete (Tsai Ing-wen)
[agent] Selected next action: Agent-FinishAction
[agent] Next Tool: Agent-FinishAction
[assistant] The current president of Taiwan is Tsai Ing-wen.
[user] draw them at the podium
[agent] Selecting next action...
[llm] OpenAI ChatComplete (draw them at the podium)
[agent] Selected next action: DalleTool
[agent] Running Tool DalleTool (Tsai Ing-wen, the president of Taiwan, standing at the podium)
[agent] Tool DalleTool: (Block(E0BA53EA-09E8-41F0-8248-380799BD95BD))
[agent] Selecting next action...
[llm] OpenAI ChatComplete (Block(E0BA53EA-09E8-41F0-8248-380799BD95BD))
[agent] Selected next action: Agent-FinishAction
[agent] Next Tool: Agent-FinishAction
[assistant] Here is the image you requested: 
[assistant] EC6B831C-754A-4DEC-9E63-03FD7BC9A4F9 (image/png)

@douglas-reid
Copy link
Contributor Author

There are some considerations here to add:

  • this could dramatically increase the number of blocks in the history file (and require more time to scan/skip)
  • this duplicates logs stream information.

@eob eob requested a review from maxwfreu August 30, 2023 17:23
@eob
Copy link
Contributor

eob commented Aug 30, 2023

Comments just based on the description (which is really helpful, thanks) before diving into code:

Persona: Front End

  • This checks the box of what would produce a Chat-GPT style interface. The main thing here being that the front-end agent is provided with information to meaningfully keep the user engaged while long-latency delays prevent streaming the actual response ("Running tool..", "Ran tool...")

  • I don't think, from the front-end perspective, it's 100% necessary to know the input/output of intermediate actions. It would be probably be fine just to know that they were being run and what their eventual status was. This isn't a change request, just an observation about what could be trimmed if the size of ChatHistory is a concern.

Persona: Performance Engineer

  • Very much read you on the concern for ChatHistory bloat. Thinking out loud:
    • If we trimmed down the ChatHistory metadata blocks to just which actions ran and their status, and eliminated the FinishAction reporting as sort of a given, that might be a meaningful diet to put the reporting on.
    • If we had a AgentContextHistory file that sat alongside ChatHistory that contained a fuller picture of the ChatHistory, that could be a way to keep a parallel set of books that was more detailed, but by default not consumed. ChatHistory & ContextHistory could be kept 1:1

Persona: Agent SDK Design

  • The logging StreamHandler trick feels really clever, but also maybe if we went that route, we'd probably want to make a point of being incredibly explicit (as you allude to above) with rigid expectations upon logging style. Maybe helper functions that pass down to logging operations?
  • The logging duplication problem does seem challenging.. and the AgentContextHistory idea above exacerbates it. Is AgentContextHistory really just a stream over ElasticSearch with a filter applied?

Will keep thinking... let me know what's useful/not useful feedback wise, but this is really awesome to read through so far 🚀🚀🚀🚀🚢🚢🚢🚢🚢

@dkolas
Copy link
Contributor

dkolas commented Aug 30, 2023

I am a big fan of this approach generally. I think having all of the inner 'reasoning' steps in the history of the chat makes total sense, and then having them appropriately filtered back out based on tags etc should work really well.

I am not particularly concerned about chat history bloat, personally. I think that can be handled via nice filter options.

…at history file

This PR adds context manager capabilities to AgentContext, such that a logs
stream of agent, llm, and tool messages will be redirected into a ChatHistory (and
appropriately tagged). This will allow our single-file streaming approach to include
status messages in the output.
@douglas-reid douglas-reid force-pushed the doug/playing-with-interwoven-histories branch from fd56108 to 26092c5 Compare September 6, 2023 22:55
@douglas-reid douglas-reid changed the title wip (streaming): sketch for status-based messages in chat history file feat(streaming): add logger to include status-based messages in chat history file Sep 6, 2023
@douglas-reid douglas-reid marked this pull request as ready for review September 6, 2023 23:01
dkolas
dkolas previously approved these changes Sep 7, 2023
Copy link
Contributor

@dkolas dkolas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me! The approach makes sense, and nothing jumped out at me in the code.

eob
eob previously approved these changes Sep 7, 2023
Copy link
Contributor

@eob eob left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sprinkled comments but mostly just talking out loud to myself while reading. This looks great!!

LGTM

"stack_trace": "%(exc_text)s",
"message_type": "%(message_type)s",
"component_name": "%(component_name)s",
AgentLogging.IS_MESSAGE: f"%({AgentLogging.IS_MESSAGE})s", # b doesn't work. Unsure how to make a bool
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Naieve response is the way you probably were avoiding because it felt silly: 'true' if AgentLogging.IS_MESSAGE else 'false'

}


class StreamingOpts(BaseModel):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like how you've started with the idea that what gets added to the stream can be selected.

One thing that would help me (and therefor I think others?) is to actively add comments here about what distinguishes an agent and llm and tool message.

For me: I find myself wanting to be reassured about my guess about agent versus llm specifically.

For ex:

  • Is agent_message something the agent says that gets added to the chat history?
  • Is llm_message something that the LLM emits to a tool or agent, but doesn't tend to get added to the chat history?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some docs. One thing that I wanted to make more distinct in this PR is the boundary between Agent and LLM. There is nothing in Agent that requires LLM integration or backing. You could have a purely random tool selection agent, for instance. So, I wanted to allow for LLMs to be responsible for LLM-scoped messages and Agents to focus on the level above in logs. This might come at a cost of a bit more of verbosity, however.

@@ -49,14 +50,13 @@ def messages_to_prompt_history(messages: List[Block]) -> str:
as_strings = []
for block in messages:
role = block.chat_role
# DON'T RETURN AGENT MESSAGES -- THOSE ARE INTERNAL STATUS MESSAGES
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh interesting. It's not what I guessed above. Yeah little doc notes would def be helpful then :)

return self.append_message_with_role(text, RoleTag.LLM, tags, content, url, mime_type)


class ChatHistoryLoggingHandler(StreamHandler):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's really cool that we're hooking on to the logging system in this way.

# don't bother doing anything if level is below logging level
return

message_dict = cast(dict, self.format(record))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[not high pri; feel free to ignore]

below feels like an opportunity to collapse into one block of code below since so much is repeated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

refactored a bit. the slight diffs between the various modes makes a full collapse uglier than a small bit of near repetition (imho), but the refactor hopefully makes things clearer/more readable.

src/steamship/agents/service/agent_service.py Outdated Show resolved Hide resolved
url=output_block.raw_data_url or output_block.url or output_block.content_url,
mime_type=output_block.mime_type,
)
with self.build_default_context(context_id, **kwargs) as context:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1. nice

@douglas-reid douglas-reid dismissed stale reviews from eob and dkolas via 1b119f8 September 7, 2023 18:37
@douglas-reid douglas-reid force-pushed the doug/playing-with-interwoven-histories branch from 1b119f8 to 9ad363d Compare September 8, 2023 01:07
@douglas-reid douglas-reid force-pushed the doug/playing-with-interwoven-histories branch from 9ad363d to 8e58a45 Compare September 8, 2023 20:41
@douglas-reid douglas-reid added this pull request to the merge queue Sep 8, 2023
Merged via the queue into main with commit ba42bb2 Sep 8, 2023
4 checks passed
@douglas-reid douglas-reid deleted the doug/playing-with-interwoven-histories branch September 28, 2023 02:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants