Skip to content

[FEATURE] Expose input messages to BeforeInvocationEvent hook #1006

@athewsey

Description

@athewsey

Problem Statement

To implement standalone input-side guardrails (for e.g. PII, toxic content, prompt attack prevention), we'd like to place a Hook as early as possible in the invocation. In particular, we want to make sure it runs (and has the opportunity to redact the user's input message) before the message gets added to memory by e.g. AgentCoreMemorySessionManager, which hooks on MessageAddedEvent.

However, the current BeforeInvocationEvent hook only receives a reference to the agent and has no visibility of the incoming messages because they haven't been added to agent.messages yet.

Proposed Solution

Extend BeforeInvocationEvent to also include the messages received during _run_loop() start-up

  • The simplest implementation of this would implicitly allow hook authors to edit the messages in-place, giving them the opportunity to redact the message but go ahead with the invocation. 👍
  • A guardrail hook could also choose to just raise error and abort the whole agent invocation, which could save some money in cases where the manner of redaction leaves nothing useful the agent could do (see also [FEATURE] Add ability to bypass LLM invocation and provide custom responses in hooks #758)

In structured_output_async(), this event is also currently raised but before agent.messages and the optional prompt argument have been combined to form the temp_messages that'll ultimately be used in the invocation.

  • AFAICT there's nothing stateful about the construction of temp_messages and its only failure modes seem to be for malformed inputs - so I'd suggest to move the invocation of the BeforeInvocationEvent hooks to straight after temp_messages is set up.
  • Specifically, I'd suggest to pull temp_messages and the if not self.messages and not prompt guard clause forward out of the tracing span - treating them as input validation activities that don't count towards the span duration (negligible anyway) but also wouldn't trigger AfterInvocationEvent in case they fail due to malformed input.

Use Case

As mentioned above, primary use-case here is for input guardrail checks to prevent PII / toxic / prompt-attack content from entering the agent as early as possible in the invocation lifecycle.

Alternatives Solutions

Today I think the next-earliest workaround is for an input guardrail to hook on to MessageAddedEvent instead (since this'll get called as soon as the agent initializes its messages list, before BeforeModelCallEvent)... But this is not ideal because MessageAdded is a typical place for session/memory managers (like AgentCoreMemorySessionManager) to hook - so relies on users to connect their guardrail and memory hooks in the right sequence to avoid leakage of sensitive/malicious input into memory. It should work, but is easy to mis-configure without realizing.

Additional Context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions