Skip to content

Observability Hooks

skobeltsyn edited this page May 23, 2026 · 6 revisions

Observability Hooks

Monitor what your agents do at runtime -- tool calls, knowledge fetches, skill selections, errors, and audit events -- without modifying business logic.


Overview

Agents.KT provides hooks at two levels: agent-level hooks on individual Agent instances, and composition-level hooks on structures like Forum.

For retained audit logs, add the agents-kt-observability module and use the first-party JSONL exporter. It writes one JSON object per event, one line at a time, with request/session/manifest correlation fields and no raw tool arguments or results by default.

Agent-Level Hooks

Hook Fires When Signature
onToolUse After a tool executor completes (name: String, args: Map<String, Any?>, result: Any?) -> Unit
onKnowledgeUsed When the LLM fetches a knowledge entry (name: String, content: String) -> Unit
onSkillChosen After skill selection resolves (name: String) -> Unit
onError Before an infrastructure error propagates (Throwable) -> Unit
Agent.observe Unified sealed event view (PipelineEvent) -> Unit

Composition-Level Hooks

Hook Fires When Signature
Forum.onMentionEmitted After each forum agent produces its output (agentName: String, output: Any?) -> Unit

All hooks are optional. They do not affect execution -- the agent runs identically with or without them. They are purely observational.


JSONL Audit Exporter

Use :agents-kt-observability when you need a durable, grep/jq-friendly audit record instead of a custom in-memory list or ad-hoc logger.

dependencies {
    implementation("ai.deep-code:agents-kt:0.6.0")
    implementation("ai.deep-code:agents-kt-observability:0.6.0")
}
import agents_engine.observability.JsonlRotation
import agents_engine.observability.events

val assistant = agent<String, String>("assistant") {
    skills {
        skill<String, String>("echo", "Echo input") {
            implementedBy { it }
        }
    }
}

val exporters = assistant.events.export {
    jsonl(
        file("build/agents/audit.jsonl"),
        rotation = JsonlRotation.Size(maxBytes = 50L * 1024 * 1024),
    )
}

try {
    assistant("hello")
} finally {
    exporters.forEach { it.close() }
}

Each line has the same field set:

requestId, sessionId, manifestHash, agentId, skillId, toolId, eventType,
timestamp, inputType, outputType, budgetState, guardrailDecision,
mcpClientId, provider, model

Important defaults:

  • Append-only, line-buffered writes.
  • SkillCompleted and Failed trigger a buffered-line flush attempt.
  • Rotation can be size-based (JsonlRotation.Size) or day-based (JsonlRotation.Daily).
  • Write failures never throw into the agent path. The exporter buffers, drops the oldest line under sustained backpressure, and logs the drop.
  • Raw tool arguments, tool results, generated output, streamed text, and exception messages are not serialized. This avoids automatic API-key or PII leakage into audit logs.
  • manifestHash is populated when the runtime event carries one, tying runtime behavior back to the signed-off capability manifest.

Ad-hoc analysis:

jq -c 'select(.eventType == "ToolCalled") | {requestId, agentId, toolId}' build/agents/audit.jsonl
jq -s 'group_by(.eventType) | map({eventType: .[0].eventType, count: length})' build/agents/audit.jsonl
tail -f build/agents/audit.jsonl | jq -r '[.timestamp, .requestId, .agentId, .eventType] | @tsv'

Session Events

The exporter can also write AgentEvent rows from agent.session(input).events directly:

import agents_engine.observability.JsonlAuditExporter
import agents_engine.observability.JsonlRotation
import agents_engine.runtime.events.session
import java.nio.file.Path

val audit = JsonlAuditExporter(
    Path.of("build/agents/session-audit.jsonl"),
    rotation = JsonlRotation.Daily(),
)

agent.session(input).events.collect { event ->
    audit.write(event)
}
audit.close()

Session rows include the same request/session/manifest fields and add provider/model values when TokenUsage is available on SkillCompleted or Completed.


Unified Agent.observe

Agent.observe { event -> } bridges skill, tool, knowledge, and error hooks into one sealed PipelineEvent stream. It composes with existing hooks, so you can keep a local debug hook and also attach the JSONL exporter or another telemetry sink.

agent.observe { event ->
    when (event) {
        is PipelineEvent.SkillChosen ->
            logger.info("skill=${event.skillName} request=${event.requestId}")
        is PipelineEvent.ToolCalled ->
            logger.info("tool=${event.toolName} request=${event.requestId}")
        is PipelineEvent.KnowledgeLoaded ->
            logger.info("knowledge=${event.entryName} chars=${event.contentLength}")
        is PipelineEvent.ErrorOccurred ->
            logger.warn("agent_error request=${event.requestId}", event.error)
    }
}

Every PipelineEvent carries requestId, sessionId, and manifestHash.


onToolUse

Fires after every tool execution within the agentic loop. You receive the tool name, the arguments the LLM provided, and the result the executor returned.

val agent = agent<String, String>("file-ops") {
    model { ollama("qwen2.5:7b") }
    budget { maxTurns = 10 }

    skills {
        skill<String, String>("manage", "Manage files") {
            tools("read_file", "write_file")

            tool("read_file", "Read a file") { args ->
                File(args["path"] as String).readText()
            }

            tool("write_file", "Write a file") { args ->
                val path = args["path"] as String
                val content = args["content"] as String
                File(path).writeText(content)
                "Written ${content.length} bytes"
            }
        }
    }

    onToolUse { name, args, result ->
        println("TOOL [$name] args=$args result=$result")
    }
}

Output when the agent reads and writes a file:

TOOL [read_file] args={path=/tmp/input.txt} result=Hello, World!
TOOL [write_file] args={path=/tmp/output.txt, content=HELLO, WORLD!} result=Written 13 bytes

Use Cases

Structured logging:

onToolUse { name, args, result ->
    logger.info(
        "tool_call",
        mapOf(
            "tool" to name,
            "args" to args,
            "result_type" to result?.javaClass?.simpleName,
            "result_length" to result?.toString()?.length
        )
    )
}

Metrics collection:

onToolUse { name, args, result ->
    metrics.counter("agent.tool.calls", "tool" to name).increment()
    metrics.timer("agent.tool.duration", "tool" to name).record(duration)
}

Audit trail:

val auditLog = mutableListOf<AuditEntry>()

onToolUse { name, args, result ->
    auditLog.add(AuditEntry(
        timestamp = Instant.now(),
        tool = name,
        args = args,
        result = result?.toString()
    ))
}

onKnowledgeUsed

Fires when the LLM decides to fetch a knowledge entry during an agentic skill. You receive the knowledge entry's name and its content.

val agent = agent<String, String>("support-bot") {
    model { ollama("qwen2.5:7b") }
    budget { maxTurns = 5 }

    skills {
        skill<String, String>("answer", "Answer questions") {
            tools("search_db")

            knowledge("faq", "Frequently asked questions") {
                loadText("/data/faq.txt")
            }

            knowledge("pricing", "Current pricing information") {
                loadText("/data/pricing.txt")
            }

            tool("search_db", "Search the support database") { args ->
                supportDb.search(args["query"] as String)
            }
        }
    }

    onKnowledgeUsed { name, content ->
        println("KNOWLEDGE [$name] loaded ${content.length} chars")
    }
}

Output when the LLM fetches the FAQ:

KNOWLEDGE [faq] loaded 4823 chars

Use Cases

Track knowledge relevance:

val knowledgeHits = mutableMapOf<String, Int>()

onKnowledgeUsed { name, content ->
    knowledgeHits.merge(name, 1, Int::plus)
}

// After many runs:
// knowledgeHits = {faq=142, pricing=87, policies=23}
// "policies" is rarely used -- consider removing or restructuring it

Monitor knowledge freshness:

onKnowledgeUsed { name, content ->
    val lastUpdated = knowledgeMetadata[name]?.lastUpdated
    if (lastUpdated != null && lastUpdated.isBefore(Instant.now().minus(30, ChronoUnit.DAYS))) {
        logger.warn("Knowledge '$name' is over 30 days old -- consider refreshing")
    }
}

onSkillChosen

Fires after skill selection resolves, regardless of which strategy was used (predicate, LLM routing, or first-match). You receive the name of the chosen skill.

val agent = agent<String, String>("router") {
    model { ollama("qwen2.5:7b") }

    skills {
        skill<String, String>("billing", "Billing questions") { /* ... */ }
        skill<String, String>("technical", "Technical support") { /* ... */ }
        skill<String, String>("general", "General inquiries") { /* ... */ }
    }

    onSkillChosen { name ->
        println("ROUTING -> $name")
    }
}

Output:

ROUTING -> billing

Use Cases

Routing analytics:

onSkillChosen { name ->
    metrics.counter("agent.skill.chosen", "skill" to name).increment()
}

Debugging routing decisions:

onSkillChosen { name ->
    logger.debug("Skill '$name' chosen for input: ${currentInput.take(100)}...")
}

Forum.onMentionEmitted

Fires after each agent in a Forum produces its output -- both participants and the captain. You receive the agent's name and its output. Participant mentions may fire concurrently (from different coroutines); the captain mention fires last.

val forum = analyst * critic * captain
forum.onMentionEmitted { agentName, output ->
    println("[$agentName]: $output")
}

forum("evaluate this proposal")
// [analyst]: The proposal has strong market fit...
// [critic]: The cost projections are unrealistic...
// [captain]: Proceed with revised budget estimates.

Use Cases

Debate progress counter:

val counter = AtomicInteger(0)
forum.onMentionEmitted { name, _ ->
    println("  mention #${counter.incrementAndGet()} by $name")
}

Collecting all debate contributions:

val transcript = CopyOnWriteArrayList<Pair<String, Any?>>()
forum.onMentionEmitted { name, output ->
    transcript.add(name to output)
}

forum(input)
// transcript contains every agent's contribution in completion order

Live streaming to UI:

forum.onMentionEmitted { name, output ->
    websocket.send(ForumEvent(agent = name, content = output.toString()))
}

Combining Hooks

An agent can combine local hooks with Agent.observe or the JSONL exporter for fuller observability:

val agent = agent<String, String>("observable-agent") {
    model { ollama("qwen2.5:7b") }
    budget { maxTurns = 10 }

    skills {
        skill<String, String>("research", "Research a topic") {
            tools("search", "fetch_page")

            knowledge("guidelines", "Research guidelines") {
                loadText("/data/guidelines.txt")
            }

            tool("search", "Web search") { args ->
                webSearch(args["query"] as String)
            }

            tool("fetch_page", "Fetch a web page") { args ->
                httpClient.get(args["url"] as String)
            }
        }

        skill<String, String>("summarize", "Summarize content") {
            implementedBy { input -> summarize(input) }
        }
    }

    onSkillChosen { name ->
        logger.info("skill_selected: $name")
    }

    onKnowledgeUsed { name, content ->
        logger.info("knowledge_fetched: $name (${content.length} chars)")
    }

    onToolUse { name, args, result ->
        logger.info("tool_executed: $name args=$args")
    }
}

A typical execution trace:

skill_selected: research
knowledge_fetched: guidelines (1205 chars)
tool_executed: search args={query=Kotlin coroutines best practices}
tool_executed: fetch_page args={url=https://example.com/article}

The raw hooks fire in execution order: skill selection first, then knowledge and tools as the agentic loop runs. The unified Agent.observe stream sees the same lifecycle through typed PipelineEvent variants.


Testing with Hooks

Hooks are powerful testing tools. They let you assert on agent behavior without needing a live LLM.

Assert on Tool Calls

@Test
fun `agent calls search before summarize`() {
    val toolCalls = mutableListOf<String>()

    val mockClient = ModelClient { messages ->
        when (toolCalls.size) {
            0 -> LlmResponse.ToolCalls(listOf(ToolCall("search", mapOf("query" to "test"))))
            1 -> LlmResponse.ToolCalls(listOf(ToolCall("summarize", mapOf("text" to "data"))))
            else -> LlmResponse.Text("Summary: test data")
        }
    }

    val agent = agent<String, String>("test-agent") {
        model { ollama("unused"); client = mockClient }
        budget { maxTurns = 5 }

        skills {
            skill<String, String>("work", "Do work") {
                tools("search", "summarize")
                tool("search", "Search") { "results" }
                tool("summarize", "Summarize") { "summary" }
            }
        }

        onToolUse { name, _, _ ->
            toolCalls.add(name)
        }
    }

    agent("analyze this")

    assertEquals(listOf("search", "summarize"), toolCalls)
}

Assert on Skill Selection

@Test
fun `billing questions route to billing skill`() {
    var selectedSkill = ""

    val agent = agent<String, String>("router") {
        skillSelection { input ->
            if (input.contains("charge")) "billing" else "general"
        }

        skills {
            skill<String, String>("billing", "Billing") {
                implementedBy { "Billing response" }
            }
            skill<String, String>("general", "General") {
                implementedBy { "General response" }
            }
        }

        onSkillChosen { name ->
            selectedSkill = name
        }
    }

    agent("Why was I charged twice?")

    assertEquals("billing", selectedSkill)
}

Assert on Knowledge Access

@Test
fun `agent uses FAQ knowledge for common questions`() {
    val knowledgeAccessed = mutableListOf<String>()

    val mockClient = ModelClient { messages ->
        // Simulate the LLM fetching knowledge, then answering
        if (knowledgeAccessed.isEmpty()) {
            LlmResponse.ToolCalls(listOf(ToolCall("knowledge_faq", emptyMap())))
        } else {
            LlmResponse.Text("Based on the FAQ: yes, we offer refunds.")
        }
    }

    val agent = agent<String, String>("support") {
        model { ollama("unused"); client = mockClient }
        budget { maxTurns = 3 }

        skills {
            skill<String, String>("answer", "Answer questions") {
                tools("knowledge_faq")

                knowledge("faq", "FAQ content") {
                    "Q: Do you offer refunds? A: Yes, within 30 days."
                }
            }
        }

        onKnowledgeUsed { name, _ ->
            knowledgeAccessed.add(name)
        }
    }

    agent("Do you offer refunds?")

    assertTrue(knowledgeAccessed.contains("faq"))
}

Full Integration Test Pattern

Combine all hooks for comprehensive behavior verification:

@Test
fun `full agent behavior test`() {
    var skill = ""
    val tools = mutableListOf<String>()
    val knowledge = mutableListOf<String>()

    val mockClient = ModelClient { messages ->
        when (tools.size) {
            0 -> LlmResponse.ToolCalls(listOf(ToolCall("search", mapOf("q" to "test"))))
            else -> LlmResponse.Text("answer")
        }
    }

    val agent = agent<String, String>("full-test") {
        model { ollama("unused"); client = mockClient }
        budget { maxTurns = 5 }

        skills {
            skill<String, String>("research", "Research") {
                tools("search")
                tool("search", "Search") { "results" }

                knowledge("docs", "Documentation") { "doc content" }
            }
        }

        onSkillChosen { skill = it }
        onToolUse { name, _, _ -> tools.add(name) }
        onKnowledgeUsed { name, _ -> knowledge.add(name) }
    }

    val result = agent("find info")

    assertEquals("research", skill)
    assertEquals(listOf("search"), tools)
    assertEquals("answer", result)
}

This pattern gives you deterministic, fast, LLM-free tests that verify the agent's orchestration logic: which skill was chosen, which tools were called, in what order, and what the final result was.


Next Steps

Clone this wiki locally