Skip to content

Latest commit

 

History

History
77 lines (60 loc) · 4.82 KB

ai-response-performance.mdx

File metadata and controls

77 lines (60 loc) · 4.82 KB
title metaDescription freshnessValidatedDate
View AIM data in the UI
AI Monitoring lets you observe the AI-layer of your tech stack, giving you a holistic overview of the health and performance of your AI-powered app.
never

INTRO here

Explore your AI responses [#explore]

To access your AI data, click AI Responses from the APM Summary page. The AI Responses page aggregates metrics about your AI's performance. This page helps you understand trends in response metrics over time, like performance, quality, and cost.

    <img
        title="AI Response page"
        alt="The AI Response page can be broken up by the filter bar, time series graphs, and the Responses table."
        src={aiResponseAggregate}
    />
    <figcaption>Your AI Response page lets you query your data, view time series graphs about your app's behavior, and investigate interactions between your end user and AI.</figcaption>

Following the screenshot above, you can use this page to:

  1. Add a query to the filter bar to look at specific kinds of data.
  2. Use the time series graphs to pinpoint exactly when a change occurred. Click the dropdown to toggle between token usage, total responses, and average response time.
  3. Investigate individual responses in the table to dig into trace-level data.

Expose the AI data that matters most to you [#filter]

You can filter your data by attributes or keywords to expose some responses from your total responses. When you add a query, your tiles, time series charts, and table will update to reflect only that data.

    <img
        title="AI Response filter bar with query"
        alt="The filter bar lets you expose the data you need to troubleshoot a  problem with your AI stack."
        src={aiFilterBar}
    />
    <figcaption>**AI Response > filter bar**: Query your data with various attributes to drill down into specific responses.</figcaption>

Because AI Monitoring lets you view all of your AI's responses, you may want to isolate certain kinds of data. You can query by attributes, which you can find by clicking the + next to the bar. You can also search your requests and responses by keyword. Some examples are:

  • Testing different models: If you're testing different models, filter by model: Response model IN anthropic.claude-instant-v1. This query shows only data related to that particular AI model.
  • Query for keywords: If you want to view responses or requests that ask about pricing, you can update the query to read Request LIKE cheapest or Response LIKE expensive. This exposes inputs and outputs where those keywords appear.

Investigate individual responses [#investigate-response]

The Responses table lets you view every request made to your AI paired with your AI's response.

    <img
        title="The AI Responses table"
        alt="The Responses table lists out information about a request/response interaction. You can see details like number of errors, completions, total tokens, and the model used."
        src={aiResponseTable}
    />
    <figcaption>The **Responses** table is your entry point to going deeper into a request's lifecycle.</figcaption>
  • Sort by ascending or descending: If your overall token usage or error count has increased, click the column name to quickly sort which responses contributed to the overall spike in behavior.
  • Open an individual response: Click a row to track your AI's progress as it interacts with other AI models, API calls, database lookups, and custom application code. This opens the traces page.

Explore traces to go deeper into an AI response [#traces]

Click a response from the table to open the trace view. This gives you specific information about a particular response.

    <img
        title="Trace view for an individual AI response"
        alt="The trace view "
        src={aiTracePage}
    />

    <figcaption>See how your AI interacts with other AI models, API calls, database lookups, and custom application code. Click **Trace** to view trace data, or **Logs** to view your logs.</figcaption>

Following the screenshot above, you can use this page to:

  1. Overview request duration and total tokens. For example, in this screenshot, you can see that the response had high latency at 21.7 seconds.
  2. Look at summary metrics for all sub-processes called when constructing a response. You can use this section to see how an operation contributed to a higher overall duration.
  3. Follow an end user's request until it reaches completion. The colored bars above a trace indicates the relative duration of every step. Click the spans to view span details.
  4. See additional context about your AI's response. If you're concerned about total tokens, see the ratio between request and completion tokens. If you have multiple models in your toolchain, check which model generated a particular response.