telemetry: log request_id per interaction #1571

abeatrix · 2023-10-31T23:39:17Z

add interactionID to codebaseContext and recipes

Pass interactionID to codebaseContext getContextMessages
Add interactionID to RecipeContext
Add interactionID parameter to relevant recipe methods
Log interactionID as request_id for each chat question submitted

A new interactionID is assigned whenever executeRecipe is run, which allows tracing in individual webview/view.

Test plan

Submit a question in chat, you should see all the logEvents has the same request_id for each chat question submitted.

I wanted to create an id for each interaction, but an interaction is not created until context is fetched so that didn't work out.

Example:

Question 1

█ logEvent (telemetry disabled): CodyVSCodeExtension:keywordContext:searchDuration {"properties":{"searchDuration":839.5423329919577,"request_id":"d6fd00bd-7559-456c-8bea-ae58b197a95c"}}
█ logEvent (telemetry disabled): CodyVSCodeExtension:recipe:chat-question:executed {"properties":{"contextSummary":{"embeddings":0,"local":7},"request_id":"d6fd00bd-7559-456c-8bea-ae58b197a95c"},"opts":{"hasV2Event":true}}

Question 2:

█ logEvent (telemetry disabled): CodyVSCodeExtension:keywordContext:searchDuration {"properties":{"searchDuration":900.0559170097113,"request_id":"b5e1f00b-96d1-470a-9b0d-d9238550af71"}}
█ logEvent (telemetry disabled): CodyVSCodeExtension:recipe:chat-question:executed {"properties":{"contextSummary":{"embeddings":0,"local":8},"request_id":"b5e1f00b-96d1-470a-9b0d-d9238550af71"},"opts":{"hasV2Event":true}}

When paste a insert a code block:

█ logEvent (telemetry disabled): CodyVSCodeExtension:insertButton:clicked {"properties":{"op":"insert","charCount":98,"lineCount":5,"source":"chat-question","request_id":"b5e1f00b-96d1-470a-9b0d-d9238550af71"}}
``

valerybugakov

Left inline comments. Functionality-wise, it's ✅, but we can also improve the docs and safety related to the request_id field added to analytics events.

valerybugakov · 2023-11-01T06:56:12Z

lib/shared/src/chat/prompts/utils.ts

+            [],
+            [],
+            request_id


For ease of readability, it would be useful to use the object as the only constructor argument. It can be in a follow-up PR if it affects many call sites.

valerybugakov · 2023-11-01T06:57:48Z

lib/shared/src/chat/prompts/utils.ts

@@ -21,14 +21,17 @@ export async function newInteraction(args: {
    assistantText?: string
    assistantDisplayText?: string
    source?: ChatEventSource
+    request_id?: string


Would it make sense to use interactionID here to keep the consistent name (for both the casing convention and the semantic meaning)?

valerybugakov · 2023-11-01T07:06:28Z

vscode/src/chat/MessageProvider.ts

@@ -625,6 +623,7 @@ export abstract class MessageProvider extends MessageHandler implements vscode.D
        const customInteraction = await newInteraction({
            displayText: humanInput,
            assistantDisplayText: assistantResponse,
+            request_id: this.currentInteractionID,


I'm confused. We rename the same value (this.currentInteractionID = uuid.v4()) multiple times from interaction ID to request ID and the other way around. Would it make sense to keep the same name everywhere?

The PR description says:

I wanted to create an id for each interaction, but an interaction is not created until context is fetched so that didn't work out.

Does that mean we do not have the "true" interaction ID yet?

Yea sorry for the confusion, I'll update to just use request_id instead because I'd assume InteractionID is created by interaction, but in this case it is not because it's an ID we assign for the whole request, which includes an interaction 😅

Got it, let's use requestID or requestId for consistent casing.

vscode/src/chat/MessageProvider.ts

valerybugakov · 2023-11-01T07:30:36Z

vscode/src/local-context/local-keyword-context-fetcher.ts

@@ -128,7 +127,7 @@ export class LocalKeywordContextFetcher implements KeywordContextFetcher {
            })
        )
        const searchDuration = performance.now() - startTime
-        telemetryService.log('CodyVSCodeExtension:keywordContext:searchDuration', { searchDuration })
+        telemetryService.log('CodyVSCodeExtension:keywordContext:searchDuration', { searchDuration, request_id })


With the current implementation, it's easy to call interaction-related events without the request_id. Also, it's unclear what request_id means here and where to get it. The only way to do that is to follow the call chain to the executeRecipe function, where the interaction ID is initialized (let me know if I misread the code here).

One option for making it safe is to implement an interaction logger similar to the autocomplete logger, which would define the interaction ID and emit all the interaction-related events from the exposed methods. This way, we can require the interaction ID argument and make it easy to determine where to get it from.

I wasn't sure how to get it to work since we have multiple services that can execute recipes asynchronously while sharing one event logger but what you shared seems like a good solution (instead of sharing one eventlogger each interaction will have its own?)

Does this mean instead of passing down a request_id, we will pass the logger into recipe and context fetcher interaction instead? 🤔

Does this mean instead of passing down a request_id, we will pass the logger into recipe and context fetcher interaction instead?

In autocomplete, we import the logger from the module scope and pass the completion ID to its methods. It's a collection of functions that are reused for all the completions (even if they are generated in parallel). The catch is having interaction logger methods with required arguments (requestID and others if needed).

Just chiming in to mention this again - the tracing approach will negate the need for all this clientside lift, as clientside you can just describe a span when something starts and automagically propagate that to the server, and the server will own linking it to the corresponding trace - I cc'd everyone over in this thread to discuss the possibility: https://sourcegraph.slack.com/archives/C04MSD3DP5L/p1698783422921649?thread_ts=1697566943.698939&cid=C04MSD3DP5L

Co-authored-by: Valery Bugakov <skymk1@gmail.com>

bobheadxi · 2023-11-01T16:51:30Z

Another note - isn't the goal of this work to be able to tie clientside interactions into the "real" cost of completions? Right now, this ID has no way of being extracted in the backend to include in backend events or forward to Cody Gateway events, as it's embedded in event metadata.

sourcegraph/sourcegraph#58016 proposes an approach for propagating manual request IDs better (basically this, but update the Sourcegraph GraphQL client to be send this as a request header universally), as the backend can own adding request IDs to event metadata and be able to forward it elsewhere, but per this thread tracing is a far more sustainable way to do this well

abeatrix · 2023-11-01T17:01:14Z

Another note - isn't the goal of this work to be able to tie clientside interactions into the "real" cost of completions? Right now, this ID has no way of being extracted in the backend to include in backend events or forward to Cody Gateway events, as it's embedded in event metadata.

Oh that's something @valerybugakov going to work on. This is for linking chat/commands actions with an ID: https://sourcegraph.slack.com/archives/C05AGQYD528/p1698355247771849?thread_ts=1697738493.299929&cid=C05AGQYD528

bobheadxi · 2023-11-01T18:50:55Z

Oh that's something @valerybugakov going to work on. This is for linking chat/commands actions with an ID: https://sourcegraph.slack.com/archives/C05AGQYD528/p1698355247771849?thread_ts=1697738493.299929&cid=C05AGQYD528

It looks like they solve for the same thing: each interaction (chat/commands actions) should end up with an ID that we can use to link related events. If Valery adds OpenTelemetry on top and we add that universally as metadata on top of event logs and events (the current plan), we not have duplicated ways of aligning data

What Kevin asked for in that thread - being able to link chains of events - is exactly what tracing was designed for: https://opentelemetry.io/docs/concepts/signals/traces/

abeatrix · 2023-11-01T19:17:28Z

Oh that's something @valerybugakov going to work on. This is for linking chat/commands actions with an ID: https://sourcegraph.slack.com/archives/C05AGQYD528/p1698355247771849?thread_ts=1697738493.299929&cid=C05AGQYD528

It looks like they solve for the same thing: each interaction (chat/commands actions) should end up with an ID that we can use to link related events. If Valery adds OpenTelemetry on top and we add that universally as metadata on top of event logs and events (the current plan), we not have duplicated ways of aligning data

What Kevin asked for in that thread - being able to link chains of events - is exactly what tracing was designed for: https://opentelemetry.io/docs/concepts/signals/traces/

Does this cover cases where a user copy code from a code block that was generated from an old conversation?

abeatrix · 2023-11-01T20:05:47Z

moved to #1586

add interactionID to codebaseContext and recipes

b6ee11f

abeatrix requested review from kelsey-brown and a team October 31, 2023 23:43

add request_id to code block

4bc640d

abeatrix changed the title ~~add interactionID to codebaseContext and recipes~~ telemetry: add interactionID to codebaseContext and recipes Nov 1, 2023

abeatrix changed the title ~~telemetry: add interactionID to codebaseContext and recipes~~ telemetry: log request_id per interaction Nov 1, 2023

valerybugakov reviewed Nov 1, 2023

View reviewed changes

Update MessageProvider.ts

7791ad6

Co-authored-by: Valery Bugakov <skymk1@gmail.com>

abeatrix closed this Nov 1, 2023

valerybugakov deleted the bee/request-id branch November 2, 2023 01:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

telemetry: log request_id per interaction #1571

telemetry: log request_id per interaction #1571

abeatrix commented Oct 31, 2023 •

edited

Loading

valerybugakov left a comment

valerybugakov Nov 1, 2023

valerybugakov Nov 1, 2023

valerybugakov Nov 1, 2023

abeatrix Nov 1, 2023

valerybugakov Nov 1, 2023

valerybugakov Nov 1, 2023 •

edited

Loading

abeatrix Nov 1, 2023

abeatrix Nov 1, 2023

valerybugakov Nov 1, 2023

bobheadxi Nov 1, 2023

bobheadxi commented Nov 1, 2023 •

edited

Loading

abeatrix commented Nov 1, 2023

bobheadxi commented Nov 1, 2023 •

edited

Loading

abeatrix commented Nov 1, 2023

abeatrix commented Nov 1, 2023

telemetry: log request_id per interaction #1571

telemetry: log request_id per interaction #1571

Conversation

abeatrix commented Oct 31, 2023 • edited Loading

Test plan

valerybugakov left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

valerybugakov Nov 1, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bobheadxi commented Nov 1, 2023 • edited Loading

abeatrix commented Nov 1, 2023

bobheadxi commented Nov 1, 2023 • edited Loading

abeatrix commented Nov 1, 2023

abeatrix commented Nov 1, 2023

abeatrix commented Oct 31, 2023 •

edited

Loading

valerybugakov Nov 1, 2023 •

edited

Loading

bobheadxi commented Nov 1, 2023 •

edited

Loading

bobheadxi commented Nov 1, 2023 •

edited

Loading