Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

telemetry: log request_id per interaction #1571

Closed
wants to merge 3 commits into from
Closed

Conversation

abeatrix
Copy link
Contributor

@abeatrix abeatrix commented Oct 31, 2023

add interactionID to codebaseContext and recipes

  • Pass interactionID to codebaseContext getContextMessages
  • Add interactionID to RecipeContext
  • Add interactionID parameter to relevant recipe methods
  • Log interactionID as request_id for each chat question submitted

A new interactionID is assigned whenever executeRecipe is run, which allows tracing in individual webview/view.

Test plan

Submit a question in chat, you should see all the logEvents has the same request_id for each chat question submitted.

I wanted to create an id for each interaction, but an interaction is not created until context is fetched so that didn't work out.

Example:

Question 1

█ logEvent (telemetry disabled): CodyVSCodeExtension:keywordContext:searchDuration {"properties":{"searchDuration":839.5423329919577,"request_id":"d6fd00bd-7559-456c-8bea-ae58b197a95c"}}
█ logEvent (telemetry disabled): CodyVSCodeExtension:recipe:chat-question:executed {"properties":{"contextSummary":{"embeddings":0,"local":7},"request_id":"d6fd00bd-7559-456c-8bea-ae58b197a95c"},"opts":{"hasV2Event":true}}

Question 2:

█ logEvent (telemetry disabled): CodyVSCodeExtension:keywordContext:searchDuration {"properties":{"searchDuration":900.0559170097113,"request_id":"b5e1f00b-96d1-470a-9b0d-d9238550af71"}}
█ logEvent (telemetry disabled): CodyVSCodeExtension:recipe:chat-question:executed {"properties":{"contextSummary":{"embeddings":0,"local":8},"request_id":"b5e1f00b-96d1-470a-9b0d-d9238550af71"},"opts":{"hasV2Event":true}}

When paste a insert a code block:

█ logEvent (telemetry disabled): CodyVSCodeExtension:insertButton:clicked {"properties":{"op":"insert","charCount":98,"lineCount":5,"source":"chat-question","request_id":"b5e1f00b-96d1-470a-9b0d-d9238550af71"}}
``

@abeatrix abeatrix requested review from kelsey-brown and a team October 31, 2023 23:43
@abeatrix abeatrix changed the title add interactionID to codebaseContext and recipes telemetry: add interactionID to codebaseContext and recipes Nov 1, 2023
@abeatrix abeatrix changed the title telemetry: add interactionID to codebaseContext and recipes telemetry: log request_id per interaction Nov 1, 2023
Copy link
Member

@valerybugakov valerybugakov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left inline comments. Functionality-wise, it's ✅, but we can also improve the docs and safety related to the request_id field added to analytics events.

Comment on lines +32 to +34
[],
[],
request_id
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For ease of readability, it would be useful to use the object as the only constructor argument. It can be in a follow-up PR if it affects many call sites.

@@ -21,14 +21,17 @@ export async function newInteraction(args: {
assistantText?: string
assistantDisplayText?: string
source?: ChatEventSource
request_id?: string
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to use interactionID here to keep the consistent name (for both the casing convention and the semantic meaning)?

@@ -625,6 +623,7 @@ export abstract class MessageProvider extends MessageHandler implements vscode.D
const customInteraction = await newInteraction({
displayText: humanInput,
assistantDisplayText: assistantResponse,
request_id: this.currentInteractionID,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused. We rename the same value (this.currentInteractionID = uuid.v4()) multiple times from interaction ID to request ID and the other way around. Would it make sense to keep the same name everywhere?

The PR description says:

I wanted to create an id for each interaction, but an interaction is not created until context is fetched so that didn't work out.

Does that mean we do not have the "true" interaction ID yet?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea sorry for the confusion, I'll update to just use request_id instead because I'd assume InteractionID is created by interaction, but in this case it is not because it's an ID we assign for the whole request, which includes an interaction 😅

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, let's use requestID or requestId for consistent casing.

vscode/src/chat/MessageProvider.ts Outdated Show resolved Hide resolved
@@ -128,7 +127,7 @@ export class LocalKeywordContextFetcher implements KeywordContextFetcher {
})
)
const searchDuration = performance.now() - startTime
telemetryService.log('CodyVSCodeExtension:keywordContext:searchDuration', { searchDuration })
telemetryService.log('CodyVSCodeExtension:keywordContext:searchDuration', { searchDuration, request_id })
Copy link
Member

@valerybugakov valerybugakov Nov 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the current implementation, it's easy to call interaction-related events without the request_id. Also, it's unclear what request_id means here and where to get it. The only way to do that is to follow the call chain to the executeRecipe function, where the interaction ID is initialized (let me know if I misread the code here).

One option for making it safe is to implement an interaction logger similar to the autocomplete logger, which would define the interaction ID and emit all the interaction-related events from the exposed methods. This way, we can require the interaction ID argument and make it easy to determine where to get it from.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't sure how to get it to work since we have multiple services that can execute recipes asynchronously while sharing one event logger but what you shared seems like a good solution (instead of sharing one eventlogger each interaction will have its own?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean instead of passing down a request_id, we will pass the logger into recipe and context fetcher interaction instead? 🤔

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean instead of passing down a request_id, we will pass the logger into recipe and context fetcher interaction instead?

In autocomplete, we import the logger from the module scope and pass the completion ID to its methods. It's a collection of functions that are reused for all the completions (even if they are generated in parallel). The catch is having interaction logger methods with required arguments (requestID and others if needed).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just chiming in to mention this again - the tracing approach will negate the need for all this clientside lift, as clientside you can just describe a span when something starts and automagically propagate that to the server, and the server will own linking it to the corresponding trace - I cc'd everyone over in this thread to discuss the possibility: https://sourcegraph.slack.com/archives/C04MSD3DP5L/p1698783422921649?thread_ts=1697566943.698939&cid=C04MSD3DP5L

Co-authored-by: Valery Bugakov <skymk1@gmail.com>
@bobheadxi
Copy link
Member

bobheadxi commented Nov 1, 2023

Another note - isn't the goal of this work to be able to tie clientside interactions into the "real" cost of completions? Right now, this ID has no way of being extracted in the backend to include in backend events or forward to Cody Gateway events, as it's embedded in event metadata.

sourcegraph/sourcegraph#58016 proposes an approach for propagating manual request IDs better (basically this, but update the Sourcegraph GraphQL client to be send this as a request header universally), as the backend can own adding request IDs to event metadata and be able to forward it elsewhere, but per this thread tracing is a far more sustainable way to do this well

@abeatrix
Copy link
Contributor Author

abeatrix commented Nov 1, 2023

Another note - isn't the goal of this work to be able to tie clientside interactions into the "real" cost of completions? Right now, this ID has no way of being extracted in the backend to include in backend events or forward to Cody Gateway events, as it's embedded in event metadata.

Oh that's something @valerybugakov going to work on. This is for linking chat/commands actions with an ID: https://sourcegraph.slack.com/archives/C05AGQYD528/p1698355247771849?thread_ts=1697738493.299929&cid=C05AGQYD528

@bobheadxi
Copy link
Member

bobheadxi commented Nov 1, 2023

Oh that's something @valerybugakov going to work on. This is for linking chat/commands actions with an ID: https://sourcegraph.slack.com/archives/C05AGQYD528/p1698355247771849?thread_ts=1697738493.299929&cid=C05AGQYD528

It looks like they solve for the same thing: each interaction (chat/commands actions) should end up with an ID that we can use to link related events. If Valery adds OpenTelemetry on top and we add that universally as metadata on top of event logs and events (the current plan), we not have duplicated ways of aligning data

What Kevin asked for in that thread - being able to link chains of events - is exactly what tracing was designed for: https://opentelemetry.io/docs/concepts/signals/traces/

@abeatrix
Copy link
Contributor Author

abeatrix commented Nov 1, 2023

Oh that's something @valerybugakov going to work on. This is for linking chat/commands actions with an ID: https://sourcegraph.slack.com/archives/C05AGQYD528/p1698355247771849?thread_ts=1697738493.299929&cid=C05AGQYD528

It looks like they solve for the same thing: each interaction (chat/commands actions) should end up with an ID that we can use to link related events. If Valery adds OpenTelemetry on top and we add that universally as metadata on top of event logs and events (the current plan), we not have duplicated ways of aligning data

What Kevin asked for in that thread - being able to link chains of events - is exactly what tracing was designed for: https://opentelemetry.io/docs/concepts/signals/traces/

Does this cover cases where a user copy code from a code block that was generated from an old conversation?

@abeatrix
Copy link
Contributor Author

abeatrix commented Nov 1, 2023

moved to #1586

@abeatrix abeatrix closed this Nov 1, 2023
@valerybugakov valerybugakov deleted the bee/request-id branch November 2, 2023 01:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants