-
Notifications
You must be signed in to change notification settings - Fork 213
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
telemetry: log request_id per interaction #1571
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left inline comments. Functionality-wise, it's ✅, but we can also improve the docs and safety related to the request_id
field added to analytics events.
[], | ||
[], | ||
request_id |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For ease of readability, it would be useful to use the object as the only constructor argument. It can be in a follow-up PR if it affects many call sites.
@@ -21,14 +21,17 @@ export async function newInteraction(args: { | |||
assistantText?: string | |||
assistantDisplayText?: string | |||
source?: ChatEventSource | |||
request_id?: string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to use interactionID
here to keep the consistent name (for both the casing convention and the semantic meaning)?
@@ -625,6 +623,7 @@ export abstract class MessageProvider extends MessageHandler implements vscode.D | |||
const customInteraction = await newInteraction({ | |||
displayText: humanInput, | |||
assistantDisplayText: assistantResponse, | |||
request_id: this.currentInteractionID, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused. We rename the same value (this.currentInteractionID = uuid.v4()
) multiple times from interaction ID to request ID and the other way around. Would it make sense to keep the same name everywhere?
The PR description says:
I wanted to create an id for each interaction, but an interaction is not created until context is fetched so that didn't work out.
Does that mean we do not have the "true" interaction ID yet?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea sorry for the confusion, I'll update to just use request_id instead because I'd assume InteractionID is created by interaction, but in this case it is not because it's an ID we assign for the whole request, which includes an interaction 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it, let's use requestID
or requestId
for consistent casing.
@@ -128,7 +127,7 @@ export class LocalKeywordContextFetcher implements KeywordContextFetcher { | |||
}) | |||
) | |||
const searchDuration = performance.now() - startTime | |||
telemetryService.log('CodyVSCodeExtension:keywordContext:searchDuration', { searchDuration }) | |||
telemetryService.log('CodyVSCodeExtension:keywordContext:searchDuration', { searchDuration, request_id }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the current implementation, it's easy to call interaction-related events without the request_id
. Also, it's unclear what request_id
means here and where to get it. The only way to do that is to follow the call chain to the executeRecipe
function, where the interaction ID is initialized (let me know if I misread the code here).
One option for making it safe is to implement an interaction logger similar to the autocomplete logger, which would define the interaction ID and emit all the interaction-related events from the exposed methods. This way, we can require the interaction ID argument and make it easy to determine where to get it from.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wasn't sure how to get it to work since we have multiple services that can execute recipes asynchronously while sharing one event logger but what you shared seems like a good solution (instead of sharing one eventlogger each interaction will have its own?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this mean instead of passing down a request_id, we will pass the logger into recipe and context fetcher interaction instead? 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this mean instead of passing down a request_id, we will pass the logger into recipe and context fetcher interaction instead?
In autocomplete, we import the logger from the module scope and pass the completion ID to its methods. It's a collection of functions that are reused for all the completions (even if they are generated in parallel). The catch is having interaction logger methods with required arguments (requestID
and others if needed).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just chiming in to mention this again - the tracing approach will negate the need for all this clientside lift, as clientside you can just describe a span when something starts and automagically propagate that to the server, and the server will own linking it to the corresponding trace - I cc'd everyone over in this thread to discuss the possibility: https://sourcegraph.slack.com/archives/C04MSD3DP5L/p1698783422921649?thread_ts=1697566943.698939&cid=C04MSD3DP5L
Co-authored-by: Valery Bugakov <skymk1@gmail.com>
Another note - isn't the goal of this work to be able to tie clientside interactions into the "real" cost of completions? Right now, this ID has no way of being extracted in the backend to include in backend events or forward to Cody Gateway events, as it's embedded in event metadata. sourcegraph/sourcegraph#58016 proposes an approach for propagating manual request IDs better (basically this, but update the Sourcegraph GraphQL client to be send this as a request header universally), as the backend can own adding request IDs to event metadata and be able to forward it elsewhere, but per this thread tracing is a far more sustainable way to do this well |
Oh that's something @valerybugakov going to work on. This is for linking chat/commands actions with an ID: https://sourcegraph.slack.com/archives/C05AGQYD528/p1698355247771849?thread_ts=1697738493.299929&cid=C05AGQYD528 |
It looks like they solve for the same thing: each interaction (chat/commands actions) should end up with an ID that we can use to link related events. If Valery adds OpenTelemetry on top and we add that universally as metadata on top of event logs and events (the current plan), we not have duplicated ways of aligning data What Kevin asked for in that thread - being able to link chains of events - is exactly what tracing was designed for: https://opentelemetry.io/docs/concepts/signals/traces/ |
Does this cover cases where a user copy code from a code block that was generated from an old conversation? |
moved to #1586 |
add interactionID to codebaseContext and recipes
A new interactionID is assigned whenever executeRecipe is run, which allows tracing in individual webview/view.
Test plan
Submit a question in chat, you should see all the logEvents has the same
request_id
for each chat question submitted.I wanted to create an id for each interaction, but an interaction is not created until context is fetched so that didn't work out.
Example:
Question 1
Question 2:
When paste a insert a code block: