Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Autocomplete: Improve sampling code and prepare for Honeycomb export #3034

Merged
merged 21 commits into from
Feb 9, 2024

Conversation

philipp-spiess
Copy link
Contributor

@philipp-spiess philipp-spiess commented Feb 5, 2024

This PR improves the client site traces for autocomplete and makes it ready to be propagated to Honeycomb. An example trace can be found here: https://ui.honeycomb.io/sourcegraph/environments/cody/datasets/cody-client/result/z87pN9yLSan/trace/rv3ajXLDNV7?fields[]=s_name&fields[]=s_serviceName&span=2edb453047488acc

Screenshot 2024-02-07 at 18 16 09

The changes can be summarized as:

  • We add a suggested event that we set when a completion callback makes it to the end and will result in being visible by a user
  • We add all exposed feature flags after the completion is ready (this means it will include eventual flags that are being evaluated as part of the completion request) as well as all the metadata that would also be added to analytics pings
  • Fix a bug that caused spans to be much shorter than they were. Whoopsie!

These changes were tested with the following OTel collector processor config:

processors:
  tail_sampling:
    decision_wait: 30s
    num_traces: 100000
    expected_new_traces_per_sec: 10
    policies:
      name: and
      type: and
      and:
        and_sub_policy:
          # We only want to sample traces for completions that are displayed to the user. We track
          # this with a suggested event on the root span
          - name: completion-was-suggested
            type: ottl_condition
            ottl_condition:
              error_mode: ignore
              spanevent:
                - "name == \"suggested\""
          # Only track events that originate from the completion provider
          - name: is-completion-event
            type: ottl_condition
            ottl_condition:
              error_mode: ignore
              span:
                - "name == \"autocomplete.provideInlineCompletionItems\""
          # Only track spans for users that are expose to the tracing experiment
          - name: is-tracing-enabled
            type: ottl_condition
            ottl_condition:
              error_mode: ignore
              span:
                - "attributes[\"cody-autocomplete-tracing\"] == true"
          # # Ignore completion callbacks that return the last candidate. We only care for traces that
          # # show a new completion
          - name: is-completion-event
            type: ottl_condition
            ottl_condition:
              error_mode: ignore
              spanevent:
                - "attributes[\"source\"] != \"LastCandidate\""

Test plan

@philipp-spiess philipp-spiess changed the title WIP Autocomplete: Improve sampling code and prepare for Honeycomb export Feb 7, 2024
}

return handleSuccess(response)
} catch (error) {
return catchError(error)
} finally {
span.end()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was causing the issue we saw with traces because for async responses, span.end would be called synchronously after creating the promise but before it resolved which left all spans be very short 😬

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would be called synchronously after creating the promise but before it resolved, which left all spans be very short

TY for catching this! Makes sense.

@philipp-spiess philipp-spiess requested review from a team and valerybugakov February 7, 2024 17:23
@philipp-spiess philipp-spiess marked this pull request as ready for review February 7, 2024 17:23
@philipp-spiess
Copy link
Contributor Author

philipp-spiess commented Feb 8, 2024

I made some more changes, now we have:

  • More granular logging on the cody gateway upstream request for fireworks (every yielded response chunk is now logged)
  • Make a sampling decision on the client

This also updates the tail sampler config used but I feel more confident that this provides actual value now, check:

Screenshot 2024-02-08 at 13 47 44

With the following sampler config:

processors:
  tail_sampling:
    decision_wait: 30s
    num_traces: 100000
    expected_new_traces_per_sec: 10
    policies:
      name: and
      type: and
      and:
        and_sub_policy:
          # Only track events that originate from the completion provider
          - name: is-completion-event
            type: ottl_condition
            ottl_condition:
              error_mode: ignore
              span:
                - "name == \"autocomplete.provideInlineCompletionItems\""
          # Only track spans that were manually marked as to be sampled by the client
          - name: is-sampled
            type: ottl_condition
            ottl_condition:
              error_mode: ignore
              span:
                - "attributes[\"sampled\"] == true"

@philipp-spiess
Copy link
Contributor Author

Did some more cleanup now. Here's a recent example: https://ui.honeycomb.io/sourcegraph/environments/cody/datasets/cody-client/result/FbSAQhtzuid/trace/96XAibG4QyM?fields[]=s_name&fields[]=s_serviceName&span=aded4798e8db6458

I’m very happy with this now. Let's try to enable this for client traces to Honeycomb and see how far we get by manually following the trace headers into the Cody Gateway traces

@philipp-spiess
Copy link
Contributor Author

@valerybugakov This one is ready to review now for reals :D

@philipp-spiess
Copy link
Contributor Author

Accompanying OTel collector change here: https://github.com/sourcegraph/deploy-sourcegraph-cloud/pull/18413

@philipp-spiess
Copy link
Contributor Author

The OTel collector changes are deployed, we can actually test this against dotcom now 😎
https://ui.honeycomb.io/sourcegraph/environments/cody/datasets/cody-client/result/wB[…]1prv/trace/xxaYoFdmQJX?fields[]=s_name&fields[]=s_serviceName

Copy link
Member

@valerybugakov valerybugakov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NICE!

}

return handleSuccess(response)
} catch (error) {
return catchError(error)
} finally {
span.end()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would be called synchronously after creating the promise but before it resolved, which left all spans be very short

TY for catching this! Makes sense.

@@ -519,6 +520,9 @@ export function suggested(id: CompletionLogID): void {
if (!event.suggestedAt) {
event.suggestedAt = performance.now()

span?.setAttributes(getSharedParams(event) as any)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is any required here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as #3034 (comment)

Comment on lines +333 to +336

return tracer.startActiveSpan(
`POST ${url}`,
async function* (span): CompletionResponseGenerator {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want similar changes in vscode/src/completions/client.ts?

Comment on lines +67 to +68
// Disable default process logging. We do not care about the VS Code extension process
autoDetectResources: false,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TIL

@philipp-spiess philipp-spiess merged commit 423938a into main Feb 9, 2024
15 checks passed
@philipp-spiess philipp-spiess deleted the ps/improve-tracing branch February 9, 2024 15:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants