Skip to content

[Error Counting] Add ID to graphql::Error #7712

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 18 commits into
base: dev
Choose a base branch
from

Conversation

rregitsky
Copy link
Contributor

@rregitsky rregitsky commented Jun 17, 2025

This makes two major changes to graphql::Error:

  1. Adds an apollo_id UUID to errors to allow us to uniquely identify Errors. These IDs are randomly generated on construction unless one is specified by the input. Its setter is private so that Error construction is forced through new() where we randomly generate the ID. The getter is public through apollo_id(). This value is not serialized as we'd prefer not to expose the value the user. This is not for security reasons, but more because it has no value outside of the router context.
  2. Converts all Error construction into builders. The apollo_id setter is private so we are forced to avoid using the constructor.

Other minor, but notable changes:

  • apollo.router.operations.error measurements with the _entity path will now emit using _entity path rather than being replaced with the entity's name. This is to prevent double counting when the error counting migration is introduced in part 3. We will attempt to fix this behavior in a fast follow.
  • Removes Default trait from Error to force a random apollo_id to be generated
  • Makes extension_code no longer required when building an Error as many places weren't setting one with the constructor. We can revisit making it required once all of those places have proper error codes.

Part 2 of a split from #7357. Part 1 can be found here: #7699


Checklist

Complete the checklist (and note appropriate exceptions) before the PR is marked ready-for-review.

  • Changes are compatible1
  • Documentation2 completed
  • Performance impact assessed and acceptable
  • Metrics and logs are added3 and documented
  • Tests added and passing4
    • Unit Tests
    • Integration Tests
    • Manual Tests

Exceptions

Note any exceptions here

Notes

Footnotes

  1. It may be appropriate to bring upcoming changes to the attention of other (impacted) groups. Please endeavour to do this before seeking PR approval. The mechanism for doing this will vary considerably, so use your judgement as to how and when to do this.

  2. Configuration is an important part of many changes. Where applicable please try to document configuration examples.

  3. A lot of (if not most) features benefit from built-in observability and debug-level logs. Please read this guidance on metrics best-practices.

  4. Tick whichever testing boxes are applicable. If you are adding Manual Tests, please document the manual testing (extensively) in the Exceptions.

Copy link
Contributor

@rregitsky, please consider creating a changeset entry in /.changesets/. These instructions describe the process and tooling.

@rregitsky rregitsky changed the title [Error Counting] Use Builder for all graphql::Error Creation [Error Counting] Add ID and Use Builder for all graphql::Errors Jun 17, 2025
@rregitsky rregitsky changed the title [Error Counting] Add ID and Use Builder for all graphql::Errors [Error Counting] Add ID to graphql::Error Jun 17, 2025
@apollo-librarian
Copy link

apollo-librarian bot commented Jun 17, 2025

✅ Docs preview ready

The preview is ready to be viewed. View the preview

File Changes

0 new, 2 changed, 0 removed
* (developer-tools)/apollo-mcp-server/(latest)/limitations.mdx
* graphos/routing/(latest)/self-hosted/containerization/kubernetes/other-considerations.mdx

Build ID: 103874ef904d501652521a52

URL: https://www.apollographql.com/docs/deploy-preview/103874ef904d501652521a52

Copy link
Contributor

@timbotnik timbotnik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Behavioral changes seem good, just wondering if we can get abstract away some of the boilerplate overwriting of null IDs in tests, which seems right now a bit heavy-handed and could be more DRY.

Copy link
Contributor

@timbotnik timbotnik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Latest changes look great!

rregitsky and others added 3 commits June 20, 2025 09:02
Co-authored-by: timbotnik <tim@apollographql.com>
…LSR_1504' into rreg/err_count_error_refactor_PULSR_1504
@rregitsky rregitsky marked this pull request as ready for review June 20, 2025 13:05
@rregitsky rregitsky requested review from a team as code owners June 20, 2025 13:05
@goto-bus-stop
Copy link
Member

Is there some background documentation/discussion on what these IDs are for, and why graphql::Error is the right place to keep them?

Copy link
Contributor

@BrynCooke BrynCooke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only a couple of comments!

@timbotnik
Copy link
Contributor

timbotnik commented Jun 26, 2025

Is there some background documentation/discussion on what these IDs are for, and why graphql::Error is the right place to keep them?

We will need a “strong identity” with which errors to count / not / count / redact in each router execution layer that is maintained in the response context. This means we’ll need to mark each error with some kind of internal error id which can then be added to a list of “already counted” errors.

…efactor_PULSR_1504

# Conflicts:
#	apollo-router/src/plugins/connectors/handle_responses.rs
#	apollo-router/src/services/layers/persisted_queries/mod.rs
@timbotnik timbotnik requested a review from BrynCooke June 26, 2025 08:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants