Skip to content

Fix several hot reload issues with subscriptions #7746

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 6 commits into
base: dev
Choose a base branch
from

Conversation

goto-bus-stop
Copy link
Member

@goto-bus-stop goto-bus-stop commented Jun 24, 2025

When a hot reload is triggered by a configuration change, the router attempted to apply updated configuration to open subscriptions. But this could cause excessive logging.

When a hot reload is triggered by a schema change, the router closed subscriptions with a SUBSCRIPTION_SCHEMA_RELOAD error. But this happened before the new schema was fully active and warmed up, so clients could reconnect tothe old schema.

To fix these issues, a configuration and a schema change now have the same behavior. The router waits for the new configuration and schema to be active, and then closes all subscriptions with a SUBSCRIPTION_SCHEMA_RELOAD error, so clients can reconnect.

Drafted pending a test that verifies that the old behaviour is wrong and the new behaviour is correct. No strong indication yet that this is causing serious problems today, but it should be an improvement.


Checklist

Complete the checklist (and note appropriate exceptions) before the PR is marked ready-for-review.

  • PR description explains the motivation for the change and relevant context for reviewing
  • PR description links appropriate GitHub/Jira tickets (creating when necessary)
  • Changeset is included for user-facing changes
  • Changes are compatible1
  • Documentation2 completed
  • Performance impact assessed and acceptable
  • Metrics and logs are added3 and documented
  • Tests added and passing4
    • Unit tests
    • Integration tests
    • Manual tests, as necessary

Exceptions

Note any exceptions here

Notes

Footnotes

  1. It may be appropriate to bring upcoming changes to the attention of other (impacted) groups. Please endeavour to do this before seeking PR approval. The mechanism for doing this will vary considerably, so use your judgement as to how and when to do this.

  2. Configuration is an important part of many changes. Where applicable please try to document configuration examples.

  3. A lot of (if not most) features benefit from built-in observability and debug-level logs. Please read this guidance on metrics best-practices.

  4. Tick whichever testing boxes are applicable. If you are adding Manual Tests, please document the manual testing (extensively) in the Exceptions.

…iptions

Otherwise:
- Configuration applies to subscriptions *before* it applies to new
  requests
- Subscriptions may be closed *before* a new schema is warmed up, and
  reopen while the old pipeline is still active
@goto-bus-stop goto-bus-stop added the backport-1.x Backport this PR to 1.x label Jun 24, 2025

This comment has been minimized.

Signed-off-by: Benjamin <5719034+bnjjj@users.noreply.github.com>
@apollo-librarian
Copy link

apollo-librarian bot commented Jun 25, 2025

✅ Docs preview ready

The preview is ready to be viewed. View the preview

File Changes

0 new, 27 changed, 1 removed
* (developer-tools)/apollo-mcp-server/(latest)/best-practices.mdx
* (developer-tools)/apollo-mcp-server/(latest)/command-reference.mdx
* (developer-tools)/apollo-mcp-server/(latest)/guides/index.mdx
* graphos/routing/(latest)/configuration/cli.mdx
* graphos/routing/(latest)/customization/coprocessor/index.mdx
* graphos/routing/(latest)/customization/coprocessor.mdx
* graphos/routing/(latest)/observability/telemetry/instrumentation/instruments.mdx
* graphos/routing/(latest)/observability/telemetry/instrumentation/events.mdx
* graphos/routing/(latest)/observability/index.mdx
* graphos/routing/(latest)/operations/subscriptions/overview.mdx
* graphos/routing/(latest)/operations/subscriptions/configuration.mdx
* graphos/routing/(latest)/operations/subscriptions/api-gateway.mdx
* graphos/routing/(latest)/performance/caching/distributed.mdx
* graphos/routing/(latest)/performance/caching/entity.mdx
* graphos/routing/(latest)/performance/caching/index.mdx
* graphos/routing/(latest)/performance/query-batching.mdx
* graphos/routing/(latest)/security/demand-control.mdx
* graphos/routing/(latest)/security/jwt.mdx
* graphos/routing/(latest)/security/authorization.mdx
* graphos/routing/(latest)/security/authorization-overview.mdx
* graphos/routing/(latest)/security/persisted-queries.mdx
* graphos/routing/(latest)/security/request-limits.mdx
* graphos/routing/(latest)/security/router-authentication.mdx
* graphos/routing/(latest)/self-hosted/containerization/docker.mdx
* graphos/routing/(latest)/self-hosted/containerization/index.mdx
* graphos/routing/(latest)/self-hosted/index.mdx
* graphos/routing/(latest)/_sidebar.yaml
- graphos/routing/(latest)/self-hosted/containerization/docker-router-only.mdx

Build ID: 440fee943fa08dd92c48c43f

URL: https://www.apollographql.com/docs/deploy-preview/440fee943fa08dd92c48c43f

@goto-bus-stop goto-bus-stop changed the title Wait for the router to complete a reload before notifying open subscriptions Fix memory spikes and ordering with subscriptions and hot reload Jun 25, 2025
@goto-bus-stop goto-bus-stop changed the title Fix memory spikes and ordering with subscriptions and hot reload Fix hot reload issues with subscriptions Jun 25, 2025
@goto-bus-stop goto-bus-stop changed the title Fix hot reload issues with subscriptions Fix several hot reload issues with subscriptions Jun 25, 2025
fetch_service_factory,
};
}
Some(_new_configuration) = configuration_updated_rx.next() => {
Copy link
Member Author

@goto-bus-stop goto-bus-stop Jun 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bnjjj It seems that we're now handling these in the same way. It's not useful to clients to know that a subscription was closed due to a schema or a configuration change (and maybe even desireable that they don't know). Should we just have a single update channel for both updates, and use the SUBSCRIPTION_SCHEMA_RELOAD extension code for both?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I think we should be let's do this in a follow up PR if you agree, I just want to minimize the impact if possible

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, but clients might treat SUBSCRIPTION_SCHEMA_RELOAD specially, so i think we should at least use the same extension code for both to get the correct behaviour

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've made that change, I'm fine with keeping the notification system intact and just having that duplicate code for the time being.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-1.x Backport this PR to 1.x
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants