Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

release-21.2: ui, server: db console requests can proxy to any node #76694

Merged
merged 1 commit into from
Feb 25, 2022

Conversation

dhartunian
Copy link
Collaborator

Backport 1/1 commits from #72659.

/cc @cockroachdb/release

Note: This PR differs a bit from what was merged on master due to the refactor of the HTTP server code in #75724


Previously, a DB Console connection to each node would only communicate
with that specific node. Certain requests could fan out to the rest of
the cluster if they were configured to, but otherwise, if a user wanted
to connect to another node, they would need a URL and an HTTP port open
to make that happen. This was challenging in certain on-prem
configurations where not all nodes are easily available via the network,
or in a situation where an HTTP load balancer is proxying connections to
the nodes and does not offer routing to a specific node to a cluster.

This change introduces a feature to enable any CRDB node to proxy HTTP
connections to any other enabling easy access to DB Console requests to
arbitrary nodes as long as a connection to a single one in the cluster
is available. This is to enable more rapid troubleshooting and response
in the event of a cluster issue.

The feature works by inspecting an HTTP cookie named remote_node_id in
each request. If the cookie is present and contains a value matching a
nodeID in the cluster, the request is routed to that node. This enables
comprehensive proxying of all HTTP requests as long as the cookie is
attached.

In addition, the proxying can be triggered via a query param with the
same name as the cookie (remote_node_id) to override the desired nodeID
for the request.

In order to proxy to a node's HTTP port, an additional field has been
added to the NodeDescriptor proto: http_address which holds the
HTTPAdvertiseAddr configuration field.

If the request encounters an error during the parsing and proxy setup,
the response will include a header to clear the remote_node_id cookie
to prevent a browser from getting stuck in an invalid request.

The DB Console has been updated with some simple UI around this feature
in the top right corner of the Advanced Debug page. A dropdown now
allows for control of the nodeID we'd like to connect to, and once we've
established the proxying cookie, the UI changes to show a "Reset" button
to remove the proxying behavior. This can clarify situations where
otherwise the cookie would be difficult to "discover" if it happens to
be set without the operator's knowledge.

Screen Shot 2022-01-26 at 1 14 26 PM

Screen Shot 2022-01-26 at 1 14 35 PM

Screen Shot 2022-01-26 at 1 14 50 PM

Resolves #73285

Release note (ui change): DB Console requests can be routed to arbitrary
nodes in the cluster. Users can select a node from a dropdown in the
Advanced Debug page of the DB Console UI to route their UI to that node.
Manually initiated requests can either add a remote_node_id query param
to their request or set a remote_node_id HTTP cookie in order to
manage the routing of their request.

Release note (ops change): Operators who wish to access HTTP endpoints
of the cluster through a proxy can now request specific nodeIDs through
a remote_node_id query param or cookie with the value set to the
nodeID they would like to proxy the connection to.

Previously, a DB Console connection to each node would only communicate
with that specific node. Certain requests could fan out to the rest of
the cluster if they were configured to, but otherwise, if a user wanted
to connect to another node, they would need a URL and an HTTP port open
to make that happen. This was challenging in certain on-prem
configurations where not all nodes are easily available via the network,
or in a situation where an HTTP load balancer is proxying connections to
the nodes and does not offer routing to a specific node to a cluster.

This change introduces a feature to enable any CRDB node to proxy HTTP
connections to any other enabling easy access to DB Console requests to
arbitrary nodes as long as a connection to a single one in the cluster
is available. This is to enable more rapid troubleshooting and response
in the event of a cluster issue.

The feature works by inspecting an HTTP cookie named `remote_node_id` in
each request. If the cookie is present and contains a value matching a
nodeID in the cluster, the request is routed to that node. This enables
comprehensive proxying of all HTTP requests as long as the cookie is
attached.

In addition, the proxying can be triggered via a query param with the
same name as the cookie (`remote_node_id`) to override the desired nodeID
for the request.

In order to proxy to a node's HTTP port, an additional field has been
added to the `NodeDescriptor` proto: `http_address` which holds the
`HTTPAdvertiseAddr` configuration field.

If the request encounters an error during the parsing and proxy setup,
the response will include a header to clear the `remote_node_id` cookie
to prevent a browser from getting stuck in an invalid request.

The DB Console has been updated with some simple UI around this feature
in the top right corner of the Advanced Debug page. A dropdown now
allows for control of the nodeID we'd like to connect to, and once we've
established the proxying cookie, the UI changes to show a "Reset" button
to remove the proxying behavior. This can clarify situations where
otherwise the cookie would be difficult to "discover" if it happens to
be set without the operator's knowledge.

Resolves cockroachdb#73285

Release note (ui change): DB Console requests can be routed to arbitrary
nodes in the cluster. Users can select a node from a dropdown in the
Advanced Debug page of the DB Console UI to route their UI to that node.
Manually initiated requests can either add a `remote_node_id` query param
to their request or set a `remote_node_id` HTTP cookie in order to
manage the routing of their request.

Release note (ops change): Operators who wish to access HTTP endpoints
of the cluster through a proxy can now request specific nodeIDs through
a `remote_node_id` query param or cookie with the value set to the
nodeID they would like to proxy the connection to.
@dhartunian dhartunian requested review from andreimatei and a team February 16, 2022 17:21
@dhartunian dhartunian requested a review from a team as a code owner February 16, 2022 17:21
@blathers-crl
Copy link

blathers-crl bot commented Feb 16, 2022

Thanks for opening a backport.

Please check the backport criteria before merging:

  • Patches should only be created for serious issues or test-only changes.
  • Patches should not break backwards-compatibility.
  • Patches should change as little code as possible.
  • Patches should not change on-disk formats or node communication protocols.
  • Patches should not add new functionality.
  • Patches must not add, edit, or otherwise modify cluster versions; or add version gates.
If some of the basic criteria cannot be satisfied, ensure that the exceptional criteria are satisfied within.
  • There is a high priority need for the functionality that cannot wait until the next release and is difficult to address in another way.
  • The new functionality is additive-only and only runs for clusters which have specifically “opted in” to it (e.g. by a cluster setting).
  • New code is protected by a conditional check that is trivial to verify and ensures that it only runs for opt-in clusters.
  • The PM and TL on the team that owns the changed code have signed off that the change obeys the above rules.

Add a brief release justification to the body of your PR to justify this backport.

Some other things to consider:

  • What did we do to ensure that a user that doesn’t know & care about this backport, has no idea that it happened?
  • Will this work in a cluster of mixed patch versions? Did we test that?
  • If a user upgrades a patch version, uses this feature, and then downgrades, what happens?

@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Collaborator

@rimadeodhar rimadeodhar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 11 of 11 files at r1, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @andreimatei)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants