Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: use grpc load-balancing when connecting to trustd #3069

Merged
merged 1 commit into from
Feb 2, 2021

Conversation

smira
Copy link
Member

@smira smira commented Jan 29, 2021

Instead of doing our homegrown "try all the endpoints" method,
use gRPC load-balancing across configured endpoints.

Generalize load-balancer via gRPC resolver we had in Talos API client,
use it in remote certificate generator code. Generalized resolver is
still under machinery/, as pkg/grpc is not in machinery/, and we
can't depend on Talos code from machinery/.

Related to: #3068

Full fix for #3068 requires dynamic updates to control plane endpoints
while apid is running, this is coming in the next PR.

Signed-off-by: Andrey Smirnov smirnov.andrey@gmail.com

@smira smira added this to the v0.9 milestone Jan 29, 2021
@smira
Copy link
Member Author

smira commented Jan 29, 2021

/approve

@AlekSi
Copy link
Contributor

AlekSi commented Jan 30, 2021

There is a typo: "locad" instead of "load"

@smira smira changed the title fix: use grpc locad-balancing when connecting to trustd fix: use grpc load-balancing when connecting to trustd Jan 30, 2021
@andrewrynhard
Copy link
Member

/rebase

@smira
Copy link
Member Author

smira commented Feb 1, 2021

/rebase

@andrewrynhard
Copy link
Member

/rebase

@andrewrynhard
Copy link
Member

/rebase

Instead of doing our homegrown "try all the endpoints" method,
use gRPC load-balancing across configured endpoints.

Generalize load-balancer via gRPC resolver we had in Talos API client,
use it in remote certificate generator code. Generalized resolver is
still under `machinery/`, as `pkg/grpc` is not in `machinery/`, and we
can't depend on Talos code from `machinery/`.

Related to: siderolabs#3068

Full fix for siderolabs#3068 requires dynamic updates to control plane endpoints
while apid is running, this is coming in the next PR.

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
Copy link
Member

@andrewrynhard andrewrynhard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@andrewrynhard
Copy link
Member

/LGTM

@andrewrynhard
Copy link
Member

/lgtm

@talos-bot talos-bot merged commit 389349c into siderolabs:master Feb 2, 2021
smira added a commit to smira/talos that referenced this pull request Feb 3, 2021
This moves endpoint refresh from the context of the service `apid` in
`machined` into `apid` service itself for the workers. `apid` does
initial poll for the endpoints when it boots, but also periodically
polls for new endpoints to make sure it has accurate list of `trustd`
endpoints to talk to, this handles cases when control plane endpoints
change (e.g. rolling replace of control plane nodes with new IPs).

Related to siderolabs#3069

Fixes siderolabs#3068

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
smira added a commit to smira/talos that referenced this pull request Feb 3, 2021
This moves endpoint refresh from the context of the service `apid` in
`machined` into `apid` service itself for the workers. `apid` does
initial poll for the endpoints when it boots, but also periodically
polls for new endpoints to make sure it has accurate list of `trustd`
endpoints to talk to, this handles cases when control plane endpoints
change (e.g. rolling replace of control plane nodes with new IPs).

Related to siderolabs#3069

Fixes siderolabs#3068

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
talos-bot pushed a commit that referenced this pull request Feb 3, 2021
This moves endpoint refresh from the context of the service `apid` in
`machined` into `apid` service itself for the workers. `apid` does
initial poll for the endpoints when it boots, but also periodically
polls for new endpoints to make sure it has accurate list of `trustd`
endpoints to talk to, this handles cases when control plane endpoints
change (e.g. rolling replace of control plane nodes with new IPs).

Related to #3069

Fixes #3068

Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants