Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decide on service discovery strategy for Boulder #5307

Closed
jsha opened this issue Feb 24, 2021 · 1 comment
Closed

Decide on service discovery strategy for Boulder #5307

jsha opened this issue Feb 24, 2021 · 1 comment
Assignees

Comments

@jsha
Copy link
Contributor

jsha commented Feb 24, 2021

Right now, Boulder discovers its gRPC backends via DNS, using gRPC's default "dns" resolver. This looks up a hostname, and sends traffic to all of the resulting IP addresses. Boulder's VA does not have gRPC backends, but has DNS backends. Right now these are statically configured via JSON and never updated except on restart.

We'd like to dynamically discover DNS backends for VA (for instance, to allow autoscaling). Also, we'd like to update our gRPC service discovery. We've noticed that gRPC never updates its backends after starting up.

According to grpc/grpc-go#1663 (comment), gRPC re-resolves on connection errors, and we can intentionally trigger periodic connection errors by setting keepalive.MaxConnectionAge. This should be non-disruptive since connections are closed cleanly, with extant RPCs given time to complete. Also relevant: grpc/grpc#12295

If that works, we can stick with DNS-based service discovery for gRPC. For consistency, we would probably also want to use DNS-based service discovery for DNS backends.

For autoscaling our remote VA DNS backends, we can use AWS Cloud Map. It supports DNS-based discovery. For on-prem, we would use our current DNS setup, which doesn't have any concept of auto-scaling. Using DNS for both would allow us to use the same code for on-prem and cloud.

Related: #5306

@jsha jsha added this to the Sprint 2021-02-23 milestone Feb 24, 2021
@jsha jsha self-assigned this Feb 24, 2021
@jsha jsha added the layer/rpc label Feb 25, 2021
jsha added a commit that referenced this issue Mar 2, 2021
This allows servers to tell clients to go away after some period of time, which triggers the clients to re-resolve DNS.

Per grpc/grpc#12295, this is the preferred way to do this.

Related: #5307.
beautifulentropy pushed a commit that referenced this issue Mar 12, 2021
This allows servers to tell clients to go away after some period of time, which triggers the clients to re-resolve DNS.

Per grpc/grpc#12295, this is the preferred way to do this.

Related: #5307.
@jsha
Copy link
Contributor Author

jsha commented May 11, 2021

We decided to use DNS with SRV records for discovering our Unbound backends, and to continue to use DNS with A records to discovering our gRPC backends. For gRPC we're using MaxConnectionAge to discover changes faster.

@jsha jsha closed this as completed May 11, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant