Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow the allocator service to use the go http/2 server #2263

Closed
roberthbailey opened this issue Sep 14, 2021 · 33 comments · Fixed by #2272
Closed

Allow the allocator service to use the go http/2 server #2263

roberthbailey opened this issue Sep 14, 2021 · 33 comments · Fixed by #2272
Labels
kind/feature New features for Agones
Milestone

Comments

@roberthbailey
Copy link
Member

roberthbailey commented Sep 14, 2021

Is your feature request related to a problem? Please describe.
#1902 added support for REST to the allocator service, with a neat trick to reuse the same port for both gRPC and REST using a mux. However, in doing so, the allocator service switched from using grpc.(*Server).Serve to grpc.(*Server).ServeHTTP, the latter of which is experimental and has noticeably worse performance.

There are certainly cases where performance isn't an issue and it's convenient to run the protocols on the same port (only one exposed port / firewall rule).

However, we have a use case where we only need to run gRPC and the experimental implementation used by grpc.(*Server).ServeHTTP is not compatible with the gRPC client (whereas the implementation in grpc.(*Server).Serve works as expected).

Describe the solution you'd like
I'd like to add a flag to the allocator service to restore the old behavior: if set to true, the flag would disable the mux (and therefore the REST API) and use grpc.(*Server).Serve. The flag would default to false, keeping the current behavior for compatibility.

Describe alternatives you've considered
I've looked into https://github.com/soheilhy/cmux as a replacement mux that might still allow us to use grpc.(*Server).Serve. It seems a bit complicated to set up, and I'm not sure how well it would work with our setup with client certs. So I'm opting for the simpler solution and we can revisit using cmux later if it makes sense.

@roberthbailey roberthbailey added the kind/feature New features for Agones label Sep 14, 2021
@roberthbailey
Copy link
Member Author

roberthbailey commented Sep 14, 2021

@markmandel
Copy link
Member

The only thought I had:

Rather than having an on/off switch - maybe we should make two options to configure the ports for the gRPC and REST service:

So:

  1. If the ports are the same (default), then things are the same as now
  2. If the ports are different, then you there is no muxing, and is served on different ports (maybe on different services)

Looking at our config settings, I think we actually could align on this nicely, since we have:

  • agones.allocator.http.port
  • agones.allocator.http.serviceType
  • agones.allocator.http.loadBalancerIP
  • agones.allocator.http.loadBalancerSourceRanges
    etc

We could do the same thing, but as agones.allocator.grpc.

Probably some details to work out, but thought I would float the idea, see if it would stick. WDYT?

@roberthbailey
Copy link
Member Author

The only difference is if someone wants to only install one handler (as in our case) it would be nice to have a way to also enable / disable the http (or grpc) service entirely. In particular, if I don't want to use http at all, it's unnecessary to create a second load balancer for it.

@roberthbailey
Copy link
Member Author

Here is the relevant block from the helm configuration:

  allocator:
    install: true
    ...
    http:
      port: 443
      serviceType: LoadBalancer
      loadBalancerIP: ""
      loadBalancerSourceRanges: []
      annotations: {}
    disableSecretCreation: false
    generateTLS: true
    generateClientTLS: true
    disableMTLS: false
    disableTLS: false

@roberthbailey
Copy link
Member Author

There are a few fields outside of the http block that affect how the service endpoint (which is muxed right now) is configured. Since we would allow http / grpc to run on the same port, I think that means we would need to keep the settings (disabling TLS / mTLS) independent of the protocol block (as we do today). If the need arises, we could later move them into the protocol blocks but for that that just makes things more complicated without any real use case.

It would be easy to add an install: true for both http and grpc to enable / disable the loadbalancer (right now you could set the serviceType to ClusterIP which means the ELB isn't created but there is still a k8s service created) which addresses my other concern.

@markmandel
Copy link
Member

markmandel commented Sep 16, 2021

It would be easy to add an install: true for both http and grpc to enable / disable the loadbalancer (right now you could set the serviceType to ClusterIP which means the ELB isn't created but there is still a k8s service created) which addresses my other concern.

agones.allocator.http.serviceType - If the user sets this to "None" then no Service is created ? Just trying to keep option sprawl at bay as much as possible.

(this would be instead of a install: true/false option)

WDYT?

@ilkercelikyilmaz
Copy link
Contributor

I agree with Robert that we should not create 2 LBs default. Hopefully many users will choose/use one and we keep the option of mux/cmux where we allow using the same port.

Also I think it is OK to have mTLS & TLS disabled/enabled for both REST & gRPC.

Regarding Mark's comment : #2263 (comment), I am OK as long as we can configure the Allocator service to run without non-http proxy but real/proper grpc service.

@roberthbailey
Copy link
Member Author

It looks like we can create a single service that uses two ports for http / grpc. https://kubernetes.io/docs/concepts/services-networking/service/ says:

As many Services need to expose more than one port, Kubernetes supports multiple port definitions on a Service object. > Each port definition can have the same protocol, or a different one.

@markmandel - https://kubernetes.io/docs/concepts/services-networking/service/#publishing-services-service-types makes it sound like "none" isn't an option (why would you create a service that doesn't have a type - then you should just not create the service). Are you suggesting that we use None as a sentinel value to skip that part of the helm chart?

@markmandel
Copy link
Member

Are you suggesting that we use None as a sentinel value to skip that part of the helm chart?

Yes, exactly 👍🏻

We could do the same with an install boolean, but I figured it might be more concise to add our own value to serviceType instead of a whole extra new property.

It looks like we can create a single service that uses two ports for http / grpc.

Ahh, that sounds perfect then.

@roberthbailey
Copy link
Member Author

Assuming that I'm reading the documentation correctly and we can use a single k8s service, I think we might want something like this (but this moves around helm values which is a breaking change for anyone who is customizing helm):

  allocator:
    install: true
    ...
    service:
      httpPort: 443
      grpcPort: 443 
      serviceType: LoadBalancer
      loadBalancerIP: ""
      loadBalancerSourceRanges: []
      annotations: {}
    disableSecretCreation: false
    generateTLS: true
    generateClientTLS: true
    disableMTLS: false
    disableTLS: false

The only thing missing is if we want a way to disable a listener entirely. Maybe we could use 0 to do that?

@roberthbailey
Copy link
Member Author

With this organization, I don't think there is a need to disable the service. If you install the allocator but have no way to reach it, then why install the pod in the first place?

But, it would be useful to be able to enable/disable http / grpc separately (both in the service definition and in the allocator pod). It'll be a bit tricky to write the helm for this, since it'll also need to check to see if the ports match (and then only add one port to the service) and otherwise add two, but it means that in all cases we will only get a single service / ELB created.

@ilkercelikyilmaz
Copy link
Contributor

 httpPort: 443
 grpcPort: 443 

This will mean that Agones will use muc/cmux?
and similarly only grpcPort: 443 mean that we will use regular grpc.serve()

@roberthbailey
Copy link
Member Author

Btw, IMO the new values will make more sense:

allocator.http.loadBalancerIP -> allocator.service.loadBalancerIP and
allocator.http.annotations -> allocation.service.annotations

The old names were a bit confusing to me (especially the annotations one) and I think this would clear that up.

@ilkercelikyilmaz
Copy link
Contributor

+1

@roberthbailey
Copy link
Member Author

roberthbailey commented Sep 16, 2021

Here's what I would expect:

Default (existing behavior).

  httpPort: 443
  grpcPort: 443

This will mean that Agones will use the existing mux (or cmux) and that the service will have a single port on the LB.

Both services enabled, two different ports

  httpPort: 443
  grpcPort: 8443

Agones will use grpc.(*Server).Serve for the grpc port and there will be a single service that has two ports exposed on the LB.

Only grpc (my feature request)

  httpPort: 0
  grpcPort: 443

Agones will use grpc.(*Server).Serve for the grpc port and the service will have a single port exposed on the LB. The REST endpoint will not be reachable. The same would be possible for only REST.

@roberthbailey
Copy link
Member Author

I suppose that instead of setting the port to 0 to disable we could have unspecified be disabling. Do you have an opinion on one vs. the other (for the helm configuration)?

@ilkercelikyilmaz
Copy link
Contributor

ilkercelikyilmaz commented Sep 16, 2021

we could have unspecified be disabling

I like this one but I am OK with 0 as well (which is not that intuitive)

@markmandel
Copy link
Member

I suppose that instead of setting the port to 0 to disable we could have unspecified be disabling. Do you have an opinion on one vs. the other (for the helm configuration)?

The only objection I have to "0" being disabled, is 0 is nomenclature for an ephemeral port:
https://stackoverflow.com/questions/5895751/a-bind-with-a-port-of-zero-will-bind-you-to-a-free-port-is-this-portable

unspecified be disabling

Are you saying that by default it would be disabled? Or I would have to explicitly set it the value to ""

@roberthbailey
Copy link
Member Author

unspecified would be disabled, e.g.

  allocator:
    install: true
    ...
    service:
      grpcPort: 443 
      serviceType: LoadBalancer
      loadBalancerIP: ""
      loadBalancerSourceRanges: []
      annotations: {}
    disableMTLS: false
    disableTLS: false

@markmandel
Copy link
Member

Sorry, not quite following.

I'm assuming that both the gRPC and Http ports would be always be specified, because they will retain the default values of 443 for both ports to maintain backward compatibility?

So I'm not sure how could they be unspecified? (Or I'm missing something obvious, which is possible.)

@roberthbailey
Copy link
Member Author

We would specify them in the helm chart for compatibility (which will also put them into install.yaml). But if you removed them from the helm chart, then we could generate configs without those pieces.

@roberthbailey
Copy link
Member Author

Also, if the ports aren't specified in a flag to the allocator service pod, it would also default to having them as they are now, which would also ensure compatibility (if anyone is not using the helm chart).

@markmandel
Copy link
Member

But if you removed them from the helm chart, then we could generate configs without those pieces.

How would you remove them, without them being set to the default value?

@roberthbailey
Copy link
Member Author

How would you remove them, without them being set to the default value?

Good point.... I'll tinker with helm a little and see what is possible. WDYT about using 0 to mean unspecified? I understand that it means ephemeral port in some situations, but I don't think it is a valid value for a LB port (but in a quick search I couldn't find anything to confirm).

@roberthbailey
Copy link
Member Author

After a bit of tinkering, it looks like running helm with --set agones.allocator.service.grpcPort="" does what I was looking for. Also, commenting out the field in values.yaml works, which makes me think that using a custom values.yaml file (without the field) as described at the bottom of the configuration section on the helm install page would do the trick as well.

@markmandel
Copy link
Member

which makes me think that using a custom values.yaml file (without the field) as described at the bottom of the configuration section on the helm install page would do the trick as well.

I would test this. I expect it will mean that helm will use the default value if not specified. From my understanding, Helm doesn't force you to declare all parameters in a values.yaml that is passed in on the command line. I think you would have to explicitly set the value, and have it be an explicitly blank value.

Possible other idea: Set the port to -1 (or any video that's less than zero?)?

Or if this is too confusing, should we should break this down more?

  allocator:
    install: true
    # ...
    service:
      http:
        enabled: false
        port: 443
      grpc:
        enabled: true
        port: 443
   # ... 

Thoughts?

@rcreasey
Copy link
Collaborator

  allocator:
    install: true
    # ...
    service:
      http:
        enabled: false
        port: 443
      grpc:
        enabled: true
        port: 443
   # ... 

Thoughts?

FWIW, I'd 👍 having finer granularity on these kinds of configuration. I would rather have production defaults specified in the values.yaml and minimize the number of default helm calls inside of a template doing some kind of inference logic.

@cindy52
Copy link
Collaborator

cindy52 commented Sep 20, 2021

which makes me think that using a custom values.yaml file (without the field) as described at the bottom of the configuration section on the helm install page would do the trick as well.

I would test this. I expect it will mean that helm will use the default value if not specified. From my understanding, Helm doesn't force you to declare all parameters in a values.yaml that is passed in on the command line. I think you would have to explicitly set the value, and have it be an explicitly blank value.

Possible other idea: Set the port to -1 (or any video that's less than zero?)?

Or if this is too confusing, should we should break this down more?

  allocator:
    install: true
    # ...
    service:
      http:
        enabled: false
        port: 443
      grpc:
        enabled: true
        port: 443
   # ... 

Thoughts?

Only one thing concerns me with this approach, we're not able to disable grpc and only enable http right? Also with both grpc and http enabled, the ports need to stay the same. It might not be intuitive and requires more documents to delimitate it.

@roberthbailey
Copy link
Member Author

Or if this is too confusing, should we should break this down more?

  allocator:
    install: true
    # ...
    service:
      http:
        enabled: false
        port: 443
      grpc:
        enabled: true
        port: 443
   # ... 

Thoughts?

One thing I noticed while editing the helm configs is that we currently allow the externally exposed port to be configured (defaulting to 443) but we do not allow the target port to be configured (set to 8443).

If we want the services on separate ports, we need to separate both the externally configured port and the target port to be distinct. If we want it to run as it does today we want the ports to be the same. But if you allow all 4 ports in the config, you can also get degenerate configurations (like the two external ports are the same but the target ports are different or the two external ports are different but the target ports are the same).

How valuable is it to be able to set all of these ports vs. just having a toggle for http and grpc on/off (as originally proposed) and keeping the ports as they are today)?

@roberthbailey
Copy link
Member Author

I've prototyped the change in #2272 (it still needs some documentation, but I don't want to write that until we agree on the structure of the helm changes).

I went with

  allocator:
    install: true
    ...
    service:
      name: agones-allocator
      serviceType: LoadBalancer
      loadBalancerIP: ""
      loadBalancerSourceRanges: []
      annotations: {}
      http:
        enabled: true
        port: 443
        targetPort: 8443
      grpc:
        enabled: true
        port: 443
        targetPort: 8443

because it seemed easiest / clearest having a boolean toggle instead of relying on the existence / non-existence of a field and/or using negative values in the helm configs (I still used negative values in the flags in the code to indicate not set). It also meant that I didn't have to use default values in the helm template, to @rcreasey's point above.

Let me know what you think.

@roberthbailey
Copy link
Member Author

Only one thing concerns me with this approach, we're not able to disable grpc and only enable http right? Also with both grpc and http enabled, the ports need to stay the same. It might not be intuitive and requires more documents to delimitate it.

@cindy52 - the way I implemented allows you to enable grpc, rest, or both. If you do both then they can be on the same port (using the mux) or on two different ports (so you still get grpc.(*Server).Serve for the grpc port):

  • If you set both to the same port/targetport then you get one externalized port with the existing mux in the allocator service (this is the default and is the same as what we've had since Agones 1.11.0)
  • If you use two ports then the k8s service has one IP with two different ports mapping to two different target ports
  • If you only enable one or the other you get one externalized port that will only accept one request type

This will require some documentation for some edge cases:

  • If you disable both http and grpc but don't disable the allocator service the pod will just crash loop with no ports to listen on
  • If you set the target port to a low numbered port (e.g. 443) then the pod won't be able to bind to the port to listen since it doesn't run as a privileged user
  • If you set the ports the same and the target ports different I'm not sure what happens
  • If you set the ports different and the target ports the same I'm not sure what happens

@roberthbailey
Copy link
Member Author

@cindy52 has verified that my change works for our use case and I'd like to get this merged prior to the 1.18.0 release. Does anyone have any further comments on the helm field layout?

@markmandel
Copy link
Member

@cindy52 has verified that my change works for our use case and I'd like to get this merged prior to the 1.18.0 release. Does anyone have any further comments on the helm field layout?

LGTM!

@roberthbailey roberthbailey added this to the 1.18.0 milestone Oct 2, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature New features for Agones
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants