New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Admin endpoint security #2763

Open
htuch opened this Issue Mar 8, 2018 · 13 comments

Comments

Projects
None yet
5 participants
@htuch
Copy link
Member

htuch commented Mar 8, 2018

The admin endpoint today is unsecured (no authentication or TLS), with the assumption that it is only available to localhost or accessible on a trusted network. Ideally:

  • We want to be able to restrict access to only trusted IPs, client certificates and ensure we have transport security.
  • We want to have some ability to distinguish roles and access to the admin console, i.e. distinct identities might be allowed to operate /quitquitquit vs. stats monitoring.

Beyond just security, there's also the question of what the admin console is. Is it just a curlable utility, an interactive web console or is it a first-class API intended for programatic use? Should it offer gRPC endpoints (in particular as we are moving towards a proto definition of its contents in places such as #2172). Answers to this affect the framing of security considerations.

Opening this issue to start the design discussion here.

@mattklein123

This comment has been minimized.

Copy link
Member

mattklein123 commented Mar 8, 2018

A few initial points are in the comment here: envoyproxy/data-plane-api#523 (comment)

@htuch

This comment has been minimized.

Copy link
Member

htuch commented Mar 9, 2018

One thing I wonder about is whether we should be offering a traditional web interface at all. Here's one alternative design; implement only gRPC or REST endpoints, have folks build out Javascript client side interfaces which can be served from a listener via Envoy's direct response (https://www.envoyproxy.io/docs/envoy/latest/api-v2/api/v2/route/route.proto#envoy-api-field-route-route-direct-response) mechanism. This removes Envoy from being in the business of worrying about XSS/CSRF/other web security concerns, while providing the convenience of browser admin capability.

@mattklein123

This comment has been minimized.

Copy link
Member

mattklein123 commented Mar 9, 2018

For reference the web stuff was only recently added by @jmarantz. I objected to this slightly at the time because realistically I think the only production use for the admin endpoint is either curl or other automated tools, but it didn't seem like a big deal to me to serve the landing page in HTML so I didn't worry about it that much.

My overall view of things right now is that the admin endpoint is not secure it all and must be accessed only over a trusted network. In the future I think we should do the following things:

  1. Generally provide only gRPC/REST endpoints codified in the data-plane-api repo (already tracked by various issues).
  2. Promote the admin endpoint to a fuller listener config allowing for both TLS/mTLS and inline RBAC security on a per-endpoint basis (once we have the inline RBAC filter). This will allow operators to configure things as they like it.
  3. Per @htuch we should consider having a "secure by default" admin config which locks admin down to localhost only and then requires operators to open up individual endpoints (.e.g., /stats) as they see fit.

In general I think that worrying about things like XSS/CSRF/etc. is kind of a waste of time for this. I will defer to @jmarantz who knows substantially more about this on what we should do on that front security-wise (hopefully optimizing for realistic usage scenarios).

@mattklein123

This comment has been minimized.

Copy link
Member

mattklein123 commented Mar 10, 2018

P.S., it would be great if someone in the community who is passionate about admin security might want to own this. There will be a non-trivial amount of work here to get to where we want to ultimately be.

@jmarantz

This comment has been minimized.

Copy link
Contributor

jmarantz commented Mar 10, 2018

One clarification: the http handlers for mutating operations were pre-existing. The change added a web home-page with proper escaping, and reduced XSS through proper http content types, and AFAIK added no additional exposure.

I agree that more restricted access by default would help.

@mattklein123

This comment has been minimized.

Copy link
Member

mattklein123 commented Mar 10, 2018

One clarification: the http handlers for mutating operations were pre-existing. The change added a web home-page with proper escaping, and reduced XSS through proper http content types, and AFAIK added no additional exposure.

Sorry I spoke incorrectly. Before your change we had no HTML. I know basically nothing about web security. My real point was that if the existence of HTML is going to cause security consternation, I don't think it's worth maintaining because IMHO the use of the HTML endpoints is not going to happen in production. Assuming there is no additional exposure, than it's fine by me. I just wanted to point out that we should be careful about the HTML stuff if that is going to cause us additional maintenance burden.

@ofek

This comment has been minimized.

Copy link
Contributor

ofek commented Mar 12, 2018

Additionally, I would strongly recommend Envoy have a separate listener/endpoint that only serves /stats with optional basic auth.

@ofek

This comment has been minimized.

Copy link
Contributor

ofek commented Mar 13, 2018

@DataDog is recommending https://gist.github.com/ofek/6051508cd0dfa98fc6c13153b647c6f8 until this is solved.

Idea courtesy of @ggreenway
Config courtesy of @bndw (with this modification from @htuch)

htuch added a commit to envoyproxy/data-plane-api that referenced this issue Mar 13, 2018

admin: add security warning (#534)
Fixes envoyproxy/envoy#2769
References envoyproxy/envoy#2763

Signed-off-by: Matt Klein <mklein@lyft.com>
@taiki45

This comment has been minimized.

Copy link
Member

taiki45 commented Mar 14, 2018

I vote to disabling web admin interface. Alternatively:

  • Move admin operations like /cpuprofiler or /logging to runtime configuration flags or simple Unix domain socket commands like haproxy one. This allows us to manage permissions via traditional file system permissions.
  • Promote pull-based endpoints like /stats to gRPC/REST API and let them have fuller listener config.

In additon, these features might be disabled by default. All of programatic use goes gRPC/REST API to supoprt user extendability, and rest of admin interface gets on FS permissions.

Personally, I like Envoy's curl-able interface so I prefer Unix domain socket commands which is similar to the current web admin interface. I have a little passion to move admin operations from web interface to Unix domain socket one.

@ofek

This comment has been minimized.

Copy link
Contributor

ofek commented Mar 14, 2018

What's the difference between a "pull-based endpoint" like /stats and a REST API version?

@jmarantz

This comment has been minimized.

Copy link
Contributor

jmarantz commented Mar 14, 2018

I think it makes sense to control access via configuration, including disabling the http admin interface completely or restricting it to an IP, with separate controls for read-access vs POSTed mutations.

@taiki45

This comment has been minimized.

Copy link
Member

taiki45 commented Mar 14, 2018

What's the difference between a "pull-based endpoint" like /stats and a REST API version?

The main difference that I thought is the API one will have a dedicated listener and will be properly schema controlled. For example, /stats already has a json format for programmatic access.

@taiki45

This comment has been minimized.

Copy link
Member

taiki45 commented Mar 14, 2018

It's not a strong objection but, to say source IP base restriction or local loopback binding, we want more detailed permission control in some deployment cases: developers can login a host in which Envoy runs but do not want to allow them to take admin operations of the Envoy instance, but want to allow only administrators to do that operations for easy debbuging.

jmarantz added a commit to jmarantz/data-plane-api that referenced this issue Apr 3, 2018

Document that admin mutations should be POSTs
envoyproxy/envoy#2971 adds warning-checks that mutations should be POSTed.  This documents that status.  In a future PR, mutations will fail if they are not POSTs.

See envoyproxy/envoy#2763 for more detail.

Also adds a doc stub for @jsedgewick to fill in for /runtime_modify.

htuch added a commit to envoyproxy/data-plane-api that referenced this issue Apr 4, 2018

doc that admin mutations should be posts (#602)
envoyproxy/envoy#2971 adds warning-checks that mutations should be POSTed. This documents that status. In a future PR, mutations will fail if they are not POSTs.

See envoyproxy/envoy#2763 for more detail.

Signed-off-by: Joshua Marantz <jmarantz@google.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment