Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Postgres filter: implement Postgres SSL termination and monitoring #10942

Closed
ahachete opened this issue Apr 24, 2020 · 15 comments · Fixed by #14634
Closed

Postgres filter: implement Postgres SSL termination and monitoring #10942

ahachete opened this issue Apr 24, 2020 · 15 comments · Fixed by #14634

Comments

@ahachete
Copy link

Previous work

This issue elaborates on the general design of a Postgres filter proposed in #9107.

Background

Encrypting the communications with the database is a hard requirement in many environments. And while cryptography is currently very fast on modern hardware, it still imposes some penalty where it is executed. Particularly, establishing SSL connections is expensive.

RDBMs like Postgres have a primary-replicas architecture, where the former is the only instance that takes writes. Offloading SSL from the primary instance can help reduce the workload, and increase the room for vertical scalability of these services. Similarly, some connection poolers frequently used in combination with Postgres, like PgBouncer, are single-threaded and may saturate the CPU soon if they are dealing with frequent SSL connection establishment.

Those are good enough reasons to implement SSL termination in the Postgres filter for the Envoy proxy. It is very convenient also because certificate management can also be offloaded, and potentially handled via Envoy APIs (and existing tools that leverage them), without having to change Postgres configuration.

Moreover, the current version of the Postgres Envoy Filter implements several traffic inspection metrics that are useful for monitoring. But they (obviously) cannot peek into the SSL traffic, exposing these metrics only for unencrypted connections (as of today).

It might be argued that the CPU offloading advantage is not present in scenarios where Envoy is deployed as a sidecar within the same Pod as Postgres. While true, it doesn’t neglect anyway the advantages of monitoring metrics of the encrypted traffic, separation of concerns and API-based management of certificates.

Goals

  • Implement Postgres SSL Termination at the existing Envoy Proxy’s Postgres filter.
  • Adapt the filter to expose the same metrics it exposes now for unencrypted traffic also for the encrypted traffic terminated at Envoy.
  • Provide SSL configuration capabilities both via the config file and the Envoy APIs.
  • Optional (or next phase refinements). Provide (equivalent) support to Postgres’ advanced SSL configuration capabilities, like ssl_ciphers, ssl_ecdh_curve or ssl_min_protocol_version, among several others.

Non-goals

  • Support encrypted communications from Envoy to the upstream Postgres server. It is assumed that this communication will happen unencrypted.
  • Support client authentication via SSL certificates (this would be a goal for a future issue).

Implementation notes

Envoy does support SSL termination at the TCP layer. However Postgres SSL support does not happen at the TCP level, but rather at the application layer (Frontend/Backend Protocol). This questions the amount of existing infrastructure that may be reused. In any case, the same SSL library that Envoy uses, BoringSSL, will be used. It is a key requirement that TLS levels and general compatibility with Postgres SSL are appropriately tested.

The following diagram (note that the Postgres terminology of the FrontEnd-BackEnd protocol is used, instead of the usual Downstream/Upstream at Envoy) illustrates how the encrypted/unencrypted flows may work:

Diagram-PGClient_Envoy_PGServer-v1

Note that the “fake response” happens after SSL Request, where we will “imitate” the backend sending back the byte “S” to the frontend.

Ideally, we should leverage as much as possible all the SSL infrastructure available already in Envoy, and not create different configuration files/keys/APIs. Here is an example of how Envoy TLS is currently configured:

tls_context:
  common_tls_context:
    tls_certificates:
      - certificate_chain:
          filename: "/etc/example-com.crt"
        private_key:
          filename: "/etc/example-com.key"

Limitations

  • Potentially, some SSL versions or encryption mechanisms may not work, and a reduced set of options may be exposed. This should be fine as Postgres clients negotiate with the server the mechanisms to use. There could be, potentially, some limitations with some specific clients, but it is not expected to be a relevant issue.

  • SCRAM authentication with channel binding may not be used when proxying through Envoy, as the Postgres server will not be running in SSL mode, and Postgres implementation of channel binding uses tls-server-end-point.

References

@royantman
Copy link

Would be nice if this can be used to make workloads behind envoy talk to an ssl forced postgres RDS so client code doesn't care about TLS. Currently it is impossible because of the "SSL request" and expected "S".

@ahachete
Copy link
Author

Hi @royantman So if I understand it correctly, the use case you are suggesting is a client -> Envoy unencrypted connection and then a Envoy -> upstream Postgres encrypted connection? This scenario would certainly not supported by this proposed design. But I'd like to understand better the use case, as:

  • Normally the Envoy->Postgres connection is equally or more trusted than the client -> Envoy, not the reverse.
  • This prevents the use case for using SSL certificates for authentication.
  • It kind of defeats the validation the client may do of the server's certificate, since the client will be SSL-unaware.

I guess implementing this would require some non trivial additional effort, so I'd like to understand if there's a strong use case behind it. Thank you!

@lizan
Copy link
Member

lizan commented Jun 3, 2020

FYI: #9577 is requested before to properly support STARTTLS and there is a PoC to terminating STARTTLS.

@davidfetter
Copy link

With utmost respect, I ask you to reconsider the first non-goal of supporting encrypted communication to the PostgreSQL server.

Terminating TLS in the hope that the onward network is free of attackers is pretty similar to not using TLS at all. Is there some way you could, say, make reconnecting with TLS an optional feature with opt-in so that you're not obligating people to choose soft chewy center as the price for using this system?

@ahachete
Copy link
Author

I see very valid use cases for terminating SSL and having a plain text upstream connection, @davidfetter. For example, when using Envoy as a sidecar, and connecting to the upstream Postgres server via Unix Domain Sockets. Also offloading SSL certificate management to Envoy (and the management layers and software above) brings significant advantages (for example avoid Postgres restarts).

That doesn't prevent, however, that other use cases may establish a new SSL connection to upstream. In this case is beneficial to decode the protocol metrics, that current version of the extension doesn't support (only plain-text traffic).

@ghost
Copy link

ghost commented Sep 24, 2020

Where I can find a POC for this topic ?

@fabriziomello
Copy link
Contributor

Where I can find a POC for this topic ?

Not sure if still working but the POC code is here: https://github.com/cpakulski/envoy/tree/issue/10942

@mjacobs
Copy link

mjacobs commented Oct 16, 2020

I see very valid use cases for terminating SSL and having a plain text upstream connection, @davidfetter. For example, when using Envoy as a sidecar, and connecting to the upstream Postgres server via Unix Domain Sockets. Also offloading SSL certificate management to Envoy (and the management layers and software above) brings significant advantages (for example avoid Postgres restarts).

That doesn't prevent, however, that other use cases may establish a new SSL connection to upstream. In this case is beneficial to decode the protocol metrics, that current version of the extension doesn't support (only plain-text traffic).

+1 to these example use cases

lizan pushed a commit that referenced this issue Feb 11, 2021
Adds ability to use _starttls_ transport socket to terminate SSL at Envoy and pass unencrypted traffic upstream to Postgres server.

Additional Description:
Risk Level: Low
Testing: Added unit and integration tests.
Docs Changes: Yes.
Release Notes: Yes. 
Fixes #10942

Signed-off-by: Christoph Pakulski <christoph@tetrate.io>
Co-authored-by: Fabrízio de Royes Mello <fabrizio@ongres.com>
@caleblloyd
Copy link
Contributor

I am successfully using this - it is currently letting through both encrypted client sessions opened with sslmode=require and unencrypted client sessions opened with sslmode=disable. Unfortunately this is the nature of STARTTLS

Is there any way to set a network filter to enforce TLS and drop connections that don't send the initial SSLRequest packet?

Or would the Postgres filter need another option, such as require_ssl?

@ahachete
Copy link
Author

ahachete commented Feb 24, 2021

That's great news @caleblloyd that is working well for you. Your question is quite relevant.

Normally, this is enforced at PostgreSQL level by pg_hba.conf. On this case, however, it can't be done as the upstream connection is (always) unencrypted, so PostgreSQL isn't aware of this.

One option would be to implement at Envoy level some directive, as you say. To make it more general, it could be similar to PostgreSQL's host, hostssl and hostnossl values, which represent respectively "either SSL or not", "required SSL" and "required not SSL".

However, there's also ongoing discussions about going further and implementing a mechanism equivalent to PostgreSQL's HBA in Envoy, that would also allow to accept/reject connections at the Envoy level by looking at the user, database and/or source IP of the connection.

@caleblloyd
Copy link
Contributor

@ahachete thanks for your work on this, and thanks for the quick response!

I'd like to continue the discussion in the appropriate issues- is there already an issue open for requiring SSL and/or the HBA?

@ahachete
Copy link
Author

There's no specific issue created yet. I just raised this question on Envoy's Slack #envoy-postgres channel, to understand first if this should be related to Envoy's RBAC module, or rather a separate functionality of the current Postgres filter. Once this is discussed, I will open an issue with the approach to implement this.

@sandangel
Copy link

sandangel commented Apr 24, 2022

hi @caleblloyd , could you share your full config for postgres proxy with tls? I'm quite new to envoy. I follow the doc to add postgres filter but it also mentions the starttls filter which I don't know where to put and a proper config for it. I would really appreciate it.

@fabriziomello
Copy link
Contributor

@sandangel have a look to this config example:

admin:
  access_log_path: /dev/null
  address:
    socket_address:
      address: 0.0.0.0
      port_value: 8000

static_resources:
  clusters:
  - name: postgres_cluster
    connect_timeout: 1s
    type: STRICT_DNS
    load_assignment:
      cluster_name: postgres_cluster
      endpoints:
      - lb_endpoints:
        - endpoint:
            address:
              socket_address:
                address: 0.0.0.0
                port_value: 5432

  listeners:
  - name: listener
    address:
      socket_address:
        address: 0.0.0.0
        port_value: 54322
    filter_chains:
    - filters:
      - name: envoy.filters.network.postgres_proxy
        typed_config:
          "@type": type.googleapis.com/envoy.extensions.filters.network.postgres_proxy.v3alpha.PostgresProxy
          stat_prefix: egress_postgres
          enable_sql_parsing: false
          terminate_ssl: true
      - name: envoy.tcp_proxy
        typed_config:
          "@type": type.googleapis.com/envoy.extensions.filters.network.tcp_proxy.v3.TcpProxy
          stat_prefix: tcp_postgres
          cluster: postgres_cluster
          idle_timeout: 10s
      transport_socket:
        name: "starttls"
        typed_config:
          "@type": type.googleapis.com/envoy.extensions.transport_sockets.starttls.v3.StartTlsConfig
          tls_socket_config:
            common_tls_context:
              tls_certificates:
                certificate_chain:
                  filename: "/d/fabrizio/ongres/etc/.creds/ssl/server.crt"
                private_key:
                  filename: "/d/fabrizio/ongres/etc/.creds/ssl/server.key"

@Sindhunavale
Copy link

There's no specific issue created yet. I just raised this question on Envoy's Slack #envoy-postgres channel, to understand first if this should be related to Envoy's RBAC module, or rather a separate functionality of the current Postgres filter. Once this is discussed, I will open an issue with the approach to implement this.

is there any implemenation done similar to HBA in envoy?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

10 participants