Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add GSS/KRB5 Encryption support when connecting to a backend server #743

Open
wants to merge 17 commits into
base: master
Choose a base branch
from

Conversation

coryastronomer
Copy link

Hello, we have used this code to successfully connect to a GSSAPI/Kerberos authenticated and encrypted PostgreSQL server from pgbouncer, while allowing a non-GSSAPI/Kerberos client to avail itself of a kerberized PostgreSQL in addition to the other pgbouncer functions.

I added a couple fields to pgbouncer.ini:

server_gssencmode (need to flesh this out; I should be able to do that this week; the goal is to allow requiring GSSAPI authentication and encryption from pgbouncer's side, not just having "hostgssenc" in pg_hba.conf)
gssapi_spn (GSSAPI/Kerberos Service Principal Name; per-database in [databases] section)

I initially created a synchronized I/O version of this patch. Right now sending/receiving data is asynchronous I/O except for the initial connection. I could probably re-engineer that at some point. Most of the async code is a modified copy of how libpq does a GSSAPI Authenticated+Encrypted connection. I considered using the libpq-fe.h API directly but it doesn't seem to be a great match for pgbouncer as-is. I imagine there is some history related to libpq that I'm not aware of.

At this time it does not support, say, making a server connection with GSSAPI auth alone or with any other method (i.e. TLS). It does not add any abilities to client connections.

I have this working in both Vagrant and docker-compose with a minimal KDC; after some cleanup I should be able to share those as well, for automated testing purposes.

I am sure there are bugs and some rough edges in the code, but so far I can't get my connections to fail or crash. I'd be happy to help smooth anything out, if this patch is of any value to the community.

N.B. So far I've only tested with pkt_buf = 16384; per the postgresql.org spec, that is how long GSSAPI encrypted messages can be.

@coryastronomer
Copy link
Author

FYI you have to configure it thus, after installing the relevant krb5 dev package(s) depending on distro:

./configure --with-server-gssenc

You can see all the details in my test bench (KDC included), I forgot I already committed most of this:

https://github.com/coryastronomer/docker-kerberos

@coryastronomer
Copy link
Author

coryastronomer commented Aug 1, 2022

Some testing details (I took from the PostgreSQL Kerberos test suite; slow because I'm running in the debugger on my local machine):

vagrant@kdbclient:~$ psql -U houston -h kpgbouncer postgres -c 'SELECT pid, gss_authenticated, encrypted, principal from pg_stat_gssapi where pid = pg_backend_pid();';
  pid  | gss_authenticated | encrypted |      principal
-------+-------------------+-----------+---------------------
 40228 | t                 | t         | houston@EXAMPLE.COM
(1 row)

vagrant@kdbclient:~$ time psql -U houston -h kpgbouncer postgres -c 'select * from generate_series(1, 100000);' | wc -l
100004

real	0m1.966s
user	0m0.066s
sys	0m0.027s
vagrant@kdbclient:~$ time perl -e 'print join("\n", (1 .. 100000))' | psql -U houston -h kpgbouncer postgres -c 'create temp table mytab (f1 int primary key); copy mytab from stdin; select count(*) from mytab;'
 count
--------
 100000
(1 row)


real	0m1.874s
user	0m0.061s
sys	0m0.017s

@JelteF
Copy link
Member

JelteF commented Aug 11, 2022

I haven't looked much at the code and I'm not really familiar with GSSAPI/Kerberos. In any case this needs some tests in the test suite.

Copy link
Member

@JelteF JelteF left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

needs tests before I can look at it further

@coryastronomer
Copy link
Author

Will work on the tests more this coming week.

@coryastronomer coryastronomer changed the title Cory/gss authentication Add GSS/KRB5 Encryption support when connecting to a backend server Aug 27, 2022
@coryastronomer
Copy link
Author

needs tests before I can look at it further

I've got the testing working.

@andriisoldatenko
Copy link

@JelteF PTAL

@JelteF
Copy link
Member

JelteF commented Sep 6, 2022

I'll try to take a look in the somewhat near future, but it's a big enough PR that it'll take me some time to review.

@coryastronomer
Copy link
Author

coryastronomer commented Sep 7, 2022

  • Nomenclature: GSS/GSSAPI/KRB5/Kerberos are used interchangeably in this document
  • History
    • Sources used in development
    • Testing notes
      • Added an Ubuntu 22.04 container with “./configure —with-server-gssenc”
        • Extra packages: krb5-kdc krb5-admin-server krb5-user libkrb5-dev
        • Extra script: test/gss/newkdc.sh - this sets up a KDC in the testing container
      • Added test/gss/test.{sh,ini} and hooked them into test/Makefile
        • Tests GSS functionality, based on test/ssl folder
        • Run by “make check” if GSS support is enabled.
      • env KRB5_TRACE=/dev/stderr can be useful in troubleshooting
  • Code
    • include/bouncer.h
      • Added extern int cf_server_gssencmode and enum GssEncMode: store the pgbouncer.ini server_gssencmode setting at runtime
        • GSSENCMODE_DISABLE
        • GSSENCMODE_PREFER
        • GSSENCMODE_REQUIRE
      • struct PgDatabase
        • Added char *gssapi_spn: store the [databases] gssapi_spn setting at runtime
      • struct PgSocket
        • Added bool wait_gssencchar: used while waiting for G/N from postgresql server during GSSAPI Encryption negotiation
    • include/pktbuf.h
      • Added macro pktbuf_write_GSSEncRequest(buf) to request GSSAPI Encryption negotation
    • include/proto.h and src/proto.c
      • Added send_gssencreq_packet(PgSocket *server) function to wrap around pktbuf_write_GSSEncRequest(buf)
    • include/sbuf.h and src/sbuf.c
      • typedef enum SBufEvent
        • Added SBUF_EV_GSSENC_READY to show that GSS Encryption has been established
      • Added GSSENC_WANT_POLLOUT and GSSENC_WANT_POLLIN for async IO
      • Added typedef unsigned int uint32 for libpq code
      • struct SBuf
        • Added uint8_t gssenc_state to store the progress of GSS Encryption, per-connection
        • Added struct gss_ctx_id_struct *gss to store a GSS context, per-connection
        • if server_gssencmode is configured, compile in these variables from libpq
          • char *gss_SendBuffer - Encrypted data waiting to be sent
          • int gss_SendLength - End of data available in gss_SendBuffer
          • int gss_SendNext - Next index to send a byte from gss_SendBuffer
          • int gss_SendConsumed - Number of unencrypted bytes consumed for current contents of gss_SendBuffer
          • char *gss_RecvBuffer - Received, encrypted data
          • int gss_RecvLength - End of data available in gss_RecvBuffer
          • char *gss_ResultBuffer - Decryption of data in gss_RecvBuffer
          • int gss_ResultLength - End of data available in gss_ResultBuffer
          • int gss_ResultNext - Next index to read a byte from gss_ResultBuffer
          • uint32 gss_MaxPktSize - Maximum size we can encrypt and fit the results into our output buffer
          • bool write_failed - have we had a write failure on sock?
      • Added extern int server_connect_gssencmode to store the pgbouncer.ini server_gssencmode setting at runtime
      • Added bool sbuf_gssenc_connect(SBuf *sbuf, char *gssapi_spn) function
        • Called by src/server.c handle_gssencchar function to initiate GSS Encryption after an affirmative “G” response has been received from the server
        • This function handles the entire GSS Encryption connection synchronously, then it turns on asynchronous IO mode for the rest of the connection
        • Main sources of this function are from: sbuf_tls_connect, krb5’s gss-sample/gss-client.c’s client_establish_context function, postgres libpq’s fe-secure-gssapi.c's pqsecure_open_gss function, RFC7546’s do_initiator function
        • Does a no-op and returns false if GSS Encryption is not compiled in
      • Added bool sbuf_gssenc_setup(void) function
        • Called by src/admin.c admin_set and admin_cmd_reload functions to hot reload the server_connect_gssencmode variable
        • Called by src/main.c main and handle_sighup functions to load the server_connect_gssencmode variable at startup and handle a SIGHUP for a hot reload of server_connect_gssencmode
        • Does a no-op and returns true if GSS Encryption is not compiled in
      • Added static ssize_t pg_GSS_write(SBuf *conn, const void *ptr, size_t len) function
        • Created by modifying the pg_GSS_write(PGconn *conn, const void *ptr, size_t len) function from postgres/src/interfaces/libpq/fe-secure-gssapi.c
        • conn->gctx becomes conn->gss
        • appendPQExpBufferStr and pg_GSS_error calls were converted into log_error calls
        • Instead of pg_hton32, htonl was used. This limits us to little Endian (i.e. x86) architectures
      • Added static ssize_t pqsecure_raw_write(SBuf *conn, const void *ptr, size_t len) function
        • Created by modifying the pqsecure_raw_write(PGconn *conn, const void *ptr, size_t len) function from postgres/src/interfaces/libpq/fe-secure.c
        • Removed signal handling as pgbouncer already has signal handling
        • Changed libpq_gettext call to log_error
        • Modified to use GSSENC_WANT_POLLOUT in case bytes did not fully send, supporting async IO
      • Added static ssize_t pg_GSS_read(SBuf *conn, void *ptr, size_t len) function
        • Created by modifying the pg_GSS_read(PGconn *conn, void *ptr, size_t len) function from postgres/src/interfaces/libpq/fe-secure-gssapi.c
        • Modified to use GSSENC_WANT_POLLIN in case bytes did not fully send, supporting async IO
        • conn->gctx becomes conn->gss
        • appendPQExpBufferStr and pg_GSS_error calls were converted into log_error calls
        • Instead of pg_ntoh32, ntohl was used. This limits us to little Endian (i.e. x86) architectures
      • Added static ssize_t pqsecure_raw_read(SBuf *conn, void *ptr, size_t len) function
        • Created by modifying the pqsecure_raw_read(PGconn *conn, void *ptr, size_t len) function from postgres/src/interfaces/libpq/fe-secure.c
        • Removed signal handling as pgbouncer already has signal handling
        • Changed libpq_gettext call to log_error
        • Modified to use GSSENC_WANT_POLLIN in case bytes did not fully send, supporting async IO
      • Added static ssize_t gssenc_sbufio_recv(struct SBuf *sbuf, void *dst, size_t len) and static ssize_t gssenc_sbufio_send(struct SBuf *sbuf, const void *data, size_t len) functions
        • Analogous to the tls_sbufio_recv and tls_sbufio_send functions
          • gssenc_state instead of tls_state
          • SBUF_GSSENC_OK instead of SBUF_TLS_OK
          • pg_GSS_read instead of tls_read
          • pg_GSS_write instead of tls_write
          • GSSENC_WANT_POLLIN instead of TLS_WANT_POLLIN
          • GSSENC_WANT_POLLOUT instead of TLS_WANT_POLLOUT
      • Added static int gssenc_sbufio_close(struct SBuf *sbuf) function
        • Closes connection, like tls_sbufio_close
      • Added static const SBufIO gssenc_sbufio_ops structure
        • gssenc_sbufio_recv function
        • gssenc_sbufio_send function
        • gssenc_sbufio_close function
        • This is used by pgbouncer to call the right encrypt/decrypt functionality for GSSAPI
      • Added static int recv_token(int s, int *flags, gss_buffer_t tok) and static int send_token(int s, int flags, gss_buffer_t tok)
        • Primarily taken from gss-misc.c
    • src/admin.c
      • Modified admin_set function to load server_connect_gssencmode
      • Modified send_one_fd function to treat GSS Encrypted connections like TLS connections, i.e. do not send data cleartext, but instead use the sbufio functions that pertain to TLS/GSSEnc
    • src/client.c
      • Added case SBUF_EV_GSSENC_READY to client_proto function - it will disconnect a client that tries to use GSS Encryption, as this codebase is for connecting to servers only
    • src/janitor.c
      • Modified kill_database function to free the gssapi_spn variable
    • src/loader.c
      • Modified parse_database function to add gssapi_spn configuration support to the individual database specified in pgbouncer.ini’s [databases] section
    • src/main.c
      • Added logic to load and store server_gssencmode
    • src/proto.c
      • See above
    • src/sbuf.c
      • See above
    • src/server.c
      • Added pg_GSS_have_cred_cache function
      • Added handle_gssencchar function
      • Modified handle_connect to support GSSAPI
      • Modified server_proto to support GSSAPI
  • Configuration
    • ./configure —with-server-gssenc: necessary to enable GSSAPI Encryption
    • pgbouncer.ini [pgbouncer] server_gssencmode
      • disable: only try a non-GSSAPI-encrypted connection
      • prefer (default): if there are GSSAPI credentials present (i.e., in a credentials cache), first try a GSSAPI-encrypted connection; if that fails or there are no credentials, try a non-GSSAPI-encrypted connection. This is the default when pgbouncer has been compiled with GSSAPI support.
      • require: only try a GSSAPI-encrypted connection
      • server_gssencmode is ignored for Unix domain socket communication. If pgbouncer is compiled without GSSAPI support, using the require option will cause an error, while prefer will be accepted but pgbouncer will not actually attempt a GSSAPI-encrypted connection
    • pgbouncer.ini [databases] gssapi_spn

@coryastronomer
Copy link
Author

@JelteF PTAL - I've posted a bunch of documentation about the patch

@danielhoherd
Copy link

danielhoherd commented May 1, 2023

@JelteF is this still on the radar? It would be really unfortunate for this branch to get so far out of sync with the main branch that it is too difficult to merge. I work at Astronomer, same place that @coryastronomer worked. I understand that this is a large PR and takes time to go through, and I wouldn't want you to take any shortcuts on reviewing it, but at the same time this is an enterprise feature that was high priority enough that we hired somebody specifically to implement it, and it would be a shame for that to go to waste.

Copy link
Member

@JelteF JelteF left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry that it took so long to actually take a look at this PR. Thanks a lot for this work. I now took some time to go over the new code and left a few small comments. But the main thing that I'm thinking is that this is quite complex code and it's hard to review. Especially since I have pretty much zero knowledge on how GSS actually works.

I think the problem of me being unable to review everything is relatively easy to work around though. Because what I can fairly easily review is the changes needed to go from the GSS implementation in libpq implementation to the implementation in PgBouncer. Lots of the code is clearly copied from fe-secure-gssapi.c in the Postgres codebase. I think if we structure the PR in the following way I will actually be able to review it:

  1. Create a single commit in which we create an sbuf_gssapi.c file which is an exact copy of fe-secure-gssapi.c
  2. Do any changes necessary to make this code work with PgBouncer on top of that.

Afaict this would greatly reduce the amount of code that's needed to review. And the code that needs to be reviewed would mostly be code specific to PgBouncer, instead of code specific to GSSAPI.

Comment on lines +114 to +115
;; disable, allow, require
;server_gssencmode = disable
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
;; disable, allow, require
;server_gssencmode = disable
;; disable, prefer, require
;server_gssencmode = prefer

@@ -29,6 +29,8 @@
#include <usual/safeio.h>
#include <usual/slab.h>

//#include <postgresql/libpq-fe.h>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can be removed

Comment on lines 291 to +292
CF_ABS("server_idle_timeout", CF_TIME_USEC, cf_server_idle_timeout, 0, "600"),
CF_ABS("server_gssencmode", CF_LOOKUP(gssencmode_map), cf_server_gssencmode, 0, "prefer"), /* libpq default */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Items in this list are ordered alphabetically

Suggested change
CF_ABS("server_idle_timeout", CF_TIME_USEC, cf_server_idle_timeout, 0, "600"),
CF_ABS("server_gssencmode", CF_LOOKUP(gssencmode_map), cf_server_gssencmode, 0, "prefer"), /* libpq default */
CF_ABS("server_gssencmode", CF_LOOKUP(gssencmode_map), cf_server_gssencmode, 0, "prefer"), /* libpq default */
CF_ABS("server_idle_timeout", CF_TIME_USEC, cf_server_idle_timeout, 0, "600"),

Comment on lines +296 to +297
} else if (!!gssapi_spn != !!db->gssapi_spn
|| (gssapi_spn && strcmp(gssapi_spn, db->gssapi_spn) != 0)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's now the strings_equal function in the codebase to do this check.

@@ -1242,6 +1282,7 @@ static int tls_sbufio_close(struct SBuf *sbuf)
return 0;
}

// TODO: handle gssapi somehow, respecting macros etc
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's up with this TODO? What needs to still happen here?

@@ -0,0 +1,223 @@
#! /bin/sh
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the tests should be rewritten to the python testing framework. But before doing that lets first focus on the new code itself.

{
log_noise("gss_close");
if (sbuf->gss) {
// TODO: free gss memory
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seemingly important TODO

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants