Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pgwire: migrate auth/conn logs to notable events #57839

Merged
merged 4 commits into from Jan 8, 2021

Conversation

knz
Copy link
Contributor

@knz knz commented Dec 11, 2020

Fixes #56154

Release note (backward-incompatible change): The connection and
authentication logging enabled by the cluster settings
server.auth_log.sql_connections.enabled and
server.auth_log.sql_sessions.enabled was previously using a text
format which was hard to parse and integrate with external monitoring
tools. This has been changed to use the standard notable event
mechanism, with standardized payloads. The output format is now
structured; see its reference documentation for details about
the supported event types and payloads.

@knz knz requested a review from itsbilal December 11, 2020 20:05
@knz knz requested a review from a team as a code owner December 11, 2020 20:05
@knz knz added this to In progress in DB Server & Security via automation Dec 11, 2020
@knz knz requested a review from a team as a code owner December 11, 2020 20:05
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@knz
Copy link
Contributor Author

knz commented Dec 11, 2020

cc @thtruo @logston

@knz
Copy link
Contributor Author

knz commented Dec 21, 2020

friendly ping to the reviewers

Copy link
Contributor

@otan otan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 16 of 16 files at r1.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @arulajmani, @itsbilal, and @knz)


pkg/sql/pgwire/auth.go, line 358 at r1 (raw file):

func (p *authPipe) AuthFail(err error) {
	p.readerDone <- authRes{err: err}

do we want to log here? seems easy to miss if AuthOK implictly logs but not AuthFail.


pkg/sql/pgwire/hba/hba.go, line 94 at r1 (raw file):

		return "host"
	default:
		panic(errors.Newf("unimplemented: %v", int(t)))

nit: "unimplemented conn type" (or something to identify better since i've been scared by panics omitting stack traces when "wrapped" by something else)


pkg/util/log/eventpb/session_events.proto, line 57 at r1 (raw file):

  // The connection type after transport negotiation.
  string transport = 1 [(gogoproto.jsontag) = ",omitempty"];
  // The username the session is for.

should we specify whether this is normalised or not?
i notice we normalise this above -- is this something we want? (feels like we may lose information if we normalise them)

Copy link
Member

@jordanlewis jordanlewis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @arulajmani, @itsbilal, and @knz)


pkg/sql/pgwire/auth.go, line 114 at r1 (raw file):

	tlsState, hbaEntry, methodFn, err := c.findAuthenticationMethod(authOpt)
	if err != nil {
		ac.LogAuthFailed(ctx, "authentication method lookup error", err)

This should be a warning and not an ordinary auth failed, right? or is this a user-specified auth method that might not exist?


pkg/sql/pgwire/auth.go, line 126 at r1 (raw file):

	if err != nil {
		ac.LogAuthFailed(ctx, "authentication pre-hook failed", err)

what is this? should it be an warning instead? shouldn't AuthFailed be reserved for cases where users gave wrong credentials, and not just arbitrary errors?

Copy link
Contributor Author

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @arulajmani, @itsbilal, @jordanlewis, and @otan)


pkg/sql/pgwire/auth.go, line 114 at r1 (raw file):

Previously, jordanlewis (Jordan Lewis) wrote…

This should be a warning and not an ordinary auth failed, right? or is this a user-specified auth method that might not exist?

To answer this point and the one below: we want an authn event logged in all authn failure cases, regardless of cause.

As to how this error can occur: This can naturally happen if none of the rules in the HBA configuration happen to match the incoming connection. That's a valid configuration (and pg logs a similar message in the same situation).


pkg/sql/pgwire/auth.go, line 126 at r1 (raw file):

what is this?

This happens when e.g. the method fails to recognize the parameters it needs in the connection, before the authn starts.

should it be an warning instead?

I'm not sure what you mean with “warning”? There's no notion of warning here: authentication either succeeds or fails, there's nothing in-between.

shouldn't AuthFailed be reserved for cases where users gave wrong credentials, and not just arbitrary errors?

Ok now that's an interesting discussion: do we want more granular events types than authentication pass/fail with an "Info" field that describes the failure?
This requirement had not been communicated to me earlier. I was assuming we only needed authn ok/fail.


pkg/sql/pgwire/auth.go, line 358 at r1 (raw file):

Previously, otan (Oliver Tan) wrote…

do we want to log here? seems easy to miss if AuthOK implictly logs but not AuthFail.

Yeah that was not a great idea. Moved the logging for the OK case side-by-side with the fail case instead, to improve legibility.


pkg/sql/pgwire/hba/hba.go, line 94 at r1 (raw file):

Previously, otan (Oliver Tan) wrote…

nit: "unimplemented conn type" (or something to identify better since i've been scared by panics omitting stack traces when "wrapped" by something else)

Done.


pkg/util/log/eventpb/session_events.proto, line 57 at r1 (raw file):

should we specify whether this is normalised or not?

Yes sure, done.

i notice we normalise this above -- is this something we want? (feels like we may lose information if we normalise them)

the normalization occurs in fact very early during the conn handshake. the .Normalized() call merely returns the stored normalized username. By this time it's too late to log the original.

(I'll ask Aaron if logging the original usernam eis important)

Copy link
Contributor

@otan otan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 7 of 8 files at r2.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @arulajmani, @itsbilal, @jordanlewis, and @otan)

Copy link
Contributor Author

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @aaron-crl, @arulajmani, @itsbilal, @jordanlewis, and @otan)


pkg/sql/pgwire/auth.go, line 126 at r1 (raw file):

Previously, knz (kena) wrote…

what is this?
This happens when e.g. the method fails to recognize the parameters it needs in the connection, before the authn starts.

should it be an warning instead?
I'm not sure what you mean with “warning”? There's no notion of warning here: authentication either succeeds or fails, there's nothing in-between.

shouldn't AuthFailed be reserved for cases where users gave wrong credentials, and not just arbitrary errors?
Ok now that's an interesting discussion: do we want more granular events types than authentication pass/fail with an "Info" field that describes the failure?
This requirement had not been communicated to me earlier. I was assuming we only needed authn ok/fail.

Discussed with @aaron-crl : the Reason field is now populated with an enum with documented + stable values, instead of an unscrupulous string.


pkg/util/log/eventpb/session_events.proto, line 57 at r1 (raw file):

Previously, knz (kena) wrote…

should we specify whether this is normalised or not?

Yes sure, done.

i notice we normalise this above -- is this something we want? (feels like we may lose information if we normalise them)

the normalization occurs in fact very early during the conn handshake. the .Normalized() call merely returns the stored normalized username. By this time it's too late to log the original.

(I'll ask Aaron if logging the original usernam eis important)

Discussed with the team - this can be added in a separate PR.

@knz knz force-pushed the 20201211-auth-events branch 2 times, most recently from 886d79b to 27adee7 Compare January 6, 2021 19:34
The "crdb-v1" format is brittle and certainly not designed to be
unambiguous and reliably parseable. Nevertheless, we can do a little
bit more to allow IPv6 addresses in logging tags, to enable parsing
the subset of messages that's needed for an authentication log.

Release note: None
Prior to this patch, when `FetchEntriesFromFiles` was fetching entries
from multiple file groups (previously known as "secondary loggers"),
it mistakenly stopped after processing entries from just one logger.

This patch fixes it.

Note: this API is deprecated and should be replaced by something that
knows about file groups.

Release note: None
Release note (security update): When using a SQL proxy, in the default
configuration CockroachDB only knows about the network address of the
proxy. That *peer* address is then used for logging, authentication
rules, etc. This is undesirable, as security logging and authentication
rules need to operate on the actual (final) client address instead.

CockroachDB can now be configured to solve this problem (conf
mechanism detailed below).

When so configured, a SQL proxy can inform the CockroachDB server of
the real address of the client via a server status parameter called
`crdb:remote_addr`. The value must be the IP address of the client,
followed by a colon, followed by the port number, using the standard
Go syntax (e.g. `11.22.33.44:5566` for IPv4, `[11:22::33]:4455` for
IPv6). When provided, this value overrides the SQL proxy's address
for logging and authentication purposes.

In any case, the original peer address is also logged alongside
the client address (overridden or not), via the new logging tag `peer`.

Security considerations:

- enabling this feature allows the peer to spoof its address wrt
  authentication and thus bypass authentication rules that would
  otherwise apply to its address, which can introduce a serious security
  vulnerability if the peer is not trusted. This is why this feature is
  not enabled by default, and must only be enabled when using a trusted
  SQL proxy.

- this feature should only be used with SQL proxies which actively
  scrub a `crdb:remote_addr` parameter received by a remote client,
  and replaces it by its own. If the proxy mistakenly forwards
  the  parameter  as provided by the client, it opens the door
  to the aforementioned security vulnerability.

- care must be taken in HBA rules: TLS client cert validation, if
  requested by a rule, is still performed using the certificate
  presented by the proxy, not that presented by the client.
  This means that this new feature is not sufficient to forward
  TLS client cert authn through a proxy. (If TLS client cert authn
  is required, it must be performed by the proxy directly.)

- care must be taken in HBA rules: the 'protocol' field (first column)
  continues to apply to the connection type between CockroachDB and the
  proxy, not between the proxy and the client. Only the 4th column
  (the CIDR pattern) is matched against the proxy-provided remote
  address override.

  Therefore, it is not possible to apply different rules to different
  client address when proxying TCP connections via a unix socket,
  because HBA rules for unix connections don't use the address column.

  Also when proxying client SSL connections via a non-SSL proxy
  connection, or proxying client non-SSL connections via a SSL proxy
  connection, care must be taken to configure address-based rule
  matching using the proper connection type. A reliable way
  to bypass this complexity is to only use the `host` connection
  type which applies equally to SSL and non-SSL connections.

As of this implementation, the feature is enabled using the
non-documented environment variable
`COCKROACH_TRUST_CLIENT_PROVIDED_SQL_REMOTE_ADDR`. The use of an env
var is a stop-gap so that this feature can be used in CC SQL pods
which do not have access to cluster settings. The env var will be
eventually removed and replaced by another mechanism.
Release note (backward-incompatible change): The connection and
authentication logging enabled by the cluster settings
`server.auth_log.sql_connections.enabled` and
`server.auth_log.sql_sessions.enabled` was previously using a text
format which was hard to parse and integrate with external monitoring
tools. This has been changed to use the standard notable event
mechanism, with standardized payloads. The output format is now
structured; see its reference documentation for details about
the supported event types and payloads.
@knz
Copy link
Contributor Author

knz commented Jan 8, 2021

TFYRs!

bors r=otan,jordanlewis

@craig
Copy link
Contributor

craig bot commented Jan 8, 2021

Build failed (retrying...):

@craig
Copy link
Contributor

craig bot commented Jan 8, 2021

Build succeeded:

@craig craig bot merged commit 802b074 into cockroachdb:master Jan 8, 2021
DB Server & Security automation moved this from In progress to Done 21.1 Jan 8, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

logging: enhance cockroach auth logging
4 participants