Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CASSANDRA-19366 trunk - Expose auth mode in system_views.clients, clientstats, metrics #3085

Closed
wants to merge 21 commits into from

Conversation

tolbertam
Copy link
Contributor

Adds 'mode' and 'metadata' fields to AuthenticatedUser to add context about how the user was authenticated and updates system_views.clients, nodetool client stats to include this information.

Also adds new metrics to ClientMetrics to help operators identify which authentication modes are being used.

for CASSANDRA-19366

Adds 'mode' and 'metadata' fields to AuthenticatedUser to add context
about how the user was authenticated and updates system_views.clients,
nodetool client stats to include this information.

Also adds new metrics to ClientMetrics to help operators identify which
authentication modes are being used.

for CASSANDRA-19366
@tolbertam tolbertam marked this pull request as draft February 5, 2024 15:53
int countConnectedClients(Predicate<ServerConnection> predicate)
{
int count = 0;
for (Channel c : allChannels)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what happens when connection is closed while you iterate over the channels? what if channels are closed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd expect that even if the Channel is closed, the attributes will still be present. If the ServerConnection does happen to be null, the instanceof check will fail. The ClientState on the connection (which the predicates evaluate) will never be null, so it should generally be safe.

Also, if an exception is thrown here, the calling thread will be a JMX rmi connection thread, which will simply bubble the exception up to the client, so I think it will behave ok even in exceptional states.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be great to have some basic test around this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 let me think about how I can test this. The challenge will be that I think to reproduce it the channel will need to be closed. This approach was borrowed from other methods on the class (countConnectedClientsByUser) so maybe there is some existing test precedence I can pull from.

Alternatively I could probably inject an unexpected failure using byteman..will keep thinking on this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tolbertam you could also probably mock that when c.attr(Connection.attributeKey).get() is called that get() returns null or throws.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added ConnectionTrackerTest that includes:

  • Validate that Channel lacking a Connection.attributeKey get skipped over.
  • Validate that Channel having a Connection.attributeKey that is not a ServerConnection get skipped.
  • Validate that if c.attr(Connection.attributeKey).get() throws an unhandled exception it gets propagated up to the caller (ultimately preventing the calculation from working, which I think is the behavior we want, but let me know if you think that should be different).
  • Also added tests that validate the expected basic functionality of countConnectedClients(predicate) and countConnectedClientsByUser (which I didn't change in this PR but figured would be nice to have at a unit level).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great! that is enough I guess ...

@tolbertam tolbertam marked this pull request as ready for review February 5, 2024 19:00
* Introduce 'AuthenticationMode' enum that encapsulates the supported
  authentication modes in the included IAuthenticator implementations in
  Cassandra.
* Rename AuthenticatedUser.getMode() -> getAuthenticatedMode()
* AuthenticatedUser.getMetadata() is now a Map<String, Object>
* Update
  MutualTlsWithPasswordFallbackAuthenticator.getSupportedAuthenticationModes()
  to be a union of its two authenticator's modes.
* ClientsTable: Rename 'mode' and 'metadata' columns to 'authentication_mode'
  and 'authentication_metadata' respectively.
* ClientMetrics: pass Map in markAuthMeter instead of determining based
  on provided string.  Simplify ConnectedNativeClients gauge init.
* ClientStats: Remove --metadata in favor of --verbose, which includes
  all available fields.  Mode -> Auth-Mode, Metadata -> Auth-Metadata
* ClientsTableTest: Update to use mTLS authenticated user and verify
  more fields.
@smiklosovic smiklosovic changed the title Expose auth mode in system_views.clients, clientstats, metrics CASSANDRA-19366 trunk - Expose auth mode in system_views.clients, clientstats, metrics Feb 6, 2024
smiklosovic and others added 5 commits February 6, 2024 08:38
Copy link
Contributor

@frankgh frankgh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good in general, I've left some minor comments

src/java/org/apache/cassandra/auth/AuthenticatedUser.java Outdated Show resolved Hide resolved
src/java/org/apache/cassandra/auth/AuthenticatedUser.java Outdated Show resolved Hide resolved
src/java/org/apache/cassandra/auth/IAuthenticator.java Outdated Show resolved Hide resolved
/**
* User authenticated using a trusted identity in their client certificate.
*/
public static final AuthenticationMode MTLS = new AuthenticationMode("Mtls") {};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we spell out the label?

Suggested change
public static final AuthenticationMode MTLS = new AuthenticationMode("Mtls") {};
public static final AuthenticationMode MTLS = new AuthenticationMode("Mutual TLS") {};

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this would break JMX as I am not sure how a metric with a space in its name would behave.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should probably check that the display name does not contain space

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point, I'm not sure what's the behavior for jmx. But if spaces break JMX, we should add some guards around it

Copy link
Contributor Author

@tolbertam tolbertam Feb 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll test whether spaces work w/ JMX, I suspect it does, but isn't desirable as might require clients to escape; If it doesn't work, I suggest we throw a RuntimeException in the AuthenticationMode constructor.

As far as the name, regardless of whether spaces are allowed, how about MutualTls (for consistency with how the authenticator is named)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like it does work:

beans
...
org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Encrypted,type=Client
org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Mutual Tls,type=Client
org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Password,type=Client
org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Unencrypted,type=Client
org.apache.cassandra.metrics:name=ConnectedNativeClients,type=Client

But it isn't easy to use, and I imagine will give some tooling trouble. For it to work with jmxterm for example I had to double \ escape the space:

$>bean org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Mutual Tls,type=Client
#IllegalArgumentException: Please specify domain using either -d option or domain command
$>bean org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Mutual\\ Tls,type=Client
#bean is set to org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Mutual Tls,type=Client
$>get *
#mbean = org.apache.cassandra.metrics:name=ConnectedNativeClients,scope=Mutual Tls,type=Client:
Value = 5;

I think we should document that spaces in the mode name aren't preferred as it may not work well with some tooling. I will do that!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in 8f082a1 and renamed Mtls -> MutualTls. I also realized was still using String in ClientMetrics instead of AuthenticationMode so I updated that as well, hopefully that looks good!

Comment on lines 118 to 120
Optional.ofNullable(mode)
.flatMap((m) -> Optional.ofNullable(authSuccessByMode.get(m)))
.ifPresent(Meter::mark);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a bit hard to read, but it is the equivalent of:

        Meter meterByMode;
        if (mode != null && (meterByMode = authSuccessByMode.get(mode)) != null) {
            meterByMode.mark();
        }

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 avoids creating some additional Optional objects as well, will change it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tolbertam you dont need the braces :) just super nit but when I see it I just say it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated 👍

/**
* @deprecated by {@link #markAuthFailure(String)}
*/
@Deprecated(since="5.1", forRemoval = true)
public void markAuthFailure()
{
authFailure.mark();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this call markAuthFailure(null) just like we do for the success case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch, this is definitely wrong, will fix it 👍

tolbertam and others added 12 commits February 7, 2024 13:38
Co-authored-by: Francisco Guerrero <frank.guerrero@gmail.com>
Co-authored-by: Francisco Guerrero <frank.guerrero@gmail.com>
Co-authored-by: Francisco Guerrero <frank.guerrero@gmail.com>
Co-authored-by: Francisco Guerrero <frank.guerrero@gmail.com>
Co-authored-by: Francisco Guerrero <frank.guerrero@gmail.com>
* Use AuthenticationMode instead of string in ClientMetrics
* Fix some imports from suggestion changes
* null check instead of Optional in mark methods
* markAuthFailure should call markAuthFailure(null)
Previously when providing --all --client-options to nodetool clientstats
it'd print two tables, which was probably not intended.  Adjusts the
behavior that establishes the following precedence:

--verbose
--client-options
--all

If any combination of these three are present, the one with the most
precedence will be used.
Anything additive should be put on the end going forward
Copy link
Contributor

@frankgh frankgh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 I have no further comments. Looks good to me

To prevent test flakiness, we should block on cassandra user existence
in tests.  Add AuthTestUtils.waitForExistingRoles() which is adapted
from the Auth package which relies on dtest code.
Don't change non-touched tests, we should change this in a separate PR
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants