Added maximum sequential login failures to the quota. #54737

Demilivor · 2023-09-18T07:20:34Z

Added maximum sequential login failures to the quota.
Related to #54450

Changelog category (leave one):

New Feature

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

Added maximum sequential login failures to the quota.

Documentation entry for user-facing changes

Documentation is written (mandatory for new features)

Demilivor · 2023-09-18T07:27:45Z

src/Access/IAccessStorage.cpp

+            auto new_current_roles = user->granted_roles.findGranted(user->default_roles);
+            const auto & access = Context::getGlobalContextInstance()->getAccessControl();
+            auto roles_info = access.getEnabledRolesInfo(new_current_roles, {});
+            const std::string custom_quota_key = ""; // TODO: Where do we get it?
+
+            assert(assert);
+            auto enabled_quota = access.getEnabledQuota(*id,
+                                                        credentials.getUserName(),
+                                                        roles_info->enabled_roles,
+                                                        address,
+                                                        forwarded_address,


How can I get a quota key from the TCP client here?

At the moment of the authentication, we have not received client_key (quota_key) from the client.

I created a function getting quota for authentication with constant quota key :

std::shared_ptr<const EnabledQuota> getAuthenticationQuota(UUID user_id, const UserPtr & user, const Poco::Net::IPAddress & address, const std::string & forwarded_address) { /// During authentication process, client_key is not received from TCP client, /// Use predefined authentication quota key and always receive the save Interval object /// to avoid throwing exceptions in case of QuotaKeyType::CLIENT_KEY key type. constexpr auto AUTHENTICATION_QUOTA_KEY = "_AUTHENTICATION_QUOTA_KEY_"; auto new_current_roles = user->granted_roles.findGranted(user->default_roles); const auto & access = Context::getGlobalContextInstance()->getAccessControl(); auto roles_info = access.getEnabledRolesInfo(new_current_roles, {}); auto enabled_quota = access.getEnabledQuota(user_id, user->getName(), roles_info->enabled_roles, address, forwarded_address, AUTHENTICATION_QUOTA_KEY); return enabled_quota; }

This solution will ignore the futher quota received by TCP connection. This solution will work with QuotaKeyType::CLIENT_KEY.

_AUTHENTICATION_QUOTA_KEY_ it looks too scary; we shouldn't have it in the code...

Good day, Alexey

I updated the code. Now it does not have conflicts and _AUTHENTICATION_QUOTA_KEY_,
but still we have to provide quota_key to the AccessControl::getEnabledQuota, during TCP connection, quota_key (i.e client_key) is sent after authentication, but failed login attempt count should be checked before authentication is finished, so we can't get quota key from the user.

Without anything passed to the quota, we will get an exception in case of QuotaKeyType::CLIENT_KEY from this code:

String QuotaCache::QuotaInfo::calculateKey(const EnabledQuota & enabled) const { ... throw Exception( ErrorCodes::QUOTA_REQUIRES_CLIENT_KEY, "Quota {} (for user {}) requires a client supplied key.", quota->getName(), params.user_name); ... }

As a workaround, I added a parameter throw_if_client_key_empty for the authentication quota case.

String QuotaCache::QuotaInfo::calculateKey(const EnabledQuota & enabled, bool throw_if_client_key_empty) const { ... if (throw_if_client_key_empty) throw Exception( ErrorCodes::QUOTA_REQUIRES_CLIENT_KEY, "Quota {} (for user {}) requires a client supplied key.", quota->getName(), params.user_name); ... }

The issue is solved, but I'm not sure that this solution is good enough.

Please advise me if you have a better idea of how to solve this.

Demilivor · 2023-09-21T01:57:35Z

Good day everyone! Please assign a reviewer to this pull request and add a label can be tested.

Please also keep in mind these features comming from these changes:
Quota key used for the authentication is hardcoded, because when authentication starts, TCP client did not sent client_key.

alexey-milovidov · 2023-12-30T12:04:03Z

I reviewed it and found the code somewhat scary...

alexey-milovidov · 2023-12-30T12:04:49Z

src/Access/IAccessStorage.cpp

+        constexpr auto AUTHENTICATION_QUOTA_KEY = "_AUTHENTICATION_QUOTA_KEY_";
+        const auto new_current_roles = user->granted_roles.findGranted(user->default_roles);
+        const auto & access = Context::getGlobalContextInstance()->getAccessControl();


Scary in this line.

@alexey-milovidov I discussed with @Demilivor the line No 57 is really scary. It should to be refactored.
It seems like many parts of code related to quotas were changed during the past 3 months. We probably need more guidance about core ClickHouse, Inc developers about implementation of the "maximum sequential login failures" feature.

Right now @Demilivor was on another project but he will return back on this PR after we discuss it with our management. The task is actual for our company.

I moved this code to another place, and now my code does not call Context::getGlobalContextInstance()->getAccessControl()

rvasin · 2024-01-25T08:51:12Z

@rschu1ze and @tavplubix could you set can be tested label for this PR? And could you please assign yourself as co-reviewer for this PR.

The PR was created in September 2023 and we restarted to work on it.

I see in git history that @pufit, @kitaisreal, @yakov-olkhovskiy and @vitlibar also worked on AccessControl/quotas. We need some guidance to finally implement this feature.

We may do complete refactoring or implement new approach from scratch if you suggest so. For example, we consider creation of new system table system.user_status with MergeTree engine to track the unsuccessful login attempts (maybe use some TTL to handle the expirations). With this approach we may solve the persistence problem. Or we may use some map in context to track it (the persistence is not solved in this case, but it's OK for us too). Any ideas/suggestions are welcome.

robot-clickhouse-ci-2 · 2024-01-25T13:11:59Z

This is an automated comment for commit 076fb1d with description of existing statuses. It's updated for the latest CI running

❌ Click here to open a full report in a separate page

Successful checks

Check name	Description	Status
AST fuzzer	Runs randomly generated queries to catch program errors. The build type is optionally given in parenthesis. If it fails, ask a maintainer for help	✅ success
ClickBench	Runs [ClickBench](https://github.com/ClickHouse/ClickBench/) with instant-attach table	✅ success
ClickHouse build check	Builds ClickHouse in various configurations for use in further steps. You have to fix the builds that fail. Build logs often has enough information to fix the error, but you might have to reproduce the failure locally. The cmake options can be found in the build log, grepping for cmake. Use these options and follow the general build process	✅ success
Compatibility check	Checks that clickhouse binary runs on distributions with old libc versions. If it fails, ask a maintainer for help	✅ success
Docker server and keeper images	There's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS	✅ success
Docs check	There's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS	✅ success
Fast tests	There's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS	✅ success
Flaky tests	Checks if new added or modified tests are flaky by running them repeatedly, in parallel, with more randomization. Functional tests are run 100 times with address sanitizer, and additional randomization of thread scheduling. Integrational tests are run up to 10 times. If at least once a new test has failed, or was too long, this check will be red. We don't allow flaky tests, read the doc	✅ success
Install packages	Checks that the built packages are installable in a clear environment	✅ success
Mergeable Check	Checks if all other necessary checks are successful	✅ success
Performance Comparison	Measure changes in query performance. The performance test report is described in detail here. In square brackets are the optional part/total tests	✅ success
SQLancer	Fuzzing tests that detect logical bugs with SQLancer tool	✅ success
Sqllogic	Run clickhouse on the sqllogic test set against sqlite and checks that all statements are passed	✅ success
Stateful tests	Runs stateful functional tests for ClickHouse binaries built in various configurations -- release, debug, with sanitizers, etc	✅ success
Stateless tests	Runs stateless functional tests for ClickHouse binaries built in various configurations -- release, debug, with sanitizers, etc	✅ success
Stress test	Runs stateless functional tests concurrently from several clients to detect concurrency-related errors	✅ success
Style check	There's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS	✅ success
Unit tests	Runs the unit tests for different release types	✅ success
Upgrade check	Runs stress tests on server version from last release and then tries to upgrade it to the version from the PR. It checks if the new server can successfully startup without any errors, crashes or sanitizer asserts	✅ success

Check name	Description	Status
CI running	A meta-check that indicates the running CI. Normally, it's in success or pending state. The failed status indicates some problems with the PR	⏳ pending
Integration tests	The integration tests report. In parenthesis the package type is given, and in square brackets are the optional part/total tests	❌ failure

tests/queries/0_stateless/02884_authentication_quota.sh

src/Access/EnabledQuota.cpp

vitlibar · 2024-01-25T15:48:16Z

src/Access/AccessControl.cpp

 {
    try
    {
-        return MultipleAccessStorage::authenticate(credentials, address, *external_authenticators, allow_no_password,
-                                                   allow_plaintext_password);
+        const auto auth_result =  MultipleAccessStorage::authenticate(credentials, address, *external_authenticators, allow_no_password,


It's not quite logically correct to always try to authenticate even if the quota is already exceeded. Also IAccessStorage::authenticate() can do some complex stuff, like connecting to another server. I think it's better to check the quota first, then call MultipleAccessStorage::authenticate(), and then reset the quota counter if everything is ok.

I would agree with you, but how do we get the quota without user_id? Do you know a way to get the user_id and quota for an authenticated user?

Now AuthResult returns user_id even if user is exist and user_id is read but authentication is failed (password is incorrect)

Do you know a way to get the user_id?

auto user_id = find<User>(credentials.getUserName())

Thanks, If that will also work with users authenticated using LDAP, then we can avoid modification of AuthResult, I will try to implement that way

If that will also work with users authenticated using LDAP

Well, LDAPAccessStorage::find<User>(credentials.getUserName()) returns nullopt if that user has not logged in before since the server started. So we cannot check the quota this time. But since it's the first time the quota must not be already exceeded. And when the same user logs in again later, find<User>(credentials.getUserName()) will return a valid user_id, so the quota can be checked.

It seems for a user who has never logged in successfully using LDAP the quota won't work. Because with only failed attempts LDAPAccessStorage::authenticate() will never assign user_id to that user and we can't check the quota.

I changed the code according to the suggestion to use find<User>. And I added a comment for specific LDAP case.

src/Access/IAccessStorage.h

rvasin · 2024-01-29T06:18:33Z

@vitlibar Do you more suggestions to this PR? I see the performance degradation. If needed we may add a new option for config.xml file or users.xml (profile-based option?) file:

<enable_failed_sequential_authentications>false</enable_failed_sequential_authentications>

So it will be false by default and it will not affect performace of existing customers which do not need this feature.

I also discussed with management: partial LDAP support is OK for us. In future in addition to this approach it could be possible to implement other approaches (and corresponding new options to enable these approaches).

vitlibar · 2024-01-29T17:54:09Z

src/Access/AccessControl.cpp

 {
+    std::shared_ptr<const EnabledQuota> authentication_quota;
    try


This function looks a bit too much complicated now. It's not necessary to check the quota twice (in checkExceeded and in used), the first checkExceeded should be enough. I believe this function can be written in a less complicated way:

auto authentication_quota = getAuthenticationQuota(credentials.getUserName()); if (authentication_quota) checkAuthenticationQuota(*authentication_quota, credentials.getUserName()); /// throws QUOTA_EXCEEDED with user_name if exceeded AuthResult auth_result; try { auth_result = MultipleAccessStorage::authenticate(...); } catch (...) { tryLogCurrentException(); if (authentication_quota) authentication_quota->used(... /* check_exceeded */ false); /// already checked before throw ErrorCodes::AUTHENTICATION_FAILED } if (authentication_quota) authentication_quota->reset(QuotaType::FAILED_SEQUENTIAL_AUTHENTICATIONS); return auth_result;

Thanks for the suggestion!

Let's imagine the situation without checking quota->used(throw_exception = true):
We have a quota with FAILED_SEQUENTIAL_AUTHENTICATIONS = 1.
The user tries to login with the wrong password the first time:
* Call check_exceed (0 > 1 - FALSE)
* throw AUTHENTICATION_FAILED error (used_tries becomes 1)
The user tries to login with the wrong password the second time:
* Call check_exceed (1 > 1 - FALSE)
* throw AUTHENTICATION_FAILED error from throw ErrorCodes::AUTHENTICATION_FAILED (used_tries becomes 2) // <<-- This looks like the wrong behavior for me. I would desire to see the 'QUOTA_EXCEED' error here.

So the line:

if (authentication_quota) authentication_quota->used(... /* check_exceeded */ false);

It looks more correct if check_exceeded will be true. Then the sequence will be:

The user tries to login with the wrong password the first time:
* Call check_exceed (0 > 1 - FALSE)
* throw AUTHENTICATION_FAILED error (used_tries becomes 1)
The user tries to login with the wrong password the second time:
* Call check_exceed (1 > 1 - FALSE)
* throw QUOTA_EXCEED from authentication_quota->used(... /* check_exceeded */ true); (used_tries becomes 2)

Though it looks a bit strange that with FAILED_SEQUENTIAL_AUTHENTICATIONS = 1 we allow two login attempts. And there is no way to allow only one login attempt.

I don't like it either, but it's a feature of quotas where the condition is violated when used > max.

In this specific case, the condition used >= max would look better from the user's perspective. I would change a code inside EnabledQuota.cpp in these lines:

ClickHouse/src/Access/EnabledQuota.cpp

Lines 56 to 66 in dfc761c

if (used > max)

{

bool counters_were_reset = false;

auto end_of_interval = interval.getEndOfInterval(current_time, counters_were_reset);

if (counters_were_reset)

used = (interval.used[quota_type_i] += value);

if (check_exceeded && (used > max))

throwQuotaExceed(user_name, intervals.quota_name, quota_type, used, max, interval.duration, end_of_interval);

}

}

ClickHouse/src/Access/EnabledQuota.cpp

Lines 83 to 89 in dfc761c

if (used > max)

{

bool counters_were_reset = false;

auto end_of_interval = interval.getEndOfInterval(current_time, counters_were_reset);

if (!counters_were_reset)

throwQuotaExceed(user_name, intervals.quota_name, quota_type, used, max, interval.duration, end_of_interval);

}

Instead we can use different conditions depending on QuotaType quota_type like:

static bool compareQuotaValue(QuotaValue used, QuotaValue max, QuotaType quota_type) { return quota_type == QuotaType::FAILED_SEQUENTIAL_AUTHENTICATIONS ? used >= max : used > max; }

and use CompareQuotaValue(used, max, quota_type) instead of just used > max.
But I need the confirmation that making such change satisfies ClickHouse.

Now the code works as you proposed but with authentication_quota->used(... /* check_exceeded */ true);

The default-generated exception from authentication_quota->checkExceed(...) already contains the user name:

Code: 201. DB::Exception: Received from localhost:9000. DB::Exception: Quota for user `2884_user_579116` for 3155695200s has been exceeded: failed_sequential_authentications = 2/1. Interval will end at 2069-12-31 06:00:00. Name of quota template: `2884_quota_579116`. (QUOTA_EXCEEDED)

and use CompareQuotaValue(used, max, quota_type) instead of just used > max.
But I need the confirmation that making such change satisfies ClickHouse.

I have another idea: let's just remove the call authentication_quota->used() after catch (...) and call it at the beginning instead of checkExceeded():

auto authentication_quota = getAuthenticationQuota(credentials.getUserName()); if (authentication_quota) authentication_quota->used(FAILED_SEQUENTIAL_AUTHENTICATIONS, 1); try { auto auth_result = MultipleAccessStorage::authenticate(...); if (authentication_quota) authentication_quota->reset(QuotaType::FAILED_SEQUENTIAL_AUTHENTICATIONS); return auth_result; } catch (...) { tryLogCurrentException(); throw ErrorCodes::AUTHENTICATION_FAILED }

This way is not very intuitive however it seems it will work exactly as we want it to work. And it's also shorter.

We need a comment for used() here though with a detailed description. About that we increase the counter of authentication failures in the beginning and reset it after a successful authentication. And we do that because if we don't have quota for a failed authentication then we shouldn't try to authenticate at all.

The implementation variant you suggested will work with this feature:
authentication_quota->used will increase the failed authentication counter even if the quota was already exceeded; that looks fine for me.

Code is updated,
Now it works as desired.

src/Access/AccessControl.cpp

vitlibar · 2024-01-31T15:05:04Z

Test failures:

test_storage_kafka/test.py::test_system_kafka_consumers_rebalance_mv is known - ThreadSanitizer data race in librdkafka (test_storage_kafka) #56043

Demilivor commented Sep 18, 2023

View reviewed changes

Demilivor mentioned this pull request Sep 18, 2023

User account locking after given number of consecutive login failures #54450

Closed

Demilivor force-pushed the ADQM-1150 branch from 8b2e85f to caefa19 Compare September 21, 2023 01:48

Demilivor changed the title ~~[DRAFT] Added maximum sequential login failures to the quota.~~ Added maximum sequential login failures to the quota. Sep 21, 2023

alexey-milovidov self-assigned this Dec 4, 2023

alexey-milovidov marked this pull request as draft December 4, 2023 17:15

alexey-milovidov reviewed Dec 30, 2023

View reviewed changes

Demilivor force-pushed the ADQM-1150 branch from 43d6025 to e041cfa Compare January 25, 2024 02:53

Demilivor marked this pull request as ready for review January 25, 2024 07:33

vitlibar self-assigned this Jan 25, 2024

tavplubix added the can be tested Allows running workflows for external contributors label Jan 25, 2024

robot-clickhouse-ci-2 added the pr-feature Pull request with new product feature label Jan 25, 2024

vitlibar reviewed Jan 25, 2024

View reviewed changes

src/Access/IAccessStorage.h Outdated Show resolved Hide resolved

Demilivor force-pushed the ADQM-1150 branch from e041cfa to fc6764a Compare January 26, 2024 04:16

vitlibar reviewed Jan 29, 2024

View reviewed changes

src/Access/AccessControl.cpp Outdated Show resolved Hide resolved

vitlibar reviewed Jan 29, 2024

View reviewed changes

src/Access/AccessControl.cpp Show resolved Hide resolved

Demilivor added 2 commits January 29, 2024 23:20

Implemented failed login attempt counting using quota

559de08

minor update

b1d2c0d

Demilivor force-pushed the ADQM-1150 branch from fc6764a to b1d2c0d Compare January 30, 2024 02:31

Updated the authentication failures counter logic

076fb1d

vitlibar approved these changes Jan 31, 2024

View reviewed changes

vitlibar merged commit a193e01 into ClickHouse:master Jan 31, 2024
250 of 253 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added maximum sequential login failures to the quota. #54737

Added maximum sequential login failures to the quota. #54737

Demilivor commented Sep 18, 2023 •

edited

Demilivor Sep 18, 2023

Demilivor Sep 19, 2023

alexey-milovidov Dec 4, 2023

Demilivor Jan 25, 2024

Demilivor commented Sep 21, 2023

alexey-milovidov commented Dec 30, 2023

alexey-milovidov Dec 30, 2023

rvasin Jan 12, 2024

Demilivor Jan 25, 2024

rvasin commented Jan 25, 2024 •

edited

robot-clickhouse-ci-2 commented Jan 25, 2024 •

edited

vitlibar Jan 25, 2024 •

edited

Demilivor Jan 25, 2024 •

edited

vitlibar Jan 25, 2024

Demilivor Jan 25, 2024

vitlibar Jan 26, 2024

vitlibar Jan 26, 2024

Demilivor Jan 26, 2024

rvasin commented Jan 29, 2024

vitlibar Jan 29, 2024

Demilivor Jan 29, 2024

vitlibar Jan 30, 2024

Demilivor Jan 30, 2024

Demilivor Jan 30, 2024

vitlibar Jan 30, 2024 •

edited

Demilivor Jan 30, 2024 •

edited

Demilivor Jan 30, 2024

vitlibar commented Jan 31, 2024

	if (used > max)
	{
	bool counters_were_reset = false;
	auto end_of_interval = interval.getEndOfInterval(current_time, counters_were_reset);
	if (counters_were_reset)
	used = (interval.used[quota_type_i] += value);

	if (check_exceeded && (used > max))
	throwQuotaExceed(user_name, intervals.quota_name, quota_type, used, max, interval.duration, end_of_interval);
	}
	}

Added maximum sequential login failures to the quota. #54737

Added maximum sequential login failures to the quota. #54737

Conversation

Demilivor commented Sep 18, 2023 • edited

Changelog category (leave one):

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

Documentation entry for user-facing changes

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Demilivor commented Sep 21, 2023

alexey-milovidov commented Dec 30, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rvasin commented Jan 25, 2024 • edited

robot-clickhouse-ci-2 commented Jan 25, 2024 • edited

vitlibar Jan 25, 2024 • edited

Choose a reason for hiding this comment

Demilivor Jan 25, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rvasin commented Jan 29, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vitlibar Jan 30, 2024 • edited

Choose a reason for hiding this comment

Demilivor Jan 30, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vitlibar commented Jan 31, 2024

Demilivor commented Sep 18, 2023 •

edited

rvasin commented Jan 25, 2024 •

edited

robot-clickhouse-ci-2 commented Jan 25, 2024 •

edited

vitlibar Jan 25, 2024 •

edited

Demilivor Jan 25, 2024 •

edited

vitlibar Jan 30, 2024 •

edited

Demilivor Jan 30, 2024 •

edited