Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added maximum sequential login failures to the quota. #54737

Merged
merged 3 commits into from Jan 31, 2024

Conversation

Demilivor
Copy link
Contributor

@Demilivor Demilivor commented Sep 18, 2023

Added maximum sequential login failures to the quota.
Related to #54450

Changelog category (leave one):

  • New Feature

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

Added maximum sequential login failures to the quota.

Documentation entry for user-facing changes

  • Documentation is written (mandatory for new features)

Comment on lines 542 to 552
auto new_current_roles = user->granted_roles.findGranted(user->default_roles);
const auto & access = Context::getGlobalContextInstance()->getAccessControl();
auto roles_info = access.getEnabledRolesInfo(new_current_roles, {});
const std::string custom_quota_key = ""; // TODO: Where do we get it?

assert(assert);
auto enabled_quota = access.getEnabledQuota(*id,
credentials.getUserName(),
roles_info->enabled_roles,
address,
forwarded_address,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How can I get a quota key from the TCP client here?

At the moment of the authentication, we have not received client_key (quota_key) from the client.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I created a function getting quota for authentication with constant quota key :

    std::shared_ptr<const EnabledQuota> getAuthenticationQuota(UUID user_id,
                                                               const UserPtr & user,
                                                               const Poco::Net::IPAddress & address,
                                                               const std::string & forwarded_address)
    {
        /// During authentication process, client_key is not received from TCP client,
        /// Use predefined authentication quota key and always receive the save Interval object
        /// to avoid throwing exceptions in case of QuotaKeyType::CLIENT_KEY key type.
        constexpr auto AUTHENTICATION_QUOTA_KEY = "_AUTHENTICATION_QUOTA_KEY_";
        auto new_current_roles = user->granted_roles.findGranted(user->default_roles);
        const auto & access = Context::getGlobalContextInstance()->getAccessControl();
        auto roles_info = access.getEnabledRolesInfo(new_current_roles, {});

        auto enabled_quota = access.getEnabledQuota(user_id,
                                                    user->getName(),
                                                    roles_info->enabled_roles,
                                                    address,
                                                    forwarded_address,
                                                    AUTHENTICATION_QUOTA_KEY);

        return enabled_quota;
    }

This solution will ignore the futher quota received by TCP connection. This solution will work with QuotaKeyType::CLIENT_KEY.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_AUTHENTICATION_QUOTA_KEY_ it looks too scary; we shouldn't have it in the code...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good day, Alexey

I updated the code. Now it does not have conflicts and _AUTHENTICATION_QUOTA_KEY_,
but still we have to provide quota_key to the AccessControl::getEnabledQuota, during TCP connection, quota_key (i.e client_key) is sent after authentication, but failed login attempt count should be checked before authentication is finished, so we can't get quota key from the user.

Without anything passed to the quota, we will get an exception in case of QuotaKeyType::CLIENT_KEY from this code:

String QuotaCache::QuotaInfo::calculateKey(const EnabledQuota & enabled) const
{
...
        throw Exception(
            ErrorCodes::QUOTA_REQUIRES_CLIENT_KEY,
            "Quota {} (for user {}) requires a client supplied key.",
            quota->getName(),
            params.user_name);
...
}

As a workaround, I added a parameter throw_if_client_key_empty for the authentication quota case.

String QuotaCache::QuotaInfo::calculateKey(const EnabledQuota & enabled, bool throw_if_client_key_empty) const
{
...
   if (throw_if_client_key_empty)
        throw Exception(
            ErrorCodes::QUOTA_REQUIRES_CLIENT_KEY,
            "Quota {} (for user {}) requires a client supplied key.",
            quota->getName(),
            params.user_name);
...
}

The issue is solved, but I'm not sure that this solution is good enough.

Please advise me if you have a better idea of how to solve this.

@Demilivor Demilivor changed the title [DRAFT] Added maximum sequential login failures to the quota. Added maximum sequential login failures to the quota. Sep 21, 2023
@Demilivor
Copy link
Contributor Author

Good day everyone! Please assign a reviewer to this pull request and add a label can be tested.

Please also keep in mind these features comming from these changes:
Quota key used for the authentication is hardcoded, because when authentication starts, TCP client did not sent client_key.

@alexey-milovidov alexey-milovidov self-assigned this Dec 4, 2023
@alexey-milovidov alexey-milovidov marked this pull request as draft December 4, 2023 17:15
@alexey-milovidov
Copy link
Member

I reviewed it and found the code somewhat scary...

Comment on lines 55 to 57
constexpr auto AUTHENTICATION_QUOTA_KEY = "_AUTHENTICATION_QUOTA_KEY_";
const auto new_current_roles = user->granted_roles.findGranted(user->default_roles);
const auto & access = Context::getGlobalContextInstance()->getAccessControl();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Scary in this line.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alexey-milovidov I discussed with @Demilivor the line No 57 is really scary. It should to be refactored.
It seems like many parts of code related to quotas were changed during the past 3 months. We probably need more guidance about core ClickHouse, Inc developers about implementation of the "maximum sequential login failures" feature.

Right now @Demilivor was on another project but he will return back on this PR after we discuss it with our management. The task is actual for our company.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved this code to another place, and now my code does not call Context::getGlobalContextInstance()->getAccessControl()

@Demilivor Demilivor marked this pull request as ready for review January 25, 2024 07:33
@rvasin
Copy link
Contributor

rvasin commented Jan 25, 2024

@rschu1ze and @tavplubix could you set can be tested label for this PR? And could you please assign yourself as co-reviewer for this PR.

The PR was created in September 2023 and we restarted to work on it.

I see in git history that @pufit, @kitaisreal, @yakov-olkhovskiy and @vitlibar also worked on AccessControl/quotas. We need some guidance to finally implement this feature.

We may do complete refactoring or implement new approach from scratch if you suggest so. For example, we consider creation of new system table system.user_status with MergeTree engine to track the unsuccessful login attempts (maybe use some TTL to handle the expirations). With this approach we may solve the persistence problem. Or we may use some map in context to track it (the persistence is not solved in this case, but it's OK for us too). Any ideas/suggestions are welcome.

@vitlibar vitlibar self-assigned this Jan 25, 2024
@tavplubix tavplubix added the can be tested Allows running workflows for external contributors label Jan 25, 2024
@robot-clickhouse-ci-2 robot-clickhouse-ci-2 added the pr-feature Pull request with new product feature label Jan 25, 2024
@robot-clickhouse-ci-2
Copy link
Contributor

robot-clickhouse-ci-2 commented Jan 25, 2024

This is an automated comment for commit 076fb1d with description of existing statuses. It's updated for the latest CI running

❌ Click here to open a full report in a separate page

Successful checks
Check nameDescriptionStatus
AST fuzzerRuns randomly generated queries to catch program errors. The build type is optionally given in parenthesis. If it fails, ask a maintainer for help✅ success
ClickBenchRuns [ClickBench](https://github.com/ClickHouse/ClickBench/) with instant-attach table✅ success
ClickHouse build checkBuilds ClickHouse in various configurations for use in further steps. You have to fix the builds that fail. Build logs often has enough information to fix the error, but you might have to reproduce the failure locally. The cmake options can be found in the build log, grepping for cmake. Use these options and follow the general build process✅ success
Compatibility checkChecks that clickhouse binary runs on distributions with old libc versions. If it fails, ask a maintainer for help✅ success
Docker server and keeper imagesThere's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS✅ success
Docs checkThere's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS✅ success
Fast testsThere's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS✅ success
Flaky testsChecks if new added or modified tests are flaky by running them repeatedly, in parallel, with more randomization. Functional tests are run 100 times with address sanitizer, and additional randomization of thread scheduling. Integrational tests are run up to 10 times. If at least once a new test has failed, or was too long, this check will be red. We don't allow flaky tests, read the doc✅ success
Install packagesChecks that the built packages are installable in a clear environment✅ success
Mergeable CheckChecks if all other necessary checks are successful✅ success
Performance ComparisonMeasure changes in query performance. The performance test report is described in detail here. In square brackets are the optional part/total tests✅ success
SQLancerFuzzing tests that detect logical bugs with SQLancer tool✅ success
SqllogicRun clickhouse on the sqllogic test set against sqlite and checks that all statements are passed✅ success
Stateful testsRuns stateful functional tests for ClickHouse binaries built in various configurations -- release, debug, with sanitizers, etc✅ success
Stateless testsRuns stateless functional tests for ClickHouse binaries built in various configurations -- release, debug, with sanitizers, etc✅ success
Stress testRuns stateless functional tests concurrently from several clients to detect concurrency-related errors✅ success
Style checkThere's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS✅ success
Unit testsRuns the unit tests for different release types✅ success
Upgrade checkRuns stress tests on server version from last release and then tries to upgrade it to the version from the PR. It checks if the new server can successfully startup without any errors, crashes or sanitizer asserts✅ success
Check nameDescriptionStatus
CI runningA meta-check that indicates the running CI. Normally, it's in success or pending state. The failed status indicates some problems with the PR⏳ pending
Integration testsThe integration tests report. In parenthesis the package type is given, and in square brackets are the optional part/total tests❌ failure

tests/queries/0_stateless/02884_authentication_quota.sh Outdated Show resolved Hide resolved
src/Access/EnabledQuota.cpp Outdated Show resolved Hide resolved
{
try
{
return MultipleAccessStorage::authenticate(credentials, address, *external_authenticators, allow_no_password,
allow_plaintext_password);
const auto auth_result = MultipleAccessStorage::authenticate(credentials, address, *external_authenticators, allow_no_password,
Copy link
Member

@vitlibar vitlibar Jan 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not quite logically correct to always try to authenticate even if the quota is already exceeded. Also IAccessStorage::authenticate() can do some complex stuff, like connecting to another server. I think it's better to check the quota first, then call MultipleAccessStorage::authenticate(), and then reset the quota counter if everything is ok.

Copy link
Contributor Author

@Demilivor Demilivor Jan 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would agree with you, but how do we get the quota without user_id? Do you know a way to get the user_id and quota for an authenticated user?

Now AuthResult returns user_id even if user is exist and user_id is read but authentication is failed (password is incorrect)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know a way to get the user_id?

auto user_id = find<User>(credentials.getUserName())

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, If that will also work with users authenticated using LDAP, then we can avoid modification of AuthResult, I will try to implement that way

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If that will also work with users authenticated using LDAP

Well, LDAPAccessStorage::find<User>(credentials.getUserName()) returns nullopt if that user has not logged in before since the server started. So we cannot check the quota this time. But since it's the first time the quota must not be already exceeded. And when the same user logs in again later, find<User>(credentials.getUserName()) will return a valid user_id, so the quota can be checked.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems for a user who has never logged in successfully using LDAP the quota won't work. Because with only failed attempts LDAPAccessStorage::authenticate() will never assign user_id to that user and we can't check the quota.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed the code according to the suggestion to use find<User>. And I added a comment for specific LDAP case.

@rvasin
Copy link
Contributor

rvasin commented Jan 29, 2024

@vitlibar Do you more suggestions to this PR? I see the performance degradation. If needed we may add a new option for config.xml file or users.xml (profile-based option?) file:

<enable_failed_sequential_authentications>false</enable_failed_sequential_authentications>

So it will be false by default and it will not affect performace of existing customers which do not need this feature.

I also discussed with management: partial LDAP support is OK for us. In future in addition to this approach it could be possible to implement other approaches (and corresponding new options to enable these approaches).

{
std::shared_ptr<const EnabledQuota> authentication_quota;
try
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function looks a bit too much complicated now. It's not necessary to check the quota twice (in checkExceeded and in used), the first checkExceeded should be enough. I believe this function can be written in a less complicated way:

auto authentication_quota = getAuthenticationQuota(credentials.getUserName());
if (authentication_quota)
    checkAuthenticationQuota(*authentication_quota, credentials.getUserName()); /// throws QUOTA_EXCEEDED with user_name if exceeded

AuthResult auth_result;
try
{
    auth_result = MultipleAccessStorage::authenticate(...);
}
catch (...)
{
    tryLogCurrentException();
    if (authentication_quota)
        authentication_quota->used(... /* check_exceeded */ false); /// already checked before
    throw ErrorCodes::AUTHENTICATION_FAILED
}

if (authentication_quota)
    authentication_quota->reset(QuotaType::FAILED_SEQUENTIAL_AUTHENTICATIONS);

return auth_result;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestion!

Let's imagine the situation without checking quota->used(throw_exception = true):
    We have a quota with FAILED_SEQUENTIAL_AUTHENTICATIONS = 1.
The user tries to login with the wrong password the first time:
      * Call check_exceed (0 > 1 - FALSE)
      * throw AUTHENTICATION_FAILED error (used_tries becomes 1) 
The user tries to login with the wrong password the second time: 
      * Call check_exceed (1 > 1 - FALSE)
      * throw AUTHENTICATION_FAILED error from throw ErrorCodes::AUTHENTICATION_FAILED (used_tries becomes 2) // <<-- This looks like the wrong behavior for me. I would desire to see the 'QUOTA_EXCEED' error here.
      
So the line:

  if (authentication_quota)
        authentication_quota->used(... /* check_exceeded */ false);

It looks more correct if check_exceeded will be true. Then the sequence will be:

The user tries to login with the wrong password the first time:
      * Call check_exceed (0 > 1 - FALSE)
      * throw AUTHENTICATION_FAILED error (used_tries becomes 1) 
The user tries to login with the wrong password the second time:
      * Call check_exceed (1 > 1 - FALSE)
      * throw QUOTA_EXCEED from authentication_quota->used(... /* check_exceeded */ true); (used_tries becomes 2)
      

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Though it looks a bit strange that with FAILED_SEQUENTIAL_AUTHENTICATIONS = 1 we allow two login attempts. And there is no way to allow only one login attempt.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like it either, but it's a feature of quotas where the condition is violated when used > max.

In this specific case, the condition used >= max would look better from the user's perspective. I would change a code inside EnabledQuota.cpp in these lines:

if (used > max)
{
bool counters_were_reset = false;
auto end_of_interval = interval.getEndOfInterval(current_time, counters_were_reset);
if (counters_were_reset)
used = (interval.used[quota_type_i] += value);
if (check_exceeded && (used > max))
throwQuotaExceed(user_name, intervals.quota_name, quota_type, used, max, interval.duration, end_of_interval);
}
}

if (used > max)
{
bool counters_were_reset = false;
auto end_of_interval = interval.getEndOfInterval(current_time, counters_were_reset);
if (!counters_were_reset)
throwQuotaExceed(user_name, intervals.quota_name, quota_type, used, max, interval.duration, end_of_interval);
}

Instead we can use different conditions depending on QuotaType quota_type like:

static bool compareQuotaValue(QuotaValue used, QuotaValue max, QuotaType quota_type)
{
    return quota_type == QuotaType::FAILED_SEQUENTIAL_AUTHENTICATIONS ? used >= max : used > max;
}

and use CompareQuotaValue(used, max, quota_type) instead of just used > max.
But I need the confirmation that making such change satisfies ClickHouse.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now the code works as you proposed but with authentication_quota->used(... /* check_exceeded */ true);

The default-generated exception from authentication_quota->checkExceed(...) already contains the user name:

Code: 201. DB::Exception: Received from localhost:9000. DB::Exception: Quota for user `2884_user_579116` for 3155695200s has been exceeded: failed_sequential_authentications = 2/1. Interval will end at 2069-12-31 06:00:00. Name of quota template: `2884_quota_579116`. (QUOTA_EXCEEDED)

Copy link
Member

@vitlibar vitlibar Jan 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and use CompareQuotaValue(used, max, quota_type) instead of just used > max.
But I need the confirmation that making such change satisfies ClickHouse.

I have another idea: let's just remove the call authentication_quota->used() after catch (...) and call it at the beginning instead of checkExceeded():

auto authentication_quota = getAuthenticationQuota(credentials.getUserName());
if (authentication_quota)
    authentication_quota->used(FAILED_SEQUENTIAL_AUTHENTICATIONS, 1);

try
{
    auto auth_result = MultipleAccessStorage::authenticate(...);

    if (authentication_quota)
        authentication_quota->reset(QuotaType::FAILED_SEQUENTIAL_AUTHENTICATIONS);

    return auth_result;
}
catch (...)
{
    tryLogCurrentException();
    throw ErrorCodes::AUTHENTICATION_FAILED
}

This way is not very intuitive however it seems it will work exactly as we want it to work. And it's also shorter.

We need a comment for used() here though with a detailed description. About that we increase the counter of authentication failures in the beginning and reset it after a successful authentication. And we do that because if we don't have quota for a failed authentication then we shouldn't try to authenticate at all.

Copy link
Contributor Author

@Demilivor Demilivor Jan 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implementation variant you suggested will work with this feature:
authentication_quota->used will increase the failed authentication counter even if the quota was already exceeded; that looks fine for me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code is updated,
Now it works as desired.

@vitlibar vitlibar merged commit a193e01 into ClickHouse:master Jan 31, 2024
250 of 253 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
can be tested Allows running workflows for external contributors pr-feature Pull request with new product feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants