New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Throttling repeated login requests & account lock #3037
Comments
Thank you for this design proposal! It is well written! I do have some points here:
The problem with this is that you can effectively DoS accounts and lock them by trying to repeatedly log in. To avoid this, it is recommended to set a maximum wait time (e.g. 10s) which makes brute forcing impractical, but also prevents DoSing.
Which process activates the identity and how does it know whether the rate limiting has disabled it, or whether it was disabled by an administrator? I would recommend to not use the active field for this but instead use the additional table.
My recommendation would be to have a table with the following layout:
And then you would do At scale, range queries are very expensive and it's always better to try and do single selects whenever possible! |
Thank you for your feedback and suggestions!
What about a config like this? rate_limit:
max_duration: 30s
backoff_factor: 2
max_wait_time: 10s
max_login_attempts: 5
login_attempt_window: 1m If we introduce a
We could use a table with a design as follows: Option 1CREATE TABLE lockouts (
id UUID PRIMARY KEY
identity_id UUID NOT NULL REFERENCES ...
lock_time TIMESTAMP NOT NULL,
reason ENUM('rate_limiting', 'admin_action') NOT NULL
); it may be beneficial to consider creating a separate table if the relationship between the Option 2locked BOOLEAN
lock_time TIMESTAMP NOT NULL,
reason ENUM('rate_limiting', 'admin_action') NOT NULL, As for the process for unlocking the identity after the lockout, we could add an admin-only endpoint such as
Thank you for providing the table design and sample query 🙇🏼♂️ I was also considering the same concept. Could you please provide some clarification regarding the following 🙏🏼
|
I think this is great! For the first step, I think it makes sense to just choose sensible defaults. Adding more configuration options is something we'll consider if enough people want to change this (for good reason). Maybe there's a blog article from a security expert that can help use choose correct values?
I would recommend an linear/exponential backoff with a fixed maxmimum - so something along the lines of:
That sounds good! Keep in mind that not all databases support ENUMs.
I think it makes sense to keep the business logic around this in one table. This will help with efficiency!
Could you elaborate on this a bit more, I'm not sure in what context to put this?
That's a great idea! It would be great to keep the naming consistent. So for example use "lock/unlock" everywhere, or "rate limit", or "throttling" or something else. Do you have a preference?
This is an internal value which will help (in the future) migrate between ory self-hosted and the ory network.
I think that makes sense, I don't think we need a 1:n relation for it - 1:n relations also are range queries which are significantly slower than unique (secondary) key reads.
I'd probably use fixed values for now, to not complicate the kratos config further |
@atreya2011 from Slack 20230128 "What are Ory's ways of challenging suspicious logins for example: These features will require some advanced data analysis, perhaps ML, and should be included in this issue. |
https://developers.cloudflare.com/waf/rate-limiting-rules/best-practices/ I found 2 articles that discuss best practices for rate limiting in Cloudflare WAF and Okta. They emphasize the importance of customizing rate-limiting settings to match the specific needs of a website or API and suggest using a combination of IP and user identity to establish unique rate limits. Both articles advise on how to monitor the effectiveness of rate limiting and make adjustments as needed to prevent DDoS attacks and other malicious traffic. Unfortunately, they don't give specific advice on default values. I will keep searching...
I understand the logic you've presented. Additionally, I have come across two articles that suggest setting a lockout threshold of 10 attempts or more as best practice for account lockout in Active Directory. Based on the logic discussed, it would be advisable to set the threshold at or above 10 to prevent the risk of a malicious user causing a lockout. We could keep the default value of 0 to signify no lockout. https://www.netwrix.com/account_lockout_best_practices.html
Thank you! If I understand correctly, Kratos utilizes go-buffalo/pop as the underlying package for handling database-related operations. Therefore, it would be appropriate to design the table as a Go struct, correct?
I apologize for any confusion. I had suggested adding the lockout-related parameters to the CREATE TABLE selfservice_rate_limits (
id UUID PRIMARY KEY
nid UUID
identity_id UUID NOT NULL REFERENCES ...
attempts int NOT NULL
last_attempt TIMESTAMP NOT NULL
locked BOOLEAN
lock_time TIMESTAMP NOT NULL,
reason varchar NOT NULL,
)
I don't have a particular preference for this. But having an
Thank you for the clarification!
I understand, that seems perfectly reasonable. |
Thank you for bringing these points to my attention 🙏🏼 Regarding point A, we could enhance security by combining Two-factor authentication (2FA), which is already available in Kratos, with monitoring login activity from new locations and setting up alerts / requiring a one-time code input for any unusual behaviour. However, I believe that addressing this point deserves a separate discussion (issue), as it does not directly relate to rate-limiting and account lockout discussed in this design proposal 🙇🏼♂️ Regarding point B, I have submitted a design proposal in this issue to implement account lockout after a specified number of failed login attempts, as well as rate-limiting consecutive login attempts. Would you be able to provide additional details regarding point C? Is it related to creating an endpoint to counteract point A, in order for an admin to blacklist devices/IPs and prevent malicious activities? |
That sounds good to me!
That's correct! Please keep in mind that enum is not supported in all databases, so you'll probably need to implement the enum in Go code :)
That looks very good to me! Please don't forget to add an index over
Good question! I think we have several options. From a pure REST perspective, we try to modify resources in plural:
Alternatively we could use verbs like but that would not be fully "REST"y:
I think that "block" and "unblock" are misleading, because "blocking" a user typically involves administrative intervention (user is malicious -> block them) whereas here we throttle/limit user authentication. What do you think of the following:
and a table of
or alternatively a table with the design above and an additional field to deal with rate limits in the future for recovery or verification flows:
|
Super excited about this being implemented. Let me know If you need any help implementing it. @jossbnd and I can help if needed ✋ |
Thank you for all the feedback! Let me summarize our discussion and, if everything looks good, I'll proceed with the implementation 🙂
I really like the idea of CREATE TABLE selfservice_rate_limits (
id UUID PRIMARY KEY,
nid UUID,
identity_id UUID NOT NULL REFERENCES identities,
attempts int NOT NULL,
last_attempt TIMESTAMP NOT NULL,
locked BOOLEAN,
reason ENUM('rate_limiting', 'admin_action') NOT NULL,
flow_type ENUM('login', 'recovery', ...)
);
CREATE INDEX idx_identity_id ON selfservice_rate_limits (identity_id);
CREATE INDEX idx_nid ON selfservice_rate_limits (nid); I have also simplified the configuration related to rate limiting, based on your feedback. Here is the updated version: rate_limit:
max_duration: 30s (default)
backoff_factor: 2 (default)
max_attempts_before_lockout: 0 (default) |
@supercairos |
Awesome, that's a great summary! A few more points (but we're close!) I'm not quite sure if I understand what I thought a bit more about the table. What do you think about this:
I don't think we need the enums right now, In business logic:
My recommendation would be the following configuration layout:
This would make the semantics very clear - this rate limit config is for login :) |
Thank you for the feedback! While The purpose of We could prevent a malicious user from making too many failed login attempts this way. The following articles suggest that a value of 10 or higher would be appropriate for https://www.manageengine.com/products/active-directory-audit/kb/best-practices/account-lockout-best-practices.html#:~:text=The%20recommended%20threshold%20is%2015%20to%2050 Please let me clarify how
|
Thank you very much for the explanation around
Since that involves a lot of work, my suggestion is to keep this feature out of scope for now and focus on the pure implementation of this feature. If an administrator wants to lock out a user, they can also set the time quite high (an hour or more) which effectively achieves the same thing. WDYT? |
I understood points 1 and 3.
We can use the courier in kratos to automatically send an email to the administrator regarding the lockout.
If we add an additional
Can you confirm if my understanding regarding point 2 is correct?
Yea this is also possible. While manual tracking by an administrator is possible, implementing a
Thank you for your input! Once I have completed the implementation of the rate-limiting feature, I will take your suggestions into consideration and prepare a revised proposal for implementing the account lockout feature, including all three points you mentioned (if that is okay with you 🙂). Shall I go ahead with the implementation of the rate-limiting feature? |
Epic, I think we're fully aligned. Thank you for your patience in this process - it definitely helps align on the vision for the feature and will make implementation much easier! Let's go! :) |
Awesome! Likewise, I thoroughly enjoyed the brainstorming process and am grateful for your feedback and approval 🙏🏼 |
Hi, I'm kinda interested in the feature, do you guys have any ETA for shipping it? |
@Dparty |
Awesome you are working on this @atreya2011 - I think it'll be a very valuable feature, and the org I work for also needs it. I'm not that experienced with go but if I can help with implementing or testing anything, let me know. |
@credcore-dave |
Hello all. |
@Robert-Bosse I think this issue might be what you're looking for? |
@Robert-Bosse Happy to continue this in a dedicated discussion, to not derail this issue further |
What is the status of the implementation for this? I can't find any PR or work in progress. Would love this feature and would happily contribute if assistance is needed. |
@Oscmage Work is progressing on this slowly. My apologies for the wait 🙇♂️ |
@atreya2011 No worries at all! Sorry if it came out like that. Happy to wait/assist :) Was just curious since this was one of the features that we were comparing our current Firebase/Google Identity Platform solution with. |
Similar situation here for us. Looking forward to this functionality. We've been trying to solve this problem for a couple of months but there hasn't been an elegant solution externally from Kratos. Is there anything that I could do to provide assistance, we've got a few developers that would be happy to contribute :) |
@SeanTasker |
No problem at all! And apologies for my slow response. Our internal issue tracker was down and I wanted to check a couple of things as we originally discussed a number of options for our login page. In the end we settled on using this template https://github.com/timalanfarrow/kratos-selfservice-ui-vue3-typescript to build our own login and account creation pages. So originally we tried to solve this with a captcha, however it turned out to be insufficient as we needed to store some state server side. We haven't forked Kratos (yet) - that's why I am here, to determine if forking will get us closer to a solution sooner. We were thinking about adding some additional hooks that would let us implement additional login restriction checks. Let me know if we can help. We're not super familiar with the Kratos code base right now so would just need some direction to help us know where to focus our attention. |
@atreya2011 any update on this functionality? Thanks! |
I think should be prioritized. We had problems with this in the past days. This is so necessary. We are thinking to change Kratos for this reason. We are very scared. |
@bradleyball @frederikhors |
@atreya2011 I think this should be done from Ory Kratos staff. Is not a nice-to-have feature. This is a security MUST. |
@frederikhors |
No, I didn't explain myself correctly. I mean it's urgent to do. Because without it there is a BIG RISK trying a login many times until you find the correct password. |
@frederikhors Yes. I second that. It is something that is urgently needed. |
This is a top priority for the Ory team and we're prioritizing protections. I'll update this issue with any upcoming changes. |
@kmherrmann any ETA? Even if not exact? |
It would be great to hear back on this. |
@kmherrmann any news for our top priority project here? |
Hello, we have decided not to work on this as part of Ory Kratos. Rate limiting, credentials stuffing, IP rate limiting across multiple nodes, and DoS prevention are very difficult problems to solve (typically cat and mouse type problems) and it makes much more sense to solve them on an operational level with things like Gateway Ratelimiters, JA3 Fingerprinting, Anti Bot detection, API firewalls or services like Cloudflare or Akamai. We have solved these things in Ory Network using a variety of tools, but that also implies that we are not implementing this in Ory Kratos itself. Sorry for anyone who was waiting for this here! Thank you for your understanding. |
Thank you for the clarification! Does this mean any PR related to this the feature won’t be accepted? Or does it mean that ORY won’t officially work on this feature? If ORY won’t officially work on this feature but is willing to accept a PR based on the design proposal above, I can pick up my implementation from where I left it off. |
Wow. This is huge. I think we should at least add guides on docs for these serious problems. If not, using Kratos or others is the same at this point. |
We would certainly accept PRs if the code changes are not too huge and do not require significant maintenance on our end. They should also be scoped to tackle one problem only (eg account locking on repeatedly failed attempts). Anything that is IP related should be out of scope (such as credentials stuffing) in my view as that should be dealt with on a rate limiter service. |
Thank you for the swift response. That’s good to hear. However, the original design proposal was for the rate limiting feature based on identity ID (not IP address) and then implement account lock in another PR. Is that still okay? |
@aeneasr How do we solve credential stuffing by a bot net where no IP is used more than a handful of times? |
What are you using right now? i mean, Cloudflare? What else? |
no cloudflare. custom login solution. right now we check ratio of good to bad logins for activating a captcha if the ratio gets too low. |
@aeneasr |
Preflight checklist
Context and scope
Goals and non-goals
The design
Based on the discussion here: #654
Example rate-limit config
Example lockout config
APIs
No response
Data storage
Create a new table to store the login history of a specific identity. The schema is as follows:
One identity shall have many login history records
One login history record belongs to one identity
Code and pseudo-code
When an identity attempts to log in:
How to track consecutive login attempts:
Degree of constraint
No response
Alternatives considered
No response
The text was updated successfully, but these errors were encountered: