Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: HTML passwordrules attribute #3518

Open
dbates-wk opened this Issue Mar 1, 2018 · 40 comments

Comments

@dbates-wk
Copy link

dbates-wk commented Mar 1, 2018

HTML passwordrules attribute

Motivation

Some user agents offer to generate random per-site passwords on behalf of the user. Safari has built-in support for this, and add-on password managers such as 1Password add this functionality. This feature improves user security by guaranteeing high-entropy passwords and avoiding reuse of the same password on multiple sites.

One challenge with this approach is that sites have different rules for valid passwords. Many sites require characters from specific sets to be present, or have other constraints. The best known solution is to have a generator rule that matches the password requirements of many sites, plus a curated list of per-site quirks for sites with unusual requirements.

A better solution would be for the website to express its password requirements in machine-readable form, and in a format that is suited for use with a generation algorithm. While the pattern attribute allows expressing many value constraints, it's very hard to use it to drive a generator. It's also tricky to express many popular password constraints (such as a limit on the number of consecutive repeated characters) in a regexp.

Proposed Solution

We propose a new content attribute on the HTML input element called passwordrules and define a mini syntax for web authors to use to express their requirements (rules). We describe how a user agent will makes use of these rules and the minimum requirements for the user agent to honor these rules below.

Extensions to HTML

We propose the following new content attribute be added to the HTML input element:

	passwordrules

Using the passwordrules attribute

The passwordrules attribute, when specified, describes the set of extra restrictions on the value of the element's value attribute that a user agent must consider when generating a password and performing client-side form validation. Its value is a semicolon delimited string of one or more property/value pairs and has the form:

required: (<identifier> | <character-class>), ..., (<identifier> | <character-class>); allowed: (<identifier> | <character-class>), ..., (<identifier> | <character-class>); max-consecutive: <non-negative-integer>

An <identifier> must case-insensitively match one of the following strings: upper, lower, digit, special, ascii-printable, and unicode. These identifiers correspond to the set of ASCII uppercase letters (A-Z), lowercase letters (a-z), digits (0-9), all other ASCII printable characters - including the space character - (-~!@#$%^&*_+=`|(){}[:;"'<>,.? ]), all ASCII printable characters, and all Unicode characters, respectively.

A <character-class> is a custom characters class.

A <non-negative-integer> is a valid non-negative integer.

The missing value default for passwordrules is allowed: ascii-printable. There is no invalid value default.

The values of multiple required/allowed properties are concatenated together and multiple max-consecutive properties behave as if a single max-consecutive property was specified whose value is the minimum of all max-consecutive properties. Duplicate property values are ignored. Specifying multiple character classes is equivalent to specifying one character class that represents the union of the characters in all character classes. Empty character classes are ignored. Properties without a value are ignored. The following examples illustrate the aforementioned equivalences:

required: upper; required: lower <=> required: upper, lower
allowed: upper; allowed: lower <=> allowed: upper, lower
max-consecutive: 4; max-consecutive: 2 <=> max-consecutive: 2
required: upper, lower, upper <=> required: upper, lower
required: [abc], [def] <=> required: [abcdef]
allowed: upper, [] <=> allowed: upper
required: ; allowed: upper <=> allowed: upper

NOTE: The expression required: upper; required: lower is NOT equivalent to required: upper, lower. See Requiring that a password contain certain characters.

If you do not specify the max-consecutive property then it defaults to being unbounded. That is, the user agent can generate a password with one or more arbitrary length runs of the same character (e.g. ooops).

If you specify the required property and do not specify the allowed property then the user agent will infer the value of the allowed property according to the rules in How a user agent determines the allowed characters.

For example, to require a password have at least 8 characters consisting of a mix of uppercase and lowercase letters, at least one number, and at most two consecutive characters, add this to your markup:

<input type="password" minlength="8" passwordrules="required: upper; required: lower; required: digit; max-consecutive: 2">

To require at least one digit or one of -().&@?'#,/"+ (not both), add this to your markup:

<input type="password" minlength="8" passwordrules="required: upper; required: lower; required: digit, [-().&@?'#,/&quot;+]; max-consecutive: 2">

Or to require at least one of -().&@?'#,/"+, add this to your markup:

<input type="password" minlength="8" passwordrules="required: upper; required: lower; required: digit; required: [-().&@?'#,/&quot;+]; max-consecutive: 2">

Alternatively, to optionally allow one of -().&@?'#,/"+, add this to your markup:

<input type="password" minlength="8" passwordrules="required: upper; required: lower; required: digit; allowed: [-().&@?'#,/&quot;+]; max-consecutive: 2">

Another example, to allow a password to contain an arbitrary mix of letters, numbers, and -().&@?'#,/"+, add this to your markup:

<input type="password" minlength="8" passwordrules="allowed: upper, lower, digit, [-().&@?'#,/&quot;+]">

WARNING: With the exception of the NOTE below, each property/value pair reduces the entropy of a user agent generated password and makes the password more likely to be guessed or brute-forced. The more characters that are required the more likely the user agent generated password can be guessed or brute-forced.

NOTE: Setting the passwordrules attribute to allowed: unicode provides the most entropy for a user agent generated password. Omitting the passwordrules attribute or setting it to the empty string provides the second most entropy for a user agent generated password.

Custom character classes

A custom character class is a list of ASCII characters that are surrounded by square brackets (e.g. [abc]). Any non-ASCII printable characters in the set are ignored. The dash character (-) is reserved as a special character. To list '-' as a literal character it must appear immediately after the opening square bracket '['. The right square bracket (]) is also reserved as a special character. To list ']' as a literal character it must appear immediately before the closing square bracket ']'.

Specifying the characters allowed to be in a password

The value of the allowed property is a comma-separated list of character class identifiers or custom character classes, or both. Each custom characters class represents a set of characters that are allowed to be in the generated password. For example, if the allowed property is set to [*]] then the generated password is allowed to contain ']' and '*', but it is not allowed to contain '[' among other non-listed characters. If the allowed property is set to digit, [@!] then the generated password is allowed to contain one or more ASCII digits, one or more '@'s and one or more '!'s, but it is not allowed to contain '[' among other non-listed characters.

Requiring that a password contain certain characters

You can require that a password contain certain characters or classes of characters by setting the value of the required property to a comma-separated list of character class identifiers or custom character classes, or both. For example, if the required property is set to upper, digit then the user agent MUST generate a password that contains at least one ASCII uppercase letter and at least one digit. If required is set to upper, [@!] then the user agent MUST generate a password that contains at least one ASCII uppercase letter and either '@' or '!'.
A user agent must generate a password that contains at least one character from each required property. For example, if the passwordrules attribute is set to required: upper; required: digit then the user agent MUST generate a password that contains at least ASCII uppercase letter and at least one digit. If there is a single required property that is set to upper, digit then the user agent MUST generate a password that contains at least one ASCII uppercase letter or at least one digit. If there is a single required property that is set to upper, [@!] then the user agent MUST generate a password that contains at least one ASCII uppercase letter or '@' or '!'.

Limiting the number of consecutive repeated characters

The value of max-consecutive is a non-negative integer that represents the maximum length of a run of consecutive identical characters that can be present in the generated password. For example, set max-consecutive to 2 to disallow a user agent from generating a password that contains a run of more than 2 of the same character (e.g. "ooops" - contains three consecutive o's).

How a user agent determines the allowed characters

The set of required characters MUST always be a subset of the set of allowed characters. If the value of passwordrules violates this constraint then the user agent MUST adjust the value of allowed to satisfy it. The following implications immediately fall out from this constraint:

  1. If you specify the required property and do not specify the allowed property then the allowed property is inferred to be the value of the required property.
  2. If you set both the required property and the allowed property then the user agent behaves as if the allowed property were set to the union of the value of the allowed property and the value of the required property. For example, if the required property is set to lower and the allowed property is set to [abc0123] then the user agent MUST behave as if the allowed property were set to lower, [0123]. Another example, if the required property is set to lower and the allowed property is set to upper then the user agent MUST behave as if the allowed property were set to lower, upper.
  3. If neither the required property nor the allowed property are specified then the user agent behaves as if the allowed property was set to ascii-printable.

How a user agent generates a password based on passwordrules

A user agent will generate a password using an algorithm or heuristic of its choice that respects the following attributes of a password element (not necessarily in order): minlength, maxlength, and passwordrules. If the set of constraints imposed by the aforementioned attributes fail to meet the following minimum restrictions then they are considered nonconforming and the user agent is REQUIRED to ignore them:

  1. The maximum password length cannot be less than 12.
  2. Allowed characters must consist of at least two of the following character classes: ASCII uppercase letter, ASCII lowercase letters, digits.

Characters in the generated password MUST be expressed in Normalization Form C and must conform to the following UAX31 profile:

Interaction with client-side form validation

It is not recommended to specify both the pattern attribute and the passwordrules attribute.

The passwordrules attributes participates in constraint validation. If the element's value attribute does not satisfy the criterion specified by the value of the passwordrules attribute then the element is in the "suffering from a passwordrules mismatch" validity state and the element is invalid for the purposes of constraint validation.

Confirmation password field

Some web pages have both a password field ("primary password field") and a confirmation password field. The passwordrules attribute needs only to appear on one of these fields. If both fields have the passwordrules attribute then you must ensure that they have the same value. Otherwise, the user agent will behave as if both fields have set their passwordrules attribute to the result of the union of both field's required property (if any) and the intersection of both field's allowed property (if any) after simplifying the passwordrules attribute of both fields according to rules in Using the passwordrules attribute. For example, if a page contains the following markup:

<input type="password" name="password" minlength="8" passwordrules="required: upper, lower, digit, [-().&@?'#,/&quot;+]; max-consecutive: 2">
<input type="password" name="confirmation-password" minlength="8" passwordrules="required: upper; allowed: [!]; max-consecutive: 3">

Then the user agent must behave as if the markup was:

<input type="password" name="password" minlength="8" passwordrules="required: upper, lower, digit, [-().&@?'#,/&quot;+]; max-consecutive: 2">
<input type="password" name="confirmation-password" minlength="8" passwordrules="required: upper, lower, digit, [-().&@?'#,/&quot;+]; max-consecutive: 2">
@dbates-wk

This comment has been minimized.

Copy link
Author

dbates-wk commented Mar 1, 2018

@mikewest

This comment has been minimized.

Copy link
Member

mikewest commented Mar 2, 2018

/cc @battre

@bsittler

This comment has been minimized.

Copy link

bsittler commented Mar 2, 2018

possible refinement, since a perverse required could apparently shrink the space a lot: "the password constraints will be ignored if they would reduce the number of possible passwords below 2**60" or something like that -- otherwise I think there are some very low-entropy edge-cases that come about due to too many required elements effectively turning the generated password into a mere permutation of those elements

@othermaciej

This comment has been minimized.

Copy link

othermaciej commented Mar 2, 2018

@bsitter Good catch! I agree that it's bad if perverse password rules limit the number of possibilities to an overly low number.

For an implementation requirement like "the password constraints will be ignored if they would reduce the number of possible passwords below 2**60", the standard would need to include an algorithm to calculate the number of possible passwords to enable interoperable behavior. If browsers calculated it slightly differently, it would be a significant interop problem.

We tried to evade the need for a full entropy calculation by having higher-level rules to ensure a wide enough range of passwords. Specifically, passwordrules must be ignored if the max length too low, or the set of allowed characters is too small a range. You are right that excessive "required" directives could also overly limit the passwords. In the spirit of the easier to determine rule for rejecting overly restrictive "passwordrules", how about setting an upper limit on the number of "required" directives that may be present?

@annevk

This comment has been minimized.

Copy link
Member

annevk commented Mar 2, 2018

First of all, I really like this! Giving declarative credential generation more love is great.

My main worry here is the complexity of the attribute and requiring another custom parser for it. Can we consolidate that with something somehow? Perhaps just having more attributes or going full JSON?

Should we also integrate this with https://w3c.github.io/webappsec-credential-management/ somehow? I understand that has adoption due to WebAuthn so presumably it's something that'll stick around and we need to account for?

(The other thing we should include in the examples advocating this technique is autocomplete=current-password and autocomplete=new-password. This is only needed for the latter (and only for the first of its kind on a page, per OP).)

@bsittler

This comment has been minimized.

Copy link

bsittler commented Mar 2, 2018

@othermaciej indeed, and I actually considered including such a wrinkle on my original comment but realized that some required values aren't particularly bad this way while others are, and evaluating them this way is a little problematic (approaching the complexity of overall entropy computation.) A very rough approximation might be: maxlength may be no less than 12 + the number of "trivial" required elements. To be considered trivial, a required element must permit no more than 35 possibilities in the printable ASCII range. correction: cutoff was supposed to be 31 - this means allowing punctuation as a non-trivial required element, which should satisfy lots of existing rules without undue penalty

@bsittler

This comment has been minimized.

Copy link

bsittler commented Mar 2, 2018

Another question: is character class merging the intended behavior for required? It seems like it shouldn't be, but this suggests otherwise:

Specifying multiple character classes is equivalent to specifying one character class that represents the union of the characters in all character classes

Otherwise required only ever has at most one character of influence

@othermaciej

This comment has been minimized.

Copy link

othermaciej commented Mar 2, 2018

@bsittler I think what this proposal says is right for allowed but probably not for required. required: upper, lower should require at least one ASCII alphabetic character, while required: upper; required: lower should require at least one uppercase character and at least one lowercase. I am not sure what @dbates-wk 's intent was when writing this but I think that's how it should work. For allowed, multiple directives and a single directive with commas would be equivalent under any reasonable interpretation.

On the "trivial character class" rule, that makes sense to me as an approach, but the specific proposal would require a minimum length of 15 instead of 12 for passwords with the typical "must include at least one uppercase, at least one lowercase, at least one number" restriction. If in addition a special character is required, that would be a minimum length of 16. That seems excessive, as adequate entropy is possible for 12-character passwords with either of these common restrictions.

@annevk We care more about the capabilities than the syntax. That said:

  • Multiple attributes is possible, but it would result in three attributes of which two have (similar) nontrivial syntax, so it would not avoid the need for an extra mini-parser.
  • JSON seems like overkill.
  • Credential Management is programmatic, while this is declarative (and that's part of the use case). So not clear how they could be integrated. I don't think the parts of Credential Management that aren't required for WebAuthN are likely to get wide traction.
@tabatkins

This comment has been minimized.

Copy link
Collaborator

tabatkins commented Mar 2, 2018

I disagree with the fundamental premise of this. :( Restrictions on passwords beyond minimum length (and maybe a large maximum length) are all fundamentally bad, particularly restricted characters - such restrictions indicate that the site is storing passwords very badly (in plaintext, with bad escaping practices when interacting with their database). Required characters are also generally a bad restriction - it's much better to simply increase the minimum length and let people use whatever characters they (or their pw generators) want.

Do we really want to be adding a feature whose primary use-case is making it easier for already-broken sites to continue being broken?

@othermaciej

This comment has been minimized.

Copy link

othermaciej commented Mar 2, 2018

Restrictions on passwords are indeed bad. I agree it would be best if they went away. But it also seems unlikely they will go away any time soon.

Password generators are extremely good. About the safest thing anyone can do for their online security is to use a unique randomly generated password for each site.

If password generators can't work with the existing password restrictions of websites, then that leads to a bad user experience (user counts on generator, then the site rejects their password) and poor security (user makes up a weak or reused password on the spot). The current state of the art is to maintain a list of site-specific quirks to get the password generator to do its job right. Safari has a pretty extensive set. We'd like password generators (including ours) to be able to do a good job without needing a quirks list.

Thus, even though password restrictions are likely harmful on net (other than minimum length), the most practical harm reduction is for sites with restrictions to make it obvious and machine readable what those restrictions are.

@js-choi

This comment has been minimized.

Copy link

js-choi commented Mar 2, 2018

@annevk:

Restrictions on passwords beyond minimum length (and maybe a large maximum length) are all fundamentally bad, particularly restricted characters - such restrictions indicate that the site is storing passwords very badly (in plaintext, with bad escaping practices when interacting with their database). Required characters are also generally a bad restriction - it's much better to simply increase the minimum length and let people use whatever characters they (or their pw generators) want.

If the WHATWG decides to add a passwordrules attribute, the attribute’s specification could include an informative note stressing that password restrictions are Useless and Bad and that storing passwords as plain text is Very Bad. This news still has not percolated through to many IT organizations; any opportunity to forcefully communicate this to them is valuable. As long as password restrictions remain a common practice on the web, for better and for worse, the new attribute could be a good opportunity for the WHATWG to emphatically recommend that web developers not use password restrictions at all.

@domenic

This comment has been minimized.

Copy link
Member

domenic commented Mar 2, 2018

It seems like many, but not all, use cases in the OP can be covered by the existing pattern attribute. (For example, specifying allowed or disallowed characters.) Could we consider scoping this down to only the use cases that cannot be accomplished with today's technology?

@dbates-wk

This comment has been minimized.

Copy link
Author

dbates-wk commented Mar 2, 2018

Another question: is character class merging the intended behavior for required? It seems like it shouldn't be, but this suggests otherwise:

Specifying multiple character classes is equivalent to specifying one character class that represents the union of the characters in all character classes
Otherwise required only ever has at most one character of influence

You're right! I updated my proposal to remove this sentence (indicated by a strikethrough).

@dbates-wk

This comment has been minimized.

Copy link
Author

dbates-wk commented Mar 2, 2018

If the WHATWG decides to add a passwordrules attribute, the attribute’s specification could include an informative note stressing that password restrictions are Useless

I take it you feel that the WARNING paragraph in the proposal is not sufficient?

@dbates-wk

This comment has been minimized.

Copy link
Author

dbates-wk commented Mar 2, 2018

It seems like many, but not all, use cases in the OP can be covered by the existing pattern attribute. (For example, specifying allowed or disallowed characters.) Could we consider scoping this down to only the use cases that cannot be accomplished with today's technology?

Although some of the use cases could be accomplished with today's technology they cannot be accomplished easily or succinctly. For instance, consider the following common variant of the first example in the proposal that disregards the consecutive character requirement: a password that has least 8 characters consisting of a mix of uppercase and lowercase letters, at least one number. This can be accomplished with today's technology. It is non-trivial to do so. Accomplishing this task or variants of it are exemplified by the regexps in https://stackoverflow.com/questions/19605150/regex-for-password-must-contain-at-least-eight-characters-at-least-one-number-a.

@othermaciej

This comment has been minimized.

Copy link

othermaciej commented Mar 3, 2018

@js-choi I don't think password restrictions are related to storing passwords in plaintext. They are either because of dumb legacy system limitations (max lengths, very restricted set of allowed characters), actually good (minimum length limit) or well-intentioned attempts to get users to make handmade passwords that are resilient to guessing or offline dictionary attack against a leaked hashed password database (for example, the popular "one letter, one number, one special" requirement).

@js-choi

This comment has been minimized.

Copy link

js-choi commented Mar 3, 2018

@othermaciej: I agree insofar that many cases of password restrictions are due to dumb legacy system limitations or well-intentioned encouragement of better handmade passwords. I was mostly responding to @tabatkins’s saying that "such restrictions indicate that the site is storing passwords very badly (in plaintext, with bad escaping practices when interacting with their database)”, which may well also be sometimes true.

@dbates-wk: The currently worded warning:

WARNING: With the exception of the NOTE below, each property/value pair reduces the entropy of a user agent generated password and makes the password more likely to be guessed or brute-forced. The more characters that are required the more likely the user agent generated password can be guessed or brute-forced.

…is not quite forceful or empathetic in discouraging password restrictions in general, a discouragement that @tabatkins probably believes ought to be done. I personally am sympathetic to his view, but I am also sympathetic to making usability better for users of password managers. From my own field, bad password restrictions are a particularly pernicious problem in healthcare/clinical applications.

Addressing password restrictions at all may be seen by developers as a general statement from WHATWG on its disposition toward password restrictions, for better or for worse. Care should therefore be crafted in how its specification is worded: it probably would not hurt for that warning above to be more forceful and empathetic against password restrictions in general. Such force may somewhat ameliorate @tabatkins’s general reservations against addressing password restrictions at all.

@othermaciej

This comment has been minimized.

Copy link

othermaciej commented Mar 3, 2018

@domenic We thought about just using the pattern attribute, but there are two challenges:

(1) Consider a common limitation like: "must contain at least one letter and one at least number, and may contain !@#$%^&*()_+-=". It's possible to do with a regexp but it's pretty non-obvious.

Here is the clearest regexp I could come up with that implements this rule: (([A-Za-z]|[0-9]|[-!@#$%^&*()_+=])*[A-Za-z]([A-Za-z]|[0-9]|[-!@#$%^&*()_+=])*[0-9]([A-Za-z]|[0-9]|[-!@#$%^&*()_+=])*)|(([A-Za-z]|[0-9]|[-!@#$%^&*()_+=])*[0-9]([A-Za-z]|[0-9]|[-!@#$%^&*()_+=])*[A-Za-z]([A-Za-z]|[0-9]|[-!@#$%^&*()_+=])*). That is a lot harder to write correctly and a lot harder to understand than "required: upper, lower; required: number; allowed: [-!@#$%^&*()_+=]". Using regexps to represent this rule is likely too hard for web developers to do correctly.

(2) In theory it's possible to use a regexp to drive generation rather than matching, but it's pretty hard. Getting a password generator to produce a uniformly random password that matches an arbitrary regexp is possible in theory, but way harder than getting it to produce a random password that matches rules of the type that passwordrules supports. Also, password generators may try to be clever and make passwords that are easy to type (for cases where you have to log into another device without benefit of autofill), at least when rules are flexible enough to allow it. It's straightforward to do this with the limited kinds of rules that passwordrules supports but infeasible to do with a generator that can be driven by a regexp. You could argue that maybe only a subset of regexps should be supported, but how do you decide what subset? It can take a very complex regexp just to represent a simple rule. It's also not very good for web devs if they are supposed to use pattern but must be very careful what they put in it or it will be ignored.

So even though passwordrules is technically redundant with pattern, it's still a practical addition because it makes the password requirements easier to write, easier to understand, easier to verify, and easier to feed to a generator. This is what made us conclude that we need a new feature and can't just reuse pattern.

@othermaciej

This comment has been minimized.

Copy link

othermaciej commented Mar 3, 2018

@js-choi I am fine with having a more assertive warning. I think the wording in the spec will have very little influence on prevalence of password restrictions one way or the other, but we should do our best to avoid proliferating restrictions even a little.

@Zirro

This comment has been minimized.

Copy link
Contributor

Zirro commented Mar 4, 2018

Are there many sites out there which restrict passwords in this way, yet still receive enough attention from developers who would be likely to add this attribute? It's anecdotal, but the only sites on which I've encountered restrictive password limitations are ones which have not seen updates for years.

I'm also a bit concerned that adding an attribute - despite warnings in the spec - might encourage more sites to introduce restrictions. Do you have data showing how common password restrictions are today, and if their usage is declining? It seems like this might become a smaller problem within a few years, as older systems get replaced.

@Zirro

This comment has been minimized.

Copy link
Contributor

Zirro commented Mar 4, 2018

Getting a password generator to produce a uniformly random password that matches an arbitrary regexp is possible in theory...

It seems like it should be possible to cover 99% of cases by generating ~50 passwords according to different rules and transparently match them against the pattern attribute until you find the most preferable one which is allowed. How long is the list of sites with unusual rules, and how quirky are they?

Unless we foresee other uses for it besides covering the remaining cases for password rules, the added complexity of introducing a unique syntax ought to be avoided.

@othermaciej

This comment has been minimized.

Copy link

othermaciej commented Mar 5, 2018

@Zirro Many sites have password restrictions, including ones that are popular and actively maintained.

For example, here's the restrictions from etrade.com (as stated by the site):

  • Needs 8-32 characters with no spaces
  • Needs at least one number
  • Needs at least one uppercase and one lowercase letter
  • Cannot be the same as your user ID

Other sites have hidden restrictions. They don't name any up front, but reject some passwords in practice.

It seems like it should be possible to cover 99% of cases by generating ~50 passwords according to different rules and transparently match them against the pattern attribute until you find the most preferable one which is allowed.

This is inefficient and likely to still fail in edge cases, so I doubt we'd adopt this over a quirks list. Also, the bigger problem with pattern is that it's very hard to write regexps that correctly implement many popular password limitations. Site authors could use pattern today but they don't.

While I am sympathetic to the desire to avoid technically redundant features, I think framing password rules in a more direct way will solve a real practical problem that can't be solved just by pushing existing features harder.

@dbates-wk

This comment has been minimized.

Copy link
Author

dbates-wk commented Mar 12, 2018

@othermaciej:

@bsittler I think what this proposal says is right for allowed but probably not for required. required: upper, lower should require at least one ASCII alphabetic character, while required: upper; required: lower should require at least one uppercase character and at least one lowercase.

Fixed this up to match your expectation.

@dbates-wk

This comment has been minimized.

Copy link
Author

dbates-wk commented Mar 12, 2018

I updated the proposal. With the exception of the example sections, I demarcated removals from- and additions to- the original proposal using strikethrough and italic, respectively.

@bsittler

This comment has been minimized.

Copy link

bsittler commented Mar 13, 2018

@dbates-wk the updates are improvements from my point of view. A few issues still concern me:

  • So far as I can tell there is no limit on how many narrow-character-class required: limitations a site can impose, which means:
    • unacceptable entropy-reduction is possible; I believe target minimum entropy levels for generators should be clearly stated in the proposal, even if no formula is given to compute the level based on the passwordrules
    • in these cases selecting conforming password candidates from a less-restricted candidate list may take too long to terminate - e.g. a list generated naively based on random selection (for each character) from the printable-ASCII subset of allowed:; to overcome this I think the proposal needs to include at least a rough outline or pseudocode for a conforming generator with guaranteed termination
  • The character class syntax differs from JS regular expressions; is this intentional? If so, it should be noted more prominently; if not, the gaps should be closed
  • As an example: how would a requirement for one of [, - or ] be expressed? I believe in JS regular expressions it would be [-[\]]
@prlbr

This comment has been minimized.

Copy link

prlbr commented Mar 18, 2018

@othermaciej

Many sites have password restrictions, including ones that are popular and actively maintained.

The actively maintained sites could be educated to lessen their technical restrictions. They could still give their users recommendations for chosing a good password without actually reducing the space of possible passwords.

@dbates-wk

This comment has been minimized.

Copy link
Author

dbates-wk commented Apr 20, 2018

@bsittler

So far as I can tell there is no limit on how many narrow-character-class required: limitations a site can impose, which means:
unacceptable entropy-reduction is possible; I believe target minimum entropy levels for generators should be clearly stated in the proposal, even if no formula is given to compute the level based on the passwordrules

Do you have a particular minimum entropy level in mind? Otherwise, I will think about it and get back to you.

in these cases selecting conforming password candidates from a less-restricted candidate list may take too long to terminate - e.g. a list generated naively based on random selection (for each character) from the printable-ASCII subset of allowed:; to overcome this I think the proposal needs to include at least a rough outline or pseudocode for a conforming generator with guaranteed termination

The character class syntax differs from JS regular expressions; is this intentional?

Yes, this is intentional because we do not need to represent arbitrary character ranges given that a custom character class syntax is designed to only contain ASCII printable characters and we expose literals to represent all the common ASCII printable character ranges (e.g. "lowercase' is equivalent to regex [a-z]+). The current proposal reserves '-' should we need to support arbitrary character ranges. See section "Custom character classes" or my reply to your last question for details on how to express '-' using the proposed syntax.

If so, it should be noted more prominently;

OK. I can add a remark about this.

[...] As an example: how would a requirement for one of [, - or ] be expressed? I believe in JS regular expressions it would be [-[]]

No escaping is necessary to express '[': [[]. The third from the last sentence and last sentence of section "Custom character classes" explain how to express '-' and ']', respectively. Quoting the proposal:

To list '-' as a literal character it must appear immediately after the opening square bracket '['. The right square bracket (]) is also reserved as a special character. To list ']' as a literal character it must appear immediately before the closing square bracket ']'.

@bsittler

This comment has been minimized.

Copy link

bsittler commented Apr 20, 2018

I think 90 bits is a reasonable minimum, but will happily defer to real cryptographers.

And thank you for addressing the rest of those questions! It might be worth including the [-[]] example as its syntax may be a bit surprising for someone familiar with regular expressions

@othermaciej

This comment has been minimized.

Copy link

othermaciej commented Apr 20, 2018

90 bits of entropy is excessive. For the "repeatedly guess" threat model, a much lower number of bits will stop the attacker (so long as the website has reasonable rate limits and/or an attempt limit). Even 20 bits is reasonably effective for this case (though obviously not ideal). 20 bits is equivalent to a 6-digit numeric passcode.

For the "offline attack against leaked database" threat model, the number of bits needed depends on the quality of password hashing used by the website. I did the math on this a while ago based on fastest known password cracking and then assuming a few power of two speedups on top of that:

Strong (bcrypt, PBKDF, scrypt): ~47 bits needed
Decent (SHA512): ~49 bits needed
Poor (SHA1): ~66 bits needed
Terrible (NTLM, DES CRYPT, MD5): just give up

So it's probably not right to have a hard limit significantly higher than 47 bits. Note that if the site allows entropy somewhat below the limit, it's probably still better to know their password rules and make a generated password, instead of ignoring them and forcing the user to make a manual one.

As an extra safety margin, Safari tries to generate passwords with >70 bits of entropy, but we would still want to generate something on sites that won't allow our full format.

Also, entropy calculations are nontrivial, especially in the presence of multiple required character classes. We can't just make it a vague requirement without including the calculation algorithm. Based on this I don't think we should have a direct entropy limit at all. Instead, we should have limitations that are more readily checkable.

@bsittler

This comment has been minimized.

Copy link

bsittler commented Apr 20, 2018

90 bits is assuming "just give up"-quality hashing (unfortunately still widely used), an offline attack, a well-funded attacker, and cheap hardware on a large scale (e.g. botnet or dedicated cryptomining-style hardware farms)

@othermaciej

This comment has been minimized.

Copy link

othermaciej commented Apr 20, 2018

90 bits of entropy is not enough to defend against a parallelized offline attack against a leaked database that uses garbage-tier hashing. I don't think there is any practical number of bits that works for those cases. They can be computed very quickly on commodity hardware and most have practical preimage attacks (not quite yet for MD5 but it's close). In my belief it is not possible to defend against an offline database attack for a site that uses very weak hashing no matter how strong your password is. In practice all you an do is make sure to not reuse passwords between sites, which password generators facilitate.

Requiring 90 bits is also more than most password generators will use, and more than many sites with mildly silly restrictions can support. So setting that as the floor would make this feature useless, and will result in failed password generation (and therefore human-generated passwords) on most sites.

Note that this feature is a harm reduction feature (reducing the collateral damage of dumb password restrictions by still letting password generators do the best they. can) not a best practices feature. It's best if password generators can work on as many sites as possible, even if some sites are individually not defensible due to bad hashes or excessive password limitations.

@bsittler

This comment has been minimized.

Copy link

bsittler commented Apr 20, 2018

(it also builds in a significant safety margin to account for so-far-unknown structural flaws, as seen e.g. in triple DES encryption, and to make online attacks with malicious code executing on the same CPU sharing cache/speculative execution byproducts more expensive)

@bsittler

This comment has been minimized.

Copy link

bsittler commented Apr 20, 2018

Depends on which garbage-tier hash :)

Overall, though, I actually agree. Any number chosen is a compromise and 70 bits seems like a very reasonable one to me

@laukstein

This comment has been minimized.

Copy link

laukstein commented Jun 8, 2018

Apple is releasing passwordrules attribute in Safari 12 as feature "Automatic Strong Passwords", see
https://webkit.org/blog/8327/safari-technology-preview-58-with-safari-12-features-is-now-available/
https://developer.apple.com/password-rules/

dbates-wk added a commit to dbates-wk/html that referenced this issue Sep 5, 2018

HTML passwordrules attribute
whatwg#3518

Add HTML content attribute passwordrules to the HTML standard.
@annevk

This comment has been minimized.

Copy link
Member

annevk commented Sep 5, 2018

So there's now a PR for this issue. It's unclear to me we have more implementers interested than Safari at this point though. It's also not really clear to me if we reached agreement that if we added this, whether we should recommend against using it except for sites that have legacy backends and such. I'd appreciate help.

cc @mnoorenberghe

@domenic

This comment has been minimized.

Copy link
Member

domenic commented Sep 5, 2018

I do want to say that even if the feature doesn't accumulate enough multi-implementer interest to land, I'm really happy to see the PR. Full-fledged spec PRs (and tests) are a great way of concretizing a proposal and making it easier for that multi-implementer interest to appear later, even if they end up hanging for a while. (We have many such awaiting-interest PRs.)

@othermaciej

This comment has been minimized.

Copy link

othermaciej commented Sep 6, 2018

It looks like Chrome has a built-in password generator now too. Does anyone know of a good contact person for this feature? (I figure browsers that feature password generation are more likely to be interested in this feature).

I can't find any indication of Firefox or IE having built-in password generation.

Would interest from implementors of add-ons or extensions that feature password generation be relevant for this feature?

@annevk

This comment has been minimized.

Copy link
Member

annevk commented Sep 6, 2018

It'd be good to know for sure, but per https://whatwg.org/working-mode#changes and how we generally talk about implementers, we'd need two browsers on board as well.

@domenic

This comment has been minimized.

Copy link
Member

domenic commented Sep 6, 2018

I'd be open to revisiting that sort of thing though, in some way. We had similar discussions about #3870; see #2945 (comment) and following comments. Maybe a good discussion for whatwg/meta (or whatwg/sg?).

I'll try to ask around inside Chrome to find the appropriate folks.

@battre

This comment has been minimized.

Copy link

battre commented Sep 7, 2018

Dominic from Chrome's password manager team here.

I took a look at the proposed specification and it seems to be generally sound. A few thoughts:

  1. I think it is a good idea to use a very simple language rather than one that tries to cover every possible corner case of password requirements imaginable (i.e. no 'it cannot look like a birthday').
  2. I have some concerns that if these passwordrules are only intended for password generation but not for validation of user-input, it will be abused to deceive password managers. But note: If it became used for input validation, a default of "allowed: ascii-printable" would be problematic of course because it would exclude non-ascii alphabets.
  3. Unfortunately, I don't expect a lot of impact by this spec.

Looking at password fields from sign-in forms and sign-up forms in the wild (a sample of a mix of very popular sites and the long tail), I see the following statistics for the current use of autocomplete attributes (not weighted by visits, every site counts equally):

Sign-in forms:

  • empty: 66.9%
  • off: 28.8
  • current-password: 1.5%
  • on: 1.1%
  • new-password: 1.1%
  • other strings ("nope", "nothing", "foo", ...): 0.8%

Sign-up forms:

  • empty: 96.1%
  • off: 3.1%
  • new-password: 0.5%
  • current-password: 0.0% (but >0)
  • false: 0.0% (but >0)
  • on: 0.0 (but >0)

This means that the autocomplete attribute is used correctly for sign-in forms in 1.5% of cases and in a deceiving way in 1.1% (+30% to say 'off' in various ways) and correctly on sign-up forms in 0.5% of cases, to disable filling in ~3% and not used at all in the vast majority of cases.

With this I am currently not really convinced that this will have enough positive impact to implement it. In particular I expect that those sites that impose any password requirements won't use it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.