Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed #16860 -- Added password validation to django.contrib.auth #4276

Closed
wants to merge 1 commit into from

Conversation

@mxsasha
Copy link
Member

@mxsasha mxsasha commented Mar 8, 2015

  • Aggregate all validation errors.
  • Aggregate all requirements with HTML lists.
  • Call a method on each validator after a successful change.
  • Validator for most common passwords.
  • See whether similarity errors can be made more specific.
  • Look into performance impact of current similarity validator.
  • Tests
  • Docs
  • Look into alternative way for validators to return their failures per Aymeric's suggestion.
  • Look into alternative design which does not require a global setting.
  • Ensure the common password list is cached.
  • Add validator to check for fully numerical passwords.
  • Release notes

https://code.djangoproject.com/ticket/16860
Mailing list discussion on: https://groups.google.com/forum/#!topic/django-developers/9GBhgGXmEKs

@@ -534,6 +534,8 @@
'django.contrib.auth.hashers.CryptPasswordHasher',
]

PASSWORD_VALIDATORS = []

This comment has been minimized.

@charettes

charettes Mar 8, 2015
Member

Since this is part of a contrib application I think this should be named AUTH_PASSWORD_VALIDATORS.

from django.utils.translation import ugettext as _


def get_password_backends():

This comment has been minimized.

@charettes

charettes Mar 8, 2015
Member

I'm not sure about the backend terminology here. I think naming this function get_password_validators would be more consistent with the rest of the the code.

if len(password) < self.min_length:
raise ValidationError(_("This password is too short."))

def help_text(self):

This comment has been minimized.

@charettes

charettes Mar 8, 2015
Member

Could we either make this a property or rename the method to get_help_text?


forbidden_values = [getattr(user, attr, None) for attr in self.user_attributes]
forbidden_values = [v for v in forbidden_values if v and isinstance(v, str)]
if any([value in password or password in value for value in forbidden_values]):

This comment has been minimized.

@charettes

charettes Mar 8, 2015
Member

You don't need a list comprehension here.

@mxsasha mxsasha assigned kmtracey and unassigned kmtracey Mar 14, 2015
@mxsasha
Copy link
Member Author

@mxsasha mxsasha commented Mar 14, 2015

Didn't actually mean to assign this to Karen, must have misclicked somewhere :)

@mxsasha
Copy link
Member Author

@mxsasha mxsasha commented Apr 11, 2015

I don't know entirely what's going on, but those build failures do not appear to be related to the patch:

 > git checkout -f fe70cd410fbc419a4881099ab3034d0f72d629ed
FATAL: Could not checkout null with start point fe70cd410fbc419a4881099ab3034d0f72d629ed
hudson.plugins.git.GitException: Could not checkout null with start point fe70cd410fbc419a4881099ab3034d0f72d629ed

Most platforms do seem to succeed.

@MarkusH
Copy link
Member

@MarkusH MarkusH commented Apr 11, 2015

That's one of Jenkins occasional hiccups. No need to worry.

from django.utils.translation import ugettext as _


def get_password_validators():

This comment has been minimized.

@jarshwah

jarshwah Apr 11, 2015
Member

Should this be cached? The number of times validators are instantiated, and the associated cost with loading in the 1000 most common passwords each time strongly suggests that it should be.

This comment has been minimized.

@mxsasha

mxsasha Apr 11, 2015
Author Member

Yes, actually, definitely.

This comment has been minimized.

@aaugustin

aaugustin Apr 11, 2015
Member

I'm theoretically reluctant to this being a global module rather than something tied to the User model.

In practice AUTH_USER_MODEL is already a global singleton so this doesn't change the picture.

(Sorry if I brought this up before.)

This comment has been minimized.

@aaugustin

aaugustin Apr 11, 2015
Member

@jarshwah Agreed. The cache must be cleared by a setting_changed receiver.

This comment has been minimized.

@jezdez

jezdez Apr 12, 2015
Contributor

I'd like to second @aaugustin's point about this conceptually fitting very well into the custom user models. This does not require yet another setting, so let's not.

On the bigger picture I really would like to prevent settings.py to continue to turn into a file full of dictionaries with dotted Python paths and initialization parameters. Let's use the extension API we've built into the auth app for such a case.

This comment has been minimized.

@mxsasha

mxsasha Apr 13, 2015
Author Member

I get your point, but I'm not entirely seeing what you are proposing. Could you make a short example? As this is a more fundamental design discussion, perhaps this would fit best on the django-developers list.

(I'll defer working/responding on other comments until we've settled this issue, as we need to get this right to have a mergeable patch.)

"""
def __init__(self):
common_passwords_file = os.path.dirname(os.path.realpath(__file__)) + '/common-passwords.txt'
self.passwords = [p.strip() for p in open(common_passwords_file).readlines()]

This comment has been minimized.

@jarshwah

jarshwah Apr 11, 2015
Member

This would be better as a set rather than a list.

This comment has been minimized.

@aaugustin

aaugustin Apr 11, 2015
Member

Since the list is in order of most common use, the code detects incorrect passwords slightly faster if you preserve the order.

:bikeshed:

This comment has been minimized.

@jarshwah

jarshwah Apr 11, 2015
Member

Scanning a list will not be faster than a membership test of a set (unless the list of words is very small).

This comment has been minimized.

@aaugustin

aaugustin Apr 11, 2015
Member

Oh I missed that. Sorry!

Returns an HTML string with all help texts of all configured validators in an <ul>.
"""
help_texts = password_validators_help_texts()
return '<ul>%s</ul>' % unordered_list(help_texts)

This comment has been minimized.

@aaugustin

aaugustin Apr 11, 2015
Member

unordered_list handles nesting which you don't seem to need here. A pedestrian implementation with format_html would be more readable:

help_items = [format_html('<li>{}</li>', help_text) for help_text in help_texts]
return format_html('<ul>{}</ul>', ''.join(help_items))

Furthermore, this implementation marks the result as safe, which is useful here.

(Truth be told, I'm reluctant to use template tags or filters in Python code, for ideological reasons.)

for validator in settings.AUTH_PASSWORD_VALIDATORS:
try:
klass = import_string(validator['NAME'])
validators.append(klass(**validator.get('OPTIONS', {})))

This comment has been minimized.

@timgraham

timgraham Apr 11, 2015
Member

this line could go in "else" of try/except/else

klass = import_string(validator['NAME'])
validators.append(klass(**validator.get('OPTIONS', {})))
except ImportError:
raise ImproperlyConfigured("Invalid NAME for a password validator: %s. Check "

This comment has been minimized.

@timgraham

timgraham Apr 11, 2015
Member

prefer hanging indent to allow longer lines


def validate_password(password, user=None):
"""
Validates whether the password meets all validator requirements.

This comment has been minimized.

@timgraham

timgraham Apr 11, 2015
Member

we're now using pep8 style for docstrings "Validate whether ..." "return None", "raise ValidationError", etc.

against either part of an e-mail address, as well as the full address.
"""
def __init__(self, user_attributes=None, max_similarity=0.7):
self.user_attributes = user_attributes if user_attributes else ('username', 'first_name', 'last_name', 'email')

This comment has been minimized.

@aaugustin

aaugustin Apr 11, 2015
Member

If user_attributes is set to () or [], the default set of attributes will be used. That may be surprising. Can you make a strict check for user_attributes is None?

I understand that the validator doesn't do anything then and can simply be removed from the settings in that case, but I can imagine situations where someone would control user_attributes (e.g. through an UI) and not AUTH_PASSWORD_VALIDATORS.

This comment has been minimized.

@wimfeijen

wimfeijen Apr 12, 2015
Contributor

I would code it like this:

DEFAULT_USER_ATTRIBUTES = ('username', 'first_name', 'last_name', 'email')

def __init__(self, user_attributes=DEFAULT_USER_ATTRIBUTES, max_similarity=0.7):
    self.user_attributes = user_attributes 

def password_changed(password, user=None):
"""
Informs all validators, that have implemented this feature, that the password has been changed.

This comment has been minimized.

@timgraham

timgraham Apr 11, 2015
Member

chop first comma
"this feature" -> "a password_changed() method"?
wrap docstrings at 79 chars

@@ -236,3 +236,173 @@ from the ``User`` model.

Checks if the given string is a hashed password that has a chance
of being verified against :func:`check_password`.

This comment has been minimized.

@aaugustin

aaugustin Apr 11, 2015
Member

Extra blank line :-)

This comment has been minimized.

@aaugustin

aaugustin Apr 11, 2015
Member

There are a few other double line breaks below.

This comment has been minimized.

@mxsasha

mxsasha May 16, 2015
Author Member

Fixed.


def password_validators_help_text_html():
"""
Returns an HTML string with all help texts of all configured validators in an <ul>.

This comment has been minimized.

@timgraham

timgraham Apr 11, 2015
Member

I would read it as "validators in a 'U' 'L'" so I would say "in a" rather than "in an" but if you read it as "unordered list" than "an" is fine.

Validators can also have optional settings to fine tune their behavior.

Validation is controlled by the :setting:`AUTH_PASSWORD_VALIDATORS` setting.
By default, validators are used in the forms used to reset or change

This comment has been minimized.

@aaugustin

aaugustin Apr 11, 2015
Member

Repetition of "used"? (We French have an OCD about not reusing the same word less than two lines apart.)


class MinimumLengthValidator(object):
"""
Validates whether the password has a minimum length.

This comment has been minimized.

@timgraham

timgraham Apr 11, 2015
Member

has a -> is of a


def validate(self, password, user=None):
if len(password) < self.min_length:
return _("This password is too short.")

This comment has been minimized.

@timgraham

timgraham Apr 11, 2015
Member

I'd include the min length in the error message

Validates whether the password is sufficiently different from the user's attributes.
If no specific attributes are provided, this looks at a sensible list of defaults.
Attributes that don't exist are gracefully ignored. Comparison is made to not only

This comment has been minimized.

@timgraham

timgraham Apr 11, 2015
Member

chop "gracefully"?


Validates whether the password is not a common password, by checking
against a list of 1000 most common password created by
`Mark Burnett <https://xato.net/passwords/more-top-worst-passwords/>`_.

This comment has been minimized.

@aaugustin

aaugustin Apr 11, 2015
Member

xato.net 502s at this time. Perhaps that's temporary.

This comment has been minimized.

@mxsasha

mxsasha May 16, 2015
Author Member

I've seen somewhat recent references to it, but it still fails for me. Should we link to the archived version? Attribution does appear to be required.

If no specific attributes are provided, this looks at a sensible list of defaults.
Attributes that don't exist are gracefully ignored. Comparison is made to not only
the full attribute value, but also it's components, so that e.g. a password is validated

This comment has been minimized.

@timgraham

timgraham Apr 11, 2015
Member

its
so that, for example, a ...

If no specific attributes are provided, this looks at a sensible list of defaults.
Attributes that don't exist are gracefully ignored. Comparison is made to not only
the full attribute value, but also it's components, so that e.g. a password is validated
against either part of an e-mail address, as well as the full address.

This comment has been minimized.

@timgraham

timgraham Apr 11, 2015
Member

no dash in "email"


* ``validate(self, password, user=None)``: validate a password. Return
``None`` if the password is valid, or a string with an error message if the
password is not valid. You must be able to deal with ``user`` being

This comment has been minimized.

@aaugustin

aaugustin Apr 11, 2015
Member

I'm worried about returning a string to announce a security-sensitive condition.

It would be easy for a developer to get confused with the general validate_password function and write code like:

class MyValidator(SomeValidator):
    def validate(self, password, user=None):
        try:
            super().validate(password, user)
        except ValidationError:
            # do stuff...
            raise

Of couse that wouldn't pass tests, but I believe the principle of least astonishement is very important for security-sensitive APIs and that the discrepancy between validate_password and validate is sub-optimal in this regard.

(Again, apologies if I already brought this up during my previous review of this PR.)

The list used was created by Mark Burnett: https://xato.net/passwords/more-top-worst-passwords/
"""
def __init__(self):
common_passwords_file = os.path.dirname(os.path.realpath(__file__)) + '/common-passwords.txt'

This comment has been minimized.

@timgraham

timgraham Apr 11, 2015
Member

Not sure how much a difference it makes, but it seems better to store this in Python rather than having to read from a text file. Worth it to make the file location customizable? If so, it might be nice to make "common passwords" a separate package so we don't have to include that list in Django. I guess users might not care for the additional setup tasks though.

This comment has been minimized.

@mxsasha

mxsasha May 16, 2015
Author Member

With the discussion leaning towards shipping a gzipped list, so that the Django codebase doesn't contain so many bad words in cleartext, I guess the python option is no longer advantageous. Making it customisable would be useful though. I'm not in favour of shipping a separate list, due to the increased effort required from our users.


AUTH_PASSWORD_VALIDATORS
------------------------

This comment has been minimized.

@timgraham

timgraham Apr 11, 2015
Member

Add "Default: ..." like other settings. Also ".. versionadded:: 1.9"

This comment has been minimized.

@mxsasha

mxsasha May 16, 2015
Author Member

Fixed.



Enabling password validation
----------------------------

This comment has been minimized.

@timgraham

timgraham Apr 11, 2015
Member

a blank line should follow each heading


Included validators
-------------------
Django includes three validators:

This comment has been minimized.

@timgraham

timgraham Apr 11, 2015
Member

need a .. module:: directive hear, otherwise you inherit the last one which is django.contrib.auth.hashers.

validation. This can be useful if you use custom forms for password setting,
or if you have API calls that allow passwords to be set, for example.

.. function:: django.contrib.auth.password_validation.validate_password(password, user=None)

This comment has been minimized.

@timgraham

timgraham Apr 11, 2015
Member

use .. currentmodule:: django.contrib.auth and then omit the prefix on the functions

@MarkusH
Copy link
Member

@MarkusH MarkusH commented Apr 11, 2015

As @aaugustin mentioned elsewhere, compression the common passwords list sounds like a good idea in combination with customizing its location.

How hard would it be to add some autodetection if the file needs to be unzipped or is already a plain-text file? Plain text files are easier to maintain for project specific lists, I think.

'NAME': 'django.contrib.auth.password_validation.MinimumLengthValidator',
},
{
'NAME': 'django.contrib.auth.password_validation.CommonPasswordValidator',

This comment has been minimized.

@wimfeijen

wimfeijen Apr 12, 2015
Contributor

Why is NAME in capitals?

For comparison, DATABASES and CACHES use lower-case 'default' and then capitals, LOGGING uses lower-case only.

This comment has been minimized.

@MarkusH

MarkusH Apr 12, 2015
Member

Because it's a predefined identifier, similar to ENGINE and NAME in the DATABASES setting and BACKEND and DIRS in TEMPLATES.

@jezdez
Copy link
Contributor

@jezdez jezdez commented Apr 12, 2015

Ough, is a new setting really required for this?

@timgraham timgraham changed the title WIP - Added password validation to django.contrib.auth [WIP] Fixed #16860 -- Added password validation to django.contrib.auth Apr 16, 2015
@mxsasha
Copy link
Member Author

@mxsasha mxsasha commented Apr 26, 2015

@jezdez, @aaugustin: as both of you have proposed to tie this to the User model, could you write up a short example? I may have some concerns about that design, but I'd rather form an opinion on an example rather than my imagination of what I think your idea might be. Perhaps I'm entirely mistaken.

As this is a more fundamental design discussion, perhaps this would fit best on the django-developers list rather than this PR.

@mxsasha mxsasha changed the title [WIP] Fixed #16860 -- Added password validation to django.contrib.auth Fixed #16860 -- Added password validation to django.contrib.auth Jun 5, 2015

def get_password_validators(validator_config=None):
validators = []
if validator_config is None:

This comment has been minimized.

@carljm

carljm Jun 5, 2015
Member

If you're going to provide get_default_password_validators, why not move this into that function and make get_password_validators entirely settings-independent, with the config arg required?

try:
klass = import_string(validator['NAME'])
except ImportError:
msg = "Invalid NAME for a password validator: %s. Check your AUTH_PASSWORD_VALIDATORS setting."

This comment has been minimized.

@carljm

carljm Jun 5, 2015
Member

Wouldn't it be a bit more helpful for this error message to specifically note that the module with the given path couldn't be imported? "Invalid" is a very vague term, which could mean all sorts of things - it seems unhelpful to silence an ImportError and replace it with a much vaguer message.

return validators


def validate_password(password, user=None, password_validators=None):

This comment has been minimized.

@carljm

carljm Jun 5, 2015
Member

If I were writing this from scratch, I'd consider "a bunch of functions that all take an optional password_validators arg and call get_default_password_validators() if it's not given" to be an indication that perhaps a higher-level class is in order that you can instantiate with a set of password validators and then call methods on (and a default instance of that class based on settings could be provided at module level).

But I don't feel strongly enough about this to suggest actually rewriting it on that model.

This comment has been minimized.

@mxsasha

mxsasha Jun 6, 2015
Author Member

That's a sensible design. However, the custom validator setup is expected to be quite uncommon: you only need this if you have multiple different sets of validators within the same Django project. So for most people, it's now as simple as calling validate_password() - which would get slightly more complicated in your proposed design. I also don't feel strongly, so I'll leave it as is.

if password_validators is None:
password_validators = get_default_password_validators()
for validator in password_validators:
if hasattr(validator, 'password_changed'):

This comment has been minimized.

@carljm

carljm Jun 5, 2015
Member

hasattr is ugly because of its propensity to silently hide AttributeError, and because of its look-before-you-leap inefficiency. I would avoid it and instead use something like:

password_changed = getattr(validator, 'password_changed', lambda *a: None)
password_changed(password, user)

This comment has been minimized.

@carljm

carljm Jun 5, 2015
Member

(This is also more "food for thought" than an actual request for change - whatever you prefer is fine.)

return _("Your password must contain at least %(min_length)d characters.") % {'min_length': self.min_length}


class UserAttributeSimilarityValidator(object):

This comment has been minimized.

@carljm

carljm Jun 5, 2015
Member

I suppose there's no reasonable way around the fact that this provides no protection if the attributes are later changed after the password is chosen. Unless validators gained an optional method where they could be notified of any change to the user model (optionally; this would have to be an API called explicitly by the user since Django doesn't provide an edit-user form).

This comment has been minimized.

@carljm

carljm Jun 5, 2015
Member

Never mind, even such an API would be useless since the raw password is no longer available after its initially set. So there really is no way around that limitation of this validator.

This comment has been minimized.

@mxsasha

mxsasha Jun 6, 2015
Author Member

Yes. The only thing you could check later is an exact match, but that's much less likely to catch anything. I don't think it's worth the effort for such a tiny gain. Then again, the attributes that I expect to be most often used as a password are less often changed, like the user's name.

Password validation
===================

Users often to choose poor passwords. To help mitigate this problem, Django

This comment has been minimized.

@carljm

carljm Jun 5, 2015
Member

extraneous to, should be "Users often choose poor passwords."

requires the minimum length to be nine characters, instead of the default
eight.
* ``CommonPasswordValidator``, which checks whether the password occurs in a
list of common passwords. By default, it compares to an included list of

This comment has been minimized.

@carljm

carljm Jun 5, 2015
Member

"an included list of the 1000 most common passwords."

Or if that's too strong a claim, then "most" should be removed: "an included list of 1000 common passwords."

@carljm
Copy link
Member

@carljm carljm commented Jun 5, 2015

This is great! Very excited to have this in Django 1.9; I think I'm about to backport it into a 1.8 project I'm working on currently :-) Thanks for working on it!

@carljm
Copy link
Member

@carljm carljm commented Jun 5, 2015

Of the comments I just left, the only one I would consider a merge blocker is the vague error message issue.

@alexbecker
Copy link
Contributor

@alexbecker alexbecker commented Jun 21, 2015

Is there a reason why password validation is not performed on UserCreationForm?

@timgraham
Copy link
Member

@timgraham timgraham commented Jun 22, 2015

This was merged in 1daae25.

@timgraham timgraham closed this Jun 22, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

You can’t perform that action at this time.