Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Passwords: use better algorithm than md5 hash, use salts and maintain BC #5728

Closed
anonymous-piwik-user opened this issue Jul 19, 2008 · 52 comments
Closed

Comments

@anonymous-piwik-user
Copy link

@anonymous-piwik-user anonymous-piwik-user commented Jul 19, 2008

What I can see from code is that passwords are hashed with md5, but without a salt. That’s not secure enough, take look at here: http://ilia.ws/archives/68-MD5-Dictionary-Attacks.html

See also the request to use different salts for different hashes: #3051

@robocoder

This comment has been minimized.

Copy link
Contributor

@robocoder robocoder commented Jan 11, 2009

Might want to consider http://www.openwall.com/phpass (public domain password framework already adopted by other open source projects).

@anonymous-piwik-user

This comment has been minimized.

Copy link
Author

@anonymous-piwik-user anonymous-piwik-user commented Jan 13, 2009

All you need to do is add a salt before hashing the password (the longer the better), as reported in the summary.

@mattab

This comment has been minimized.

Copy link
Member

@mattab mattab commented Dec 28, 2010

We should have a salt for Password to avoid dictionnary attacks or prevent password leakages on other websites to impact a user login to Piwik.

@robocoder

This comment has been minimized.

Copy link
Contributor

@robocoder robocoder commented Dec 28, 2010

Note: salting is a compat buster. We can't salt existing passwords because they're already hashed.

To migrate:

  • add a flag column to each user; if true, user must change their password when he/she next logins; regenerate token_auth at that time
  • move current Login auth to a migration plugin that hooks into the authenticator; this plugin would be deactivated after everyone has changed their passwords

For more robustness:

  • configureable hash() algorihtm (defaults to md5); increase md5password and token_auth size to 128 chars (hexits)
  • implement double salt in the new Login auth
@alexlehm

This comment has been minimized.

Copy link

@alexlehm alexlehm commented Jul 24, 2012

I would like to add two concerns about md5 unsalted passwords

a hash is resolvable by googling if the password is a plain word, you do not even need a dictionary, e.g. https://www.google.com/search?q=14c4b06b824ec593239362517f538b29

all hashes for passwords up to a given length can be resolved by a rainbow table attack, so a leaked password table can be processed without any brute force operation

(if you need additional arguments, take a look at a few recent pw leaks e.g. on linkedin)

@mattab

This comment has been minimized.

Copy link
Member

@mattab mattab commented Dec 16, 2012

When we implement per user salt, we should also add in

  • when user change the password, also invalidate all other currently logged in sessions for this users (ie. change the user's salt?)
@halfdan

This comment has been minimized.

Copy link
Member

@halfdan halfdan commented Feb 4, 2013

We should use the new Password API from PHP 5.5 for this! At the end of the article there is a link to a compatibility implementation for earlier PHP versions.

https://gist.github.com/3707231

If we agree on something I am keen on implementing this one.

@mattab

This comment has been minimized.

Copy link
Member

@mattab mattab commented Feb 5, 2013

This new API looks great! But it will need to work of course for 5.1.x to 5.4.x as well, also we need to plan if "crypt" extension is not available (prior to 5.3 it seems crypt might not be available always)

@alexlehm

This comment has been minimized.

Copy link

@alexlehm alexlehm commented Feb 1, 2014

I cannot believe that this is still buried as a low priority ticket and has been ignored for 6 years.

md5 password storage is completely insufficient, this poses a significant security risk.

md5 passwords with salt are also insufficient since md5 cannot be considered secure at all anymore.

To achieve an acceptable level of security, it is necessary to use a newer hash method like bcrypt, scrypt or another key derivation function, most previous pw methods are too weak and have been for some time.

Sacrificing this with the argument of backward compatibility is criminally stupid, sorry.

@robocoder

This comment has been minimized.

Copy link
Contributor

@robocoder robocoder commented Feb 1, 2014

@alexlehm In future, make your comments constructive. The challenge is migrating existing users.

@alexlehm

This comment has been minimized.

Copy link

@alexlehm alexlehm commented Feb 1, 2014

I'm sorry if that is non-constructive, however the current pw scheme is such a major issue that this must be addressed sooner rather than later.

Serendipity used a rather smart method to migrate user pws to a new scheme a while back where they defined a cut-off date 180 days after the migration was activated (based on the installation, not on the release of the software) so that all active users migrated themselves without even knowing and inactive users were locked out after the grace period.

E.g. migrating the admin user would work like, since the user will log in anyway.
I am not sure how the auth mechanisms work internally, but for anything that is authenticating on a client API, an API hash would be better then using a real password.

@mattab

This comment has been minimized.

Copy link
Member

@mattab mattab commented Feb 1, 2014

Increasing priority... Yes it's an important ticket, but as you can see, there are a lot of important tickets to work on, the challlenge is that we have a very small team... Are you familiar with PHP? If you are interested and familiar with php, please consider joining the project or helping with this!

@alexlehm

This comment has been minimized.

Copy link

@alexlehm alexlehm commented Feb 1, 2014

Actually I could do that, have to read up on the piwik source code first, even though I am using this for quite a while I have never looked at anything except the config file.

@halfdan

This comment has been minimized.

Copy link
Member

@halfdan halfdan commented Feb 1, 2014

@alexlehm It would be nice if you'd join up for sure :)

@mattab We discussed this a few days ago - I'd very much like to do this with the new PHP 5.5 password API (+ the 5.3 fallback).

The way this could work is that for all installations with PHP >= 5.3.7 we automatically update the passwords to bcrypt once they log in (the 5.3 fallback only works with >= 5.3.7 since there were issues in earlier PHP versions). This would leave quite a number of people out of the loop, but if they cared about security they wouldn't be running 5.3.x in the first place :)

Cheers!

@mattab

This comment has been minimized.

Copy link
Member

@mattab mattab commented Feb 3, 2014

Another challenge: maintain backward compatibility for token_auth

  • Since token = md5($userLogin . $md5Password) (see UsersManager\API) then changing password will change the token. this may cause some problem and break some tools/apps using the previous token...
  • we have to clarify all places where md5password is used, this will break some usersManager APIs. Can we keep BC in these APIs by keeping the md5Password parameter?
@alexlehm

This comment has been minimized.

Copy link

@alexlehm alexlehm commented Feb 4, 2014

using the hashed password to create the token without any additional data is not so good, this makes it impossible to change the token without changing the password as well, it should be possible to invalidate the api tokens by changing the password to the same value (or to create a new token by clicking a button without affecting the user account at all).

Have to look into this a bit further, it should be possible to keep both but disable the old functionality when the user chooses so.

@halfdan

This comment has been minimized.

Copy link
Member

@halfdan halfdan commented Feb 4, 2014

I agree, the token should be decoupled from the password completely. We could simply generate a unique string with some random data.

It would be really cool to invalidate the token without having to change the password, so I'll be looking into that.

@mattab

This comment has been minimized.

Copy link
Member

@mattab mattab commented Mar 7, 2014

Things to think about:

@mattab mattab removed the P: normal label Aug 3, 2014
@mattab mattab changed the title Security: use salt for md5 passwords and maintain BC Security: use salts, use better algorithm than md5 hash, and maintain BC Sep 10, 2014
@mattab mattab changed the title Security: use salts, use better algorithm than md5 hash, and maintain BC Passwords: use better algorithm than md5 hash, use salts and maintain BC Sep 10, 2014
@etjossem

This comment has been minimized.

Copy link

@etjossem etjossem commented Sep 11, 2015

Please try to auto migrate users to bcrypt, etc. Unsalted MD5 is kind of a scary vulnerability.

See: #8753

@tsteur

This comment has been minimized.

Copy link
Member

@tsteur tsteur commented Sep 11, 2016

Upgrading the user performing the upgrade

I was thinking of the following:

  • User logs in, we check if password string is 32 char. If so, we hash original password and save the newly hashed password in the DB in the same column (hashing using password_hash()).

I don't think it's needed to email users and so far I think we have not done this. This would be basically up to each Piwik Administrator whether they want to email their users and ask them to log in again so their password is updated.

API Tokens

Sounds like #6559

Here we basically have the raw token in the DB and could simply run password_hash on them. Problem is when a user wants to authenticate using a token we can't simply do a query like select * from users where api_token = password_hash($passedApiToken). The problem is, in API calls we do not have a username where we can get the hashed password and then use password_verify($passedApiToken). We would need to compare it individually against each user (where there could be > 30k users).

It may work to generate the original hash using something like https://github.com/ircmaxell/password_compat/blob/v1.0.4/lib/password.php#L235 and then we are able to query for that user in the DB but not sure if it works. Also it might be a problem as on some installations crypt might not be available. So we would not be able to rely on the crypt extension and would need an alternative there.

In the UI we would not be able to show the original token anymore on the API page which is good. Instead users would be able to add / revoke tokens etc.

API only users

That would be really great to have if not too much effort

@mneudert

This comment has been minimized.

Copy link
Member

@mneudert mneudert commented Sep 12, 2016

So the basic target is "a silent upgrade with full compatibility". Sounds fine 👍

Silent password update

Comparing the hash length and then silently switching over is probably the most suitable way without having to add separate database field, option keys, ....

Probably also the most common way to deal with that issue. And as already mentioned a solved problem since long ago.

That would keep everything working for several releases until support is completely dropped from the core. (Or potentially moved to a separate plugin for large installations that cannot notify users even if they haven't used their password for "a long time" and therefore skipped the upgrade.)

API tokens

I might miss something but it should not be a big problem at all.

While upgrading the release all existing users could get one token auto generated using the old method. Whether new users get a default token or not is a different thing. As the token_auth database field already has a unique key we could just take that over to the new storage without running into problems with duplicates. That makes a frictionless upgrade path leaving it the administrator's choice to force users to generate new API tokens or not.

For generating/validating the tokens I think the part about password_hash was a bit confusing. The hashing was only intended to generate a random string. It was not my intention to say any incoming request for token authentication should hash the token and then compare it.

Tokens should be no problem to be stored in plaintext as they are already. For a usable token management interface they have to be available in a readable form anyway. Cannot think of a way to manage a token someone cannot read :D.

And as we have to store them that way it sounds suitable to also compare them that way.

password_compat seems to be a perfect resource for the parts to use/"copy". The source range #L104-L138 shows a really portable way to generate new tokens. And at line #L241 is the timing attack resisting comparison.

With some special care it might also be possible to keep the upgrade process smooth for tools like python log importer. Like saying "please do not revoke your default key until you have updated your tools"...

API only users

Have not looked at the implications that idea would have. But having disconnected tokens and credentials should make it at least a candidate for a 3.1.0 release if not earlier.

@tsteur

This comment has been minimized.

Copy link
Member

@tsteur tsteur commented Sep 12, 2016

For generating/validating the tokens I think the part about password_hash was a bit confusing. The hashing was only intended to generate a random string. It was not my intention to say any incoming request for token authentication should hash the token and then compare it.

For improved security it would be really good to have the API tokens hashed. Just in case for some reasons someone gets access to DB etc. Of course they could then also just overwrite the token with a new one but it would be still good security to have it hashed. There are probably solutions for this but I haven't checked yet. Usually authentications consist of a key and a secret which makes it easier as only having the secret.

Tokens should be no problem to be stored in plaintext as they are already. For a usable token management interface they have to be available in a readable form anyway. Cannot think of a way to manage a token someone cannot read :D.

When creating a token eg on Google they show it once when the token is generated. They ask you to "save" it eg in a password manage or to use it in your mail client etc. There is no way to see that token again. If you "lose" it, you need to generate a new one (and ideally delete/revoke the old one).

If we'd still keep the token in plain text in the DB there is not really a big reason to change anything (for now). It would be still a benefit to generate the token more randomly though. So if we decide to keep the token in plain text in the DB until we have a REST API we'd only need to improve the token generation IMO.

I just noticed if we decide to hash the API token we would need to also fix eg #10185 as the token is currently used for the export feature and for other things. However, we would not be able to get the "original" token anymore because it is hashed.

@ircmaxell

This comment has been minimized.

Copy link

@ircmaxell ircmaxell commented Sep 12, 2016

For improved security it would be really good to have the API tokens hashed. Just in case for some reasons someone gets access to DB etc. Of course they could then also just overwrite the token with a new one but it would be still good security to have it hashed. There are probably solutions for this but I haven't checked yet. Usually authentications consist of a key and a secret which makes it easier as only having the secret.

For this use-case, a simple SHA-256 on DB-insert should suffice.

@tsteur

This comment has been minimized.

Copy link
Member

@tsteur tsteur commented Sep 12, 2016

For this use-case, a simple SHA-256 on DB-insert should suffice.

That's a good point 👍

@J0WI

This comment has been minimized.

Copy link

@J0WI J0WI commented Sep 12, 2016

It would be great to have the plain md5 hashes mitigated during the update even if the users will not log in for a while.

We could hash the current md5 hashes a second time with a stronger algorithm and with salt and pepper. These hashes can be flagged (so the login script knows that it need to hash them twice to verify) and updated during the next successful log in. If the hashes become public for any reason, we have no plain md5 hashes in DB.

The md5 hashes are the main issue here. So if the above solution won't be implemented, I would go with forcing a password reset for every user.
History showed that plain md5 password hashes are always evil, even for inactive users or hosts.

Edit:
If something similar is done with the API tokens, it might be possible to be backwards compatible. But for security reasons this should be behind a pref and removed completely in future updates.

@mneudert

This comment has been minimized.

Copy link
Member

@mneudert mneudert commented Sep 12, 2016

Is there any feature that could not be modified to use the API proxy defined in ee3bc9c? Or at least not short-term modified?

A search for token_auth usage find results like the DocumentationGenerator (example URLs for RSS feeds if I get it right?) or UserCountryMapController.

Hashing stuff is always good but if a feature breaks it might be a deal breaker.

If these are "user facing" features for "URLs without login" we could go with hashing and update the messages/documentation to point at the token management. Users without any tokens available would get displayed a nice message to create a token. And going back to a string length check we could have the legacy token only available for existing URLs or "people who know" while not including it in the check for potentially available tokens.

@tsteur

This comment has been minimized.

Copy link
Member

@tsteur tsteur commented Sep 13, 2016

We cannot use the solution in ee3bc9c as it is missing an nonce to prevent CSRF attacks. We would need to pass along an nonce token or a token/access key that is only valid for limited time but ideally an nonce. We also currently use the token_auth in API requests from JavaScript. This is pretty much the same case and here we would need to use something like nonce as well.

I think the most critical things for now are:

  1. Migrate password via password_hash on login
  2. Generate token auth randomly and not based on login / password

This will already be a great improvement.

Next step that will be needed eventually is to hash the token_auth which we will have to do for sure. If not in 3.0, then in 3.1 as it shouldn't break anything (I reckon).

It would be great to focus for now on the first two things. Once this is done we could do another evaluation on the steps.

@tsteur

This comment has been minimized.

Copy link
Member

@tsteur tsteur commented Sep 25, 2016

@mneudert or @sgiehl will you work on this?

@sgiehl

This comment has been minimized.

Copy link
Member

@sgiehl sgiehl commented Sep 25, 2016

@J0WI

This comment has been minimized.

Copy link

@J0WI J0WI commented Sep 26, 2016

How do you think about my proposal in #5728 (comment) to migrate plain md5 hashes during the update procedure and not only on login?

@tsteur

This comment has been minimized.

Copy link
Member

@tsteur tsteur commented Sep 26, 2016

be worth having a look into it for sure. Would probably only need to check how to flag it.

@mneudert

This comment has been minimized.

Copy link
Member

@mneudert mneudert commented Sep 26, 2016

Currently I plan to rehash every password with SHA256 during the upgrade combined with setting a "_legacy_password"-Option stored for each user. After a first valid login this option will get deleted and a proper rehashing will update the password.

Having all passwords rehashed also nicely shows where the password/token_auth is in use with all the tests breaking :D

@HybridAU

This comment has been minimized.

Copy link

@HybridAU HybridAU commented Sep 27, 2016

I might be misunderstanding the process you are using here, and for what it's worth I really appreciate you doing this work so I don't want this to come across as just complaints from the peanut gallery.

But I think if your going to go through all the effort of changing the password storage method you should move to something like PBKDF2 or bcrypt. They are designed for password storage and have things like a salt and configurable iterations built in.

Rather than moving to SHA256 which while better than MD5 still shares many of it's faults (e.g. With out a salt two passwords which are the same will have the same hash and so rainbow tables can be made. Also there is no easy way to increase the number of required iterations and so the time to create a hash which is already quite low, will only ever get lower.)

@tsteur

This comment has been minimized.

Copy link
Member

@tsteur tsteur commented Sep 27, 2016

I'm not quite sure but I think the SHA256 is only for auth tokens and not for end user passwords see discussion further above. end user passwords should be using http://us2.php.net/manual/en/function.password-hash.php (bcrypt)

@mneudert

This comment has been minimized.

Copy link
Member

@mneudert mneudert commented Sep 27, 2016

Yeah, that's right. I think however that needs some deeper refactorings for example regarding the tests matching password hashes in their XML fixtures. Plan is to change SHA256 with bcrypt for old password once new password are hashed that way. Then everything should be in place to handle that switch a lot easier than right of the bat.

@mattab

This comment has been minimized.

Copy link
Member

@mattab mattab commented Oct 6, 2016

Hi @mneudert @sgiehl - what is the status for this security improvement? if possible, we would love to include it in the Piwik 3.0.0-beta2 or beta3 in coming days. We've just released the first public beta of Piwik 3.0.0: https://piwik.org/changelog/piwik-3-0-0-beta1/

@mneudert

This comment has been minimized.

Copy link
Member

@mneudert mneudert commented Oct 9, 2016

Currently I have switched the password hashing to bcrypt (re-hashing of old passwords yet to be added) and worked on the token separation/extraction. If it helps I could open a WIP pull request for that part (with a list of caveats attached as some things might seem somewhat "quirky" otherwise).

There are still some things to sort out like a proper handling of md5 hashed passwords used in API requests or what should happen if a user has no API token (last one deleted or none ever created).

@tsteur

This comment has been minimized.

Copy link
Member

@tsteur tsteur commented Oct 9, 2016

PR with caveats sounds good 👍

For now we wouldn't need the feature to create multiple tokens etc and be good to only migrate existing tokens (each user has one token) but if you're keen on implementing this as well with multi tokens 👍 . Maybe it would be possible to split it into 2 PRs as just making passwords and token more secure by hashing them could be much more simple compared to managing multiple tokens maybe. Ideally the UI would not rely on using API tokens and only use nonce or generated short-term tokens that are always different per session or so for better security. The UI should eventually not use an API token and therefore, when a user deletes all tokens the API would be basically not usable which is a good feature.

Main feature would be: On login bcrypt password if not done yet, on update hash all tokens with SHA256 and generate the tokens in the future more randomly. Those three things would be main improvement for now. Everything else, like app-specific tokens is great to have :)

@mattab

This comment has been minimized.

Copy link
Member

@mattab mattab commented Dec 2, 2016

Awesome work @mneudert @tsteur - this important security improvement merged in #10926 🎉

@mattab mattab closed this Dec 2, 2016
@mlissner

This comment has been minimized.

Copy link

@mlissner mlissner commented Dec 7, 2016

I've only skimmed this thread, but I don't see anywhere that we're keeping track of the bcrypt configuration parameters. Ideally when we upgrade people's existing passwords, we will save all of the following:

  • Algorithm
  • Iterations
  • Salt
  • Hashed password

For example, this is the value that Django stores for every user. It's a $ delimited field in the database:

pbkdf2_sha256$20000$random-salt-here$hashed-password-here

This says:

The password was hashed using 20000 rotations of the pdkdf2 sha256 algorithm using the salt random-salt-here and resulted in the hash hashed-password-here.

Apologies if this is all being taken care of already (I didn't check the code), but we need to store ALL of this so that we can later upgrade the algorithm, the number of rotations, etc. For example, Django regularly upgrades the number of rotations as computer hardware and software become faster. If we don't do this now, we'll have another headache down the road when we need to upgrade things again.

Here's Django's documentation. It's very good.

@tsteur

This comment has been minimized.

Copy link
Member

@tsteur tsteur commented Dec 7, 2016

The algorithm, cost and salt etc is stored in the password see http://us2.php.net/manual/en/function.password-hash.php

@mlissner

This comment has been minimized.

Copy link

@mlissner mlissner commented Dec 7, 2016

Fantastic! I'll go on my now. Apologies for the drive by. I'm very happy to see this improvement.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
You can’t perform that action at this time.