New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement New Password Hashing #796
Comments
Little to add except we might want to put these functions inside a class, so we can add more hashing mechanisms in the future, without changing the logic at all. As I assume this will not be in 2.3, I will write a proposal for this for 2.4. And, as Michael pointed out, I am not a big fan of using a "standard" method of hashing, because it is more likely somebody will create a huge rainbowtable (big enough to include the salts) for LDAP than it is for Symphony. The bad thing here is, if such a table exists they will be relatively cheap compared to having to create one yourself. |
I've just learned about PBKDF2, an iterative key derivation function. It is, similarily to most other hashing-functions, basically creating a derived/hashed key from a given plaintext password. The difference to SHA1, MD5 etc. is that one can decide
The main advantage is that the number of iterations can be increased at will, increasing the algorithm's computational requirements to complete the calculation (obviously requiring to migrate all passwords each time you do so). So if nowadays 10.000 iterations are regarded as sufficient, next year that can be increased to 20.000, then 40.000 etc. There are quite a number of popular projects making use of PBKDF2, too: WPA/WPA2, iOS, Django, FileVault, TrueCrypt, GRUB2 etc. and there exists a simple PHP implementation, too. Also, I now think we should in fact automatically migrate legacy passwords. :-) |
Let's assume we're using a 10-character salt and the average password length is 5 characters. That means there are at least 36^15 possible combinations, each 160 bit long. Thus your rainbow table would be 4.000.000.000.000.000.000 Megabytes (or 3 Yottabytes) in size. It is entirely impossible to calculate and save that much data with current hardware. And if that changes you can simply increase the length of your salt to make the range even bigger. So rainbow tables currently aren't a problem as long as you're doing things right. (The advantage of PBKDF2 over SHA1 is the ability to increase the computational complexity of those roughly 1 quadrillion operations, making them take a longer time to finish) On the other hand picking a non-standard hashing methods could result in a huge security issue without us even noticing. There is huge research and knowledge in cryptography to make sure current algorithms are still in fact secure. And if a security breach has been found, people will notice. |
OK, I've implemented two things:
Currently passwords in the database are simply prefixed by a 5-character handle describing the algorithm used: I need feedback on how to handle different salt lengths/key lengths/iteration counts. To make those parameters completely independent and migratable those need to be saved too. An idea would be to give 5 chars for the algorithm-identifier, 2-3 characters to the salt length and 6-8 characters to the iteration count, i.e.
|
Looks great. You could use a separation character between the salt and the hash to keep them apart. |
OK, now SHA1 and PBKDF2 both support settings-migrating meaning you can change the salt-length or number of iterations at will and old passwords will automatically be migrated. I don't think we should make those values configurable though as weak settings can severely compromise a site's security. Instead we can change those settings in the core directly if we see the need without having to worry about upgrading users messing up their passwords. What I would also like to implement is a version-aware algorithm identifier. Meaning that if we find an issue with salted SHA1 or with PBKDF2 we can fix it, roll out the new algorithm as a separate class and silently migrate passwords to the new one. Prefixes should always be of the same length, i.e.
|
Excellent work @nils-werner. |
Just for the record: I have compared my PBKDF2 implementation with an open source JS implementation (http://anandam.name/pbkdf2/) using a few examples and the results were identical. |
I'd like to just add that maybe it would be better to have 2 salts: one per user, like You described, and one global that would be set in configuration (randomly generated at install time?). That way, if someone gets data from database alone (there were such accidents reported lately, not Symphony related, but it's better to be safe than sorry later :), he/she still will be missing that one salt from the configuration file. Maybe it would be a good idea to have the position of the salt stored inside hash configurable too? So it would be harder for an attacker to read it from a hash. Admin could configure it to be at 5th char, or 18th char, etc... and code would cut it out of hash before starting verification process. |
I think there are a few misconceptions as to why salts exists. The reason they exist is not to hide some information from the hacker but to make it harder to either launch a dictionary-attack or compute an entire rainbow table since the number of possible combinations grows by the factor 2^n where n is the number of bits used as the random salt. If you now traded some of those random salt bits by m secret static salt bits the uncertainty would still be the same as if all passwords in your database had random hashes, thus nothing gained overall. It might however prove advantageous if the hacker was only interested in one single password as he can no longer create a rainbow-table for a specific random yet known salt. Now if those bits were in fact known (and you should never assume the hacker did miss something) to your attacker you're decreasing the possible range of combinations by a whopping 1/(2^m), compromising the cryptology-measures for your entire database. So, for a hashing algorithm to be effective it's important to make the salt as random and as evenly distributed as possible but not to try to hide anything. |
But if you have that "static" salt hashed very early, and then keep hashing with per-user salt, then it might make attacker's life a bit harder, whithout decrasing range of combinations? So:
could be:
|
Yes, but if you do so you could also chose a random hash of length |
Wouldn't creating random hash each time prevent us from validating the password later :)? |
The salt is well known to us as well as potential attackers as it's saved alongside the hash. The fact that algorithm, parameters and salt are known is part of the principle behind hashing. The concern that only time is our shield led others to the invention of PBKDF2. MD5 and SHA1 were designed to be quick so you are able to run them on entire .iso images. PBKDF2 is designed to be slowed down at will by requiring to run the algorithm thousands of times. So you can increase the number of iterations proportional to each year's average computing power and you'll end up with roughly the same effort of cracking a password no matter how advanced your computers are (you're obviously required to constantly upgrade to the new standard). You might see having a "secret salt" as an additional hurdle for the attacker but that only applies if you manage to keep it secret. Once it's secrecy is compromised it imposes a much greater threat than the initial additional advantage. |
I know, but "computer farms", botnets, etc... can grow much faster. So depending only on the number of iterations is not that safe as it looks. Especially when we know that the process of increasing number of iterations will not be happening that often. We also know that we cannot set number of iterations too high, because login process happens on a single server, so we cannot make it crawl too often, while attackers can have whole bunch of servers to re-create hash.
Same applies to salt+algo+iterations+hash without added separate secret salt. Adding separate salt kept outside (in a different place) makes it harder for anyone to get all information at the same time. The only way to read that "static salt" from the configuration file would be to read that file. As soon as someone can do that, he/she will not have to crack any passwords, because he/she will already know the password to the database, have access to the filesystem, etc... So, if anyone can get that salt form a file, you have much bigger problem already :).
How so? Aside from the situation i described above (if someone gets that secret salt, he/she will already have control over whole site, and won't need to read any password), what additional threat there can be? Even assuming attacker got secret salt without accessing the filesystem, or any other information from the configuration file, how will that make a threat greater than reading all the information needed from a single hashed password? In such case, attacker will have both salts, so it will not get worse - just the same as without a secret salt. So i think that with additional secret-salt, that's stored elsewhere (e.g., configuration file, so in the same place that password to the database is stored in) we get more security for free, without compromising anything. |
The site doesn't become any more secure when you do password hashing. Chances are at the point where you can read the passwords you can modify other stuff as well. So we are not worrying about hackers controlling the site here but trying to crack the passwords from the database. Those could be much more important, i.e. login info to online banking, credit cards, insurances etc. and are to be protected. Please don't confuse those two.
It's actually the opposite. Say SHA1 takes 0,0001ms to compute. Configuring PBKDF2 to take 50 to 100ms doesn't make a big difference on our end as the users will barely notice it. But without our users noticing we've increased the complexity by the factor 100,000. This means barely a change for us but 100,000 times as much time or 100,000 times as many computers in the attacker's cluster. |
I did some example calucations, just for the heck of it. :-) Assume
This means 2^200 operations needed for all possibilities, each taking 50ms to complete. On a comparable environment it would take you 2.5478e+51 (that's a 1 with 53 zeros) years to do that OK, now assume the botnet is 1.000.000x as fast. Still 2.5478e+45 years. Now let's assume the computational power of your botnet grows exponentially by the factor of 100 each year. Doing some math using geometric series... That's still approx. 24 years! 24 years of exponentially growing investment and power bills. Oh and at the time the calculations finished the cluster would be 1e+75 times as fast as your server and would've produced 4.7852e+38 Yottabytes of data (or 1.9141e+45 tonnes of microSD-cards). |
While it may seem a great idea to add extra hashes (and it's always good to question these things @ahwayakchih), it is totally unrequired for our needs at present. The need was to add better hashing into Symphony, and @nils-werner has achieved that. If we find it a need in the future due to a security breach that gets reported, then we can re-address the extra hash. As for now, I think this will get added as is when @brendo gets some time. |
I'm not. I was just trying to find out, how You saw it as additional threat, so i tried to think about potential security issues :).
I know that too. I'm just saying that depending on number of iterations alone may be not enough, especially when attacker already has all the information needed (i'm talking about a case in which attacker somehow reads hashed password). Thanks for calculations, but remember that attacker does not have to try all possibilities. Just the most common ones (latest news about leaked passwords show that there are a lot of common passwords used by people everywhere).
I'm not denying that :). I just thought, that we can secure it a tiny bit better, without much more work (a single random salt generation at installation time, added to list of configuration variables stored in configuration file, then just used in password hashing - just a few additional lines of code overall). |
Keep in mind that an attacker will probably try and attack not only one but all passwords at once. So giving all of those passwords a different random salt will keep them as safe as possible. Now for the attacker a lot of random salts are just as bad as one common yet unknown salt. Nothing gained, nothing lost. If however the static salt is in fact known to the attacker you will make cracking the passwords a lot easier. The attack-case in wich a hacker is only interested in one single password of one single person is just a lot more unlikely. And if it did happen you would only gain something if the static salt remains secret. So:
|
By the way, the concept you're proposing is called a "cryptographic pepper". Interestingly enough there seem to be no security research papers covering that topic, all they're talking about are salts. |
Per-user salt just makes it harder to use rainbow tables for an attack. Nothing more. At least not without keeping salts somewhere else than hashes.
That would be true, if You used just one of them - either "salt" or "pepper" :). But using both may make an attack a bit harder to succeed than with only one of them being used to generate a hash.
How so? That would be true if per-user salts were also kept elsewhere ("secret"), because then, attacker would have part of the sercret needed to generate the hash. But You keep salts along with hashes, so they're already known. Even if attacker somehow gets the pepper used for hashing (i've just read that it's not hash when using both salt+pepper, but MAC or HMAC, but that's just naming :), he/she will not have any more info than if there was no pepper used at all. You keep writing about situation in which pepper is known to the attacker. But such situation is less likely to happen than the one with attacker reading all the hashes from the database. "Leaking" hashes from databases was often done with SQL injections, which does not need access to the filesystem. "Leaking" variable from configuration file is a lot less likely to happen (that's why i was writing about worst-case scenario, in which attacker has access to the filesystem - but by then it's game over anyway).
I'm not writing about an attack on a single password. Salts already make it harder to make an attack on all passwords at the same time. Pepper just makes it even harder (at least in case of salts being kept along with hashes in the database).
I guess i did not explain myself enough: i'm not arguing for using just one of "salt" or "pepper" (using pepper alone would be very weak security measure indeed)! I'm just saying that using both is better than using salt alone (at least for this implementation: where "per-user salt" is kept in the same place as the generated hash). Here's an interesting explanation voting for using bigger "pepper": I would agree with You, that using pepper is unnecessary, if "per-user salts" were kept separate (e.g., in the filesystem). In that case, using a single pepper could indeed weaken the hashes. But that's not the case. Here we have a salt kept together with a generated hash. If i were an attacker and got read access to the list of hashes, i would try the most common passwords first (along with user names and other user info, because it would be easy to get too, especially names which are kept in the same table as hashes :), which would make "guessing game" much shorter than trying all the possible combinations. Well... at least in case of a hit, but users are really bad at selecting their passwords, so there would be a big chance for that. Now, if passwords were hashed with both salt and pepper, i would have much harder time guessing the passwords, because i would have to guess the pepper too. In a post i linked above, there's an example of registering as a user, and then reading hashed password. In case of this implementation here (again: salts and hashes kept together in one place) + pepper, that would give me "salt", my password and a hashed password. Which would make it easier to guess the pepper. That's why the author proposes using bigger pepper. Even in such scenario, using pepper will not weaken other hashes at all - because attacker already knows all salts and hashes. So, at worst, pepper will not change anything, and at best it will make guessing all the passwords a bit harder to do.
Thanks, i've read that name somewhere before, but i forgot about it :).
That does not proove either "for" or "against" it, does it? I've found also this article, which is against using pepper: But comments below it seem to show that its author is wrong ;). |
No. Imagine being the hacker. You have 10 bits of known random salt and 10 bits of unknown pepper. Because the pepper is unknown to you you can assume those bits to be random too so in order to create a rainbow table for all passwords all you can do is create one for every possible salt-combination and every possible pepper-combination. Thus, you treat the pepper as if it was additional salt. Regardless if 20bits of salt or 10+10bits of salt+pepper, you need 2^20 times as many attempts for a guaranteed password retrieval.
Also incorrect, since your salt is now shorter. Say again you have 10 bits of salt and 10 bits of pepper. By having a fixed pepper you're giving up randomness in your salt: The attacker can be sure what the value of those 10 bits is and doesn't have to try every possible combination: For a guaranteed compromise of the entire database you'd need 2^10 times less attempts than without a pepper.
You keep writing about situations in which he doesn't. Don't assume best-case scenarios when designing security measures! You should always think about every possible thing that could go wrong and try to still keep that scenario safe. As I said before: The only scenario in which a pepper could help is when the pepper is safe and the attacker isn't trying to create a table for all x bits (salt+password) but for only the salts he can find in the DB and the most common passwords. This advantage converges against zero the more users you have though: the more salts there are the more tries you need until you end up with as many salts as possible pepper-combinations. Also, this assumes that you've been lucky and the system has only been partially compromised, which is not a scenario you should design security measures for. I mean even if you've only got access to the DB there is stuff like MySQL's
It's certainly no proof but I see it as a additional hint that it's of no much use.
I see it differently, in my eyes the comments are wrong. |
Why? I never wrote that salt should be shorter. I would actually opt for longer hashes, salts and pepper.
Again: that would be true, if a salt was not stored together with a hash. Right now, You keep hash and salt in the same place, making salt known to anyone that knows a hash. So, following Your example and current implementation, attacker already knows all 20 bits of salt. With pepper there is at least a big chance that attacker does not know 10 bits (assuming You actually sacrifice part of salt for pepper).
I don't. That's why i described worst case scenarios too.
Indeed.
That's only if MySQL is on the same server as filesystem used by Symphony (or scripts in general). Even less expensive shared hosts here often have databases on separate servers, so such trick wouldn't work anyway.
Well... one can see it both ways. That's why it's not really a hint in any direction.
Oh, so SQL injections do not happen at all :)? Also worrying is the attitude that if there's no paper on something, then it's of no use. Same attitude was often a reason of big breaches, because admins didn't believe that attack is possible at all or possible in practice, because no one wrote a paper on that (or paper assumed that attack is not possible for practical reasons ;). He does however propose to encrypt salts with some password. Same idea is mentioned as one of the two alternatives (using additional pepper, that they call "key", or encrypting salt) described here: |
One more article about advantage of using HMAC: http://rdist.root.org/2009/10/29/stop-using-unsafe-keyed-hashes-use-hmac/ Few years old, but does describe advantage of using additional secret. |
This is getting ridiculous now. Can we please stop this discussion! Enough is enough already! |
Yep, that's the reason why I am suggesting the use of PBKDF2. |
Just so you know: I've merged the PBKDF2 implementation into the hashing-branch and rebased the whole thing ontop of the newest integration. @brendo do you have a timeframe in wich you want to have this task finished? |
Great! Does this also mean that if we - in the future - choose to adapt another algorithm this is possible without any pain? Just add a new default class, and every password will be updated as soon as the user logs in? |
I assume the existing Forgot Password functionality still works though?
Cool, so verifying the hash is a faster procedure then? |
Yep. Also settings (saltlength, number of iterations etc.) in the most recent class is easily changeable, migrating "old hashes" on the fly.
Yes, that's an entirely different feature. Same with login-by-hash-URL. They have nothing to do with password hashing.
No. 0.01 to 0.1 seconds is in fact very slow. For comparison: SHA1 takes about 0.00000000005 seconds to compute. The user won't care but calculating 2 million hashes per second becomes impossible. Once the hash has been confirmed the authenticity of the user is being saved in the session, the hash is never being compared again while the session is valid. |
That's not entirely true. See here, which will log a user in from the |
Yeah, but that's |
So it is! I guess the only change then will be that after logging in to 2.3.1 (ie. the user's password hash updating), their old author token will not be valid. It will regenerate though, it's just for any scripts that have hardcoded the token value. |
Correct. Maybe that part could be rewritten as well, changing the token only when you enable/disable it. This new kind of token could then be generated upon upgrade, using the old values that are still in the database. When people log in afterwards and migrate their hashes, the token would remain the same as before as it is now decoupled from the password hash (at least until the point when somebody disables and re-enables the token access). |
I would appreciate that very much. It would actually be great to have the token completely separated from the usual login hash and have it changeable by the user if they see fit (think application tokens at github). |
I spoke to a few people and most were OK with changing the token on upgrade (including you @designermonkey, what changed your mind? :-)). Plus: The token only changes once you've logged back in and migrated your password by doing so. If the upgrade procedure warns users that this will happen I think we should be OK. Having as many tokens as you like to authorize single applications indeed sounds nice but won't be necessary for most of our users. I think we should look into it more once we have i.e. a full featured REST interface baked into the core. |
We could also log failed login attempts to make it more obvious to dev's that one of their tokens isnt working. |
Oh, and we might only show the "tokens will be changed" message if one or more tokens are actually enabled. We don't want to confuse our users, do we? :-) |
I think this is fine too. From what I understand most agencies setup this 'cron' account under a different username, so until they actually log into Symphony as this user the token will still be valid right? That is of course unless the token is updated upon the first login via Lets keep it simple. |
Maybe a strange suggestion, but isn't it possible to create the key once, and leave the password out of it? So: as soon as the user is created a hash (based on a random number, a timestamp, your favourite pet name, etc) the key is stored in the DB, next to the username and password hash. Of course, you should be able to re-set this key - like when the password is updated (via the backend) - for security reasons. |
|
Yes
I will test it to make sure. |
I can confirm that Symphony does not migrate anything when you use login tokens. Also, how would it? You're not entering your password. |
I thought perhaps if they logged in via usual means, and then went to login with a prior token. Is this nearly ready for a PR? |
That will be impossible. An update followed by a regular login changes the token. An update with a token-login doesn't.
I'd say so, yes. Maybe @michael-e wants to try and break it a bit first though ;-) |
@nils-werner's stuff is over my head (while I simply trust in it), and I haven't really followed this discussion. Additionally, I don't have the time, really. So I think we might just give it a try in the next beta. |
Ok, I've changed the syntax a bit:
I've introduced the Algorithm version part a few weeks ago, this time I gave "salt length" one more character, making the maximum possible salt length 9999 chars, that should be enough. Also the saved iterations count is now divided by 10000. Before I was saving Thinking more about those characters... It might be handy to just use a special character to split the parts by, as @creativedutchmen suggested. |
Ok, splitting makes the whole thing a lot nicer. I opted against splitting algorithm and version though because the |
Using a delimiter will make the code a whole lot cleaner; instead of using lots of substr calls, a single explode will do.
|
Funny coincidence: PHP's developers just accepted a proposal for improved password hashing that looks almost exactly like my implementation. |
Nice! What's holding this from going into this release? |
I don't see any reason to not pull this. |
Go for it! |
Yeah, nothing really. Do you want a "proper Pull Request" or can you do it yourself? |
Proper PR (and I also need to speak with you about a potential security issue too when we meet on Saturday) |
There has been a discussion about Symphony's security at the Symposium 2011. This is an abstract of one of the topics.
Currently passwords for Members are stored using a SHA1 hash with a fixed salt. Author/developer passwords are stored using SHA1 without a salt.
In the future all password storage should use the same hashing function. It should use a random salt, similar to the 'SSHA' password scheme (e.g. in OpenLDAP). (We will use an even longer salt, however.)
(An explanation of password schemes can be found here: http://wiki.dovecot.org/Authentication/PasswordSchemes)
Hashes might basically look like this (50 character string):
Updating password hashes
When Symphony switched from MD5 to SHA1, it was necessary to reset all the author passwords. This should not be necessary now (or in the future) if we use a prefix like in the example above. It's quite easy then to see how a specific password has been hashed and execute the corresponding action (e.g. choose the validation function that fits or initiate the hash update method.)
The logic to update the hash would be quite simple: If the old hash matches the hash which is generated using the password, the new hash will be generated to overwrite the old hash.
It must still be discussed if updating the hashes should be done automatically upon user login. Especially @Phoque (a.k.a. @nils-werner) felt bad about this.
Suggested hash
It has been discussed to generate the salt from the password itself (so there is no need to store it), but as @Phoque pointed out, the salt would not be random then. In that case a rainbow table which fits one Symphony installation (i.e. password hashes) would fit any installation in the world. So we decided to go for a random salt.
We don't use the SSHA mechanism "as is", but choose a longer random salt. Although that may not increase the security as much as one might expect, it's definitely not bad to do this. As @creativedutchmen pointed out, this will be useful if SSHA rainbow tables occur one day since these wouldn't fit to Symphony's method.
(In OpenLDAP the default encoding of the hash and the salt is base64 for SHA-type password schemes. Authentication software like e.g. Dovecot will automatically detect base64 vs. HEX encoding.)
Hashing functions
Currently in
class-general.php
we have:And in Members (
field.memberpassword.php
):It is suggested to add new hashing/validation funtions to
class.general.php
, which can be used everywhere, e.g.:Discussion
The above should be discussed, then implemented. For everyone who wants to participate here is a simple starter test file:
Questions:
The text was updated successfully, but these errors were encountered: