Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement New Password Hashing #796

Closed
michael-e opened this issue Oct 4, 2011 · 60 comments
Closed

Implement New Password Hashing #796

michael-e opened this issue Oct 4, 2011 · 60 comments

Comments

@michael-e
Copy link
Member

There has been a discussion about Symphony's security at the Symposium 2011. This is an abstract of one of the topics.

Currently passwords for Members are stored using a SHA1 hash with a fixed salt. Author/developer passwords are stored using SHA1 without a salt.

In the future all password storage should use the same hashing function. It should use a random salt, similar to the 'SSHA' password scheme (e.g. in OpenLDAP). (We will use an even longer salt, however.)

(An explanation of password schemes can be found here: http://wiki.dovecot.org/Authentication/PasswordSchemes)

Hashes might basically look like this (50 character string):

{SYMSSHA1}P4jeTuq8sLBuXyv0BH/jRcFMquoyRlltZVd6emJa

Updating password hashes

When Symphony switched from MD5 to SHA1, it was necessary to reset all the author passwords. This should not be necessary now (or in the future) if we use a prefix like in the example above. It's quite easy then to see how a specific password has been hashed and execute the corresponding action (e.g. choose the validation function that fits or initiate the hash update method.)

The logic to update the hash would be quite simple: If the old hash matches the hash which is generated using the password, the new hash will be generated to overwrite the old hash.

It must still be discussed if updating the hashes should be done automatically upon user login. Especially @Phoque (a.k.a. @nils-werner) felt bad about this.

Suggested hash

It has been discussed to generate the salt from the password itself (so there is no need to store it), but as @Phoque pointed out, the salt would not be random then. In that case a rainbow table which fits one Symphony installation (i.e. password hashes) would fit any installation in the world. So we decided to go for a random salt.

We don't use the SSHA mechanism "as is", but choose a longer random salt. Although that may not increase the security as much as one might expect, it's definitely not bad to do this. As @creativedutchmen pointed out, this will be useful if SSHA rainbow tables occur one day since these wouldn't fit to Symphony's method.

  • suggested prefix: {SYMSSHA1} (which means 'Symphony's Salted SHA1')
  • suggested hashing algorithm: SHA1
  • suggested salt: random, 10 characters
  • suggested encoding of the hash: base64

(In OpenLDAP the default encoding of the hash and the salt is base64 for SHA-type password schemes. Authentication software like e.g. Dovecot will automatically detect base64 vs. HEX encoding.)

Hashing functions

Currently in class-general.php we have:

/**
 * Uses SHA1 or MD5 to create a hash based on some input
 * This function is currently very basic, but would allow
 * future expansion. Salting the hash comes to mind.
 *
 * @param string $input
 * the string to be hashed
 * @param string $algorithm
 * a valid PHP function handle
 * @return string
 * the hashed string
 */
public static function hash($input, $algorithm='sha1'){
    return call_user_func($algorithm, $input);
}

And in Members (field.memberpassword.php):

/**
 * Given a string, this function will encode it using the
 * field's salt and the sha1 algorithm
 *
 * @param string $password
 * @return string
 */
public function encodePassword($password) {
    return General::hash($this->get('salt') . $password, 'sha1');
}

It is suggested to add new hashing/validation funtions to class.general.php, which can be used everywhere, e.g.:

/**
 * Hash a password using SHA1 and a 10 bit random salt
 * 
 * Example output: {SYMSSHA1}P4jeTuq8sLBuXyv0BH/jRcFMquoyRlltZVd6emJa
 *
 * @param string $password 
 * @return string
 */
public static function hashPasswordSYMSSHA1($password){
    $salt = substr(base64_encode(sha1(mt_rand(), true)), 0, 10);
    $hash = "{SYMSSHA1}" . base64_encode(sha1($password . $salt, true) . $salt);
    return $hash;
}

/**
 * Verify a password (SYMSSHA1 hash given)
 *
 * @param string $password 
 * @param string $hash 
 * @return boolean
 */
public static function verifyPasswordSYMSSHA1($password, $hash){
    $hash = base64_decode(substr($hash, 10));
    $original_hash = substr($hash, 0, 20);
    $salt = substr($hash, 20);
    $new_hash = sha1($password . $salt, true);
    return $original_hash == $new_hash;
}

Discussion

The above should be discussed, then implemented. For everyone who wants to participate here is a simple starter test file:

<?php

    $pass = 'hello';
    $hash = hashpasswordSYMSSHA1($pass);
    echo("Hash:          ");
    echo $hash;
    echo "\r\n";
    echo("Verification:  ");
    echo var_dump(verifyPasswordSYMSSHA1($pass, $hash));

    function hashPasswordSYMSSHA1($password){
        $salt = substr(base64_encode(sha1(mt_rand(), true)), 0, 10);
        $hash = "{SYMSSHA1}" . base64_encode(sha1($password . $salt, true) . $salt);
        return $hash;
    }

    function verifyPasswordSYMSSHA1($password, $hash){
        $hash = base64_decode(substr($hash, 10));
        $original_hash = substr($hash, 0, 20);
        $salt = substr($hash, 20);
        $new_hash = sha1($password . $salt, true);
        return $original_hash == $new_hash;
    }

Questions:

  1. Do we all agree that we roll out our own mechanism as described above? (Or would you rather like Symphony to use a widely adopted standard mechanism? As described above, wide adoption may also be a disadvantage.)
  2. Is the above mechanism considered secure? Are there any mistakes? (This question mainly goes to @creativedutchmen and @nils-werner)
  3. Shall we update the hashes automatically when a user/author logs in (without telling him)? Or should we wait until the user changes his password (in which case the user/author will expect that something stored in the system will change)?
@creativedutchmen
Copy link
Member

Little to add except we might want to put these functions inside a class, so we can add more hashing mechanisms in the future, without changing the logic at all. As I assume this will not be in 2.3, I will write a proposal for this for 2.4.

And, as Michael pointed out, I am not a big fan of using a "standard" method of hashing, because it is more likely somebody will create a huge rainbowtable (big enough to include the salts) for LDAP than it is for Symphony.

The bad thing here is, if such a table exists they will be relatively cheap compared to having to create one yourself.

@nils-werner
Copy link
Contributor

I've just learned about PBKDF2, an iterative key derivation function.

It is, similarily to most other hashing-functions, basically creating a derived/hashed key from a given plaintext password. The difference to SHA1, MD5 etc. is that one can decide

  1. how long the derived key will be
  2. how many iterations the algorithm has to run to obtain the key

The main advantage is that the number of iterations can be increased at will, increasing the algorithm's computational requirements to complete the calculation (obviously requiring to migrate all passwords each time you do so).

So if nowadays 10.000 iterations are regarded as sufficient, next year that can be increased to 20.000, then 40.000 etc.

There are quite a number of popular projects making use of PBKDF2, too: WPA/WPA2, iOS, Django, FileVault, TrueCrypt, GRUB2 etc. and there exists a simple PHP implementation, too.

Also, I now think we should in fact automatically migrate legacy passwords. :-)

@nils-werner
Copy link
Contributor

And, as Michael pointed out, I am not a big fan of using a "standard" method of hashing, because it is more likely somebody will create a huge rainbowtable (big enough to include the salts) for LDAP than it is for Symphony.

Let's assume we're using a 10-character salt and the average password length is 5 characters. That means there are at least 36^15 possible combinations, each 160 bit long. Thus your rainbow table would be 4.000.000.000.000.000.000 Megabytes (or 3 Yottabytes) in size.

It is entirely impossible to calculate and save that much data with current hardware. And if that changes you can simply increase the length of your salt to make the range even bigger. So rainbow tables currently aren't a problem as long as you're doing things right.

(The advantage of PBKDF2 over SHA1 is the ability to increase the computational complexity of those roughly 1 quadrillion operations, making them take a longer time to finish)

On the other hand picking a non-standard hashing methods could result in a huge security issue without us even noticing. There is huge research and knowledge in cryptography to make sure current algorithms are still in fact secure. And if a security breach has been found, people will notice.

@nils-werner
Copy link
Contributor

OK, I've implemented two things:

  1. A more abstract system of calculating and comparing hashed passwords. There is a separate implementation for MD5, SHA1 and salted SHA1 and one unifying Cryptography class. The Cryptography-class is "holding the actual implementations together" and is capable of deciding what algorithm to use and whether a hash has to be migrated. All old hashes, including MD5 are automatically being migrated upon login. The code resides in the hashing branch.
  2. An implementation of PBKDF2 also using the new abstracted Cryptography class. Again, old passwords (including salted SHA1 this time) will automatically be migrated to PBKDF2. The additional code for this algorithm is in the pbkdf2 branch.

Currently passwords in the database are simply prefixed by a 5-character handle describing the algorithm used: SSHA1 and PBKDF. The salt-length is fixed to 10 characters while the keylength is fixed to 40 characters. PBKDF iterations are fixed to 10000.

I need feedback on how to handle different salt lengths/key lengths/iteration counts. To make those parameters completely independent and migratable those need to be saved too. An idea would be to give 5 chars for the algorithm-identifier, 2-3 characters to the salt length and 6-8 characters to the iteration count, i.e.

PBKDF0100010000098f6bcd4621d373cade4e832627b4f6....
|    |  |       |         ^begin of hash
|    |  |       ^begin of salt
|    |  ^iterations count
|    ^salt length
^Algorithm-identifier

@creativedutchmen
Copy link
Member

Looks great. You could use a separation character between the salt and the hash to keep them apart.

@nils-werner
Copy link
Contributor

OK, now SHA1 and PBKDF2 both support settings-migrating meaning you can change the salt-length or number of iterations at will and old passwords will automatically be migrated.

I don't think we should make those values configurable though as weak settings can severely compromise a site's security. Instead we can change those settings in the core directly if we see the need without having to worry about upgrading users messing up their passwords.

What I would also like to implement is a version-aware algorithm identifier. Meaning that if we find an issue with salted SHA1 or with PBKDF2 we can fix it, roll out the new algorithm as a separate class and silently migrate passwords to the new one. Prefixes should always be of the same length, i.e.

  • SSHA1Xv1 for salted SHA1, version 1
  • SSHA1Xv2 salted SHA1, version 2
  • PBKDF2v1 PBKDF2, version 1
  • PBKDF2v2 PBKDF2, version 1

@designermonkey
Copy link
Member

Excellent work @nils-werner.

@nils-werner
Copy link
Contributor

Just for the record: I have compared my PBKDF2 implementation with an open source JS implementation (http://anandam.name/pbkdf2/) using a few examples and the results were identical.

@ahwayakchih
Copy link
Contributor

I'd like to just add that maybe it would be better to have 2 salts: one per user, like You described, and one global that would be set in configuration (randomly generated at install time?). That way, if someone gets data from database alone (there were such accidents reported lately, not Symphony related, but it's better to be safe than sorry later :), he/she still will be missing that one salt from the configuration file.
If someone gets access to the filesystem, then it'll be a lost case anyway.

Maybe it would be a good idea to have the position of the salt stored inside hash configurable too? So it would be harder for an attacker to read it from a hash. Admin could configure it to be at 5th char, or 18th char, etc... and code would cut it out of hash before starting verification process.

@nils-werner
Copy link
Contributor

I think there are a few misconceptions as to why salts exists. The reason they exist is not to hide some information from the hacker but to make it harder to either launch a dictionary-attack or compute an entire rainbow table since the number of possible combinations grows by the factor 2^n where n is the number of bits used as the random salt.

If you now traded some of those random salt bits by m secret static salt bits the uncertainty would still be the same as if all passwords in your database had random hashes, thus nothing gained overall. It might however prove advantageous if the hacker was only interested in one single password as he can no longer create a rainbow-table for a specific random yet known salt.

Now if those bits were in fact known (and you should never assume the hacker did miss something) to your attacker you're decreasing the possible range of combinations by a whopping 1/(2^m), compromising the cryptology-measures for your entire database.

So, for a hashing algorithm to be effective it's important to make the salt as random and as evenly distributed as possible but not to try to hide anything.

@ahwayakchih
Copy link
Contributor

But if you have that "static" salt hashed very early, and then keep hashing with per-user salt, then it might make attacker's life a bit harder, whithout decrasing range of combinations?

So:

"{SYMSSHA1}" . base64_encode(sha1($password . $salt, true) . $salt);

could be:

"{SYMSSHA1}" . base64_encode(sha1($password . $salt . $staticsalt, true) . $salt);

@nils-werner
Copy link
Contributor

Yes, but if you do so you could also chose a random hash of length strlen($salt) + strlen($staticsalt) and all the above still applies.

@ahwayakchih
Copy link
Contributor

Wouldn't creating random hash each time prevent us from validating the password later :)?
Also $staticsalt is a randomly generated string. It's just generated once and stored outside of database.
Anyway, i guess i just do not like leaving everything in one place, including description of algorithm (function, iterations, etc...) that was used to generate the data, because then we have only a time as our "shield", and computers/computer-farms are getting faster as we speak here (so they will be able to re-generate data sooner then we think they will). New versions of Symphony are introduced much slower than that, not to mention that sites are updated even less often.
User passwords tend to be short and some are very common, that is why it would be better to have that additional "secret" stored outside of database - to make re-creating a password hash a bit harder, even if only by a tiny bit.

@nils-werner
Copy link
Contributor

The salt is well known to us as well as potential attackers as it's saved alongside the hash. The fact that algorithm, parameters and salt are known is part of the principle behind hashing.

The concern that only time is our shield led others to the invention of PBKDF2. MD5 and SHA1 were designed to be quick so you are able to run them on entire .iso images. PBKDF2 is designed to be slowed down at will by requiring to run the algorithm thousands of times. So you can increase the number of iterations proportional to each year's average computing power and you'll end up with roughly the same effort of cracking a password no matter how advanced your computers are (you're obviously required to constantly upgrade to the new standard).

You might see having a "secret salt" as an additional hurdle for the attacker but that only applies if you manage to keep it secret. Once it's secrecy is compromised it imposes a much greater threat than the initial additional advantage.

@ahwayakchih
Copy link
Contributor

PBKDF2 is designed to be slowed down at will by requiring to run the algorithm thousands of times.

I know, but "computer farms", botnets, etc... can grow much faster. So depending only on the number of iterations is not that safe as it looks. Especially when we know that the process of increasing number of iterations will not be happening that often. We also know that we cannot set number of iterations too high, because login process happens on a single server, so we cannot make it crawl too often, while attackers can have whole bunch of servers to re-create hash.
So, without any additional secret (that's not some textbook example of most common passwords ever ;), when attacker reads hashes from database, he/she will get everything that's needed to crack the password along with our step-by-step instruction how to do that faster.

if you manage to keep it secret

Same applies to salt+algo+iterations+hash without added separate secret salt. Adding separate salt kept outside (in a different place) makes it harder for anyone to get all information at the same time. The only way to read that "static salt" from the configuration file would be to read that file. As soon as someone can do that, he/she will not have to crack any passwords, because he/she will already know the password to the database, have access to the filesystem, etc... So, if anyone can get that salt form a file, you have much bigger problem already :).

Once it's secrecy is compromised it imposes a much greater threat than the initial additional advantage.

How so? Aside from the situation i described above (if someone gets that secret salt, he/she will already have control over whole site, and won't need to read any password), what additional threat there can be? Even assuming attacker got secret salt without accessing the filesystem, or any other information from the configuration file, how will that make a threat greater than reading all the information needed from a single hashed password? In such case, attacker will have both salts, so it will not get worse - just the same as without a secret salt.

So i think that with additional secret-salt, that's stored elsewhere (e.g., configuration file, so in the same place that password to the database is stored in) we get more security for free, without compromising anything.
What's more is that algorithms may use that secret-salt, but do not have to, so it does not lessen flexibility of the system you guys described.

@nils-werner
Copy link
Contributor

As soon as someone can do that, he/she will not have to crack any passwords, because he/she will already know the password to the database, have access to the filesystem, etc... So, if anyone can get that salt form a file, you have much bigger problem already :).

The site doesn't become any more secure when you do password hashing. Chances are at the point where you can read the passwords you can modify other stuff as well. So we are not worrying about hackers controlling the site here but trying to crack the passwords from the database. Those could be much more important, i.e. login info to online banking, credit cards, insurances etc. and are to be protected.

Please don't confuse those two.

We also know that we cannot set number of iterations too high, because login process happens on a single server, so we cannot make it crawl too often, while attackers can have whole bunch of servers to re-create hash.

It's actually the opposite. Say SHA1 takes 0,0001ms to compute. Configuring PBKDF2 to take 50 to 100ms doesn't make a big difference on our end as the users will barely notice it. But without our users noticing we've increased the complexity by the factor 100,000. This means barely a change for us but 100,000 times as much time or 100,000 times as many computers in the attacker's cluster.

@nils-werner
Copy link
Contributor

I know, but "computer farms", botnets, etc... can grow much faster.

I did some example calucations, just for the heck of it. :-) Assume

  • We're using PBKDF2
  • 50ms per hash on your Server
  • 160bit (20 char) salt
  • 40bit (5 char) password

This means 2^200 operations needed for all possibilities, each taking 50ms to complete. On a comparable environment it would take you 2.5478e+51 (that's a 1 with 53 zeros) years to do that

OK, now assume the botnet is 1.000.000x as fast. Still 2.5478e+45 years.

Now let's assume the computational power of your botnet grows exponentially by the factor of 100 each year. Doing some math using geometric series... That's still approx. 24 years! 24 years of exponentially growing investment and power bills.

Oh and at the time the calculations finished the cluster would be 1e+75 times as fast as your server and would've produced 4.7852e+38 Yottabytes of data (or 1.9141e+45 tonnes of microSD-cards).

@designermonkey
Copy link
Member

While it may seem a great idea to add extra hashes (and it's always good to question these things @ahwayakchih), it is totally unrequired for our needs at present. The need was to add better hashing into Symphony, and @nils-werner has achieved that.

If we find it a need in the future due to a security breach that gets reported, then we can re-address the extra hash. As for now, I think this will get added as is when @brendo gets some time.

@ahwayakchih
Copy link
Contributor

Please don't confuse those two.

I'm not. I was just trying to find out, how You saw it as additional threat, so i tried to think about potential security issues :).

This means barely a change for us but 100,000 times as much time or 100,000 times as many computers in the attacker's cluster.

I know that too. I'm just saying that depending on number of iterations alone may be not enough, especially when attacker already has all the information needed (i'm talking about a case in which attacker somehow reads hashed password).
I'm not saying that it will somehow horribly slow down Symphony (slow down will be almost invisible, and only at the login).

Thanks for calculations, but remember that attacker does not have to try all possibilities. Just the most common ones (latest news about leaked passwords show that there are a lot of common passwords used by people everywhere).

The need was to add better hashing into Symphony, and @nils-werner has achieved that.

I'm not denying that :). I just thought, that we can secure it a tiny bit better, without much more work (a single random salt generation at installation time, added to list of configuration variables stored in configuration file, then just used in password hashing - just a few additional lines of code overall).

@nils-werner
Copy link
Contributor

Keep in mind that an attacker will probably try and attack not only one but all passwords at once. So giving all of those passwords a different random salt will keep them as safe as possible. Now for the attacker a lot of random salts are just as bad as one common yet unknown salt. Nothing gained, nothing lost.

If however the static salt is in fact known to the attacker you will make cracking the passwords a lot easier.

The attack-case in wich a hacker is only interested in one single password of one single person is just a lot more unlikely. And if it did happen you would only gain something if the static salt remains secret.

So:

  • Attacker interested in one password, salt hidden: Your concept keeps one hash more secure
  • Attacker interested in one password, salt compromised: Your concept is equal to the normal method
  • Attacker interested in all passwords, salt hidden: Your concept is equal to the normal method
  • Attacker interested in all passwords, salt compromised: Your concept compromises the security of all hashes

@nils-werner
Copy link
Contributor

By the way, the concept you're proposing is called a "cryptographic pepper". Interestingly enough there seem to be no security research papers covering that topic, all they're talking about are salts.

@ahwayakchih
Copy link
Contributor

Keep in mind that an attacker will probably try and attack not only one but all passwords at once. So giving all of those passwords a different random salt will keep them as safe as possible.

Per-user salt just makes it harder to use rainbow tables for an attack. Nothing more. At least not without keeping salts somewhere else than hashes.

Now for the attacker a lot of random salts are just as bad as one common yet unknown salt. Nothing gained, nothing lost.

That would be true, if You used just one of them - either "salt" or "pepper" :). But using both may make an attack a bit harder to succeed than with only one of them being used to generate a hash.

If however the static salt is in fact known to the attacker you will make cracking the passwords a lot easier.

How so? That would be true if per-user salts were also kept elsewhere ("secret"), because then, attacker would have part of the sercret needed to generate the hash. But You keep salts along with hashes, so they're already known. Even if attacker somehow gets the pepper used for hashing (i've just read that it's not hash when using both salt+pepper, but MAC or HMAC, but that's just naming :), he/she will not have any more info than if there was no pepper used at all.

You keep writing about situation in which pepper is known to the attacker. But such situation is less likely to happen than the one with attacker reading all the hashes from the database. "Leaking" hashes from databases was often done with SQL injections, which does not need access to the filesystem. "Leaking" variable from configuration file is a lot less likely to happen (that's why i was writing about worst-case scenario, in which attacker has access to the filesystem - but by then it's game over anyway).
Additionally, when the database and files are on separate servers, there is even less chance for an attacker to have access to the filesystem if he/she gets access to read the database (of course, if an attacker gets access to the filesystem first, then a separate database server will not help at all, but again: in such case all is lost anyway), so pepper stored in the configuration file is probably much harder to read by an attacker than hashes kept in the database.

The attack-case in wich a hacker is only interested in one single password of one single person is just a lot more unlikely.

I'm not writing about an attack on a single password. Salts already make it harder to make an attack on all passwords at the same time. Pepper just makes it even harder (at least in case of salts being kept along with hashes in the database).

So: [...]

I guess i did not explain myself enough: i'm not arguing for using just one of "salt" or "pepper" (using pepper alone would be very weak security measure indeed)! I'm just saying that using both is better than using salt alone (at least for this implementation: where "per-user salt" is kept in the same place as the generated hash).

Here's an interesting explanation voting for using bigger "pepper":
http://forums.devnetwork.net/viewtopic.php?f=34&t=127891

I would agree with You, that using pepper is unnecessary, if "per-user salts" were kept separate (e.g., in the filesystem). In that case, using a single pepper could indeed weaken the hashes. But that's not the case. Here we have a salt kept together with a generated hash.

If i were an attacker and got read access to the list of hashes, i would try the most common passwords first (along with user names and other user info, because it would be easy to get too, especially names which are kept in the same table as hashes :), which would make "guessing game" much shorter than trying all the possible combinations. Well... at least in case of a hit, but users are really bad at selecting their passwords, so there would be a big chance for that. Now, if passwords were hashed with both salt and pepper, i would have much harder time guessing the passwords, because i would have to guess the pepper too.

In a post i linked above, there's an example of registering as a user, and then reading hashed password. In case of this implementation here (again: salts and hashes kept together in one place) + pepper, that would give me "salt", my password and a hashed password. Which would make it easier to guess the pepper. That's why the author proposes using bigger pepper. Even in such scenario, using pepper will not weaken other hashes at all - because attacker already knows all salts and hashes. So, at worst, pepper will not change anything, and at best it will make guessing all the passwords a bit harder to do.

By the way, the concept you're proposing is called a "cryptographic pepper".

Thanks, i've read that name somewhere before, but i forgot about it :).

Interestingly enough there seem to be no security research papers covering that topic, all they're talking about are salts.

That does not proove either "for" or "against" it, does it?

I've found also this article, which is against using pepper:
http://blog.ircmaxell.com/2012/04/properly-salting-passwords-case-against.html

But comments below it seem to show that its author is wrong ;).

@nils-werner
Copy link
Contributor

Now for the attacker a lot of random salts are just as bad as one common yet unknown salt. Nothing gained, nothing lost.

That would be true, if You used just one of them - either "salt" or "pepper" :). But using both may make an attack a bit harder to succeed than with only one of them being used to generate a hash.

No. Imagine being the hacker. You have 10 bits of known random salt and 10 bits of unknown pepper. Because the pepper is unknown to you you can assume those bits to be random too so in order to create a rainbow table for all passwords all you can do is create one for every possible salt-combination and every possible pepper-combination. Thus, you treat the pepper as if it was additional salt. Regardless if 20bits of salt or 10+10bits of salt+pepper, you need 2^20 times as many attempts for a guaranteed password retrieval.

If however the static salt is in fact known to the attacker you will make cracking the passwords a lot easier.

How so? That would be true if per-user salts were also kept elsewhere ("secret"), because then, attacker would have part of the sercret needed to generate the hash. But You keep salts along with hashes, so they're already known. Even if attacker somehow gets the pepper used for hashing (i've just read that it's not hash when using both salt+pepper, but MAC or HMAC, but that's just naming :), he/she will not have any more info than if there was no pepper used at all.

Also incorrect, since your salt is now shorter. Say again you have 10 bits of salt and 10 bits of pepper. By having a fixed pepper you're giving up randomness in your salt: The attacker can be sure what the value of those 10 bits is and doesn't have to try every possible combination: For a guaranteed compromise of the entire database you'd need 2^10 times less attempts than without a pepper.

You keep writing about situation in which pepper is known to the attacker.

You keep writing about situations in which he doesn't. Don't assume best-case scenarios when designing security measures! You should always think about every possible thing that could go wrong and try to still keep that scenario safe.

As I said before: The only scenario in which a pepper could help is when the pepper is safe and the attacker isn't trying to create a table for all x bits (salt+password) but for only the salts he can find in the DB and the most common passwords. This advantage converges against zero the more users you have though: the more salts there are the more tries you need until you end up with as many salts as possible pepper-combinations.

Also, this assumes that you've been lucky and the system has only been partially compromised, which is not a scenario you should design security measures for. I mean even if you've only got access to the DB there is stuff like MySQL's LOAD_FILE that can potentially give you access to almost any file you want.

Interestingly enough there seem to be no security research papers covering that topic, all they're talking about are salts.

That does not proove either "for" or "against" it, does it?

It's certainly no proof but I see it as a additional hint that it's of no much use.

I've found also this article, which is against using pepper:
http://blog.ircmaxell.com/2012/04/properly-salting-passwords-case-against.html

But comments below it seem to show that its author is wrong ;).

I see it differently, in my eyes the comments are wrong.

@ahwayakchih
Copy link
Contributor

Also incorrect, since your salt is now shorter.

Why? I never wrote that salt should be shorter. I would actually opt for longer hashes, salts and pepper.

Say again you have 10 bits of salt and 10 bits of pepper. By having a fixed pepper you're giving up randomness in your salt: The attacker can be sure what the value of those 10 bits is and doesn't have to try every possible combination: For a guaranteed compromise of the entire database you'd need 2^10 times less attempts than without a pepper.

Again: that would be true, if a salt was not stored together with a hash. Right now, You keep hash and salt in the same place, making salt known to anyone that knows a hash. So, following Your example and current implementation, attacker already knows all 20 bits of salt. With pepper there is at least a big chance that attacker does not know 10 bits (assuming You actually sacrifice part of salt for pepper).

Don't assume best-case scenarios when designing security measures!

I don't. That's why i described worst case scenarios too.

You should always think about every possible thing that could go wrong and try to still keep that scenario safe.

Indeed.

I mean even if you've only got access to the DB there is stuff like MySQL's LOAD_FILE that can potentially give you access to almost any file you want.

That's only if MySQL is on the same server as filesystem used by Symphony (or scripts in general). Even less expensive shared hosts here often have databases on separate servers, so such trick wouldn't work anyway.
And, as i keep writing, even in worst case scenario, knowing pepper will not make hash weaker than with current implementation, where attacker knows salt and hash. But without pepper, hash is tiny bit harder to break, because there's more random data (i.e., attacker has to assume that password is stronger, because from his/her point of view password is now something short that users keep using plus something long and random that was generated by server).

It's certainly no proof but I see it as a additional hint that it's of no much use.

Well... one can see it both ways. That's why it's not really a hint in any direction.

I see it differently, in my eyes the comments are wrong.

Oh, so SQL injections do not happen at all :)? Also worrying is the attitude that if there's no paper on something, then it's of no use. Same attitude was often a reason of big breaches, because admins didn't believe that attack is possible at all or possible in practice, because no one wrote a paper on that (or paper assumed that attack is not possible for practical reasons ;).

He does however propose to encrypt salts with some password. Same idea is mentioned as one of the two alternatives (using additional pepper, that they call "key", or encrypting salt) described here:

http://crackstation.net/hashing-security.htm

@ahwayakchih
Copy link
Contributor

One more article about advantage of using HMAC:

http://rdist.root.org/2009/10/29/stop-using-unsafe-keyed-hashes-use-hmac/

Few years old, but does describe advantage of using additional secret.

@designermonkey
Copy link
Member

This is getting ridiculous now. Can we please stop this discussion! Enough is enough already!

@nils-werner
Copy link
Contributor

Yep, that's the reason why I am suggesting the use of PBKDF2.

@nils-werner
Copy link
Contributor

Just so you know: I've merged the PBKDF2 implementation into the hashing-branch and rebased the whole thing ontop of the newest integration.

@brendo do you have a timeframe in wich you want to have this task finished?

@creativedutchmen
Copy link
Member

Great! Does this also mean that if we - in the future - choose to adapt another algorithm this is possible without any pain? Just add a new default class, and every password will be updated as soon as the user logs in?

@brendo
Copy link
Member

brendo commented Aug 12, 2012

One sideffect that is worth mentioning: If you've forgotten your password but have access to the database you can set it to MD5(hello) and you'll have "hello" PBKDF2-ified once you log in again. I find that pretty neat as calculating the PBKDF2 and setting all properties in the table is pretty hard to do.

I assume the existing Forgot Password functionality still works though?

Approximately 0.01 to 0.1 second during login. The algorithm only runs during login, once logged in there will be no further delays.

Cool, so verifying the hash is a faster procedure then?

@nils-werner
Copy link
Contributor

Great! Does this also mean that if we - in the future - choose to adapt another algorithm this is possible without any pain?

Yep. Also settings (saltlength, number of iterations etc.) in the most recent class is easily changeable, migrating "old hashes" on the fly.

I assume the existing Forgot Password functionality still works though?

Yes, that's an entirely different feature. Same with login-by-hash-URL. They have nothing to do with password hashing.

Cool, so verifying the hash is a faster procedure then?

No. 0.01 to 0.1 seconds is in fact very slow. For comparison: SHA1 takes about 0.00000000005 seconds to compute. The user won't care but calculating 2 million hashes per second becomes impossible.

Once the hash has been confirmed the authenticity of the user is being saved in the session, the hash is never being compared again while the session is valid.

@brendo
Copy link
Member

brendo commented Aug 13, 2012

Yes, that's an entirely different feature. Same with login-by-hash-URL. They have nothing to do with password hashing.

That's not entirely true. See here, which will log a user in from the ?auth-token=xxxxx and it assumes the hash is SHA1. Unless your PR has already considered that.

@nils-werner
Copy link
Contributor

Yeah, but that's SHA1(username, hash)

@brendo
Copy link
Member

brendo commented Aug 13, 2012

So it is! I guess the only change then will be that after logging in to 2.3.1 (ie. the user's password hash updating), their old author token will not be valid. It will regenerate though, it's just for any scripts that have hardcoded the token value.

@nils-werner
Copy link
Contributor

Correct. Maybe that part could be rewritten as well, changing the token only when you enable/disable it. This new kind of token could then be generated upon upgrade, using the old values that are still in the database.

When people log in afterwards and migrate their hashes, the token would remain the same as before as it is now decoupled from the password hash (at least until the point when somebody disables and re-enables the token access).

@designermonkey
Copy link
Member

Maybe that part could be rewritten as well, changing the token only when you enable/disable it.

I would appreciate that very much.

It would actually be great to have the token completely separated from the usual login hash and have it changeable by the user if they see fit (think application tokens at github).

@nils-werner
Copy link
Contributor

I spoke to a few people and most were OK with changing the token on upgrade (including you @designermonkey, what changed your mind? :-)). Plus: The token only changes once you've logged back in and migrated your password by doing so. If the upgrade procedure warns users that this will happen I think we should be OK.

Having as many tokens as you like to authorize single applications indeed sounds nice but won't be necessary for most of our users. I think we should look into it more once we have i.e. a full featured REST interface baked into the core.

@nils-werner
Copy link
Contributor

We could also log failed login attempts to make it more obvious to dev's that one of their tokens isnt working.

@nils-werner
Copy link
Contributor

Oh, and we might only show the "tokens will be changed" message if one or more tokens are actually enabled. We don't want to confuse our users, do we? :-)

@brendo
Copy link
Member

brendo commented Aug 14, 2012

I spoke to a few people and most were OK with changing the token on upgrade (including you @designermonkey, what changed your mind? :-)). Plus: The token only changes once you've logged back in and migrated your password by doing so. If the upgrade procedure warns users that this will happen I think we should be OK.

I think this is fine too. From what I understand most agencies setup this 'cron' account under a different username, so until they actually log into Symphony as this user the token will still be valid right? That is of course unless the token is updated upon the first login via ?auth-token.

Lets keep it simple.

@creativedutchmen
Copy link
Member

Maybe a strange suggestion, but isn't it possible to create the key once, and leave the password out of it? So: as soon as the user is created a hash (based on a random number, a timestamp, your favourite pet name, etc) the key is stored in the DB, next to the username and password hash.

Of course, you should be able to re-set this key - like when the password is updated (via the backend) - for security reasons.

@brendo
Copy link
Member

brendo commented Aug 14, 2012

Lets keep it simple.

@nils-werner
Copy link
Contributor

so until they actually log into Symphony as this user the token will still be valid right?

Yes

That is of course unless the token is updated upon the first login via ?auth-token.

I will test it to make sure.

@nils-werner
Copy link
Contributor

I can confirm that Symphony does not migrate anything when you use login tokens. Also, how would it? You're not entering your password.

@brendo
Copy link
Member

brendo commented Aug 19, 2012

I thought perhaps if they logged in via usual means, and then went to login with a prior token. Is this nearly ready for a PR?

@nils-werner
Copy link
Contributor

I thought perhaps if they logged in via usual means, and then went to login with a prior token.

That will be impossible. An update followed by a regular login changes the token. An update with a token-login doesn't.

Is this nearly ready for a PR?

I'd say so, yes. Maybe @michael-e wants to try and break it a bit first though ;-)

@michael-e
Copy link
Member Author

@nils-werner's stuff is over my head (while I simply trust in it), and I haven't really followed this discussion. Additionally, I don't have the time, really. So I think we might just give it a try in the next beta.

@nils-werner
Copy link
Contributor

Ok, I've changed the syntax a bit:

PBKDFv100200000000198f6bcd4621d373cade4e832627b4f6....
|    | |   |       |                  ^begin of hash
|    | |   |       ^begin of salt
|    | |   ^iterations count divided by 10000
|    | ^salt length
|    ^ Algorithm version
^Algorithm-identifier

I've introduced the Algorithm version part a few weeks ago, this time I gave "salt length" one more character, making the maximum possible salt length 9999 chars, that should be enough.

Also the saved iterations count is now divided by 10000. Before I was saving 00010000 for 10000, now I'm saving 00000001. This means less redundant zeros and more headroom for more iterations, should we see the need to increase them.

Thinking more about those characters... It might be handy to just use a special character to split the parts by, as @creativedutchmen suggested.

@nils-werner
Copy link
Contributor

Ok, splitting makes the whole thing a lot nicer. I opted against splitting algorithm and version though because the Cryptography class needs to be able to parse them easily. A single 8-character long identifier suits this purpose well.

@creativedutchmen
Copy link
Member

Using a delimiter will make the code a whole lot cleaner; instead of using lots of substr calls, a single explode will do.

Ok, splitting makes the whole thing a lot nicer. I opted against splitting algorithm and version though because the Cryptography class needs to be able to parse them easily. A single 8-character long identifier suits this purpose well.


Reply to this email directly or view it on GitHub.

@nils-werner
Copy link
Contributor

Funny coincidence: PHP's developers just accepted a proposal for improved password hashing that looks almost exactly like my implementation.

@brendo
Copy link
Member

brendo commented Sep 14, 2012

Nice! What's holding this from going into this release?

@michael-e
Copy link
Member Author

I don't see any reason to not pull this.

@designermonkey
Copy link
Member

Go for it!

@nils-werner
Copy link
Contributor

Yeah, nothing really. Do you want a "proper Pull Request" or can you do it yourself?

@brendo
Copy link
Member

brendo commented Sep 19, 2012

Proper PR (and I also need to speak with you about a potential security issue too when we meet on Saturday)

@nils-werner nils-werner mentioned this issue Sep 23, 2012
@brendo brendo closed this as completed Sep 23, 2012
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants