-
Notifications
You must be signed in to change notification settings - Fork 39
/
hashing-security.php
executable file
·1046 lines (906 loc) · 49 KB
/
hashing-security.php
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
<h1>Salted Password Hashing - Doing it Right</h1>
<p>
If you're a web developer, you've probably had to make a user account system.
The most important aspect of a user account system is how user passwords are
protected. User account databases are hacked frequently, so you absolutely must
do something to protect your users' passwords if your website is ever breached.
The best way to protect passwords is to employ <b>salted password hashing</b>.
This page will explain why it's done the way it is.
</p>
<p>
There are a lot of conflicting ideas and misconceptions on how to do password
hashing properly, probably due to the abundance of misinformation on the web.
Password hashing is one of those things that's so simple, but yet so many people
get wrong. With this page, I hope to explain not only the correct way to do it,
but why it should be done that way.
</p>
<div style="background-color: #FFCCCC; border: solid red 1px; padding: 10px;">
<strong>IMPORTANT WARNING:</strong> If you are thinking of writing your own
password hashing code, <strong>please don't!</strong>. It's too easy to screw
up. No, that cryptography course you took in university doesn't make you exempt
from this warning. This applies to everyone: <strong>DO NOT WRITE YOUR OWN
CRYPTO!</strong> The problem of storing passwords has already been solved. Use
either use either <a href="http://www.openwall.com/phpass/">phpass</a>, the PHP,
C#, Java, and Ruby implementations in <a
href="https://github.com/defuse/password-hashing">defuse/password-hashing</a>,
or <a
href="https://download.libsodium.org/doc/password_hashing/index.html">libsodium</a>.
</div>
<p>
If for some reason you missed that big red warning note, please go read it now.
Really, this guide is <b>not</b> meant to walk you through the process of
writing your own storage system, it's to explain the reasons why passwords
should be stored a certain way.
</p>
<p>You may use the following links to jump to the different sections of this page.</p>
<table class="shortcuts">
<tbody><tr>
<td><a href="#normalhashing" class="sca" title="What are hash functions and why are they used?">1. What is password hashing?</a></td>
<td><a href="#attacks" class="sca" title="Methods for making hash cracking more efficient">2. How Hashes are Cracked</a></td>
<td><a href="#salt" class="sca" title="Adding salt to render hash cracking attacks less effective">3. Adding Salt</a></td>
</tr>
<tr>
<td><a href="#ineffective" class="sca" title="The wrong way to do password hashing">4. Ineffective Hashing Methods</a></td>
<td><a href="#properhashing" class="sca" title="The right way to do password hashing, with salt">5. How to hash properly</a></td>
<td><a href="#faq" class="sca" title="Frequently asked questions about password hashing and salt">6. Frequently Asked Questions</a></td>
</tr>
</tbody></table>
<a name="normalhashing"></a>
<h2>What is password hashing?</h2>
<div class="passcrack" style="text-align: center;">
hash("hello") = 2cf24dba5fb0a30e26e83b2ac5b9e29e1b161e5c1fa7425e73043362938b9824<br />
hash("hbllo") = 58756879c05c68dfac9866712fad6a93f8146f337a69afe7dd238f3364946366<br />
hash("waltz") = c0e81794384491161f1777c232bc6bd9ec38f616560b120fda8e90f383853542<br />
</div>
<p>
Hash algorithms are one way functions. They turn any amount of data into
a fixed-length "fingerprint" that cannot be reversed. They also have the
property that if the input changes by even a tiny bit, the resulting hash is
completely different (see the example above). This is great for protecting
passwords, because we want to store passwords in a form that protects them even
if the password file itself is compromised, but at the same time, we need to be
able to verify that a user's password is correct.
</p>
<p>
The general workflow for account registration and authentication in a hash-based
account system is as follows:
</p>
<ol class="moveul">
<li>The user creates an account.</li>
<li>Their password is hashed and stored in the database. At no point is the plain-text (unencrypted) password ever written to the hard drive.</li>
<li>When the user attempts to login, the hash of the password they entered is checked against the hash of their real password (retrieved from the database).</li>
<li>If the hashes match, the user is granted access. If not, the user is told they entered invalid login credentials.</li>
<li>Steps 3 and 4 repeat every time someone tries to login to their account.</li>
</ol>
<p>
In step 4, never tell the user if it was the username or password they got wrong. Always display
a generic message like "Invalid username or password." This prevents attackers from enumerating
valid usernames without knowing their passwords.
</p>
<p>
It should be noted that the hash functions used to protect passwords are not the
same as the hash functions you may have seen in a data structures course. The
hash functions used to implement data structures such as hash tables are
designed to be fast, not secure. Only <b>cryptographic hash functions</b> may be
used to implement password hashing. Hash functions like SHA256, SHA512, RipeMD,
and WHIRLPOOL are cryptographic hash functions.
</p>
<p>
It is easy to think that all you have to do is run the password through a
cryptographic hash function and your users' passwords will be secure. This is
far from the truth. There are many ways to recover passwords from plain hashes
very quickly. There are several easy-to-implement techniques that make these
"attacks" much less effective. To motivate the need for these techniques,
consider this very website. On the front page, you can submit a list of hashes
to be cracked, and receive results in less than a second. Clearly, simply
hashing the password does not meet our needs for security.
</p>
<p>The next section will discuss some of the common attacks used to crack plain password hashes.</p>
<a name="attacks"></a>
<h2>How Hashes are Cracked</h2>
<ul class="moveul" >
<li>
<h3>Dictionary and Brute Force Attacks</h3>
<table style="margin: 0 auto;">
<tbody><tr>
<td>
<div class="passcrack" title="Cracking a hash with a wordlist">
<center>Dictionary Attack</center><br />
Trying apple : failed<br />
Trying blueberry : failed<br />
Trying justinbeiber : failed<br />
<center>...</center>
Trying letmein : failed<br />
<span style="color: green;">Trying s3cr3t : success!</span><br />
</div>
</td>
<td>
<div class="passcrack" title="Cracking a hash by brute force">
<center>Brute Force Attack</center><br />
Trying aaaa : failed<br />
Trying aaab : failed<br />
Trying aaac : failed<br />
<center>...</center>
Trying acdb : failed<br />
<span style="color: green;">Trying acdc : success!</span><br />
</div>
</td>
</tr>
</tbody></table>
<p>
The simplest way to crack a hash is to try to guess the password, hashing each guess, and checking if the guess's hash equals the hash being cracked. If the hashes are equal, the guess is the password.
The two most common ways of guessing passwords are <b>dictionary attacks</b> and <b>brute-force attacks</b>.
</p>
<p>
A dictionary attack uses a file containing words, phrases, common passwords,
and other strings that are likely to be used as a password. Each word in the
file is hashed, and its hash is compared to the password hash. If they
match, that word is the password. These dictionary files are constructed by
extracting words from large bodies of text, and even from real databases of
passwords. Further processing is often applied to dictionary files, such as
replacing words with their "leet speak" equivalents ("hello" becomes
"h3110"), to make them more effective.
</p>
<p>
A brute-force attack tries every possible combination of characters up to a
given length. These attacks are very computationally expensive, and are
usually the least efficient in terms of hashes cracked per processor time,
but they will always eventually find the password. Passwords should be long
enough that searching through all possible character strings to find it will
take too long to be worthwhile.
</p>
<p>
There is no way to prevent dictionary attacks or brute force attacks. They
can be made less effective, but there isn't a way to prevent them
altogether. If your password hashing system is secure, the only way to crack
the hashes will be to run a dictionary or brute-force attack on each hash.
</p>
</li>
<li>
<h3>Lookup Tables</h3>
<center>
<div class="passcrack" style="text-align: center;" title="Cracking many hashes with a pre-computed lookup table">
<span style="color: green;">Searching: 5f4dcc3b5aa765d61d8327deb882cf99: FOUND: password5</span><br />
Searching: 6cbe615c106f422d23669b610b564800: not in database<br />
<span style="color: green;">Searching: 630bf032efe4507f2c57b280995925a9: FOUND: letMEin12 </span><br />
<span style="color: green;">Searching: 386f43fab5d096a7a66d67c8f213e5ec: FOUND: mcd0nalds</span><br />
<span style="color: green;">Searching: d5ec75d5fe70d428685510fae36492d9: FOUND: p@ssw0rd!</span><br />
</span>
</center>
<p>
Lookup tables are an extremely effective method for cracking many hashes of
the same type very quickly. The general idea is to <b>pre-compute</b> the
hashes of the passwords in a password dictionary and store them, and their
corresponding password, in a lookup table data structure. A good
implementation of a lookup table can process hundreds of hash lookups per
second, even when they contain many billions of hashes.
</p>
<p>
If you want a better idea of how fast lookup tables can be, try cracking the
following sha256 hashes with CrackStation's <a href="/">free hash
cracker</a>.
</p>
<div class="passcrack" style="text-align: center;" title="Example hashes to be cracked">
c11083b4b0a7743af748c85d343dfee9fbb8b2576c05f3a7f0d632b0926aadfc<br />
08eac03b80adc33dc7d8fbe44b7c7b05d3a2c511166bdb43fcb710b03ba919e7<br />
e4ba5cbd251c98e6cd1c23f126a3b81d8d8328abc95387229850952b3ef9f904<br />
5206b8b8a996cf5320cb12ca91c7b790fba9f030408efe83ebb83548dc3007bd<br />
</div>
</li>
<li>
<h3>Reverse Lookup Tables</h3>
<center>
<div class="passcrack" style="text-align: left;" title="Cracking many hashes with a pre-computed lookup table">
<span style="color: green;">Searching for hash(apple) in users' hash list... : Matches [alice3, 0bob0, charles8]</span><br />
<span style="color: green;">Searching for hash(blueberry) in users' hash list... : Matches [usr10101, timmy, john91]</span><br />
<span style="color: green;">Searching for hash(letmein) in users' hash list... : Matches [wilson10, dragonslayerX, joe1984]</span><br />
<span style="color: green;">Searching for hash(s3cr3t) in users' hash list... : Matches [bruce19, knuth1337, john87]</span><br />
<span>Searching for hash(z@29hjja) in users' hash list... : No users used this password</span><br />
</span>
</center>
<p>
This attack allows an attacker to apply a dictionary or brute-force attack to many hashes at the same time, without having to pre-compute a lookup table.
</p>
<p>
First, the attacker creates a lookup table that maps each password hash from
the compromised user account database to a list of users who had that hash.
The attacker then hashes each password guess and uses the lookup table to
get a list of users whose password was the attacker's guess. This attack is
especially effective because it is common for many users to have the same
password.
</p>
</li>
<li>
<h3>Rainbow Tables</h3>
<p>
Rainbow tables are a time-memory trade-off technique. They are like lookup
tables, except that they sacrifice hash cracking speed to make the lookup
tables smaller. Because they are smaller, the solutions to more hashes can
be stored in the same amount of space, making them more effective. Rainbow
tables that can crack any md5 hash of a password up to 8 characters long <a
href="http://www.freerainbowtables.com/en/tables2/">exist</a>.
</p>
</li>
</ul>
<p>
Next, we'll look at a technique called salting, which makes it impossible to use
lookup tables and rainbow tables to crack a hash.
</p>
<a name="salt"></a>
<h2>Adding Salt</h2>
<div class="passcrack" style="text-align: center;" title="Salt example">
hash("hello") = 2cf24dba5fb0a30e26e83b2ac5b9e29e1b161e5c1fa7425e73043362938b9824<br />
hash("hello" + "QxLUF1bgIAdeQX") = 9e209040c863f84a31e719795b2577523954739fe5ed3b58a75cff2127075ed1<br />
hash("hello" + "bv5PehSMfV11Cd") = d1d3ec2e6f20fd420d50e2642992841d8338a314b8ea157c9e18477aaef226ab<br />
hash("hello" + "YYLmfY6IehjZMQ") = a49670c3c18b9e079b9cfaf51634f563dc8ae3070db2c4a8544305df1b60f007
</div>
<p>
Lookup tables and rainbow tables only work because each password is hashed the
exact same way. If two users have the same password, they'll have the same
password hashes. We can prevent these attacks by randomizing each hash, so that
when the same password is hashed twice, the hashes are not the same.
</p>
<p>
We can randomize the hashes by appending or prepending a random string, called a
<b>salt</b>, to the password before hashing. As shown in the example above, this
makes the same password hash into a completely different string every time. To
check if a password is correct, we need the salt, so it is usually stored in the
user account database along with the hash, or as part of the hash string itself.
</p>
<p>
The salt does not need to be secret. Just by randomizing the hashes, lookup
tables, reverse lookup tables, and rainbow tables become ineffective. An
attacker won't know in advance what the salt will be, so they can't pre-compute
a lookup table or rainbow table. If each user's password is hashed with a
different salt, the reverse lookup table attack won't work either.
</p>
<p>
In the next section, we'll look at how salt is commonly implemented incorrectly.
</p>
<a name="ineffective"></a>
<h2>The <span style="color: red;">WRONG</span> Way: Short Salt & Salt Reuse</h2>
<p>
The most common salt implementation errors are reusing the same salt in multiple
hashes, or using a salt that is too short.
</p>
<h3>Salt Reuse</h3>
<p>
A common mistake is to use the same salt in each hash. Either the salt is
hard-coded into the program, or is generated randomly once. This is ineffective
because if two users have the same password, they'll still have the same hash.
An attacker can still use a reverse lookup table attack to run a dictionary
attack on every hash at the same time. They just have to apply the salt to each
password guess before they hash it. If the salt is hard-coded into a popular
product, lookup tables and rainbow tables can be built for that salt, to make it
easier to crack hashes generated by the product.
</p>
<p>
A new random salt must be generated each time a user creates an account or changes their password.
</p>
<h3>Short Salt</h3>
<p>
If the salt is too short, an attacker can build a lookup table for every
possible salt. For example, if the salt is only three ASCII characters, there
are only 95x95x95 = 857,375 possible salts. That may seem like a lot, but if
each lookup table contains only 1MB of the most common passwords, collectively
they will be only 837GB, which is not a lot considering 1000GB hard drives can
be bought for under $100 today.
</p>
<p>
For the same reason, the username shouldn't be used as a salt. Usernames may be
unique to a single service, but they are predictable and often reused for
accounts on other services. An attacker can build lookup tables for common
usernames and use them to crack username-salted hashes.
</p>
<p>
To make it impossible for an attacker to create a lookup table for every
possible salt, the salt must be long. A good rule of thumb is to use a salt that
is the same size as the output of the hash function. For example, the output of
SHA256 is 256 bits (32 bytes), so the salt should be at least 32 random bytes.
</p>
<h2>The <span style="color: red;">WRONG</span> Way: Double Hashing & Wacky Hash Functions</h2>
<p>
This section covers another common password hashing misconception: wacky
combinations of hash algorithms. It's easy to get carried away and try to
combine different hash functions, hoping that the result will be more secure. In
practice, though, there is very little benefit to doing it. All it does is
create interoperability problems, and can sometimes even make the hashes less
secure. Never try to invent your own crypto, always use a standard that has
been designed by experts. Some will argue that using multiple hash functions
makes the process of computing the hash slower, so cracking is slower, but
there's a better way to make the cracking process slower as we'll see later.
</p>
<p>Here are some examples of poor wacky hash functions I've seen suggested in forums on the internet.</p>
<ul>
<li><span class="ic">md5(sha1(password))</span></li>
<li><span class="ic">md5(md5(salt) + md5(password))</span></li>
<li><span class="ic">sha1(sha1(password))</span></li>
<li><span class="ic">sha1(str_rot13(password + salt))</span></li>
<li><span class="ic">md5(sha1(md5(md5(password) + sha1(password)) + md5(password)))</span></li>
</ul>
<p>
Do not use any of these.
</p>
<p>
Note: This section has proven to be controversial. I've received a number of
emails arguing that wacky hash functions are a good thing, because it's better
if the attacker doesn't know which hash function is in use, it's less
likely for an attacker to have pre-computed a rainbow table for the wacky hash
function, and it takes longer to compute the hash function.
</p>
<p>
An attacker cannot attack a hash when he doesn't know the algorithm, but note <a
href="https://en.wikipedia.org/wiki/Kerckhoffs%27s_principle"> Kerckhoffs's
principle</a>, that the attacker will usually have access to the source code
(especially if it's free or open source software), and that given a few
password-hash pairs from the target system, it is not difficult to reverse
engineer the algorithm. It does take longer to compute wacky hash functions, but
only by a small constant factor. It's better to use an iterated algorithm that's
designed to be extremely hard to parallelize (these are discussed below). And,
properly salting the hash solves the rainbow table problem.
</p>
<p>
If you really want to use a standardized "wacky" hash function like HMAC, then it's OK.
But if your reason for doing so is to make the hash computation slower, read the section below about key stretching first.
</p>
<p>
Compare these minor benefits to the risks of accidentally implementing a
completely insecure hash function and the interoperability problems wacky hashes
create. It's clearly best to use a standard and well-tested algorithm.
</p>
<h2>Hash Collisions</h2>
<p>
Because hash functions map arbitrary amounts of data to fixed-length strings,
there must be some inputs that hash into the same string. Cryptographic hash
functions are designed to make these collisions incredibly difficult to find.
From time to time, cryptographers find "attacks" on hash functions that make
finding collisions easier. A recent example is the MD5 hash function, for which
collisions have actually been found.
</p>
<p>
Collision attacks are a sign that it may be more likely for a string other than
the user's password to have the same hash. However, finding collisions in even a
weak hash function like MD5 requires a lot of dedicated computing power, so it
is very unlikely that these collisions will happen "by accident" in practice. A
password hashed using MD5 and salt is, for all practical purposes, just as
secure as if it were hashed with SHA256 and salt. Nevertheless, it is a good
idea to use a more secure hash function like SHA256, SHA512, RipeMD, or
WHIRLPOOL if possible.
</p>
<a name="properhashing"></a>
<h2>The <span style="color: green;">RIGHT</span> Way: How to Hash Properly</h2>
<p>
This section describes exactly how passwords should be hashed. The first
subsection covers the basics—everything that is absolutely necessary. The
following subsections explain how the basics can be augmented to make the hashes
even harder to crack.
</p>
<h3>The Basics: Hashing with Salt</h3>
<p>
<b>Warning: Do not just read this section. You absolutely must implement the
stuff in the next section: "Making Password Cracking Harder: Slow Hash
Functions".</b>
</p>
<p>
We've seen how malicious hackers can crack plain hashes very quickly using
lookup tables and rainbow tables. We've learned that randomizing the hashing
using salt is the solution to the problem. But how do we generate the salt, and
how do we apply it to the password?
</p>
<p>
Salt should be generated using a <b>Cryptographically Secure Pseudo-Random
Number Generator</b> (CSPRNG). CSPRNGs are very different than ordinary
pseudo-random number generators, like the "C" language's
<span class="ic">rand()</span> function. As the name suggests, CSPRNGs are
designed to be cryptographically secure, meaning they provide a high level of
randomness and are completely unpredictable. We don't want our salts to be
predictable, so we must use a CSPRNG. The following table lists some CSPRNGs
that exist for some popular programming platforms.
</p>
<table id="rnglist">
<tr><th>Platform</th><th>CSPRNG</th></tr>
<tr><td>PHP</td><td><a href="http://php.net/manual/en/function.mcrypt-create-iv.php">mcrypt_create_iv</a>, <a href="http://php.net/manual/en/function.openssl-random-pseudo-bytes.php">openssl_random_pseudo_bytes</a></td></tr>
<tr><td>Java</td><td><a href="http://docs.oracle.com/javase/6/docs/api/java/security/SecureRandom.html">java.security.SecureRandom</a></td></tr>
<tr><td>Dot NET (C#, VB)</td><td><a href="http://msdn.microsoft.com/en-us/library/system.security.cryptography.rngcryptoserviceprovider.aspx">System.Security.Cryptography.RNGCryptoServiceProvider</a></td></tr>
<tr><td>Ruby</td><td><a href="http://rubydoc.info/stdlib/securerandom/1.9.3/SecureRandom">SecureRandom</a></td></tr>
<tr><td>Python</td><td><a href="http://docs.python.org/library/os.html">os.urandom</a></td></tr>
<tr><td>Perl</td><td><a href="http://search.cpan.org/~mkanat/Math-Random-Secure-0.06/lib/Math/Random/Secure.pm">Math::Random::Secure</a></td></tr>
<tr><td>C/C++ (Windows API)</td><td><a href="http://en.wikipedia.org/wiki/CryptGenRandom">CryptGenRandom</a></td></tr>
<tr><td>Any language on GNU/Linux or Unix</td><td>Read from <a href="http://en.wikipedia.org/wiki//dev/random">/dev/random</a> or /dev/urandom</td></tr>
</table>
<p> The salt needs to be unique per-user per-password. Every time a user creates an account or
changes their password, the password should be hashed using a new random salt. Never reuse a salt.
The salt also needs to be long, so that there are many possible salts. As a rule of thumb, make your
salt is at least as long as the hash function's output. The salt should be stored in the user
account table alongside the hash. </p>
<h4>To Store a Password</h4>
<ol>
<li>Generate a long random salt using a CSPRNG.</li>
<li>Prepend the salt to the password and hash it with a <b>standard</b> password hashing function like Argon2, bcrypt, scrypt, or PBKDF2.</li>
<li>Save both the salt and the hash in the user's database record.</li>
</ol>
<h4>To Validate a Password</h4>
<ol>
<li>Retrieve the user's salt and hash from the database.</li>
<li>Prepend the salt to the given password and hash it using the same hash function.</li>
<li>Compare the hash of the given password with the hash from the database. If they match, the password is correct. Otherwise, the password is incorrect.</li>
</ol>
<h4>In a Web Application, <b>always</b> hash on the server</h4>
<p>
If you are writing a web application, you might wonder <em>where</em> to hash.
Should the password be hashed in the user's browser with JavaScript, or should
it be sent to the server "in the clear" and hashed there?
</p>
<p> Even if you are hashing the user's passwords in JavaScript, you still have
to hash the hashes on the server. Consider a website that hashes users'
passwords in the user's browser without hashing the hashes on the server. To
authenticate a user, this website will accept a hash from the browser and check
if that hash exactly matches the one in the database. This seems more secure
than just hashing on the server, since the users' passwords are never sent to
the server, but it's not. </p>
<p>
The problem is that the client-side hash logically <em>becomes</em> the user's
password. All the user needs to do to authenticate is tell the server the hash
of their password. If a bad guy got a user's <em>hash</em> they could use it to
authenticate to the server, without knowing the user's password! So, if the bad
guy somehow steals the database of hashes from this hypothetical website,
they'll have immediate access to everyone's accounts without having to guess any
passwords.
</p>
<p>
This isn't to say that you <em>shouldn't</em> hash in the browser, but if you
do, you absolutely have to hash on the server too. Hashing in the browser is
certainly a good idea, but consider the following points for your implementation:
</p>
<ul>
<li>
<p>
Client-side password hashing is <b>not</b> a substitute for HTTPS
(SSL/TLS). If the connection between the browser and the server is
insecure, a man-in-the-middle can modify the JavaScript code as it is
downloaded to remove the hashing functionality and get the user's
password.
</p>
</li>
<li>
<p>
Some web browsers don't support JavaScript, and some users disable
JavaScript in their browser. So for maximum compatibility, your app
should detect whether or not the browser supports JavaScript and emulate
the client-side hash on the server if it doesn't.
</p>
</li>
<li>
<p>
You need to salt the client-side hashes too. The obvious solution is to
make the client-side script ask the server for the user's salt. Don't do
that, because it lets the bad guys check if a username is valid without
knowing the password. Since you're hashing and salting (with a good
salt) on the server too, it's OK to use the username (or email)
concatenated with a site-specific string (e.g. domain name) as the
client-side salt.
</p>
</li>
</ul>
<h3>Making Password Cracking Harder: Slow Hash Functions</h3>
<p>
Salt ensures that attackers can't use specialized attacks like lookup tables
and rainbow tables to crack large collections of hashes quickly, but it
doesn't prevent them from running dictionary or brute-force attacks on each
hash individually. High-end graphics cards (GPUs) and custom hardware can
compute billions of hashes per second, so these attacks are still very
effective. To make these attacks less effective, we can use a technique
known as <b>key stretching</b>.
</p>
<p>
The idea is to make the hash function very slow, so that even with a fast
GPU or custom hardware, dictionary and brute-force attacks are too slow to
be worthwhile. The goal is to make the hash function slow enough to impede
attacks, but still fast enough to not cause a noticeable delay for the user.
</p>
<p>
Key stretching is implemented using a special type of CPU-intensive hash
function. Don't try to invent your own–simply iteratively hashing the
hash of the password isn't enough as it can be parallelized in hardware and
executed as fast as a normal hash. Use a standard algorithm like <a
href="http://en.wikipedia.org/wiki/PBKDF2">PBKDF2</a> or <a href="http://en.wikipedia.org/wiki/Bcrypt">bcrypt</a>.
You can find a PHP implementation of <a href="https://defuse.ca/php-pbkdf2.htm">PBKDF2 here</a>.
</p>
<p>
These algorithms take a security factor or iteration count as an argument.
This value determines how slow the hash function will be. For desktop
software or smartphone apps, the best way to choose this parameter is to run
a short benchmark on the device to find the value that makes the hash take
about half a second. This way, your program can be as secure as possible
without affecting the user experience.
</p>
<p>
If you use a key stretching hash in a web application, be aware that you
will need extra computational resources to process large volumes of
authentication requests, and that key stretching may make it easier to run a
Denial of Service (DoS) attack on your website. I still recommend using key
stretching, but with a lower iteration count. You should calculate the
iteration count based on your computational resources and the expected
maximum authentication request rate. The denial of service threat can be
eliminated by making the user solve a CAPTCHA every time they log in.
Always design your system so that the iteration count can be increased or
decreased in the future.
</p>
<p>
If you are worried about the computational burden, but still want to use key
stretching in a web application, consider running the key stretching
algorithm in the user's browser with JavaScript. The <a href="http://crypto.stanford.edu/sjcl/">Stanford JavaScript Crypto
Library</a> includes PBKDF2. The iteration count should be set low enough
that the system is usable with slower clients like mobile devices, and the
system should fall back to server-side computation if the user's browser
doesn't support JavaScript. Client-side key stretching does not remove the
need for server-side hashing. You must hash the hash generated by the client
the same way you would hash a normal password.
</p>
<h3>Impossible-to-crack Hashes: Keyed Hashes and Password Hashing Hardware</h3>
<p>
As long as an attacker can use a hash to check whether a password guess is
right or wrong, they can run a dictionary or brute-force attack on the hash.
The next step is to add a <b>secret key</b> to the hash so that only someone
who knows the key can use the hash to validate a password. This can be
accomplished two ways. Either the hash can be encrypted using a cipher like
AES, or the secret key can be included in the hash using a keyed hash
algorithm like <a href="http://en.wikipedia.org/wiki/HMAC">HMAC</a>.
</p>
<p>
This is not as easy as it sounds. The key has to be kept secret from an
attacker even in the event of a breach. If an attacker gains full access to
the system, they'll be able to steal the key no matter where it is stored.
The key must be stored in an external system, such as a physically separate
server dedicated to password validation, or a special hardware device
attached to the server such as the <a href="https://www.yubico.com/YubiHSM">YubiHSM</a>.
</p>
<p>
I highly recommend this approach for any large scale (more than 100,000
users) service. I consider it necessary for any service hosting more than
1,000,000 user accounts.
</p>
<p>
If you can't afford multiple dedicated servers or special hardware devices,
you can still get some of the benefits of keyed hashes on a standard web
server. Most databases are breached using <a
href="http://en.wikipedia.org/wiki/SQL_injection">SQL Injection Attacks</a>,
which, in most cases, don't give attackers access to the local filesystem
(disable local filesystem access in your SQL server if it has this feature).
If you generate a random key and store it in a file that isn't accessible
from the web, and include it into the salted hashes, then the hashes won't
be vulnerable if your database is breached using a simple SQL injection
attack. Don't hard-code a key into the source code, generate it randomly
when the application is installed. This isn't as secure as using a separate
system to do the password hashing, because if there are SQL injection
vulnerabilities in a web application, there are probably other types, such
as Local File Inclusion, that an attacker could use to read the secret key
file. But, it's better than nothing.
</p>
<p>
Please note that keyed hashes do not remove the need for salt. Clever
attackers will eventually find ways to compromise the keys, so it is
important that hashes are still protected by salt and key stretching.
</p>
<a name="othersecurity"></a>
<h2>Other Security Measures</h2>
<p>
Password hashing protects passwords in the event of a security breach. It does
not make the application as a whole more secure. Much more must be done to
prevent the password hashes (and other user data) from being stolen in the first
place.
</p>
<p>
Even experienced developers must be educated in security in order to write secure applications.
A great resource for learning about web application vulnerabilities is
<a href="https://www.owasp.org/index.php/Main_Page">The Open Web Application
Security Project (OWASP)</a>. A good introduction is the
<a href="https://www.owasp.org/images/7/72/OWASP_Top_10-2017_%28en%29.pdf.pdf">OWASP Top Ten Vulnerability List</a>.
Unless you understand all the vulnerabilities on the list, do not attempt to
write a web application that deals with sensitive data. It is the employer's
responsibility to ensure all developers are adequately trained in secure
application development.
</p>
<p>
Having a third party "penetration test" your application is a good idea. Even
the best programmers make mistakes, so it always makes sense to have a security
expert review the code for potential vulnerabilities. Find a trustworthy
organization (or hire staff) to review your code on a regular basis. The
security review process should begin early in an application's life and continue
throughout its development.
</p>
<p>
It is also important to monitor your website to detect a breach if one does
occur. I recommend hiring at least one person whose full time job is detecting
and responding to security breaches. If a breach goes undetected, the attacker
can make your website infect visitors with malware, so it is extremely important
that breaches are detected and responded to promptly.
</p>
<a name="faq"></a>
<h2>Frequently Asked Questions</h2>
<h3>What hash algorithm should I use?</h3>
<span style="color: green;"><b>DO</b></span> use:
<ul class="moveul">
<li>Well-designed key stretching algorithms such as <a href="http://en.wikipedia.org/wiki/PBKDF2">PBKDF2</a>, <a href="http://en.wikipedia.org/wiki/Bcrypt">bcrypt</a>, and <a href="http://www.tarsnap.com/scrypt.html">scrypt</a>.</li>
<li>OpenWall's <a href="http://www.openwall.com/phpass/">Portable PHP password hashing
framework</a></li>
<li><a href="https://github.com/defuse/password-hashing">My implementations of PBKDF2 in PHP, C#, Java, and Ruby.</a></li>
<li>Secure versions of <a href="http://en.wikipedia.org/wiki/Crypt_(Unix)#Library_Function_crypt.283.29">crypt</a> ($2y$, $5$, $6$)</li>
</ul>
<br />
<span style="color: red;"><b>DO NOT</b></span> use:
<ul class="moveul">
<li>Fast cryptographic hash functions such as MD5, SHA1, SHA256, SHA512, RipeMD, WHIRLPOOL, SHA3, etc.</li>
<li>Insecure versions of crypt ($1$, $2$, $2x$, $3$).</li>
<li>Any algorithm that you designed yourself. Only use technology that is in the public domain and has been well-tested by experienced cryptographers.</li>
</ul>
<p>
Even though there are no cryptographic attacks on MD5 or SHA1 that make
their hashes easier to crack, they are old and are widely considered
(somewhat incorrectly) to be inadequate for password storage. So I don't
recommend using them. An exception to this rule is PBKDF2, which is
frequently implemented using SHA1 as the underlying hash function.
</p>
<h3>How should I allow users to reset their password when they forget it?</h3>
<p>
It is my personal opinion that all password reset mechanisms in widespread
use today are insecure. If you have high security requirements, such as an
encryption service would, do not let the user reset their password.
</p>
<p>
Most websites use an email loop to authenticate users who have forgotten their
password. To do this, generate a random <b>single-use</b> token that is strongly
tied to the account. Include it in a password reset link sent to the user's
email address. When the user clicks a password reset link containing a valid
token, prompt them for a new password. Be sure that the token is strongly tied
to the user account so that an attacker can't use a token sent to his own email
address to reset a different user's password.
</p>
<p>
The token must be set to expire in 15 minutes or after it is used, whichever
comes first. It is also a good idea to expire any existing password tokens when
the user logs in (they remembered their password) or requests another reset
token. If a token doesn't expire, it can be forever used to break into the
user's account. Email (SMTP) is a plain-text protocol, and there may be
malicious routers on the internet recording email traffic. And, a user's email
account (including the reset link) may be compromised long after their password
has been changed. Making the token expire as soon as possible reduces the
user's exposure to these attacks.
</p>
<p>
Attackers will be able to modify the tokens, so don't store the user account
information or timeout information in them. They should be an unpredictable
random binary blob used only to identify a record in a database table.
</p>
<p>
Never send the user a new password over email. Remember to pick a new random
salt when the user resets their password. Don't re-use the one that was used to
hash their old password.
</p>
<h3>What should I do if my user account database gets leaked/hacked?</h3>
<p>
Your first priority is to determine how the system was compromised and patch
the vulnerability the attacker used to get in. If you do not have experience
responding to breaches, I highly recommend hiring a third-party security firm.
</p>
<p>
It may be tempting to cover up the breach and hope nobody notices. However, trying to cover up a breach makes you look worse, because you're putting
your users at further risk by not informing them that their passwords and other
personal information may be
compromised. You must inform your users as soon as possible—even if you don't yet fully understand what happened. Put a notice on the
front page of your website that links to a page with more detailed information,
and send a notice to each user by email if possible.
</p>
<p>
Explain to your users exactly how their passwords were protected—hopefully
hashed with salt—and that even though they were protected with a salted
hash, a malicious hacker can still run dictionary and brute force attacks on the
hashes. Malicious hackers will use any passwords they find to try to login to a
user's account on a different website, hoping they used the same password on
both websites. Inform your users of this risk and recommend that they change
their password on any website or service where they used a similar password.
Force them to change their password for your service the next time they log in.
Most users will try to "change" their password to the original password to get
around the forced change quickly. Use the current password hash to ensure that
they cannot do this.
</p>
<p>
It is likely, even with salted slow hashes, that an attacker will be able to
crack some of the weak passwords very quickly. To reduce the attacker's window of opportunity to use these passwords, you should require, in
addition to the current password, an email loop for authentication until the
user has changed their password. See the previous question, "How should I allow
users to reset their password when they forget it?" for tips on implementing
email loop authentication.
</p>
<p>
Also tell your users what kind of personal information was stored on the
website. If your database includes credit card numbers, you should instruct your
users to look over their recent and future bills closely and cancel their
credit card.
</p>
<h3>What should my password policy be? Should I enforce strong passwords?</h3>
<p>
If your service doesn't have strict security requirements, then don't limit your
users. I recommend showing users information about the strength of their
password as they type it, letting them decide how secure they want their
password to be. If you have special security needs, enforce a minimum length of
12 characters and require at least two letters, two digits, and two symbols.
</p>
<p>
Do not force your users to change their password more often than once every six
months, as doing so creates "user fatigue" and makes users less likely to choose
good passwords. Instead, train users to change their password whenever they feel
it has been compromised, and to never tell their password to anyone. If it is a
business setting, encourage employees to use paid time to memorize and practice
their password.
</p>
<h3>If an attacker has access to my database, can't they just replace the hash of my password with their own hash and login?</h3>
<p>
Yes, but if someone has access to your database, they probably already have
access to everything on your server, so they wouldn't need to login to your
account to get what they want. The purpose of password hashing (in the context
of a website) is not to protect the website from being breached, but to protect
the passwords if a breach does occur.
</p>
<p>
You can prevent hashes from being replaced during a SQL injection attack by
connecting to the database with two users with different permissions. One for
the 'create account' code and one for the 'login' code. The 'create account'
code should be able to read and write to the user table, but the 'login' code
should only be able to read.
</p>
<h3>Why do I have to use a special algorithm like HMAC? Why can't I just append
the password to the secret key?</h3>
<p> Hash functions like MD5, SHA1, and SHA2 use the <a href="http://en.wikipedia.org/wiki/Merkle%E2%80%93Damg%C3%A5rd_construction">
Merkle–Damgård construction</a>, which makes them vulnerable to what are known
as length extension attacks. This means that given a hash H(X), an attacker can
find the value of H(pad(X) + Y), for any other string Y, without knowing X.
pad(X) is the padding function used by the hash.
</p>
<p>
This means that given a hash H(key + message), an attacker can compute H(pad(key +
message) + extension), without knowing the key. If the hash was being used as a
message authentication code, using the key to prevent an attacker from being
able to modify the message and replace it with a different valid hash, the
system has failed, since the attacker now has a valid hash of message +
extension.
</p>
<p>
It is not clear how an attacker could use this attack to crack a password hash
quicker. However, because of the attack, it is considered bad practice to
use a plain hash function for keyed hashing. A clever cryptographer may one day
come up with a clever way to use these attacks to make cracking faster, so use
HMAC.
</p>
<h3>Should the salt come before or after the password?</h3>
<p>
It doesn't matter, but pick one and stick with it for interoperability's sake.
Having the salt come before the password seems to be more common.
</p>
<h3>Why does the hashing code on this page compare the hashes in
"length-constant"
time?</h3>
<p>
Comparing the hashes in "length-constant" time ensures that an
attacker cannot extract the hash of a password in an on-line system using a
timing attack, then crack it off-line.
</p>
<p>
The standard way to check if two sequences of bytes (strings) are the same is to
compare the first byte, then the second, then the third, and so on. As soon as
you find a byte that isn't the same for both strings, you know they are
different and can return a negative response immediately. If you make it through
both strings without finding any bytes that differ, you know the strings are the
same and can return a positive result. This means that comparing two strings can
take a different amount of time depending on how much of the strings match.
</p>
<p>
For example, a standard comparison of the strings "xyzabc" and
"abcxyz" would immediately see that the first character is different
and wouldn't bother to check the rest of the string. On the other hand, when the
strings "aaaaaaaaaaB" and "aaaaaaaaaaZ" are compared, the
comparison algorithm scans through the block of "a" before it determines the
strings are unequal.
</p>
<p>
Suppose an attacker wants to break into an on-line system that rate limits
authentication attempts to one attempt per second. Also suppose the attacker
knows all of the parameters to the password hash (salt, hash type, etc), except
for the hash and (obviously) the password. If the attacker can get a precise
measurement of how long it takes the on-line system to compare the hash of the
real password with the hash of a password the attacker provides, he can use the
timing attack to extract part of the hash and crack it using an offline attack,
bypassing the system's rate limiting.
</p>
<p>
First, the attacker finds 256 strings whose hashes begin with every possible
byte. He sends each string to the on-line system, recording the amount of time
it takes the system to respond. The string that takes the longest will be the
one whose hash's first byte matches the real hash's first byte. The attacker now
knows the first byte, and can continue the attack in a similar manner on the
second byte, then the third, and so on. Once the attacker knows enough of the
hash, he can use his own hardware to crack it, without being rate limited by the
system.
</p>
<p>
It might seem like it would be impossible to run a timing attack over a network.
However, it has been done, and has been
<a href="https://crypto.stanford.edu/~dabo/papers/ssl-timing.pdf">shown to be practical</a>.
That's why the code on this page compares strings in a way that takes the same
amount of time no matter how much of the strings match.
</p>
<a name="slowequals"></a>
<h3>How does the SlowEquals code work?</h3>
<p>
The previous question explains why SlowEquals is necessary, this one explains
how the code actually works.
</p>
<div class="passcrack">
1. private static boolean slowEquals(byte[] a, byte[] b)<br />
2. {<br />
3. int diff = a.length ^ b.length;<br />
4. for(int i = 0; i < a.length && i < b.length; i++)<br />
5. diff |= a[i] ^ b[i];<br />
6. return diff == 0;<br />
7. }<br />
</div>
<p>
The code uses the XOR "^" operator to compare integers for equality, instead of
the "==" operator. The reason why is explained below. The result of XORing
two integers will be zero if and only if they are exactly the same. This is
because 0 XOR 0 = 0, 1 XOR 1 = 0, 0 XOR 1 = 1, 1 XOR 0 = 1. If we apply that to
all the bits in both integers, the result will be zero only if all the bits
matched.
</p>
<p>
So, in the first line, if <code>a.length</code> is equal to
<code>b.length</code>, the diff variable will get a zero value, but if not, it
will get some non-zero value. Next, we compare the bytes using XOR, and OR the
result into diff. This will set diff to a non-zero value if the bytes differ.
Because ORing never un-sets bits, the only way diff will be zero at the end of
the loop is if it was zero before the loop began (a.length == b.length) and all
of the bytes in the two arrays match (none of the XORs resulted in a non-zero
value).
</p>
<p>
The reason we need to use XOR instead of the "==" operator to compare integers
is that "==" is usually translated/compiled/interpreted as a branch. For example,