From 2f7537755f1b24b2f7050602a61595d5abad1a95 Mon Sep 17 00:00:00 2001 From: Jon Staab Date: Wed, 3 Nov 2021 18:28:45 -0700 Subject: [PATCH 01/18] Update secure-hash-algorithms.md for grammar and spelling. --- .../secure-hash-algorithms.md | 28 +++++++++---------- 1 file changed, 14 insertions(+), 14 deletions(-) diff --git a/cryptographic-hash-functions/secure-hash-algorithms.md b/cryptographic-hash-functions/secure-hash-algorithms.md index 427d3d7..32f3ab5 100644 --- a/cryptographic-hash-functions/secure-hash-algorithms.md +++ b/cryptographic-hash-functions/secure-hash-algorithms.md @@ -1,6 +1,6 @@ # Secure Hash Algorithms -In the past, many **cryptographic hash algorithms** were proposed and used by software developers. Some of them was **broken** \(like **MD5** and **SHA1**\), some are still considered secure \(like **SHA-2**, **SHA-3** and **BLAKE2**\). Let's review the most widely used cryptographic hash functions \(algorithms\). +In the past, many **cryptographic hash algorithms** were proposed and used by software developers. Some of them were **broken** \(like **MD5** and **SHA1**\), some are still considered secure \(like **SHA-2**, **SHA-3** and **BLAKE2**\). Let's review the most widely used cryptographic hash functions algorithms. ## Secure Hash Functions @@ -8,9 +8,9 @@ In the past, many **cryptographic hash algorithms** were proposed and used by so ### SHA-2, SHA-256, SHA-512 -[**SHA-2**](https://en.wikipedia.org/wiki/SHA-2) is a family of strong cryptographic hash functions: **SHA-256** \(256 bits hash\), **SHA-384** \(384 bits hash\), **SHA-512** \(512 bits hash\), etc. It is based on the cryptographic concept "[**Merkle–Damgård construction**](https://en.wikipedia.org/wiki/Merkle–Damgård_construction)" and is considered **highly secure**. SHA-2 is published as official crypto standard in the United States. +[**SHA-2**](https://en.wikipedia.org/wiki/SHA-2) is a family of strong cryptographic hash functions: **SHA-256** \(256 bits hash\), **SHA-384** \(384 bits hash\), **SHA-512** \(512 bits hash\), etc. It is based on the cryptographic concept "[**Merkle–Damgård construction**](https://en.wikipedia.org/wiki/Merkle–Damgård_construction)" and is considered **highly secure**. SHA-2 is published as an official crypto standard in the United States. -**SHA-2** is widely used by developers and in cryptography and is considered cryptographically strong enough for modern commercial applications. +**SHA-2** is widely used by developers in cryptography and is considered cryptographically strong enough for modern commercial applications. **SHA-256** is widely used in the **Bitcoin** blockchain, e.g. for identifying the transaction hashes and for the proof-of-work mining performed by the miners. @@ -26,17 +26,17 @@ SHA-512('hello') = 9b71d224bd62f3785d96d46ad3ea3d73319bfbc2890caadae2dff72519673 By design, **more bits at the hash output** are expected to achieve **stronger security** and higher collision resistance \(with some exceptions\). As general rule, 128-bit hash functions are weaker than 256-bit hash functions, which are weaker than 512-bit hash functions. -Thus, SHA-512 is stronger than SHA-256, so we can expect that for SHA-512 it is more unlikely to practically find a collision than for SHA-256. +Thus, SHA-512 is stronger than SHA-256, so we can expect that when using SHA-512 you are less likely to find a collision than when using SHA-256. ### SHA-3, SHA3-256, SHA3-512, Keccak-256 [**SHA-3**](https://en.wikipedia.org/wiki/SHA-3) \(and its variants SHA3-224, SHA3-256, SHA3-384, SHA3-512\), is considered **more secure than SHA-2** \(SHA-224, SHA-256, SHA-384, SHA-512\) for the same hash length. For example, SHA3-256 provides **more cryptographic strength than SHA-256** for the same hash length \(256 bits\). -The **SHA-3** family of functions are representatives of the "**Keccak**" hashes family, which are based on the cryptographic concept "[**sponge construction**](https://en.wikipedia.org/wiki/Sponge_function)". Keccak is the winner of the [SHA-3 NIST competition](https://en.wikipedia.org/wiki/NIST_hash_function_competition#Finalists). +The **SHA-3** family of functions are representatives of the "**Keccak**" hashes family, which are based on the cryptographic concept "[**sponge construction**](https://en.wikipedia.org/wiki/Sponge_function)". Keccak was the winner of the [SHA-3 NIST competition](https://en.wikipedia.org/wiki/NIST_hash_function_competition#Finalists). Unlike **SHA-2**, the **SHA-3** family of cryptographic hash functions are not vulnerable to the "[**length extension attack**](https://en.wikipedia.org/wiki/Length_extension_attack)". -**SHA-3** is considered **highly secure** and is published as official recommended crypto standard in the United States. +**SHA-3** is considered **highly secure** and is published as an official recommended crypto standard in the United States. The hash function **Keccak-256**, which is used in the **Ethereum** blockchain, is a variant of SHA3-256 with some constants changed in the code. @@ -62,7 +62,7 @@ The **BLAKE2** function is an improved version of **BLAKE**. **BLAKE2b** \(typically **512-bit**\) is BLAKE2 implementation, performance-optimized for 64-bit microprocessors. -The **BLAKE2** hash function has similar security strength like SHA-3, but is less used by developers than SHA2 and SHA3. +The **BLAKE2** hash function is similar in strength to SHA-3, but is less frequently used by developers compared with SHA2 and SHA3. Examples of BLAKE hashes: @@ -100,15 +100,15 @@ Avoid using of the following hash algorithms, which are considered **insecure** The below functions are popular strong cryptographic hash functions, alternatives to SHA-2, SHA-3 and BLAKE2: -* [**Whirlpool**](https://en.wikipedia.org/wiki/Whirlpool_%28hash_function) is secure cryptographic hash function, which produces 512-bit hashes. -* [**SM3**](https://tools.ietf.org/id/draft-oscca-cfrg-sm3-01.html) is the crypto hash function, officialy standartized by the **Chinese government**. It is similar to SHA-256 \(based on the Merkle–Damgård construction\) and produces 256-bit hashes. -* [**GOST**](https://en.wikipedia.org/wiki/GOST_%28hash_function) \(GOST R 34.11-94\) is secure cryptographic hash function, the Russian national standard, described in [RFC 4357](https://tools.ietf.org/html/rfc4357). It produces 256-bit hashes. +* [**Whirlpool**](https://en.wikipedia.org/wiki/Whirlpool_%28hash_function) is a secure cryptographic hash function, which produces 512-bit hashes. +* [**SM3**](https://tools.ietf.org/id/draft-oscca-cfrg-sm3-01.html) is the crypto hash function, officialy standardized by the **Chinese government**. It is similar to SHA-256 \(based on the Merkle–Damgård construction\) and produces 256-bit hashes. +* [**GOST**](https://en.wikipedia.org/wiki/GOST_%28hash_function) \(GOST R 34.11-94\) is a secure cryptographic hash function, the Russian national standard, described in [RFC 4357](https://tools.ietf.org/html/rfc4357). It produces 256-bit hashes. The below functions are less popular alternatives to SHA-2, SHA-3 and BLAKE, finalists at the [SHA-3 NIST competition](https://en.wikipedia.org/wiki/NIST_hash_function_competition#Finalists): -* [**Skein**](https://en.wikipedia.org/wiki/Skein_%28hash_function) is secure cryptographic hash function, capable to derive 128, 160, 224, 256, 384, 512 and 1024-bit hashes. -* [**Grøstl**](https://en.wikipedia.org/wiki/Grøstl) is secure cryptographic hash function, capable to derive 224, 256, 384 and 512-bit hashes. -* [**JH**](https://en.wikipedia.org/wiki/JH_%28hash_function) is secure cryptographic hash function, capable to derive 224, 256, 384 and 512-bit hashes. +* [**Skein**](https://en.wikipedia.org/wiki/Skein_%28hash_function) is a secure cryptographic hash function, capable of deriving 128, 160, 224, 256, 384, 512 and 1024-bit hashes. +* [**Grøstl**](https://en.wikipedia.org/wiki/Grøstl) is a secure cryptographic hash function, capable of deriving 224, 256, 384 and 512-bit hashes. +* [**JH**](https://en.wikipedia.org/wiki/JH_%28hash_function) is a secure cryptographic hash function, capable of deriving 224, 256, 384 and 512-bit hashes. ### No Collisions for SHA-256, SHA3-256, BLAKE2s and RIPEMD-160 are Known @@ -116,7 +116,7 @@ As of Oct 2018, **no collisions are known** for: **SHA256**, **SHA3-256**, **Kec **Brute forcing** to find hash function collision as general costs: 2128 for SHA256 / SHA3-256 and 280 for RIPEMD160. -Respectively, on a powerful enough **quantum computer**, it will cost less time: 2256/3 and 2160/3 respectively. Still \(as of September 2018\) so powerful quantum computers are not known to exist. +On a powerful enough **quantum computer**, it will cost less time: 2256/3 and 2160/3 respectively. Still \(as of September 2018\) such powerful quantum computers are not known to exist. Learn more about cryptographic hash functions, their strength and **attack resistance** at: [https://z.cash/technology/history-of-hash-function-attacks.html](https://z.cash/technology/history-of-hash-function-attacks.html) From 92fb3c80ef838bad0f96fd0dc923f28f3a824867 Mon Sep 17 00:00:00 2001 From: Jon Staab Date: Wed, 3 Nov 2021 18:30:13 -0700 Subject: [PATCH 02/18] Update hash-functions-examples.md for grammar and spelling --- cryptographic-hash-functions/hash-functions-examples.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/cryptographic-hash-functions/hash-functions-examples.md b/cryptographic-hash-functions/hash-functions-examples.md index d885b43..dc994f4 100644 --- a/cryptographic-hash-functions/hash-functions-examples.md +++ b/cryptographic-hash-functions/hash-functions-examples.md @@ -1,10 +1,10 @@ # Hash Functions - Examples -In this section we shall provide a few **examples** about calculating cryptographic hash functions in Python. +In this section we shall provide a few **examples** of calculating cryptographic hash functions in Python. ## Calculating Cryptographic Hash Functions in Python -We shall use the standard Python library `hashlib`. The input data for hashing should be given as **bytes sequence** \(bytes object\), so we need to **encode the input string** using some text encoding, e.g. `utf8`. The produced **output data** is also a bytes sequence, which can be printed as hex digits using `binascii.hexlify()` as shown below: +We shall use the standard Python library `hashlib`. The input data for hashing should be given asa **bytes sequence** \(bytes object\), so we need to **encode the input string** using some text encoding, e.g. `utf8`. The produced **output data** is also a bytes sequence, which can be printed as hex digits using `binascii.hexlify()` as shown below: ```python import hashlib, binascii From 7527212d3a3d96fca265091193a80b28e965f5b1 Mon Sep 17 00:00:00 2001 From: Jon Staab Date: Wed, 3 Nov 2021 18:31:00 -0700 Subject: [PATCH 03/18] Update exercises-calculate-hashes.md for grammar and spelling --- cryptographic-hash-functions/exercises-calculate-hashes.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/cryptographic-hash-functions/exercises-calculate-hashes.md b/cryptographic-hash-functions/exercises-calculate-hashes.md index c6c6b35..0e7f06c 100644 --- a/cryptographic-hash-functions/exercises-calculate-hashes.md +++ b/cryptographic-hash-functions/exercises-calculate-hashes.md @@ -1,6 +1,6 @@ # Exercises: Calculate Hashes -In this exercise session, you are assigned to write some code to **calculate cryptographic hashes**. Write a program to **calculate hashes** of given text message: **SHA-224**, **SHA-256**, **SHA3-224**, **SHA3-384**, **Keccak-384** and **Whirlpool**. Write your code in programming language of choice. +In this exercise session, you are assigned to write some code to **calculate cryptographic hashes**. Write a program to **calculate hashes** of given a text message using **SHA-224**, **SHA-256**, **SHA3-224**, **SHA3-384**, **Keccak-384** and **Whirlpool**. Write your code in programming language of choice. ## Calculate **SHA-224 Hash** From 3aef09c3f99e044330f0ad233f07ae8972f455e7 Mon Sep 17 00:00:00 2001 From: Jon Staab Date: Wed, 3 Nov 2021 18:34:59 -0700 Subject: [PATCH 04/18] Update proof-of-work-hash-functions.md for spelling and grammar --- .../proof-of-work-hash-functions.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/cryptographic-hash-functions/proof-of-work-hash-functions.md b/cryptographic-hash-functions/proof-of-work-hash-functions.md index c8a70cf..cf5a39f 100644 --- a/cryptographic-hash-functions/proof-of-work-hash-functions.md +++ b/cryptographic-hash-functions/proof-of-work-hash-functions.md @@ -1,6 +1,6 @@ # Proof-of-Work Hash Functions -Blockchain **proof-of-work mining** algorithms use a special class of hash functions which are **computational-intensive** and **memory-intensive**. These hash functions are designed to consume a lot of computational resources and a lot of memory and to be very hard to be implemented in a hardware devices \(such as [FPGA](https://en.wikipedia.org/wiki/Field-programmable_gate_array) integrated circuits or [ASIC](https://en.wikipedia.org/wiki/Application-specific_integrated_circuit) miners\). Such hash functions are known as "**ASIC-resistant**". +Blockchain **proof-of-work mining** algorithms use a special class of hash functions which are **computationally-intensive** and **memory-intensive**. These hash functions are designed to consume a lot of computational resources and memory, and to be very hard to implement in a hardware device \(such as [FPGA](https://en.wikipedia.org/wiki/Field-programmable_gate_array) integrated circuits or [ASIC](https://en.wikipedia.org/wiki/Application-specific_integrated_circuit) miners\). Such hash functions are known as "**ASIC-resistant**". Many hash functions are designed for proof-of-work mining algorithms, e.g. **ETHash**, **Equihash**, **CryptoNight** and **Cookoo Cycle**. These hash functions are **slow to calculate**, and usually use **GPU** hardware \([rigs](https://en.bitcoin.it/wiki/Mining_rig) of graphics cards like NVIDIA GTX 1080\) or powerful **CPU** hardware \(like Intel Core i7-8700K\) and a lot of fast **RAM** memory \(like DDR4 chips\). The goal of these mining algorithms is to **minimize the centralization of mining** by stimulating the small miners \(home users and small mining farms\) and limit the power of big players in the mining industry \(who can afford to build giant mining facilities and data centers\). A big number of **small players means better decentralization** than a small number of big players. @@ -29,8 +29,8 @@ Let's explain in briefly the idea behind the **Equihash** proof-of-work mining h How does **Equihash** work? -* Uses **BLAKE2b** to compute **50 MB hash dataset** from the previous blocks in the blockchain \(until the current block\). -* Solves the "**Generalized Birthday Problem**" over the generated hash dataset \(pick 512 different strings from 2097152, such that the binary XOR of them is zero\). The best known solution \(Wagner's algorithm\) runs in exponential time, so it requires a lot of memory-intensive and computing-intensive calculations. +* Uses **BLAKE2b** to compute a **50 MB hash dataset** from the previous blocks in the blockchain \(until the current block\). +* Solves the "**Generalized Birthday Problem**" over the generated hash dataset \(pick 512 different strings from 2097152, such that the binary XOR of them is zero\). The best known solution \(Wagner's algorithm\) runs in exponential time, so it requires a lot of memory-intensive and computationally-intensive calculations. * **Double SHA256** the solution to compute the final hash. Learn more about **Equihash** at: [https://www.cryptolux.org/images/b/b9/Equihash.pdf](https://www.cryptolux.org/images/b/b9/Equihash.pdf), [https://github.com/tromp/equihash](https://github.com/tromp/equihash). From 422780019a040d3da93d6c35be1cbb11639ee4cb Mon Sep 17 00:00:00 2001 From: Jon Staab Date: Thu, 4 Nov 2021 09:19:49 -0700 Subject: [PATCH 05/18] Update hmac-and-key-derivation.md for spelling and grammar --- mac-and-key-derivation/hmac-and-key-derivation.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mac-and-key-derivation/hmac-and-key-derivation.md b/mac-and-key-derivation/hmac-and-key-derivation.md index bc97c12..5dca573 100644 --- a/mac-and-key-derivation/hmac-and-key-derivation.md +++ b/mac-and-key-derivation/hmac-and-key-derivation.md @@ -10,7 +10,7 @@ Simply calculating `hash_func(key + msg)` to obtain a MAC \(message authenticati HMAC(key, msg, hash_func) -> hash ``` -The results MAC code is a **message hash** mixed with a secret key. It has the cryptographic properties of hashes: **irreversible**, **collision resistant**, etc. +The resulting MAC code is a **message hash** mixed with a secret key. It has the cryptographic properties of hashes: it is **irreversible**, **collision resistant**, etc. The `hash_func` can be any cryptographic hash function like `SHA-256`, `SHA-512`, `RIPEMD-160`, `SHA3-256` or `BLAKE2s`. From 497121811a7a97a808e14f9a8ca412528cbad3bb Mon Sep 17 00:00:00 2001 From: Jon Staab Date: Thu, 4 Nov 2021 09:20:46 -0700 Subject: [PATCH 06/18] Update exercises-calculate-hmac.md for spelling and grammar --- mac-and-key-derivation/exercises-calculate-hmac.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mac-and-key-derivation/exercises-calculate-hmac.md b/mac-and-key-derivation/exercises-calculate-hmac.md index 83d12e0..ff1e72d 100644 --- a/mac-and-key-derivation/exercises-calculate-hmac.md +++ b/mac-and-key-derivation/exercises-calculate-hmac.md @@ -1,6 +1,6 @@ # Exercises: Calculate HMAC -Write a program to **calculate HMAC-SHA-384** of given text **message** by given **key**. Write your code in programming language of choice. +Write a program to **calculate the HMAC-SHA-384** of given text **message** by given **key**. Write your code in programming language of choice. | **Input** | **Output** | | :--- | :--- | From cb402217501face237d42f536bf40d8f3a8201f0 Mon Sep 17 00:00:00 2001 From: Jon Staab Date: Thu, 4 Nov 2021 09:23:00 -0700 Subject: [PATCH 07/18] Update kdf-deriving-key-from-password.md for spelling and grammar --- mac-and-key-derivation/kdf-deriving-key-from-password.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mac-and-key-derivation/kdf-deriving-key-from-password.md b/mac-and-key-derivation/kdf-deriving-key-from-password.md index 44208a4..da4bfd5 100644 --- a/mac-and-key-derivation/kdf-deriving-key-from-password.md +++ b/mac-and-key-derivation/kdf-deriving-key-from-password.md @@ -24,7 +24,7 @@ To calculate a secure KDF it takes some **CPU time** to derive the key \(e.g. 0. When a modern KDF function is used with appropriate config parameters, **cracking passwords** will be **slow** \(e.g. 5-10 attempts per second, instead of thousands or millions attempts per second\). -All of the above mentioned key-derivation algorithms \([PBKDF2](https://en.wikipedia.org/wiki/PBKDF2), [Bcrypt](https://en.wikipedia.org/wiki/Bcrypt), [Scrypt](https://en.wikipedia.org/wiki/Scrypt) and [Argon2](https://en.wikipedia.org/wiki/Argon2)\) are not patented and **royalty-free** for public use. +All of the above mentioned key-derivation algorithms \([PBKDF2](https://en.wikipedia.org/wiki/PBKDF2), [Bcrypt](https://en.wikipedia.org/wiki/Bcrypt), [Scrypt](https://en.wikipedia.org/wiki/Scrypt) and [Argon2](https://en.wikipedia.org/wiki/Argon2)\) are not patented and are **royalty-free** for public use. Let's learn more about these modern KDF. From 9ea59b40adf262686144bb7fe7e6ca2106268853 Mon Sep 17 00:00:00 2001 From: Jon Staab Date: Thu, 4 Nov 2021 09:29:15 -0700 Subject: [PATCH 08/18] Update modern-key-derivation-functions.md for spelling and grammar --- mac-and-key-derivation/modern-key-derivation-functions.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/mac-and-key-derivation/modern-key-derivation-functions.md b/mac-and-key-derivation/modern-key-derivation-functions.md index ab232a3..832b65c 100644 --- a/mac-and-key-derivation/modern-key-derivation-functions.md +++ b/mac-and-key-derivation/modern-key-derivation-functions.md @@ -1,12 +1,12 @@ # Modern Key Derivation Functions -**PBKDF2** has a major weakness: it is **not GPU-resistant** and **not ASIC-resistant**, because it uses relatively small amount of RAM and can be efficiently implemented on GPU \(graphics cards\) or **ASIC** \(specialized hardware\). +**PBKDF2** has a major weakness: it is **not GPU- or ASIC-resistant**, because it uses a relatively small amount of RAM and can be efficiently implemented on GPU \(graphics cards\) or **ASIC** \(specialized hardware\). Modern key-derivation functions \(KDF\) like [**Scrypt**](https://en.wikipedia.org/wiki/Scrypt) and [**Argon2**](https://en.wikipedia.org/wiki/Argon2) are designed to be **resistant** to **dictionary attacks**, **GPU attacks** and **ASIC attacks**. These functions derive a key \(of fixed length\) from a password \(text\) and need a lot memory \(RAM\), which does not allow fast parallel computations on GPU or ASIC hardware. -Algorithms like **Bcrypt**, **Scrypt** and **Argon2** are considered more **secure** KDF functions. They use **salt** + many **iterations** + a lot of **CPU** + a lot of **RAM** memory and this makes very hard to design a custom hardware to significantly speed up password cracking. +Algorithms like **Bcrypt**, **Scrypt** and **Argon2** are considered more **secure** KDF functions. They use **salt**, many **iterations**, a lot of **CPU**, and a lot of **RAM**. This makes it very hard to design custom hardware to significantly speed up password cracking. -It takes a lot of **CPU time** to derive the key \(e.g. 0.2 sec\) + a lot of **RAM memory** \(e.g. 1GB\). The calculation process is memory-dependent, so **the memory access is the bottleneck** of the calculations. Faster RAM access will speed-up the calculations. +It takes a lot of **CPU time** to derive the key \(e.g. 0.2 sec\) and a lot of **RAM** \(e.g. 1GB\). The calculation process is memory-dependent, so **the memory access is the bottleneck** of the calculations. Faster RAM access will speed-up the calculations. When a lot of CPU and RAM is used to derive the key from given password, **cracking passwords is slow** and inefficient \(e.g. 5-10 attempts / second\), even when using very good password cracking hardware and software. The goal of the modern KDF functions is to make practically infeasible to perform a brute-force attack to reverse the password from its hash. From 3b847906a087b41e75056aec73002dec4b281b56 Mon Sep 17 00:00:00 2001 From: Jon Staab Date: Thu, 4 Nov 2021 09:36:58 -0700 Subject: [PATCH 09/18] Update scrypt.md for grammar --- mac-and-key-derivation/scrypt.md | 20 ++++++++++++-------- 1 file changed, 12 insertions(+), 8 deletions(-) diff --git a/mac-and-key-derivation/scrypt.md b/mac-and-key-derivation/scrypt.md index 363fa59..efb4c0e 100644 --- a/mac-and-key-derivation/scrypt.md +++ b/mac-and-key-derivation/scrypt.md @@ -12,11 +12,11 @@ key = Scrypt(password, salt, N, r, p, derived-key-len) The **Scrypt config parameters** are: +* `password`– the input password \(8-10 chars minimal length is recommended\) +* `salt` – securely-generated random bytes \(64 bits minimum, 128 bits recommended\) * `N` – iterations count \(affects memory and CPU usage\), e.g. 16384 or 2048 * `r` – block size \(affects memory and CPU usage\), e.g. 8 * `p` – parallelism factor \(threads to run in parallel - affects the memory, CPU usage\), usually 1 -* `password`– the input password \(8-10 chars minimal length is recommended\) -* `salt` – securely-generated random bytes \(64 bits minimum, 128 bits recommended\) * `derived-key-length` - how many bytes to generate as output, e.g. 32 bytes \(256 bits\) The **memory** in Scrypt is accessed in strongly **dependent order** at each step, so the memory access speed is the algorithm's bottleneck. The **memory required** to compute Scrypt key derivation is calculated as follows: @@ -25,12 +25,16 @@ The **memory** in Scrypt is accessed in strongly **dependent order** at each ste Memory required = 128 * N * r * p bytes ``` -Example: e.g. 128 \* N \* r \* p = 128 \* 16384 \* 8 \* 1 = 16 MB -\(or 128 \* N \* r \* p = 128 \* 2048 \* 8 \* 1 = 2 MB\) +Here's are a couple examples: + +```text +128 * 16384 * 8 * 1 = 16 MB +128 * 2048 * 8 * 1 = 2 MB +``` -**Choosing parameters** depends on how much you want to wait and what level of security \(password cracking resistance\) do you want to achieve: +**Choosing parameters** depends on how long you want to wait and what level of security \(password cracking resistance\) you want to achieve: -* Sample parameters for **interactive login**: N=16384, r=8, p=1 \(RAM = 2 MB\). For interactive login you most probably do not want to wait more than a 0.5 seconds, so the computations should be very slow. Also at the server side, it is usual that many users can login in the same time, so slow Scrypt computation will slow down the entire system. +* Sample parameters for **interactive login**: N=16384, r=8, p=1 \(RAM = 2 MB\). For interactive login you most probably do not want to wait more than 0.5 seconds, so the computations should be reasonably fast. If many users are logging in in the same time, a slow Scrypt computation can slow down the entire system. * Sample parameters for **file encryption**: N=1048576, r=8, p=1 \(RAM = 1 GB\). When you encrypt your hard drive, you will unlock the encrypted data in rare cases, usually not more than 2-3 times per day, so you may want to wait for 2-3 seconds to increase the security. You can perform tests and choose the Scrypt parameters yourself during the design and development of your app or system. Always try to use the **fastest possible implementation of Scrypt** for your language and platform, because crackers will definitely use it. Some implementations \(e.g. in Python\) may be 100 times slower than the fastest ones! @@ -56,7 +60,7 @@ pip install scrypt Note that the `scrypt` package depends on OpenSSL, so first install it in its default location \(e.g. in `C:\OpenSSL-Win64` in Windows\), then install the **scrypt** Python package. Now, after the `scrypt` package is successfully installed, write the Python code to calculate a Scrypt hash: -\(_Note, we have chosen smaller number for iterations count. We did that just to increase the following example execution speed. In common usage, a higher iterations count is recommended, e.g. 16384 - see above._\) +\(_Note, we have chosen a smaller number for iterations count. We did that just to increase the following example execution speed. In common usage, a higher iterations count is recommended, e.g. 16384 - see above._\) ```python import pyscrypt @@ -77,7 +81,7 @@ The **output** from the above code execution is the following: Derived key: b'e813a6f6ccc4e9110193bf9efb7c0a489d76655f9e36629dccbeaf2a73bc0c6f' ``` -Try to change the number of **iterations** or the **block size** and see how they affect the **execution time**. Have in mind that the above Python implementation is not very fast. You may find fast Scrypt implementation in Internet. +Try to change the number of **iterations** or the **block size** and see how they affect the **execution time**. Have in mind that the above Python implementation is not very fast. You may find a faster Scrypt implementation on the Internet. ## Storing Algorithm Settings + Salt + Hash Together From 8bd67c5f2b1759731d7c7858850a93aaaf91495c Mon Sep 17 00:00:00 2001 From: Jon Staab Date: Thu, 4 Nov 2021 14:47:16 -0700 Subject: [PATCH 10/18] Update password-encryption.md for grammar --- mac-and-key-derivation/password-encryption.md | 20 +++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-) diff --git a/mac-and-key-derivation/password-encryption.md b/mac-and-key-derivation/password-encryption.md index d03f52f..26393c2 100644 --- a/mac-and-key-derivation/password-encryption.md +++ b/mac-and-key-derivation/password-encryption.md @@ -20,18 +20,18 @@ Let's review these **password storage methods** and discuss their **level of sec The easiest and **most highly insecure** method for password storage and password-based authentication is to use [**clear-text passwords**](https://en.wikipedia.org/wiki/Plaintext) written directly in the database. * In this scenario to **check the password**, developers just compare the password for checking with the password from the database. -* **Never do this!!!** It is anti-pattern for software development. It is **bad for many reasons**. +* **Never do this!!!** It is an anti-pattern for software development. It is **bad for many reasons**. * Admins will be able to see user's passwords and this is really bad, because many users use **the same password for several sites / apps**, e.g. the same password for GMail, Facebook and Twitter. * **Admins should never know user's passwords**, but should be able to change them in case of emergency. - * If someone hacks the server and gains access to the database, he will **see all user's passwords** in plaintext. + * If someone hacks the server and gains access to the database, he will **see all users' passwords** in plaintext. * It is **very bad practice** to keep plaintext passwords in any information system / app in the world! * Just don't do it! ## Simple Password Hash - Highly Insecure -A relatively easy and **relatively insecure** method for password storage and password-based authentication is to use **simple password hash** like **SHA-256**\(password\), written directly in the database. +A relatively easy and **relatively insecure** method for password storage and password-based authentication is to use **simple password hash** like `SHA-256(password)`, written directly in the database. -* In this scenario to **check the password**, developers just compare the **hash**\(_password for checking_\) with the password hash from the database. +* In this scenario to **check the password**, developers just compare the `hash(_password for checking_)` with the password hash from the database. * **Avoid this!** It is highly **insecure** method. * Why? Because hashes are vulnerable to [**dictionary attacks**](https://en.wikipedia.org/wiki/Dictionary_attack). * Crackers who gain access to the database, can use a **dictionary** holding the hashes of the most commonly used 10-20 million passwords and most of the passwords will be decrypted. @@ -41,7 +41,7 @@ A relatively easy and **relatively insecure** method for password storage and pa ## Salted Hashed Passwords - Secure, but Not Enough -More complicated and **relatively secure** method for password storage and password-based authentication is to use **salted hashed passwords**, written in the database as pair { **salt** + **hash\(password + salt\)** }. The hash function can be any strong cryptographic hash like SHA-256. +A more complicated and **relatively secure** method for password storage and password-based authentication is to use **salted hashed passwords**, written in the database as pair { **salt** + **hash\(password + salt\)** }. The hash function can be any strong cryptographic hash like SHA-256. * The idea is to keep different random **salt**, along with different **password hash**, changed every time, when the password is written in the database. Thus the same password is encrypted every time as different ciphertext { **salt** + **hash** }. * To **check the password**, developers **calculate the hash**\(password for checking\) using the **salt** from the database and compare the **calculated hash** with the **hash from the database**. @@ -64,24 +64,24 @@ The most complicated and **most secure** method for secure password storage and ## Password-Based Authentication -Using a **secure password storage** is only one of the components of the process of **secure password-based authentication** for Web apps, mobile apps and Internet services. Systems, that use password-based-authentication are subject of many attacks: +Using a **secure password storage** is only one of the components of the process of **secure password-based authentication** for Web apps, mobile apps and Internet services. Systems that use password based authentication are subject of many attacks: * **Password guessing attack**: attacker tries to guess / brute-force the user's password by attempting many logins in parallel. * Solved easily by adding **increasing login delay** \(wait time before the login is available again\) after each wrong login attempt or even temporary account locking. Delays / locking should be done by **IP address + username**, to avoid login problems for the legitimate users. * Secure **KDF-based password storage** delays the password guessing process, so it is highly recommended. * Using a [**CAPTCHA**](https://en.wikipedia.org/wiki/CAPTCHA) after a 2-3 unsuccessful login attempts provides quite good protection. * **Denial of service attack**: attacker may attempt to login too many times to overload the system or can try to lock some user account with too many invalid login attempts for the same user. - * The **protection** from this attack is similar to the previous attack: use a **CAPTCHA** and **delay the login** process for certain IP address after each login attempt. + * The **protection** from this attack is similar to the previous attack: use a **CAPTCHA** and **delay the login** process for certain IP addresses after each login attempt. * **Intercept and replay attack**: attacker may intercept the authentication communication \(to sniff the login / password / auth ticket / other credentials\) and use the intercepted credentials to login later. * Most systems solve this problem by using [**TLS**](https://en.wikipedia.org/wiki/Transport_Layer_Security) \(encrypted connection\) to securely send the authentication credentials \(password / authentication ticket\) to the server. * Other solutions include [challenge-response](https://en.wikipedia.org/wiki/Challenge–response_authentication) based **cryptographic authentication scheme**, such as the scheme used in the [Kerberos](https://en.wikipedia.org/wiki/Kerberos_%28protocol%29) protocol. * **Man-in-the-middle attack**: attacker can intercept and modify the intercepted traffic between the server and the client to trick the user to reveal its login credentials. * This is solved by using a [**TLS**](https://en.wikipedia.org/wiki/Transport_Layer_Security) secure connection with server certificate, which **authenticates the server**. * In some scenarios \(e.g. online banking\) **clients are also authenticated** by a digital certificate or OTP \(one-time password\). -* **Compromised server attack**: if the authentication server and its database is compromised \(hacked\) and all its authentication data is leaked, the attacker should be unable to reveal user's plaintext passwords. +* **Compromised server attack**: if the authentication server and its database is compromised \(hacked\) and all its authentication data is leaked, the attacker should be unable to reveal users' plaintext passwords. * First, it should be clear that if the authentication server is compromised, in all cases the **attackers will get unauthorised access**, because they will be able to intercept user's legitimate sessions \(their login and communication after successful login\) and use them to impersonate the user. - * Using a strongly **secure password storage** mechanism mitigates the risk for users' passwords to be revealed as plaintext. Still, attackers who gain access to the authentication server may inject password interception [backdoor](https://en.wikipedia.org/wiki/Backdoor_%28computing%29) and steal each user's plaintext credentials \(username + password\) during the login. - * The **backdoored server attack** can be stopped like this: the client generates a random number **r** and sends as authentication **HMAC\(password, r\)**; the server compares the HMAC with its stored password. This process may be combined with **client-side Scrypt or Argon2** computation and securely stored password at the server side \(Scrypt or Argon2 hashed\). In this scenario, unless the client software is not compromised, attackers who gained access to the authentication server will not obtain user's passwords in plaintext. + * Using a strongly **secure password storage** mechanism mitigates the risk for users' passwords to be revealed as plaintext. Still, attackers who gain access to the authentication server may inject a password interception [backdoor](https://en.wikipedia.org/wiki/Backdoor_%28computing%29) and steal each user's plaintext credentials \(username + password\) during the login. + * The **backdoored server attack** can be stopped like this: the client generates a random number **r** and sends as authentication **HMAC\(password, r\)**; the server compares the HMAC with its stored password. This process may be combined with **client-side Scrypt or Argon2** computation and securely stored password at the server side \(Scrypt or Argon2 hashed\). In this scenario, unless the client software is compromised, attackers who gain access to the authentication server will not obtain user's passwords in plaintext. * In Web applications, if the server is compromised it can **inject JavaScript code** to compromise the client itself. In desktop and mobiles apps, the client is more safe in case of compromised server. **Conclusions** about how to implement a secure password-based authentication for Web sites, apps and services: From 26737870ac6b33af52878f7264b17ff2d4a6601d Mon Sep 17 00:00:00 2001 From: Jon Staab Date: Thu, 4 Nov 2021 15:57:04 -0700 Subject: [PATCH 11/18] Update pseudo-random-numbers-examples.md for grammar --- secure-random-generators/pseudo-random-numbers-examples.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/secure-random-generators/pseudo-random-numbers-examples.md b/secure-random-generators/pseudo-random-numbers-examples.md index b97775e..b037b93 100644 --- a/secure-random-generators/pseudo-random-numbers-examples.md +++ b/secure-random-generators/pseudo-random-numbers-examples.md @@ -1,6 +1,6 @@ # Pseudo-Random Numbers - Examples -To get a better idea **how pseudo-random numbers are generated** in computer programming, let's play with at the following Python code, which generates 5 pseudo-random numbers in the range \[10...20\]: +To get a better idea **how pseudo-random numbers are generated** in computer programming, let's play with the following Python code, which generates 5 pseudo-random numbers in the range \[10...20\]: ```python import hashlib, time @@ -78,7 +78,7 @@ f8a4eaceb16156b1a23f4b6d08e54665ffa4822949b22e01d6de4c5daae965e3|3 3011033199204 f8a4eaceb16156b1a23f4b6d08e54665ffa4822949b22e01d6de4c5daae965e3|4 100466094724924763659843669256673300207383922129676800217664465341535622195997 --> 16 ``` -Note that the **collected entropy is very hard to be predicted**. The cracker should guess all the text entered by the user and also guess the exact time for each of the 5 inputs. If the above is repeated 20 instead of 5 times, it will be even harder to predict \(the collected entropy will be bigger\). +Note that the **collected entropy is very hard to predict**. The cracker should guess all the text entered by the user and also guess the exact time for each of the 5 inputs. If the above is repeated 20 instead of 5 times, it will be even harder to predict \(the collected entropy will be bigger\). Some cryptographical software use similar techniques like in the above code example when generating keys, password and randomness as general and now you know why: to collect entropy in an unpredictable way. From faf94d1130f2c29bae1768dd77a44985d6526319 Mon Sep 17 00:00:00 2001 From: Jon Staab Date: Thu, 4 Nov 2021 16:07:25 -0700 Subject: [PATCH 12/18] Update secure-random-generators-csprng.md for grammar and spelling --- .../secure-random-generators-csprng.md | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/secure-random-generators/secure-random-generators-csprng.md b/secure-random-generators/secure-random-generators-csprng.md index d4f93d5..049fe45 100644 --- a/secure-random-generators/secure-random-generators-csprng.md +++ b/secure-random-generators/secure-random-generators-csprng.md @@ -1,17 +1,17 @@ # Secure Random Generators \(CSPRNG\) -Cryptography secure pseudo-random number generators \(**CSPRNG**\) are random generators, which guarantee that the random numbers coming from them are **absolutely unpredictable**. **CSPRNG** satisfy the [**next-bit test**](https://en.wikipedia.org/wiki/Next-bit_test) and withstand the [**state compromise extensions**](https://www.owasp.org/index.php/PRNG_state_compromise_extension_attack) and are typically part of the operating system or come from secure external source. Depending on the level of security required, CSPRNG can be implemented as **software** components or as **hardware** devices or as combination of both. +Cryptography secure pseudo-random number generators \(**CSPRNG**\) are random generators, which guarantee that the random numbers coming from them are **absolutely unpredictable**. **CSPRNG** satisfy the [**next-bit test**](https://en.wikipedia.org/wiki/Next-bit_test) and withstand the [**state compromise extensions**](https://www.owasp.org/index.php/PRNG_state_compromise_extension_attack) and are typically part of the operating system or come from a secure external source. Depending on the level of security required, CSPRNG can be implemented as **software** components or as **hardware** devices or as a combination of both. -For example, in the credit card printing centers the formal security regulations require certified **hardware random generators** to be used to generate credit card PIN codes, private keys and other data, designed to remain private. +For example, in credit card printing centers the formal security regulations require certified **hardware random generators** to be used to generate credit card PIN codes, private keys and other data, designed to remain private. -Modern operating systems \(OS\) **collect entropy** \(initial seed\) from the **environmental noise**: keyboard clicks, mouse moves, network activity, system I/O interruptions, hard disk activity, etc. Sources of randomness from the environment in Linux, for example, include inter-keyboard timings, inter-interrupt timings from some interrupts, and other events which are both non-deterministic and hard to measure for an outside observer. +Modern operating systems \(OS\) **collect entropy** \(initial seed\) from **environmental noise**: keyboard clicks, mouse moves, network activity, system I/O interruptions, hard disk activity, etc. Sources of randomness from the environment in Linux, for example, include inter-keyboard timings, inter-interrupt timings from some interrupts, and other events which are both non-deterministic and hard to measure for an outside observer. -The collected in the OS randomness is usually accessible from `/dev/random` and `/dev/urandom`. +Collected OS randomness is usually accessible from `/dev/random` and `/dev/urandom`. * Reading from the `/dev/random` file \(the limited blocking random generator\) **returns entropy** from the kernel's entropy pool \(collected noise\) and **blocks** when the entropy pool is empty until additional environmental noise is gathered. * Reading the `/dev/urandom` file \(the unlimited non-blocking random generator\) returns entropy from the kernel's entropy pool or a pseudo-random data, generated from previously collected environmental noise, which is also unpredictable, but is based on secure entropy "stretching" algorithm. -Usually a **CSPRNG** should start from an **unpredictable random seed** from the operating system, from a specialized hardware or from external source. Random numbers after the seed initialization are typically produces by a **pseudo-random computation**, but this does not compromise the security. Most algorithms often "**reseed**" the CSPRNG random generator when a new entropy comes, to make their work even more unpredictable. +Usually a **CSPRNG** should start from an **unpredictable random seed** from the operating system, from a specialized hardware or from external source. Random numbers after the seed initialization are typically produced by a **pseudo-random computation**, but this does not compromise the security. Most algorithms often "**reseed**" the CSPRNG random generator when a new entropy comes, to make their work even more unpredictable. Typically modern OS CSPRNG APIs combine the constantly collected **entropy** from the environment with the **internal state** of their built-in pseudo-random algorithm with continuous **reseeding** to guarantee maximal **unpredictability** of the generated randomness with high **speed** and **non-blocking** behavior in the same time. @@ -23,9 +23,9 @@ Modern **microprocessors** \(CPU\) ****provide a built-in hardware random genera Most cryptographic applications today do not require a hardware random generator, because the entropy in the operating system is secure enough for general cryptographic purposes. Using a **TRNG** is needed for systems with higher security requirements, such as banking and finance applications, certification authorities and high volume payment processors. -## How as a Developer to Access the CSPRNG? +## How a Developer can Access the CSPRNG? -Typically developers access the cryptographically strong random number generators \(**CSPRNG**\) for their OS from a **cryptography library** for their language and platform. +Typically developers access the cryptographically strong random number generators \(**CSPRNG**\) for their OS by using a **cryptography library** for their language and platform. * In **Linux** and **macOS**, it is considered that both `/dev/random` and `/dev/urandom` sources of randomness are **secure enough for most cryptographic purposes** and most cryptographic libraries access them internally. * In **Windows**, random numbers for cryptographic purposes can be securely generated using the `BCryptGenRandom` function from the [Cryptography API: Next Generation \(CNG\)](https://docs.microsoft.com/windows/desktop/SecCNG/cng-portal) or higher level crypto libraries. From 5721707b10c621bf65a0e7455c6d45577a489a9c Mon Sep 17 00:00:00 2001 From: Jon Staab Date: Thu, 4 Nov 2021 16:08:07 -0700 Subject: [PATCH 13/18] Update exercises-pseudo-random-generator.md for grammar --- secure-random-generators/exercises-pseudo-random-generator.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/secure-random-generators/exercises-pseudo-random-generator.md b/secure-random-generators/exercises-pseudo-random-generator.md index ccfa37f..440b045 100644 --- a/secure-random-generators/exercises-pseudo-random-generator.md +++ b/secure-random-generators/exercises-pseudo-random-generator.md @@ -1,6 +1,6 @@ # Exercises: Pseudo-Random Generator -Write a code to generate **30 pseudo-random integers** in the range **\[1...10\]**, starting from certain **entropy**, taken as input, using **HMAC key derivation**. +Write some code to generate **30 pseudo-random integers** in the range **\[1...10\]**, starting from certain **entropy**, taken as input, using **HMAC key derivation**. From the **entropy** generate a **seed** \(256-bit binary sequence\) using **SHA-256**: From 6cf106e08fe5d0fd9d1fdccd551af3a4530ecd7c Mon Sep 17 00:00:00 2001 From: Jon Staab Date: Thu, 4 Nov 2021 16:23:33 -0700 Subject: [PATCH 14/18] Update secure-random-generators for grammar --- secure-random-generators/README.md | 39 +++++++++++++++++------------- 1 file changed, 22 insertions(+), 17 deletions(-) diff --git a/secure-random-generators/README.md b/secure-random-generators/README.md index 15b9f3f..4a05ba8 100644 --- a/secure-random-generators/README.md +++ b/secure-random-generators/README.md @@ -2,15 +2,15 @@ ## Secure Random Number Generators, PRNG and CSPRNG -In cryptography the **randomness** \(entropy\) plays very important role. In many algorithms, we need **random \(i.e. unpredictable\) numbers**. If these numbers are not unpredictable, the algorithms will be compromised. +In cryptography **randomness** \(entropy\) plays very important role. In many algorithms, we need **random \(i.e. unpredictable\) numbers**. If these numbers are not unpredictable, the algorithms will be compromised. For example, assume we need a **secret key**, that will protect our financial assets. This secret key should be **randomly generated** in a way that nobody else should be able to generate or have the same key. If we generate the key from a **secure random generator**, the it will be **unpredictable** and the system will be secure. Therefore "secure random" means simply "**unpredictable random**". -Let's discuss in bigger detail the **random numbers** in computer science and their role in **cryptography**, as well as pseudo-random numbers generators \(**PRNG**\), secure pseudo-random generators \(**CSPRNG**\) and some guidelines about how developers should generate and use random numbers in their code. +Let's discuss in more detail **random numbers** in computer science and their role in **cryptography**, as well as pseudo-random numbers generators \(**PRNG**\), secure pseudo-random generators \(**CSPRNG**\) and some guidelines about how developers should generate and use random numbers in their code. ## Random Generators -In computer science **random numbers** usually come from a **pseudo-random number generators** \(PRNG\), initialized by some unpredictable initial randomness \(**entropy**\). In cryptography secure PRNGs are used, known as **CSPRNG**, which typically combined entropy with PRNG and other techniques to make the generated randomness **unpredictable**. +In computer science **random numbers** usually come from a **pseudo-random number generators** \(PRNG\), initialized by some unpredictable initial randomness \(**entropy**\). In cryptography secure PRNGs are used, known as **CSPRNG**, which typically combine entropy with PRNG and other techniques to make the generated randomness **unpredictable**. ### Pseudo-Random Number Generators \(PRNG\) @@ -26,32 +26,37 @@ Pseudo-random functions \(which are not secure for cryptography\) usually use an This process in its simplest form can be implemented as follows: ```python -init(entropy): - state = entropy, counter = 0 -netNum(): - state = HMAC(state, ++counter) - return state +class RandomNumberGenerator(object): + def __init__(self, entropy): + self.state = entropy + self.counter = 0 + + def get_num(self): + self.counter += 1 + self.state = HMAC(self.state, self.counter) + + return self.state ``` Of course, the **HMAC** function can be changed by some **cryptographic hash** function or another mathematical transformation like the [**Mersenne Twister**](https://en.wikipedia.org/wiki/Mersenne_Twister) \(which is not cryptographically secure\), but the main idea stays the same: pseudo-random generators have internal **state**, initialized with some **initial randomness** and over the time **change** their internal state and **generate pseudo-random numbers**, based on the current state. -Good random number generators should be **fast** and should generate **statistical randomness** \(see the [Diehard tests](https://en.wikipedia.org/wiki/Diehard_tests)\), i.e. all numbers should have the same chance to be generated over the time. This is not sufficient cryptography, so CSPRNG have higher requirements. +Good random number generators should be **fast** and should generate **statistical randomness** \(see the [Diehard tests](https://en.wikipedia.org/wiki/Diehard_tests)\), i.e. all numbers should have the same chance to be generated over the time. This however is not sufficient for cryptography, so CSPRNG have higher requirements. -The above idea to generate random pseudo-numbers based on **HMAC\(key + counter\)**, with some complications, is known as the [**HMAC\_DRGB algorithm**](https://www.cs.cmu.edu/~kqy/resources/thesis.pdf), described in the security standard [NIST 800-90A](https://nvlpubs.nist.gov/nistpubs/Legacy/SP/nistspecialpublication800-90a.pdf). +The above idea of generating random pseudo-numbers based on **HMAC\(key + counter\)**, with some complications, is known as the [**HMAC\_DRGB algorithm**](https://www.cs.cmu.edu/~kqy/resources/thesis.pdf), described in the security standard [NIST 800-90A](https://nvlpubs.nist.gov/nistpubs/Legacy/SP/nistspecialpublication800-90a.pdf). ### Initial Entropy \(Seed\) -To be secure, a **PRNG** \(which is statistically random\) should start by a **truly random initial seed**, which is absolutely **unpredictable**. If the seed is predictable, it will generate predictable sequence of random numbers and the entire random generation process will be **insecure**. That's why having unpredictable randomness at the start \(secure seed\) is very important. +To be secure, a **PRNG** \(which is statistically random\) should start with a **truly random initial seed**, which is absolutely **unpredictable**. If the seed is predictable, it will generate a predictable sequence of random numbers and the entire random generation process will be **insecure**. That's why having unpredictable randomness at the start \(secure seed\) is very important. How to initialize the pseudo-random generator in a secure way? The answer is simple: **collect randomness \(entropy\)**. ### Entropy -In computer science "**entropy**" means **unpredictable randomness**, and is usually measured in bits. For example, if you move your computer's mouse, it will generate some hard-to-predict events, like the start location and the end location of the mouse cursor. If we assume that the mouse has changed its position in the range of \[0...255\] pixels, the entropy collected from this mouse movement should be about 8 bits \(because 2^8 = 255\). Another example: if the user is asked to think of a number in the range \[0...1000\], this number will hold about 9-10 bits of entropy \(because 2^10 = 1024\). To collect 256 bits of entropy \(e.g. to securely generate a 256-bit integer\), you will need to take into account a sequence of several such events \(like mouse movements and keyboard interracions from the user\). +In computer science "**entropy**" means **unpredictable randomness**, and is usually measured in bits. For example, if you move your computer's mouse, it will generate some hard-to-predict events, like the start location and the end location of the mouse cursor. If we assume that the mouse has changed its position in the range of \[0...255\] pixels, the entropy collected from this mouse movement should be about 8 bits \(because 2^8 = 255\). Another example: if the user is asked to think of a number in the range \[0...1000\], this number will hold about 9-10 bits of entropy \(because 2^10 = 1024\). To collect 256 bits of entropy \(e.g. to securely generate a 256-bit integer\), you will need to take into account a sequence of several such events \(like mouse movements and keyboard activity\). ### Collecting Entropy -Entropy can be collected from many **hard-to-predict events** in the computer: keyboard clicks, mouse moves, network activity, camera activity, microphone activity and others, combined with the time at which they occur. This collection of initial randomness is usually performed internally by the **operating system** \(OS\), which provides standard **API** to access it \(e.g. reading from the `/dev/random` file in Linux\). In desktop system, laptop or mobile phone entropy is easy to collect, while on some limited hardware devices \(such as simple microcontrollers\) entropy is hard or impossible to be collected. +Entropy can be collected from many **hard-to-predict events** in the computer: keyboard clicks, mouse moves, network activity, camera activity, microphone activity and others, combined with the time at which they occur. This collection of initial randomness is usually performed internally by the **operating system** \(OS\), which provides a standard **API** to access it \(e.g. reading from the `/dev/random` file in Linux\). In desktop system, laptop or mobile phone entropy is easy to collect, while on some limited hardware devices \(such as simple microcontrollers\) entropy is hard or impossible to collect. Application software can **collect entropy explicitly**, by asking the user to move the mouse, type something at the keyboard, say something at the microphone or move in front of the camera for a while. A great example of this is the [**bitaddress.org**](https://www.bitaddress.org) wallet app, which combines mouse moves with keyboard events to collect entropy: @@ -61,11 +66,11 @@ Once enough entropy is collected, it is used to initialize the random generator. ### Insecure Randomness -Insecure / compromised randomness can compromise cryptography. A good example to learn from is the story of the stolen Bitcoins, due to **broken random generator in Android**: [https://goo.gl/PFE1kr](https://goo.gl/PFE1kr). That's why developers should care about randomness, when they use cryptography and ensure their **random generators are secure**. +Insecure / compromised randomness can compromise cryptography. A good example to learn from is the story of the stolen Bitcoins, due to a **broken random generator in Android**: [https://goo.gl/PFE1kr](https://goo.gl/PFE1kr). That's why developers should care about randomness when they use cryptography and ensure their **random generators are secure**. ### Insecure Randomness - Examples -As example how easy it is to **compromise the random number security in Python** \(in its old versions\), we shall give this code example: +Below is an example of how easy it is to **compromise the random number security in Python** \(in its old versions\), we shall give this code example: ```python import random @@ -119,9 +124,9 @@ By definition **CSPRNG** \(Cryptography Secure Random Number Generators\) are a The **entropy** in the operating system is usually of **limited amount** and waiting for more entropy is slow and unpractical. Most cryptographic applications use **CSPRNG**, which "stretch" the available entropy from the operating system into **more bits**, required for cryptographic purposes and comply to the above CSPRNG requirements. -Many design have been proposed to construct CSPRNG algorithms: +Many designs have been proposed to construct CSPRNG algorithms: -* **CSPRNG** based on secure **block ciphers** in counter mode, on **stream ciphers** or on secure **secure hash functions**. +* **CSPRNG** based on secure **block ciphers** in counter mode, **stream ciphers**, or **secure hash functions**. * **CSPRNG** based on number theory, relying on the difficulty of the integer factorization problem \(IFP\), the discrete logarithm problem \(DLP\) or the elliptic-curve discrete logarithm problem \(ECDLP\). * **CSPRNG** based on special design for cryptographic secure randomness, such as [Yarrow algorithm](https://en.wikipedia.org/wiki/Yarrow_algorithm) and [Fortuna](https://en.wikipedia.org/wiki/Fortuna_%28PRNG%29), which were used in MacOS and and FreeBSD. From f9a2cba365a92af8bca3905687b18863fb155938 Mon Sep 17 00:00:00 2001 From: Jon Staab Date: Fri, 12 Nov 2021 08:56:37 -0800 Subject: [PATCH 15/18] Update diffie-hellman-key-exchange.md for grammar and spelling --- key-exchange/diffie-hellman-key-exchange.md | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/key-exchange/diffie-hellman-key-exchange.md b/key-exchange/diffie-hellman-key-exchange.md index 15f8b78..0afbd5d 100644 --- a/key-exchange/diffie-hellman-key-exchange.md +++ b/key-exchange/diffie-hellman-key-exchange.md @@ -14,25 +14,25 @@ The **Diffie–Hellman Key Exchange** protocol can be implemented using **discre ### Key Exchange by Mixing Colors -The Diffie–Hellman Key Exchange protocol is very similar to the concept of "**key exchanging by mixing colors**", which has a good visual representation, which simplifies its understanding. This is why we shall first explain how to exchange a secret color by **color mixing**. +The Diffie–Hellman Key Exchange protocol is very similar to the concept of "**visually mixing colors**", a helpful illustration for understanding **key exchange**. -The design of color mixing key exchange scheme assumes that if we have two liquids of different colors, we can **easily mix the colors** and obtain a new color, but the reverse operation is almost impossible: **no way to separate the mixed colors** back to their original color components. +The design of color mixing key exchange scheme assumes that if we have two liquids of different colors, we can **easily mix the colors** and obtain a new color, but the reverse operation is almost impossible: there is **no way to separate the mixed colors** into their original color components. This is the color exchange **scenario**, step by step: * **Alice** and **Bob**, agree on an arbitrary **starting \(shared\) color** that does not need to be kept secret \(e.g. _yellow_\). * **Alice** and **Bob** separately select a **secret color** that they keep to themselves \(e.g. _red_ and _sea green_\). -* Finally **Alice** and **Bob** **mix** their secret color together with their mutually shared color. The obtained mixed colors area ready for public exchange \(in our case _orange_ and _light sky blue_\). +* Finally **Alice** and **Bob** **mix** their secret color together with their mutually shared color. The resulting colors are ready for public exchange \(in our case _orange_ and _light sky blue_\). ![](../.gitbook/assets/key-exchange-by-color-mixing-part-1.png) The next steps in the color exchanging scenario are as follows: * **Alice** and **Bob** publicly **exchange** their two **mixed colors**. - * We assume that there is no efficient way to extract \(separate\) the secret color from the mixed color, so third parties who know the mixed colors cannot reveal the secret colors. + * There is no efficient way to extract \(separate\) the secret color from the mixed color, so third parties who know the mixed colors cannot discover the secret colors. * Finally, **Alice** and **Bob** mix together the color they received from the partner with their own secret color. * The result is the **final color mixture** \(_yellow-brown_\) which is identical to the partner's color mixture. - * It is the **securely exchanged shared key**. + * This is the **securely exchanged shared key**. ![](../.gitbook/assets/key-exchange-by-color-mixing-part-2.png) @@ -64,7 +64,7 @@ there is no efficient \(fast\) algorithm to find the secret exponent **s**. This The **Discrete Logarithm Problem \(DLP\)** in computer science is defined as follows: -* By given element _**b**_ and value _**a**_ = _**bx**_ find the exponent _**x**_ \(if it exists\) +* Given element _**b**_ and value _**a**_ = _**bx**_ find the exponent _**x**_ \(if it exists\) The exponent _**x**_ is called [**discrete logarithm**](https://en.wikipedia.org/wiki/Discrete_logarithm), i.e. **x** = _log_**b**\(**a**\). The elements _**a**_ and _**b**_ can be simple integers modulo _**p**_ \(from the [group ℤ/pℤ](https://en.wikipedia.org/wiki/Multiplicative_group_of_integers_modulo_n)\) or elements of [finite cyclic multiplicative group **G**](https://en.wikipedia.org/wiki/Cyclic_group) \(modulo _**p**_\), where _**p**_ is typically a prime number. @@ -98,7 +98,7 @@ In the most common implementation of DHKE \(following the [RFC 3526](https://too ### Security of the DHKE Protocol -The DHKE protocol is based on the practical difficulty of the [Diffie–Hellman problem](https://en.wikipedia.org/wiki/Diffie–Hellman_problem), which is a variant of the well known in the computer science [DLP \(discrete logarithm problem\)](https://en.wikipedia.org/wiki/Discrete_Logarithm_Problem_%28DLP%29), for which no efficient algorithm still exists. +The DHKE protocol is based on the practical difficulty of the [Diffie–Hellman problem](https://en.wikipedia.org/wiki/Diffie–Hellman_problem), which is a well-known variant of the [DLP \(discrete logarithm problem\)](https://en.wikipedia.org/wiki/Discrete_Logarithm_Problem_%28DLP%29), for which no efficient algorithm still exists. DHKE exchanges a **non-secret sequence of integer numbers** over insecure, public \(sniffable\) channel \(such as signal going through a cable or propagated by waves in the air\), but does not reveal the secretly-exchanged shared private key. @@ -116,5 +116,5 @@ As live example, you can play with this online DHKE tool: [http://www.irongeek.c The [**Elliptic-Curve Diffie–Hellman \(ECDH\)**](https://en.wikipedia.org/wiki/Elliptic-curve_Diffie–Hellman) is an anonymous key agreement protocol that allows two parties, each having an **elliptic-curve public–private key pair**, to establish a shared secret over an insecure channel. -**ECDH** is a variant of the classical **DHKE** protocol, where the **modular exponentiation** calculations are replaced with **elliptic-curve** calculations for improved security. We shall explain in details the **elliptic-curve cryptography \(ECC\)** section later. +**ECDH** is a variant of the classical **DHKE** protocol, where the **modular exponentiation** calculations are replaced with **elliptic-curve** calculations for improved security. We shall explain in detail the **elliptic-curve cryptography \(ECC\)** section later. From 275fd2c2a365211d2618dffccdfb5c3d99edd55d Mon Sep 17 00:00:00 2001 From: Jon Staab Date: Wed, 1 Dec 2021 13:10:22 -0800 Subject: [PATCH 16/18] Update diffie-hellman-key-exchange.md for spelling and grammar --- key-exchange/diffie-hellman-key-exchange.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/key-exchange/diffie-hellman-key-exchange.md b/key-exchange/diffie-hellman-key-exchange.md index 0afbd5d..20f48d6 100644 --- a/key-exchange/diffie-hellman-key-exchange.md +++ b/key-exchange/diffie-hellman-key-exchange.md @@ -4,7 +4,7 @@ [**Diffie–Hellman Key Exchange**](https://en.wikipedia.org/wiki/Diffie–Hellman_key_exchange) \(DHKE\) is a cryptographic method to **securely exchange cryptographic keys** \(key agreement protocol\) over a public \(insecure\) channel in a way that overheard communication does not reveal the keys. The exchanged keys are used later for encrypted communication \(e.g. using a symmetric cipher like AES\). -**DHKE** was one of the first **public-key protocols**, which allows two parties to exchange data securely, so that is someone sniffs the communication between the parties, the information exchanged can be revealed. +**DHKE** was one of the first **public-key protocols**, which allows two parties to exchange data securely, so that if someone sniffs the communication between the parties, the information exchanged will not be revealed. The Diffie–Hellman \(DH\) method is **anonymous key agreement scheme**: it allows two parties that have no prior knowledge of each other to jointly establish a **shared secret key over an insecure channel**. From fc5543322869a7bb1d72201f89edefbd502c0daf Mon Sep 17 00:00:00 2001 From: Jon Staab Date: Wed, 1 Dec 2021 13:26:02 -0800 Subject: [PATCH 17/18] Update encryption-symmetric-and-asymmetric.md for grammar --- encryption-symmetric-and-asymmetric.md | 20 ++++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-) diff --git a/encryption-symmetric-and-asymmetric.md b/encryption-symmetric-and-asymmetric.md index 667ed22..2dc6ca0 100644 --- a/encryption-symmetric-and-asymmetric.md +++ b/encryption-symmetric-and-asymmetric.md @@ -99,7 +99,7 @@ Digital signatures are widely used in the **finance industry** for authorizing p ### Key Pairs -The **public key cryptography** uses a **pair of keys**: **public key** + **private key**. These keys are mathematically connected and are used together as **pair**. +The **public key cryptography** uses a **pair of keys**: **public key** + **private key**. These keys are mathematically connected and are used together as a **pair**. In some public key cryptosystems \(like the Elliptic-Curve Cryptography - **ECC**\), the public key can be calculated from the private key. In other cryptosystems \(like **RSA**\), the public key and the private key are generated together but cannot be directly calculated from each other. @@ -114,7 +114,7 @@ pubKey: 02c324648931b89e3e8a0fc42c96e8e3be2e42812986573a40d46563bceaf75110 ### Private Keys -Message **encryption** and **signing** is done by a **private key**. The private keys are always kept **secret** by their owner, just like passwords. In the server infrastructure, private key usually stay in an encrypted and protected **keystore**. In the blockchain systems the private keys usually stay in specific software or hardware apps or devices called "**crypto wallets**", which store securely a set of private keys. +Message **encryption** and **signing** is done by a **private key**. The private keys are always kept **secret** by their owner, just like passwords. In the server infrastructure, private keys usually stay in an encrypted and protected **keystore**. In the blockchain systems the private keys usually stay in specific software or hardware apps or devices called "**crypto wallets**", which securely store a set of private keys. **Example** of 256-bit private key: @@ -124,9 +124,9 @@ Message **encryption** and **signing** is done by a **private key**. The private ### Public Keys -Message **decryption** and **signature verification** is done by the **public key**. Public keys are by design public information \(not a secret\). It is mathematically infeasible to calculate the private key from its corresponding public key. +Message **decryption** and **signature verification** is done using the **public key**. Public keys are by design public information \(not a secret\). It is mathematically infeasible to calculate the private key from its corresponding public key. -In many systems the **public key** is encapsulated in a **digital certificate**, which binds certain identity \(e.g. person or Internet domain name\) to certain public key. In blockchain systems public keys are usually published as parts of the blockchain transactions to help identify who has signed each transaction. In systems like PGP and SSH the public key is downloaded from the server once \(after manual user verification\) and is remembered for further use. +In many systems the **public key** is encapsulated in a **digital certificate**, which binds a certain identity \(e.g. person or Internet domain name\) to a certain public key. In blockchain systems public keys are usually published as parts of the blockchain transactions to help identify who has signed each transaction. In systems like PGP and SSH the public key is downloaded from the server once \(after manual user verification\) and is remembered for further use. **Example** of 256-bit public key: @@ -136,29 +136,29 @@ In many systems the **public key** is encapsulated in a **digital certificate**, In most blockchain systems the **blockchain address** is derived from the public key \(by hashing and other transformations\), so if you have someone's public key, you are assumed to have his blockchain address as well. -A certain **public key** can be connected to certain **person** or **organization** or is used anonymously. You can never know who is the owner of the private key, corresponding to certain public key, unless you have additional proof, e.g. a [**digital certificate**](https://en.wikipedia.org/wiki/Public_key_certificate). +A certain **public key** can be connected to a certain **person** or **organization** or is used anonymously. You can never know who is the owner of the private key, corresponding to certain public key, unless you have additional proof, e.g. a [**digital certificate**](https://en.wikipedia.org/wiki/Public_key_certificate). ## Popular Public Key Cryptosystems -**Public key cryptosystems** provide mathematical framework and algorithms to generate public + private key pairs, to **sign**, **verify**, **encrypt** and **decrypt** messages and **exchange keys**, in a cryptographically secure way. +**Public key cryptosystems** provide a mathematical framework and algorithms to generate public + private key pairs, to **sign**, **verify**, **encrypt** and **decrypt** messages and **exchange keys**, in a cryptographically secure way. Well-known public-key cryptosystems are: [**RSA**](https://en.wikipedia.org/wiki/RSA_%28cryptosystem%29), [**ECC**](https://en.wikipedia.org/wiki/Elliptic-curve_cryptography) and [**ElGamal**](https://en.wikipedia.org/wiki/ElGamal_encryption). Many **crypto algorithms** are based on the primitives from these cryptosystems like **RSA sign**, **RSA encrypt / decrypt**, **ECDH** key exchange and **ECDSA** and **EdDSA** signatures. ### The RSA Cryptosystem -The [**RSA public-key cryptosystem**](https://en.wikipedia.org/wiki/RSA_%28cryptosystem%29) is based on the math of **modular exponentiations** \(numbers raised to a power by modulus\) and some additional assumptions, together with the computational difficulty of the [**integer factorization problem**](https://en.wikipedia.org/wiki/RSA_problem). We shall discuss it later in details, along with examples. +The [**RSA public-key cryptosystem**](https://en.wikipedia.org/wiki/RSA_%28cryptosystem%29) is based on the math of **modular exponentiations** \(numbers raised to a power by modulus\) and some additional assumptions, together with the computational difficulty of the [**integer factorization problem**](https://en.wikipedia.org/wiki/RSA_problem). We shall discuss it later in detail, along with examples. ### The ECC Cryptosystem -The [**elliptic-curve cryptography \(ECC\) public-key cryptosystem**](https://en.wikipedia.org/wiki/Elliptic-curve_cryptography) is based on the math of the on the algebraic structure of the **elliptic curves** over finite fields and the difficulty of the [**elliptic curve discrete logarithm problem \(ECDLP\)**](https://en.wikipedia.org/wiki/Elliptic-curve_cryptography#Rationale). The **ECC** usually comes together with the [**ECDSA** algorithm](https://en.wikipedia.org/wiki/Elliptic_Curve_Digital_Signature_Algorithm) \(elliptic-curve digital signature algorithm\). We shall discuss ECC and ECDSA in details, along with examples. +The [**elliptic-curve cryptography \(ECC\) public-key cryptosystem**](https://en.wikipedia.org/wiki/Elliptic-curve_cryptography) is based on the math of the on the algebraic structure of the **elliptic curves** over finite fields and the difficulty of the [**elliptic curve discrete logarithm problem \(ECDLP\)**](https://en.wikipedia.org/wiki/Elliptic-curve_cryptography#Rationale). The **ECC** usually comes together with the [**ECDSA** algorithm](https://en.wikipedia.org/wiki/Elliptic_Curve_Digital_Signature_Algorithm) \(elliptic-curve digital signature algorithm\). We shall discuss ECC and ECDSA in detail, along with examples. ### ECC is Recommended in the General Case -**ECC uses smaller keys**, ciphertexts and signatures than RSA and is recommended for most applications. It is mathematically proven that a **3072-bit RSA key** has similar cryptographic strength to a **256-bit ECC key**. Key generation is in ECC is significantly faster than with RSA. +**ECC uses smaller keys**, ciphertexts and signatures than RSA and is recommended for most applications. It is mathematically proven that a **3072-bit RSA key** has similar cryptographic strength to a **256-bit ECC key**. Key generation in ECC is significantly faster than with RSA. Due to the above reasons most blockchain networks \(like Bitcoin and Ethereum\) use elliptic-curve-based cryptography \(ECC\) to secure the transactions. -Note that both **RSA** and **ECC** cryptosystems are **not quantum-safe**, which means that if someone has enough powerful quantum computer, he will be able to derive the private key from given public key in just few seconds. +Note that both **RSA** and **ECC** cryptosystems are **not quantum-safe**, which means that if someone has enough powerful quantum computer, he will be able to derive the private key from given public key in just a few seconds. ## Asymmetric Encryption in Practice From 00ed2855b3d4349c783f2072f124ef95cd8c948c Mon Sep 17 00:00:00 2001 From: Jon Staab Date: Mon, 3 Jan 2022 10:44:48 -0800 Subject: [PATCH 18/18] Update rsa-signatures.md for grammar --- digital-signatures/rsa-signatures.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/digital-signatures/rsa-signatures.md b/digital-signatures/rsa-signatures.md index fd9b2e9..3fcfd95 100644 --- a/digital-signatures/rsa-signatures.md +++ b/digital-signatures/rsa-signatures.md @@ -4,7 +4,7 @@ The **RSA** public-key cryptosystem provides a **digital signature scheme** \(si ## Key Generation -The RSA algorithm uses **keys** of size 1024, 2048, 4096, ..., 16384 bits. RSA supports also longer keys \(e.g. 65536 bits\), but the performance is too slow for practical use \(some operations may take several minutes or even hours\). For 128-bit security level, a 3072-bit key is required. +The RSA algorithm uses **keys** of size 1024, 2048, 4096, ..., 16384 bits. RSA supports also longer keys \(e.g. 65536 bits\), but the performance is too slow for practical use \(some operations may take several minutes or even hours\). For a 128-bit security level, a 3072-bit key is required. The **RSA key-pair** consists of: