# Cryptography





## Understand the Fundamental Concepts of Cryptography
```
Explain the fundamental concepts of cryptography and how they are used. Cryptography is the process of obscuring or hiding the meaning of information so that unauthorized persons or processes cannot read it or make a useful copy of it. The original information is called plaintext (no matter what form of data it is), which is encrypted to produce ciphertext, which can be transmitted to a recipient or stored for later retrieval. Upon receipt or retrieval, the ciphertext is decrypted to recover the original meaning and the original form of the plaintext. The encryption and decryption processes (or algorithms) require keys; without the keys, no encryption or decryption can occur. Symmetric encryption uses the same key (or a simple transform of it) for encryption and decryption, whereas asymmetric encryption uses different keys that are nearly impossible to derive from each other.
```
Cryptography brings many capabilities to the information systems designer, builder, user, and owner:

* **Confidentiality**: Protect the meaning of information and restrict its use to authorized users.
* **Utility**: Map very large sets of possible messages, values, or data items to much smaller, more useful sets.
* **Uniqueness**: Generate and manage identifiers for use in access control and privilege management systems.
* **Identity**: Validate that a person or process is who and what they claim to be.
* **Privacy**: Ensure that information related to the identity of a person or process is kept confidential, and its integrity is maintained throughout.
* **Nonrepudiation**: Provide ways to sign messages, documents, and even software executables so that recipients can be assured of their authenticity.
* **Integrity**: Ensure that the content of the information has not been changed in any way except by authorized, trustworthy processes.

**Encryption** is the process of taking a message written in one set of symbols (and its syntax and semantics) and hiding or obscuring its meaning by changing the way the message is written. **Decryption** is then the process of unobscuring or revealing the meaning of an encrypted message and restoring it so that its original meaning is intact and revealed. The original **plaintext** message or information is encrypted into **ciphertext**, which is then decrypted back to its plaintext form and meaning.

**Cleartext** can either mean plaintext or data that is never intended to be transmitted, stored, or used in anything but an unencrypted form with its meaning and value available to anyone to read.

**Cryptographic algorithms** are the formal definition of the processes we use to encrypt plaintext into ciphertext and then decrypt ciphertext back to plaintext.

![](images/encoding-encrypting-decrypting-decoding.png)

At its heart, all cryptography uses substitution and transposition to take the input plaintext and rewrite it in a different set of symbols so that its meaning is hidden. Simple **substitution** encrypts by replacing every occurrence of one symbol in the plaintext with its cipher value (from a table); the symbols can be individual letters, digits, short fixed-length strings of characters, or entire words. Decryption takes each symbol in the ciphertext and uses the same table to look up its plaintext value. **Transposition** changes the order of symbols in the plaintext message (as in the scytale cipher used in ancient Greece).

**Cryptography** is the art and science of transforming plaintext information by means of suitable encryption techniques into ciphertext, which can then be decrypted back into matching plaintext. 

A **cryptographic system** is the sum total of all the elements we need to make a specific application of cryptography be part of our information systems. It includes the algorithm for encrypting and decrypting our information; the control parameters, keys, and procedural information necessary to use the algorithm correctly and any other specialized support hardware, software, or procedures necessary to make a complete solution.

* **Cryptography** refers specifically to the use and practice of cryptographic techniques.
* **Cryptanalysis** refers to the study of vulnerabilities (theoretical or practical) in cryptographic algorithms and systems and the use of exploits against those vulnerabilities to break such systems.
* **Cryptology** refers to the combined study of cryptography (the secret writing) and cryptanalysis (trying to break other people’s secret writing systems or find weaknesses in your own).
* **Cryptolinguistics**, however, refers to translating between human languages to produce useful information, insight, or actionable intelligence (and has little to do with cryptography).

By defining how the cryptographic systems process the input plaintext, we can get:

* **Character or symbol ciphers** use individual symbols in the plaintext as the unit to encrypt and decrypt, much like the simple, classical substitution and transposition ciphers did.


* **Block ciphers** take the input plaintext as a stream of symbols and break it up into fixed-length blocks; each block is then encrypted and decrypted as if it was a single (larger) symbol. A block of $64$ bits ($8$ eight-bit bytes) can be thought of as a $64$-digit binary number, which is what the encryption algorithm would then work on. Block ciphers typically have to pad the last block of a fixed-length plaintext message (such as a file or an email) so that each block has the required length.


* **Stream ciphers** are symmetric encryption processes that work on a single byte (sometimes even a single bit) of plaintext at a time, but they use a pseudorandom string (or keystream) of cipher digits to encrypt the input plaintext with. Stream ciphers typically use simple operations, such as exclusive-or, to encrypt each bit or byte. These operations run very fast (perhaps each encryption taking a few nanoseconds). Stream ciphers by design can work on any length of input plaintext. The keystream generator is a function (implemented in hardware, software, or both) that uses a seed value (the encryption key itself) as input, producing encryption values to be combined with each bit or byte of the input plaintext. Stream ciphers like RC4 find widespread use in mobile communications systems such as cell phones, Wi-Fi, and others, in which the plaintext input is often of unbounded length.

A **cryptographic algorithm** defines or specifies a series of steps - some mathematical, some logical, some grouping or un-grouping of symbols, or other kinds of operations—that must be being applied, in the specified sequence, to achieve the required operation of the system. Think of the algorithm as the total set of swap rules that you need to use, and the correct order to apply those rules in, to make the cryptographic system work properly. The basic processes of substitution and transposition can be repetitively or iteratively applied in a given cryptographic process. The number of rounds that an algorithm iterates over is a measure of this repetition. A combination of hardware and software features can implement this repetition.

Encryption and decryption processes can suffer from what we call a **collision**, which can render them unusable. This can occur if one of the following happens:

* Two different plaintext phrases should not map (encrypt) to the same ciphertext phrase; otherwise, you lose the difference in meaning between the two plaintext phrases.
* Two different ciphertext phrases should not map (decrypt) to the same plaintext phrase; otherwise, you have no idea which plaintext meaning was intended.

We talk about the **key strength** as a way to measure or assert how much effort would be required to break (illicitly decrypt) a cleartext message encrypted by a given algorithm using such a key. In most cases, this is directly related to the key size, defined as how many bits make up a key. Another way to think of this is that the key strength determines the size of the key space - the total number of values that such a key can take on. Thus, an $8$-bit key can represent the decimal numbers $0$ through $255$, which is like saying that an 8-bit key space has $256$ unique values in it. SSL uses a $256$-bit key as its session key (to encrypt and decrypt all exchanges of information during a session), which would mean that someone trying to brute force crack your session would need to try $2256$ possible values (that's a $78$-digit base-$10$ number) of a key to decrypt packets they’ve sniffed from your session.

**Key distribution and management** become the biggest challenges in running almost any cryptographic system. **Keying material** is a term that collectively refers to all materials and information that govern how keys are generated and distributed to users in a cryptographic system, and how those users validate that the keys are legitimate. **Key management** processes govern how long a key can be used and what users and systems managers must do if a key has been compromised (presumably by falling into the wrong hands). **Key distribution** describes how newly generated keys are issued to each legitimate user, along with any updates to the rules for their period of use and their safe disposal. 

The term **cryptographic protocols** can refer to two different sets of processes and techniques. The first is the use of cryptography itself in the operation of a cryptographic system, which typically can refer to key management and key distribution techniques. The second usage refers to the use of cryptographic systems and techniques to solve a particular problem.

A **cryptographic module** is any combination of hardware, firmware, or software that implements cryptographic functions. 

### Hashing

```
Differentiate between hashing and encryption. Hashing is a one-way encryption process: plaintext goes in, a hash value comes out, but you cannot reverse this to “un-hash” a hash value to get back to the original plaintext. Hashing takes a plaintext message and uses an encrypting hash algorithm to transform the plaintext into a smaller, shorter value (called the hash or hash value), which must be unique to the input plaintext. The hash algorithm should make it impossible to decrypt the hash value back into the plaintext, without any way to determine the meaning of a particular hash value. By contrast, the purpose of non-hashing encryption is to safely store or communicate plaintext with its meaning hidden for storage and transmission so that the meaning can later be derived by means of the right decryption algorithm and key. Encryption for storage and communication is thus part of a two-way process.
```

Hashing provides a way to take a very large set of expressions (messages, names, values, etc.) and map them down to a much smaller set of values.

Hashing provides many advantages in information systems design that stem from its ability to uniquely generate a numeric value that can represent arbitrary alphanumeric data (such as individual names or street addresses, part numbers, or drug names). These hash values can be stored in tables as relative offsets or pointers into very large files, eliminating the need to read every record to see if it's the one you actually need to use. Hashing the entire contents of a file produces a long-form error detection and correction code by reapplying the hash function and comparing that resultant hash value to the one stored with the file; a mismatch indicates the file may have been corrupted or changed. These are sometimes called **digital fingerprints** or **checksums** when used to detect (and possibly correct) errors in file storage or transmission. Hashing can also be applied to an entire message, producing a secure message hash or message digest. Since messages are typically of variable length, the message digest is fixed length, which makes them easy to use in file systems, communications systems, and security systems.

Hash algorithms transform the long key into a hash key or short key, where the long keys can be drawn from some arbitrarily large set of values (such as personal names) and the short key or hash key needs to fit within a more constrained space. The hash key, also called the hash, the hash value, the hash sum, or other term, is then used in place of the long key as a pointer value, an index, or an identifier. Two main properties of hash functions are similar to those of a good encryption function:

* The hash function must be one way: there should be no computationally feasible way to take a hash value and back-compute or derive the long key from which it was produced.
* The hash function must produce unique values for all possible inputs; it should be computationally infeasible to have two valid long keys as input that produce the same hash value as a result of applying the hash function.

Notice that both hashing and encryption must be one-to-one mappings or functions - no two input values can produce the same output value. But encryption must be able to decrypt the ciphertext back into one and only one plaintext (the identical one you started with).

![Hashing vs. Encryption as Functions][images/hashing-vs-encryption-as-functions.png]

Like encryption algorithms, hash algorithms need to deal with collisions (situations where two different long key inputs can hash to the same hash value). These are typically addressed with additional processing stages to detect and resolve the collision.

Hash algorithms may make use of a salt value to initialize the calculations. This is typically a random (well, pseudorandom) value that is included with the input long key; if the hash algorithm is dealing with a $256$-byte long key, a two-byte salt value effectively has the algorithm work on $258$ bytes. This offers significant protection against rainbow table or dictionary-based attacks on hashes by making the attacker have to precompute a significantly larger table of hashed values.

A number of published standards define secure hash functions for use in various kinds of information security systems. The SHA series of Secure Hash Algorithms, published by the NSA, is one such series; the original SHA-0 and SHA-1 standards have been shown to be vulnerable to collision attacks and are being disbanded for use with SSL.

If an algorithm is **deterministic** - given the same input and series of events, it always produces the same result.

If we use a deterministic algorithm to produce this set of numbers, using a seed value as a key input, we call such sets of numbers **pseudorandom**: the set as a whole exhibits statistical randomness, but given the nth value of the sequence and knowing the algorithm and the seed, the next element of the sequence - the $(n + 1)$th value - can be determined.

**Entropy** is a measure of the randomness of a system.

### Salting

```
Explain the basic hashing algorithms and the role of salting in hashing. Hashing algorithms treat all input plaintext as if it is a series of numbers and use techniques such as modulo arithmetic to transform potentially large, variable-length inputs into fixed-length hash values. When the function is chosen correctly, the change of a single bit in the input will produce a significantly different hash value. This provides a fast way to demonstrate that two sets of input (two files, for example) are either bit-for-bit identical or they are not. It should not be possible to take a hash value and reverse-calculate what the input plaintext was that produced it. To improve the strength of a hash function, a large random number is added to the input plaintext as additional bytes of input. This makes it much harder for brute force attacks to attempt to break a hash value back to its original plaintext.
```
### Symmetric/Asymmetric Encryption/Elliptic Curve Cryptography (ECC)

```
Explain the important differences between symmetric and asymmetric encryption algorithms. Symmetric encryption uses the same key (or a simple transform of it) for encryption and decryption. The underlying mathematical operations are ones that can run in reverse so that the ciphertext can be decrypted back to the form and content of the original plaintext. Once compromised, this key can be used to decrypt all previously encrypted ciphertext—there is no forward privacy or secrecy. Asymmetric encryption uses a very different mathematical construct to encrypt than it does for decrypting; it is required that there be no computationally feasible or doable way to take ciphertext and solve for the original plaintext without having both the corresponding decryption algorithm and the decryption key. There should also be no way to mathematically derive the decryption key from the encryption key. Asymmetric encryption, when implemented with computationally difficult algorithms using very large numbers as factors and keys, provides inherently better security than symmetric encryption can, given the same size keys. It can also provide forward secrecy (protect previously encrypted ciphertext from being decrypted) when keys are changed or compromised. Asymmetric encryption and decryption are compute-intensive, using a lot of processing time, whereas symmetric encryption can be built to run very fast in hardware, software, or both. Thus most public key infrastructures use asymmetric encryption while establishing a session key and then use symmetric encryption, using that session key for the bulk of the session’s communication.
```

Symmetric encryption algorithms have the greatest challenges with key management and key distribution. Symmetric encryption not only uses the same key (or a simple transform of that key) for encryption and decryption; it also provides no forward secrecy - which means that when a key is compromised, that compromised key can always be used to decrypt any ciphertext that was produced with that key.

As you'll see later, asymmetric encryption still uses keys; those keys still must be protected. And even though you publish your public key (when using hybrid encryption systems and the public key infrastructure for key exchange), your private key still represents the single most important secret that you must keep.

Any cryptographic system has to deal with key **revocation** - informing all users that a particular key is no longer valid and that it should not continue to be used.

**Zeroization** is the process by which cryptologic systems are cleared of all keying materials, plaintext, ciphertext, control parameters, and sometimes even their software and firmware. This process serves two main purposes: it restores the device to a clean initial state, and it removes any information that might possibly be used to break the encryption scheme, decrypt previously encrypted messages, or derive the encryption key to use for later decryption of subsequent messages.

#### Symmetric Key Cryptography

Symmetric key cryptography uses the same key to encrypt and decrypt the data being exchanged or protected. The algorithms for symmetric key cryptography typically run very fast - this type is suitable for encrypting high data rate streaming services, for example, or for protecting very large databases at rest or in motion.

Key distribution and key management are the Achilles’ heel of symmetric key encryption strategies. Every sender-receiver pair needs to exchange keys, which means for n users in a key exchange system you have n2 key exchanges to manage—and to update when you have to retire one key and replace it with another. With so many keys in motion, it becomes probable that keys may be intercepted and surreptitiously copied in transit, storage, or use. Brute force or other computational techniques can defeat these encryption schemes given sufficient computing resources.

#### Asymmetric Key (or Public Key) Cryptography

The asymmetric key (or public key) cryptographic set of algorithms and systems uses one key for encrypting the plaintext, and a very different key for decrypting the resultant ciphertext back to useful plaintext. This typically means that very different algorithms are used to encrypt and decrypt. The strength of asymmetric key cryptography rests on the assertion that it is computationally infeasible to use the encryption key to calculate the decryption key or to use the decryption key to calculate the encryption key, even if the details of the algorithms are known. The asymmetric encryption algorithms are often called **trapdoor functions**, in that you can fall down through an open trapdoor, but you cannot fall backup through it.

Public key distribution systems rely on the near impossibility of computing one of a pair of keys given the other. This lets users publish (or make publicly available) one key (the public key) while keeping the corresponding key secret and protected (or private). 

#### Hybrid Cryptosystems

Hybrid cryptosystems use multiple approaches to encrypt and decrypt the plaintext they are protecting. The most common hybrid systems are ones that combine asymmetric and symmetric algorithms using:

* **Key encapsulation** processes, which are typically built with public key infrastructures (PKIs) to handle key exchange.
* **Data (or payload) encapsulation** processes, which use more runtime-efficient symmetric key algorithms.



### Non-Repudiation (e.g., Digital Signatures/Certificates, HMAC, Audit Trail)

```
Know how to use cryptography to provide nonrepudiation. Digitally signing documents, files, or emails makes it exceptionally difficult for a sender to claim that the file the recipient has is not the file that they sent or to deny sending it at all. Using digital signatures to prove receipt and use of files by the addressee or recipient, however, requires some form of digitally signed receipt process, which most email systems cannot support. However, add-on systems for email do provide this, and EU standards have been supporting their adoption and use as part of secure e-commerce. Some national postal systems and a growing number of Internet service providers now make such capabilities available to users.
```
### Encryption Algorithms (e.g., AES, RSA)

```
Explain the difference between character, block, and stream ciphers. Character ciphers encrypt and decrypt each single character or symbol in the input plaintext, such as is done by a simple alphabetic substitution cipher; the encryption key is used to encrypt (and decrypt) each character. Block ciphers encrypt and decrypt fixed-length groups (blocks) of symbols or bytes from the input plaintext, typically in fixed-length blocks, which are then encrypted via transposition, substitution, or both; block ciphers may also transpose blocks, and multistage block encryption can do that at any stage in the process. The keys for block ciphers are applied to each block for encryption and decryption. Stream ciphers treat the input plaintext and the key as if they were continuous streams of symbols, and they use one element of the key to encrypt one element of the plaintext. Stream ciphers must use a key whose length is longer than the input plaintext and is random across that length to prevent attacks against the ciphertext.
```

```
Understand how encryption strength depends on the size of keys and other parameters. The simplest way to break an encryption system is to capture some ciphertext outputs from it, and using its known or assumed decryption algorithm, try every possible key and see if a presumed cleartext output is a meaningful message. Since even pure binary cleartext files (executable programs, for example) contain a lot of error checking and parity information, if a presumed cleartext output is error free, it probably is meaningful and might even be what the attacker is looking for. Key length determines how many possible keys must be tried—keys of 8-bit length require trying only 256 possible keys, for example. The larger the key, the larger the search space of possible keys. Using large, random salt or seed values as part of the encryption and decryption effectively enlarges that search space again. If the encryption and decryption algorithms depend on numbers, such as integer factors or exponents, the larger these values, again, the larger the search space.
```
### Key Strength (e.g., 256, 512, 1024, 2048 Bit Keys)
### Cryptographic Attacks, Cryptanalysis, & Counter Measures

White hat cryptanalysis can help pinpoint weaknesses in key generation, key management and distribution, or even in the algorithms themselves. This might lead us to redesign these systems and processes or to provide other processes to reduce the risk of harm if we cannot affordably strengthen our cryptosystems. 

Black hats may use many of the same tools and techniques and read many of the same technical journals, wiki pages, and books that the white hats depend on as they try to find and exploit vulnerabilities in our cryptosystems and their use.

## Understand the Reasons & Requirements for Cryptography

```
Understand the reasons for using cryptography as part of a secure information system. Unique identification of users, processes, files, or other information assets is a fundamental cornerstone of building any secure information system. Cryptographic techniques, from hashes through digital signatures and to encryption and decryption of data at rest, in motion, and in use, can provide a wide range of confidentiality, integrity, authentication, nonrepudiation, and availability benefits to systems designers. Modern cryptographic systems provide a wide range of choices, which allows systems builders to achieve the protection they need for costs (in money, time, effort, runtime resources, and operational complexity) commensurate with the risk.
```

```
Explain why cryptography does not answer all information security needs. Most information systems security incidents occur because of flaws in business process design, implementation, and use; this includes the training, education, and proficiency of the human users and other workers within the organization as much as it includes the IT systems and components. Cryptography can strengthen access control, enhance the integrity and confidentiality of information, and add nonrepudiation as well—but it cannot prevent the unanticipated. Cryptography helps implement hierarchies of trust, but these are reliable only insofar as the human or supply chain aspects of those hierarchies are as trustworthy as is required.
```

```
Explain the major vulnerabilities in various cryptographic systems and processes. The encryption and decryption keys are the most critical elements of any cryptographic system, be it symmetric, asymmetric, or hybrid, paper or electronic. If the keys cannot be protected, then all is lost. Keys can be stolen. Algorithmic weaknesses can be discovered and exploited to enable partial or complete attacks on ciphertext. Physical characteristics, such as mechanical or electrical noise, timing, stray emanations, or data remaining after part or all of an encryption operation, can be accessed, analyzed, and used to identify exploitable weaknesses.
```

### Confidentiality

Suitably encrypting cleartext information makes it difficult for unauthorized readers to view, understand, or use the meaning contained in that plaintext. Encrypting information provides for its confidentiality at rest or in motion. If the information must be decrypted for use, other means must be employed to protect the information where and when it is in use.

### Integrity & Authenticity

Any request by a subject for access to or use of an information asset needs to be authenticated. We must be able to prove that the subject is who (or what) they claim to be, and then compare that to our controlled and protected lists or rosters of capabilities and privileges. In almost all circumstances, doing so requires the subject to send credentialing information of some kind to our systems; while in transit, that information can be intercepted for later reuse by an otherwise unauthorized subject. It can also be altered while in transit. Credentialing information is also stored (in some form) by subjects and by our authentication systems; encrypting that stored information provides protection at rest.

We don't have to decrypt the credentials in order to validate that they are correct. If our authentication system stores only the encrypted (ciphertext) versions of the credentials, then a simple comparison of the ciphertext sent by the subject to the ciphertext kept on file validates or invalidates the identity of the subject. This use of digital signatures in their ciphertext form provides information protection while in use.

Every communications or information storage technology is subject to error, and yet every purpose for which we use communication and information requires that information to be as error-free as possible. This fact has led to developing error detection and correction techniques - adding a parity bit to each byte or calculating a checksum digit for a block of symbols, for example. As data blocks (or messages or files) get larger and larger, error correction code (ECC) must become more complex if it is to comprehensively provide information integrity assurance. ECC can identify where an error in the associated data has occurred - which bit got flipped from a $1$ to a $0$, or which symbol was changed into another symbol - and then show us what the correct bit or symbol ought to be. 

ECC works by having the sender calculate the ECC ciphertext value of the message, transmitting it along with the message content (in plaintext). The receiver calculates their own ECC ciphertext value, using the same agreed-to protocol or algorithm for that ECC process, and compares it to the ECC sent with the message. Differences in sent and received ECC can then be used to find and fix the error (often by notifying the sender to resend the block).

We use different names to refer to this use of cryptography to protect (or validate) the integrity of information, whether that information is at rest, in transit, or in use:

* **Hashing** is the general process of using an algorithm to compute a smaller, unique value that represents the contents of the plaintext in some way. This hash value can have many uses, depending on our needs. Database systems, for example, often need to take very long strings of text (such as personal names) and map or convert them to a logical record number in a file.


* A **digital signature** asserts that the file or message it is associated with is in fact what its name or circumstances claim it to be. Digital signatures attest to the integrity of software distribution files, for example. Digital signatures can be generated using hash algorithms or more complex encryption techniques; recipients then use the same agreed-to algorithms to validate that the signature and the file agree with each other.

Nonrepudiation requires that

* The identities of all parties have been authenticated.


* All parties have proven that they have the authority or privilege to participate in the transaction.


* The terms and conditions of the transaction exist in a form that can be recorded.


* All of this information can be collectively or separately verified and validated to be true and correct, free from any attempts to tamper with or alter it.

We assess the availability of an information system (in security terms) at two levels:

* Is the system itself, and the services it provides, available and ready to perform when subjects (users or processes at their behest) request objects or other services?


* Is the information needed by the user or requesting subject available when needed, and can it be completely and correctly output, displayed, or provided to that user or subject?

Cryptography supports both of these functional needs by providing for stronger authentication and information integrity control systems. Cryptography directly contributes to making the requested information available where it is needed, when it is needed, without compromise or loss of integrity. This offers protection for information at rest and in motion.

Cryptography also contributes to overall systems availability, typically as a component of strong access controls. It prevents or limits resources being exhausted (as in a denial of service attack) and can protect key systems functions by making it much harder for unauthorized subjects to perform disruptive actions.

### Data Sensitivity (e.g., PII, Intellectual Property, PHI)
### Regulatory
```
Know the regulatory and legal considerations for using cryptography in private business. Private businesses, in almost all jurisdictions, are subject to a variety of legal, government, and financial and insurance regulations regarding their safekeeping of information; these requirements are best summarized as CIANA, or confidentiality, integrity, availability, nonrepudiation, and authentication. Taken together, these should establish high-level, strategic needs for information security processes and systems, including cryptographic systems where applicable, for that business. Failing to do so puts customers, employees, owners, and the business at risk.
```

## Understand & Support Secure Protocols
### Services & Protocols (e.g., IPSec, TLS, S/MIME, DKIM)
### Common Use Cases
### Limitations & Vulnerabilities



## Understand Public Key Infrastructure (PKI) Systems

Three main factors separate the modern from the classical era of cryptography. The first is the switch from lexical analysis as the focus of cryptography to computationally hard problems - problems that are fairly easy to compute in one direction (given an $x$, find the corresponding $y$), but very difficult if not impossible to do in the reverse (given that $y$, find the $x$ that would generate it). The second is the near-simultaneous development, in the United States and United Kingdom, of what have been called **public key exchange protocols**. The third and perhaps most significant factor has been the explosive growth in the population of cryptographers.

These factors led to the widespread adoption of hybrid approaches to cryptography, which are what make **public key encryption**, **public key infrastructures**, and our modern e-commerce world possible.

```
Explain how public key infrastructures (PKIs) are used. Public key infrastructures provide two important benefits. First, by providing a secure means to generate, distribute, authenticate, and use public and private encryption keys, PKI has made widespread use of cryptographic protection a fundamental part of business, personal, and government use of the Web and the Internet. Second, by providing a scalable, decentralized capability to digitally sign documents, files, email, or other content, PKI provides not only enhanced confidentiality and integrity of information, but also nonrepudiation protection. It also strengthens authentication mechanisms. The total is that it makes secure, reliable information more available when it is needed, where it is needed.
```

### Fundamental Key Management Concepts (e.g., Key Rotation, Key Composition, Key Creation, Exchange, Revocation, Escrow)

```
Explain what key management is, what different approaches can be used, and the issues with key management. Key management is the process of creating encryption and decryption keys and then issuing, distributing, or sending them to users of the cryptographic system in question. The cryptographic keys are the fundamental secret that must be protected—all else, from systems design and usage through its fundamental algorithms, is known or will be easily known by one’s adversaries. Keys must be distributed in ways that prevent loss or disclosure, and they need to be destroyed or zeroized if users leave the network, if keys are partially compromised, or as a routine security measure. Keys can be distributed as physical documents or in electronic message format; both are subject to compromise, corruption, and loss, and typically such key systems (if based on symmetric algorithms) cannot self-authenticate a sender or recipient. Public key infrastructures do not actually distribute keys; rather, they provide for sender and recipient to co-generate a unique, private session key, which is used only for that session’s communication; these require asymmetric (public and private) keys have been generated for each user, typically authenticated by certificates.
```

#### Diffie-Hellman-Merkle Public Key Exchange

Key exchange is not about exchanging secret information between the parties; rather, it is about "creating" a shared key to use for subsequent encrypted sharing of secrets. Furthermore, it's important to realize that the "public" part of public key exchange is that you can quite literally publish parts of that key exchange without compromising the security of the encryption it supports. Whitfield Diffie and Martin Hellman first showed that public key exchange requires the use of what they called trapdoor functions - a class of mathematical problems that are easy to do in one direction (like falling through a trapdoor in the floor) but extremely difficult if not impossible to do in the other direction.

* Classical cryptographic systems depend upon **key distribution** systems to ensure that all known, authenticated, and trustworthy parties on the system have current encryption keys. Key distribution is the passing of secret information - the keys - from the key originator and controller to the parties who will use it.


* **Key exchange** systems start with the presumption that parties do not know each other, and have no a priori reason to trust each other. They achieve this trust, and therefore can share in a secure, encrypted conversation, by generating their session key together, and keeping that session key secret to themselves.

In both cases, the underlying **key infrastructure** is the collection of systems, communications pathways, protocols, algorithms, and processes (people-facing or built into software and hardware) that make key distribution or exchange work effectively and reliability.

#### RSA Encryption & Key Exchange

Like Diffie-Hellman, RSA uses the properties of modulo arithmetic applied to exponentiation of very large integers, where the modulus is also a very large prime number.

#### ElGamal Encryption

ElGamal provides for asymmetric encryption of keys previously used in symmetric encryption schemes. ElGamal also proposed a digital signature mechanism that allows third parties to confirm the authenticity of a message signed with it.

Some hybrid encryption systems use ElGamal to encrypt the symmetric keys used to encrypt message content. It is vulnerable to the chosen-ciphertext attack, in which the attacker somehow tricks or spoofs a legitimate user (an oracle) into decrypting an arbitrary message block and then sharing those results with the attacker. (Variations on this kind of attack were first known as **lunchtime attacks**, since the user's machine was assumed to be available while they were at lunch.) ElGamal does provide padding and other means to limit this vulnerability.

#### Digital Signature

```
Explain how cryptography is used to support digital signatures and what benefits you gain from using digital signatures. Asymmetric keys provide a way to digitally sign a file, an email, or a document. Typically this involves calculating a cryptographic hash of the input file, and combining it with the originator’s private key via a decryption process; the result is called the sender’s digital signature of that file or document. Recipients use the matching encryption process on that digital signature, using the sender’s public signature, to produce a received hash value, while also locally computing a hash of the received file. If these match, then the sender’s identity has been validated. Digitally signing files assures recipients that software updates, transaction files, or important documents have not been altered in storage or transmission. This provides enhanced data integrity and nonrepudiation and can do so across space (sender to recipient) and across time (validating that files placed in storage have not been corrupted between the time they were created and the time they are retrieved for use, be that milliseconds or months).
```

Suppose our friend Carol wishes to send a message to Bob, but in doing so, she needs to prove to Bob that the message is inarguably from her and not from some imposter:

1. Carol produces a strong hash of the message content.
2. Carol decrypts that hash value, using the trapdoor function and her private key. This new value is her digital signature.
3. Carol sends the message and her digital signature to Bob.
4. Bob encrypts Carol's digital signature, using the same trapdoor algorithm and Carol's public signature, to produce the signed hash value.
5. Bob uses the same hash function to produce a comparison hash of the message he received (not including the signature). If this matches the value he computed in step $4$, he has proven that Carol (who is the only one who knows her private key) is the only one who could have sent that message.

HTTPS actually says "use HTTP over secure sockets", which either meant "over SSL" or "over TLS".

### Web of Trust (WOT) (e.g., PGP, GPG)

```
Explain the difference between hierarchies of trust and webs of trust. Both concepts strive to establish associations or logical networks of entities. The topmost node of such a network, its trust anchor, confers trust upon intermediaries, which can then assert their trust to end (leaf) nodes. In hierarchies of trust, certificate authorities are the trusted anchors, which can issue certificates to intermediaries, which can issue certificates to the leaf nodes. End users, seeking to validate the trustworthiness of a certificate, infer that a certificate from a trusted end (leaf) node is trustworthy if the intermediary that issued it is, on up to the anchor. Webs of trust, by contrast, involve peer-to-peer trust relationships that do not rely on central certificate authorities as the anchors. Hierarchies of trust are much more scalable (to billions of certificates in use) than webs of trust. Both systems have drawbacks and issues, particularly with respect to certificate revocation, expiration, or the failure of a node to maintain trustworthiness.
```

In information and communications systems terms, the foremost token of trust is a **certificate** that asserts that the identity of the certificate holder and the public key associated with that certificate are linked or bound with each other. This gives rise to two different concepts of how trust conferred by one node upon another can be scaled up to larger numbers of nodes:

* 