# Chapter 3: Cryptography Essentials

In Chapter 2, we secured the pathways—the networks and systems that transport and process our data. But securing the pipes is insufficient if the message itself remains plaintext. When data traverses untrusted networks, rests on potentially compromised servers, or faces insider threats, we need a mechanism that renders stolen information useless to the thief. This mechanism is **cryptography**.

Cryptography is the mathematical foundation of cybersecurity. It transforms readable data (plaintext) into unintelligible formats (ciphertext) and provides mechanisms to verify authenticity and integrity. However, cryptography is also a domain where subtle errors—an incorrect mode of operation, a reused nonce, or a weak random number generator—can catastrophically undermine an otherwise secure system.

In this chapter, we will explore the cryptographic primitives that power modern security: symmetric and asymmetric encryption, hashing, and digital signatures. We will learn not just *how* to use algorithms like AES and RSA, but *when* to use them, how to manage the keys that protect our data, and how to implement TLS correctly to secure communications. We will also examine the common cryptographic failures that plague applications today (OWASP A04) and how to avoid them.

---

## 3.1 Core Concepts: Symmetric vs. Asymmetric Encryption, Hashing, and Digital Signatures

Cryptography is not a monolithic tool; it is a toolkit. Each tool serves a specific purpose, and understanding their distinct roles is fundamental to secure design.

### 3.1.1 Symmetric Encryption: Shared Secrets
**Symmetric encryption** uses a single key for both encryption and decryption. It is fast and efficient, making it ideal for encrypting large volumes of data (bulk encryption).

**Key Characteristics:**
*   **Speed:** Extremely fast (hardware-accelerated AES can encrypt gigabytes per second).
*   **Key Distribution Problem:** Both parties must possess the same secret key. Securely sharing this key over an insecure channel is the primary challenge.
*   **Use Cases:** Encrypting databases, file systems, TLS session data, and message payloads.

**Core Primitives:**
*   **Block Ciphers:** Encrypt fixed-size blocks of data (e.g., AES uses 128-bit blocks).
*   **Stream Ciphers:** Encrypt data bit-by-bit or byte-by-byte (e.g., ChaCha20).

**Important Warning: Never Roll Your Own Crypto**
Implementing encryption algorithms from scratch is dangerous. Side-channel attacks, timing attacks, and subtle mathematical flaws can compromise custom implementations. Always use vetted libraries like OpenSSL, libsodium, or the Python `cryptography` module.

### 3.1.2 Asymmetric Encryption (Public-Key Cryptography): Two Keys
**Asymmetric encryption** uses a mathematically related pair of keys:
*   **Public Key:** Can be shared openly. Used to encrypt data or verify signatures.
*   **Private Key:** Must be kept secret. Used to decrypt data or create signatures.

**Key Characteristics:**
*   **Speed:** Slow (thousands of times slower than symmetric encryption). Not suitable for bulk data.
*   **Key Distribution:** Solves the key distribution problem. You can publish your public key on a billboard.
*   **Use Cases:** Key exchange (establishing symmetric keys), digital signatures, certificate validation.

**The Mathematical Foundation:**
Asymmetric cryptography relies on "trapdoor functions"—mathematical operations easy to compute in one direction but computationally infeasible to reverse without the private key. RSA relies on the difficulty of factoring large prime numbers, while Elliptic Curve Cryptography (ECC) relies on the discrete logarithm problem over elliptic curves.

### 3.1.3 Hashing: One-Way Functions
Hashing transforms data of arbitrary size into a fixed-size string (the "digest" or "hash"). It is a **one-way function**: you cannot derive the input from the output.

**Properties of Cryptographic Hash Functions:**
1.  **Deterministic:** The same input always produces the same output.
2.  **Pre-image Resistance:** Given a hash `h`, it should be infeasible to find input `m` such that `hash(m) = h`.
3.  **Collision Resistance:** It should be infeasible to find two different inputs `m1` and `m2` such that `hash(m1) = hash(m2)`.
4.  **Avalanche Effect:** A tiny change in input drastically changes the output.

**Use Cases:**
*   **Integrity Verification:** Ensuring files haven't been tampered with.
*   **Password Storage:** Storing password hashes instead of plaintext (with salt!).
*   **Digital Signatures:** Signing the hash of a message rather than the message itself (more efficient).

### 3.1.4 Digital Signatures: Authentication and Non-Repudiation
Digital signatures provide three security services:
1.  **Authenticity:** Proves the message came from the claimed sender.
2.  **Integrity:** Proves the message wasn't modified in transit.
3.  **Non-repudiation:** The sender cannot deny they sent the message (unlike MACs, which use symmetric keys).

**How it works:**
1.  Sender creates a hash of the message.
2.  Sender encrypts the hash with their **private key** (this is the signature).
3.  Recipient decrypts the signature with the sender's **public key** to get the hash.
4.  Recipient hashes the received message and compares. If they match, the signature is valid.

---

## 3.2 Common Algorithms & When to Use Them

Selecting the right algorithm is critical. Outdated algorithms (MD5, SHA-1, DES) have known vulnerabilities and must not be used in production.

### 3.2.1 Symmetric Algorithms

#### **AES (Advanced Encryption Standard)**
The global standard for symmetric encryption. AES is a block cipher with three key sizes: 128, 192, and 256 bits. AES-256 is the gold standard for classified information.

**Critical Concept: Modes of Operation**
AES itself just encrypts blocks. **Modes** define how to apply AES to data larger than one block.

*   **ECB (Electronic Codebook):** **NEVER USE.** Identical plaintext blocks produce identical ciphertext blocks. Leaks patterns.
*   **CBC (Cipher Block Chaining):** Older standard. Uses an Initialization Vector (IV) and chains blocks together. Sequential only; vulnerable to padding oracle attacks if not implemented carefully.
*   **GCM (Galois/Counter Mode):** **Recommended.** Provides authenticated encryption (confidentiality + integrity) simultaneously. Supports parallel processing.

**Code Implementation: AES-256-GCM**
```python
from cryptography.hazmat.primitives.ciphers import Cipher, algorithms, modes
from cryptography.hazmat.backends import default_backend
import os

def encrypt_aes_gcm(plaintext: bytes, key: bytes) -> bytes:
    """
    Encrypts data using AES-256-GCM.
    Returns: nonce (12 bytes) + tag (16 bytes) + ciphertext
    """
    if len(key) != 32:
        raise ValueError("Key must be 32 bytes (256 bits)")
    
    # GCM requires a unique nonce for every encryption with the same key
    # 96 bits (12 bytes) is the recommended size
    nonce = os.urandom(12)
    
    cipher = Cipher(algorithms.AES(key), modes.GCM(nonce), backend=default_backend())
    encryptor = cipher.encryptor()
    
    ciphertext = encryptor.update(plaintext) + encryptor.finalize()
    
    # Return nonce + auth_tag + ciphertext
    return nonce + encryptor.tag + ciphertext

def decrypt_aes_gcm(encrypted_data: bytes, key: bytes) -> bytes:
    """
    Decrypts AES-256-GCM data.
    Format expected: nonce (12) + tag (16) + ciphertext
    """
    if len(key) != 32:
        raise ValueError("Key must be 32 bytes")
    
    nonce = encrypted_data[:12]
    tag = encrypted_data[12:28]
    ciphertext = encrypted_data[28:]
    
    cipher = Cipher(algorithms.AES(key), modes.GCM(nonce, tag), backend=default_backend())
    decryptor = cipher.decryptor()
    
    return decryptor.update(ciphertext) + decryptor.finalize()

# Usage example
key = os.urandom(32)  # Generate a secure 256-bit key
message = b"Secret API Key: sk-123456789"

encrypted = encrypt_aes_gcm(message, key)
decrypted = decrypt_aes_gcm(encrypted, key)

print(f"Original: {message}")
print(f"Decrypted: {decrypted}")
```

#### **ChaCha20-Poly1305**
An alternative to AES designed by Daniel Bernstein. Often faster than AES on mobile devices without AES hardware acceleration. Used in TLS for mobile clients and in WireGuard VPN.

### 3.2.2 Asymmetric Algorithms

#### **RSA (Rivest–Shamir–Adleman)**
The traditional public-key algorithm. Key sizes of 2048 bits (minimum) or 4096 bits (recommended) are standard.

**Use Cases:** Key exchange, digital signatures (though ECC is replacing it), legacy systems.

**Code Example: RSA Key Generation and Encryption**
```python
from cryptography.hazmat.primitives import serialization, hashes
from cryptography.hazmat.primitives.asymmetric import rsa, padding

# Generate RSA key pair (4096 bits recommended for long-term security)
private_key = rsa.generate_private_key(
    public_exponent=65537,  # Standard value
    key_size=4096
)
public_key = private_key.public_key()

# Serialize keys for storage (PEM format)
pem_private = private_key.private_bytes(
    encoding=serialization.Encoding.PEM,
    format=serialization.PrivateFormat.PKCS8,
    encryption_algorithm=serialization.NoEncryption()  # In production, encrypt with password!
)

pem_public = public_key.public_bytes(
    encoding=serialization.Encoding.PEM,
    format=serialization.PublicFormat.SubjectPublicKeyInfo
)

# Encryption (using OAEP padding - Optimal Asymmetric Encryption Padding)
# Note: RSA can only encrypt data smaller than key size minus padding overhead
message = b"Symmetric key: 32 bytes of data..."
encrypted = public_key.encrypt(
    message,
    padding.OAEP(
        mgf=padding.MGF1(algorithm=hashes.SHA256()),
        algorithm=hashes.SHA256(),
        label=None
    )
)

# Decryption
decrypted = private_key.decrypt(
    encrypted,
    padding.OAEP(
        mgf=padding.MGF1(algorithm=hashes.SHA256()),
        algorithm=hashes.SHA256(),
        label=None
    )
)
```

#### **ECC (Elliptic Curve Cryptography)**
Provides equivalent security to RSA with much smaller keys (e.g., 256-bit ECC ≈ 3072-bit RSA). Faster and more efficient.

**Curves:**
*   **P-256, P-384:** NIST curves (widely supported).
*   **Curve25519:** High-security, designed to resist side-channel attacks. Used in modern TLS and SSH.

### 3.2.3 Hashing Algorithms

**SHA-256 (Secure Hash Algorithm)**
Part of the SHA-2 family. Produces a 256-bit (32-byte) hash. The current standard for integrity verification.

**SHA-3**
The newest standard (Keccak algorithm). Different internal structure than SHA-2. Useful for diversity (if SHA-2 is ever broken, SHA-3 likely won't be vulnerable the same way).

**Code Example: Secure Password Hashing (Argon2)**
Never store passwords with SHA-256 alone (too fast; vulnerable to brute force). Use **Argon2** (winner of the Password Hashing Competition) or **bcrypt**.

```python
from argon2 import PasswordHasher
from argon2.exceptions import VerifyMismatchError

# Initialize Argon2id (resistant to side-channel and GPU attacks)
ph = PasswordHasher(
    time_cost=3,      # Iterations
    memory_cost=65536, # 64 MB
    parallelism=4,    # Parallel threads
    hash_len=32,
    salt_len=16
)

# Hashing a password
password = "user_password_123"
hash_string = ph.hash(password)
print(f"Stored hash: {hash_string}")

# Verifying
try:
    ph.verify(hash_string, password)
    print("Password valid")
    # Check if rehashing needed (parameters upgraded)
    if ph.check_needs_rehash(hash_string):
        new_hash = ph.hash(password)
        # Update database with new_hash
except VerifyMismatchError:
    print("Invalid password")
```

---

## 3.3 Key Management: Generation, Storage, Rotation, and HSMs

Cryptographic keys are the "crown jewels." If an attacker obtains your encryption key, the ciphertext might as well be plaintext. **Key Management** is often the hardest part of cryptography.

### 3.3.1 Key Generation
Keys must be generated using **Cryptographically Secure Pseudo-Random Number Generators (CSPRNGs)**. Standard random libraries (like `random` in Python) are predictable.

```python
import os
import secrets

# Bad (predictable)
import random
bad_key = random.getrandbits(256)

# Good (CSPRNG)
good_key = os.urandom(32)  # 32 bytes = 256 bits
good_token = secrets.token_urlsafe(32)  # For URL-safe tokens
```

### 3.3.2 Key Storage: The Hierarchy
**Never hardcode keys in source code.** Use this hierarchy:

1.  **Hardware Security Modules (HSMs):** Physical devices that generate, store, and process keys in tamper-resistant hardware. Keys never leave the HSM. Examples: AWS CloudHSM, Azure Dedicated HSM, Thales Luna.
2.  **Cloud KMS (Key Management Service):** Managed services (AWS KMS, Azure Key Vault, GCP Cloud KMS) that handle key storage and rotation. Keys are encrypted at rest by hardware.
3.  **Environment Variables:** Acceptable for development, risky for production. Better than hardcoding.
4.  **Configuration Files:** Encrypted at rest (e.g., Ansible Vault, SOPS).

**Code Example: AWS KMS Integration**
```python
import boto3
from cryptography.fernet import Fernet

kms_client = boto3.client('kms')

# Generate a data key from KMS
# This gives you a plaintext key for local use and an encrypted version to store
response = kms_client.generate_data_key(
    KeyId='alias/my-app-key',
    KeySpec='AES_256'
)

# Plaintext key for encryption
plaintext_key = response['Plaintext']
# Encrypted key to store in database/config
encrypted_key = response['CiphertextBlob']

# Encrypt data locally (fast)
fernet = Fernet(base64.urlsafe_b64encode(plaintext_key))
encrypted_data = fernet.encrypt(b"Sensitive user data")

# Immediately destroy plaintext key from memory
plaintext_key = None

# Later, to decrypt:
# 1. Decrypt the data key using KMS
decrypt_response = kms_client.decrypt(CiphertextBlob=encrypted_key)
plaintext_key = decrypt_response['Plaintext']
# 2. Decrypt data locally
```

### 3.3.3 Key Rotation
Regularly rotating keys limits the impact of a compromise.

**Strategies:**
*   **Time-based:** Rotate every 90 days (compliance requirement in some sectors).
*   **Event-based:** Rotate after suspected compromise or employee departure.
*   **Versioning:** Support multiple key versions simultaneously to allow graceful rotation without downtime.

### 3.3.4 Key Destruction
When rotating or decommissioning, ensure keys are **cryptographically erased** (overwritten or destroyed in HSM). Simply deleting a file may not remove it from disk.

---

## 3.4 TLS/SSL Deep Dive: Handshakes, Certificates, PKI, and Common Misconfigurations

Transport Layer Security (TLS) 1.3 is the current standard (2026). It supersedes SSL (which is deprecated and insecure) and TLS 1.2 (which has legacy baggage).

### 3.4.1 The TLS 1.3 Handshake
TLS 1.3 simplified the handshake for speed and security (removed support for obsolete algorithms).

**Simplified Flow:**
1.  **Client Hello:** Client sends supported cipher suites and a random value.
2.  **Server Hello:** Server chooses cipher (e.g., AES-256-GCM), sends its certificate and random value.
3.  **Key Exchange:** Diffie-Hellman key exchange establishes a shared secret (forward secrecy ensures past sessions aren't compromised if private keys are stolen later).
4.  **Encrypted Application Data:** All subsequent communication is encrypted with the negotiated symmetric key.

**Key Improvement in TLS 1.3:** **0-RTT (Zero Round Trip Time)** allows resuming previous connections instantly, but with a trade-off: potential replay attacks. Use carefully.

### 3.4.2 Public Key Infrastructure (PKI)
PKI is the trust system that allows us to verify that `bank.com` is actually the bank.

**Components:**
*   **Certificate Authority (CA):** Trusted third party (Let's Encrypt, DigiCert) that signs certificates.
*   **Certificate:** Binds a public key to an identity (domain name). Contains:
    *   Subject (domain)
    *   Issuer (CA)
    *   Public Key
    *   Validity dates
    *   SANs (Subject Alternative Names - for multiple domains)
*   **Chain of Trust:** Your certificate is signed by an Intermediate CA, which is signed by a Root CA. Your browser trusts the Root CA.

**Certificate Validation Snippet:**
```python
import ssl
import certifi
import socket

def verify_certificate(hostname, port=443):
    context = ssl.create_default_context()
    context.load_verify_locations(certifi.where())  # Use Mozilla's CA bundle
    
    with socket.create_connection((hostname, port)) as sock:
        with context.wrap_socket(sock, server_hostname=hostname) as ssock:
            cert = ssock.getpeercert()
            cipher = ssock.cipher()
            protocol = ssock.version()
            
            print(f"TLS Version: {protocol}")  # Should be TLSv1.3
            print(f"Cipher: {cipher[0]}")
            print(f"Subject: {cert['subject']}")
            print(f"Issuer: {cert['issuer']}")
            print(f"Not After: {cert['notAfter']}")
            
            if protocol != "TLSv1.3":
                print("WARNING: Using outdated TLS version!")

verify_certificate("google.com")
```

### 3.4.3 Common Misconfigurations (OWASP A04)
These errors constitute **Cryptographic Failures**, the #4 risk in OWASP Top 10.

**1. Using Weak Protocols**
*   **SSL 2.0/3.0:** Broken (POODLE attack).
*   **TLS 1.0/1.1:** Deprecated, vulnerable to BEAST, CRIME.
*   **Fix:** Enforce TLS 1.3 only, or TLS 1.2+ with secure cipher suites.

**2. Weak Cipher Suites**
Avoid:
*   RC4 (broken)
*   DES/3DES (broken)
*   MD5/SHA1 signatures (weak)
*   CBC mode without proper HMAC (padding oracle risks)

**3. Certificate Issues**
*   **Expired Certificates:** Cause browser warnings and downtime.
*   **Self-Signed in Production:** Users bypass warnings, enabling MITM attacks.
*   **Wildcard Certificates:** `*.example.com` is convenient but increases blast radius if stolen.
*   **Missing Certificate Pinning:** Mobile apps should pin specific certificates to prevent rogue CA attacks.

**4. Improper Certificate Validation**
*   **Hostname Verification:** Always verify the certificate matches the hostname you're connecting to.
*   **Certificate Revocation:** Check OCSP (Online Certificate Status Protocol) or use OCSP Stapling.

**Configuration: Secure Nginx SSL (TLS 1.3)**
```nginx
server {
    listen 443 ssl;
    
    # Certificates
    ssl_certificate /etc/nginx/ssl/example.com.crt;
    ssl_certificate_key /etc/nginx/ssl/example.com.key;
    
    # Only TLS 1.3 (most secure)
    ssl_protocols TLSv1.3;
    
    # If you must support legacy (not recommended):
    # ssl_protocols TLSv1.2 TLSv1.3;
    # ssl_ciphers 'ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:...';
    # ssl_prefer_server_ciphers off;
    
    # Enable OCSP Stapling
    ssl_stapling on;
    ssl_stapling_verify on;
    ssl_trusted_certificate /etc/nginx/ssl/chain.pem;
    resolver 8.8.8.8 8.8.4.4 valid=300s;
    
    # HSTS (force HTTPS for 1 year)
    add_header Strict-Transport-Security "max-age=31536000; includeSubDomains; preload" always;
}
```

---

## 3.5 Cryptographic Failures: Case Studies and Mitigation (OWASP A04)

Even with strong algorithms, implementation errors lead to breaches.

### 3.5.1 Case Study: The Adobe Password Breach (2013)
**Failure:** Encrypted passwords instead of hashed them, using symmetric encryption (3DES) with a single key for all passwords.
**Lesson:** Never encrypt passwords. Hash them with Argon2/bcrypt. Encryption is reversible; hashing is not (with proper algorithms).

### 3.5.2 Case Study: Nonce Reuse in GCM (AES-GCM)
AES-GCM requires a unique nonce (IV) for every encryption with the same key. If a nonce is reused:
*   An attacker can recover the authentication key.
*   Integrity guarantees are void.

**Mitigation:**
*   Use 96-bit random nonces (probability of collision is negligible with proper RNG).
*   Or use a counter if you can ensure atomicity.
*   Rotate keys before $2^{32}$ encryptions (birthday bound).

### 3.5.3 Case Study: Timing Attacks
Comparing MACs (Message Authentication Codes) or passwords with standard string comparison (`==`) leaks timing information.
```python
# Vulnerable to timing attacks
if provided_mac == calculated_mac:
    return True

# Secure constant-time comparison
import hmac
if hmac.compare_digest(provided_mac, calculated_mac):
    return True
```

### 3.5.4 Case Study: Downgrade Attacks
An attacker forces a connection to use TLS 1.0 instead of 1.3, exploiting known vulnerabilities.
**Mitigation:** Use `TLS_FALLBACK_SCSV` (for legacy) or enforce TLS 1.3 only. Implement HSTS to prevent SSL stripping.

---

### Chapter Summary

In this chapter, we established the cryptographic foundations essential for secure development. We distinguished between **symmetric encryption** (AES-256-GCM) for bulk data and **asymmetric encryption** (RSA/ECC) for key exchange and signatures. We emphasized that **hashing** (SHA-256) provides integrity, while **password hashing** requires specialized slow algorithms (Argon2). We explored the critical importance of **key management**—generation, storage in HSMs/KMS, and rotation—and dissected **TLS 1.3** as the standard for transport security, highlighting common **OWASP A04** failures like weak cipher suites and improper certificate validation.

Cryptography provides the technical controls to enforce **Confidentiality** and **Integrity**, but technology alone cannot secure an organization. Without governance structures, risk management frameworks, and strategic oversight—the ability to **Identify**, **Protect**, **Detect**, **Respond**, and **Recover**—even the strongest encryption fails to address systemic vulnerabilities. We must now zoom out from technical implementation to organizational strategy, examining how modern enterprises structure their security programs to govern these technical controls effectively.

**Next Up: Chapter 4: GOVERN (GV) – Cybersecurity Governance & Strategy**

<div style='width:100%; display:flex; justify-content:space-between; align-items:center; margin: 1em 0;'>
  <a href='2. networking_system_fundamentals_for_security.ipynb' style='font-weight:bold; font-size:1.05em;'>&larr; Previous</a>
  <a href='../TOC.md' style='font-weight:bold; font-size:1.05em; text-align:center;'>Table of Contents</a>
  <a href='../2. the_nist_cybersecurity_framework_csf_20_core/4. govern_gv_cybersecurity_governance_strategy.ipynb' style='font-weight:bold; font-size:1.05em;'>Next &rarr;</a>
</div>
