# 🧑‍💻 SHA-3 (Keccak) Bit-Level Implementation in Python

Welcome to this educational project where SHA-3 (Keccak) and its variants are implemented **entirely from scratch at the bit level** in Python! 🚀

## 📚 About

- This notebook provides a **bitwise, step-by-step implementation** of the SHA-3 family, including:
    - SHA3-224
    - SHA3-256
    - SHA3-384
    - SHA3-512
    - SHAKE128
    - SHAKE256
    - Keccak-224/256/384/512

- **Every function** closely follows the pseudocode and specifications from the official NIST FIPS 202 paper:  
    [SHA-3 Standard: Permutation-Based Hash and Extendable-Output Functions](https://nvlpubs.nist.gov/nistpubs/FIPS/NIST.FIPS.202.pdf) 📄

- The implementation is **bit-level** (not byte-level or word-level), making each transformation (Theta, Rho, Pi, Chi, Iota, etc.) transparent and easy to follow for learning and research purposes.

## 🛠️ Features

- **From Scratch:** No external cryptographic libraries are used for the core logic—everything is built from the ground up!
- **Educational:** Each step of the Keccak permutation and sponge construction is implemented and documented for clarity.
- **Variants Supported:** SHA3, Keccak, and SHAKE (XOF) functions are all included.
- **Validation:** The outputs are compared with Python's `hashlib` to ensure correctness. ✅

## 📖 References

- Main reference:  
    [NIST FIPS 202: SHA-3 Standard](https://nvlpubs.nist.gov/nistpubs/FIPS/NIST.FIPS.202.pdf)

- For more details, see the code cells below, where each transformation and function is implemented and explained.

## 💡 Why Bit-Level?

Implementing SHA-3 at the bit level provides a **deep understanding** of how each transformation works internally. This project was a fascinating journey into the mechanics of modern cryptographic hash functions! 🧩

## 🧩 Extensions

- The project is extended to support **SHAKE128** and **SHAKE256** (extendable-output functions), as well as all standard SHA-3 hash sizes.

---

Enjoy exploring and learning from this bitwise SHA-3 implementation!  
Feel free to experiment with the code and see how each step transforms the state. 🧑‍🔬✨

In [25]:
def reverse_bits_in_byte(b):
    return int('{:08b}'.format(b)[::-1], 2)

def str_to_keccak_bits(msg):
    return ''.join(format(reverse_bits_in_byte(b), '08b') for b in msg.encode('utf-8'))

In [26]:
def keccak_pad(msg_bits, r, suffix_bits):
    m = list(msg_bits)
    m += list(suffix_bits)
    pad_len = (-len(m) - 1) % r
    m += ['0'] * pad_len
    m.append('1')
    return ''.join(m)

In [27]:
def string_to_state_array(S, x=5, y=5, w=64):
    """
    Converts a string S of b = x * y * w bits to a 3D state array A[x][y][w].
    Each A[x][y][z] is a bit (0 or 1).
    """
    if len(S) != x * y * w:
        raise ValueError("Length of input S must be x * y * w bits")

    A = [[[0 for z in range(w)] for y_ in range(y)] for x_ in range(x)]

    for i in range(x):
        for j in range(y):
            for k in range(w):
                A[i][j][k] = int(S[w * (5 * j + i) + k])

    return A

def state_array_to_string(A, x=5, y=5, w=64):
    """
    Converts a 3D state array A into a string S as per Keccak specification.
    """
    Lane_i_j = [['' for _ in range(y)] for _ in range(x)]  # 5x5 matrix of empty strings

    for i in range(x):
        for j in range(y):
            for k in range(w):
                # integer bit to string before concatenation
                Lane_i_j[i][j] += str(A[i][j][k])

    plane_j = ['' for _ in range(y)]  # one string for each plane j

    for j in range(y):
        for i in range(x):
            plane_j[j] += Lane_i_j[i][j]

    S = ''.join(plane_j)  # concatenate Plane(0) || Plane(1) || ...
    return S

In [28]:
def theta(A, w=64):
    """
    Keccak Theta step: modifies the state array A in-place based on the Theta transformation.

    Args:
        A: 3D state array A[x][y][z] where x, y in [0..4], z in [0..w-1]
        w: lane size (e.g., 64 for SHA3-512)

    Returns:
        A′: New state array after applying theta step.
    """
    #Compute C[x][z]
    C = [[0] * w for _ in range(5)]
    for x in range(5):
        for z in range(w):
            C[x][z] = A[x][0][z]
            for y in range(1, 5):
                C[x][z] ^= A[x][y][z]

    # Compute D[x][z]
    D = [[0] * w for _ in range(5)]
    for x in range(5):
        for z in range(w):
            D[x][z] = C[(x - 1) % 5][z] ^ C[(x + 1) % 5][(z - 1) % w]

    A_prime = [[[0] * w for _ in range(5)] for _ in range(5)]
    for x in range(5):
        for y in range(5):
            for z in range(w):
                A_prime[x][y][z] = A[x][y][z] ^ D[x][z]

    return A_prime

In [29]:
def rho(A, w=64):
    """
    Keccak Rho step: performs bitwise rotations on the lanes of A according to fixed offsets.

    Args:
        A: 3D state array A[x][y][z], where x, y in [0..4], z in [0..w-1]
        w: lane size (e.g., 64 for SHA3-512)

    Returns:
        A_prime: new state array after applying rho step.
    """
    A_prime = [[[0] * w for _ in range(5)] for _ in range(5)]

    for z in range(w):
        A_prime[0][0][z] = A[0][0][z]

    x, y = 1, 0

    for t in range(24):
        offset = ((t + 1) * (t + 2)) // 2

        for z in range(w):
            rotated_index = (z - offset) % w
            A_prime[x][y][z] = A[x][y][rotated_index]

        x, y = y, (2 * x + 3 * y) % 5

    return A_prime

In [30]:
def pi(A, w=64):
    """
    Keccak Pi step: rearranges the lanes.

    Args:
        A: 3D state array A[x][y][z], where x, y in [0..4], z in [0..w-1]
        w: lane size

    Returns:
        A_prime: new state array after applying pi step.
    """
    A_prime = [[[0] * w for _ in range(5)] for _ in range(5)]

    for x in range(5):
        for y in range(5):
            for z in range(w):
                A_prime[x][y][z] = A[(x + 3 * y) % 5][x][z]

    return A_prime

In [31]:
def chi(A, w=64):
    """
    Keccak Chi step: applies non-linear mixing of bits within each row.

    Args:
        A: 3D state array A[x][y][z], where x, y in [0..4], z in [0..w-1]
        w: lane size

    Returns:
        A_prime: new state array after applying chi step.
    """
    A_prime = [[[0] * w for _ in range(5)] for _ in range(5)]

    for x in range(5):
        for y in range(5):
            for z in range(w):
                A_prime[x][y][z] = A[x][y][z] ^ ((A[(x + 1) % 5][y][z] ^ 1) & A[(x + 2) % 5][y][z])

    return A_prime

In [32]:
def rc_bit(t):
    if t % 255 == 0:
        return 1

    R = ['1', '0', '0', '0', '0', '0', '0', '0']  # '10000000'

    for i in range(1, t % 255 + 1):
        R = ['0'] + R  # Prepend 0 to make it 9 bits

        # XOR operations (bitwise xor with R[8])
        R[0] = str(int(R[0]) ^ int(R[8]))
        R[4] = str(int(R[4]) ^ int(R[8]))
        R[5] = str(int(R[5]) ^ int(R[8]))
        R[6] = str(int(R[6]) ^ int(R[8]))

        R = R[:8]  # Truncate to 8 bits

    return int(R[0])

def iota(A, ir, w=64):
    """
    Keccak Iota step: adds round constant to A[0][0]

    Args:
        A: 3D state array A[x][y][z], where x, y in [0..4], z in [0..w-1]
        ir: round index (0 to 23)
        w: lane size

    Returns:
        A_prime: state array with round constant added to A[0][0]
    """
    A_prime = [[[A[x][y][z] for z in range(w)] for y in range(5)] for x in range(5)]

    RC = [0] * w
    # l = int.bit_length(w) - 1  # log2(w)
    l=6

    for j in range(l + 1):
        bit_position = (1 << j) - 1
        RC[bit_position] = rc_bit(j + 7 * ir)

    for z in range(w):
        A_prime[0][0][z] ^= RC[z]

    return A_prime

In [33]:
def Rnd(A, ir):
    """
    One round of the Keccak-f permutation.

    Args:
        A: 2D state array A[x][y], where x, y in [0..4], each element is a w-bit integer.
        ir: round index (0 to 23)

    Returns:
        A: State array after applying one round.
    """
    A = theta(A,64)
    A = rho(A)
    A = pi(A)
    A = chi(A)
    A = iota(A, ir)
    return A
def KECCAK_p_1600_24(S):
    A = string_to_state_array(S)
    for ir in range(24):
        A = Rnd(A, ir)
    return state_array_to_string(A)

In [34]:
def sponge(f, P, r, d):
    n = len(P) // r
    blocks = [P[i*r:(i+1)*r] for i in range(n)]
    c = 1600 - r
    S = '0' * 1600
    for Pi in blocks:
        Pi_padded = Pi + '0'*c
        S = bin(int(S, 2) ^ int(Pi_padded, 2))[2:].zfill(1600)
        S = f(S)
    Z = ''
    while len(Z) < d:
        Z += S[:r]
        if len(Z) >= d:
            break
        S = f(S)
    return Z[:d]

In [35]:
def bin_to_hex_keccak(bits):
    out = []
    for i in range(0, len(bits), 8):
        b = bits[i:i+8]
        if len(b) < 8:
            b = b.ljust(8, '0')
        out.append(reverse_bits_in_byte(int(b, 2)))
    return ''.join('{:02x}'.format(x) for x in out)
def keccak_hash(msg, r, d, suffix_bits):
    msg_bin = str_to_keccak_bits(msg)
    padded_bits = keccak_pad(msg_bin, r, suffix_bits)
    digest_bits= sponge(KECCAK_p_1600_24, padded_bits, r, d)
    return bin_to_hex_keccak(digest_bits)

In [36]:
def sha3_256(msg):
    return keccak_hash(msg, 1088, 256, '0110')
def sha3_512(msg):
    return keccak_hash(msg, 576, 512, '0110')
def sha3_384(msg):
    return keccak_hash(msg, 832, 384, '0110')
def sha3_224(msg):
    return keccak_hash(msg, 1152, 224, '0110')
def shake128(msg, outbits):
    return keccak_hash(msg, 1344, outbits, '11111')
def shake256(msg, outbits):
    return keccak_hash(msg, 1088, outbits, '11111')
def keccak_512(msg):
    return keccak_hash(msg, 576, 512, '1')
def keccak_256(msg):
    return keccak_hash(msg, 1088, 256, '1')
def keccak_384(msg):
    return keccak_hash(msg, 832, 384, '1')
def keccak_224(msg):
    return keccak_hash(msg, 1152, 224, '1')

In [37]:
msg = "hello"
print("SHA3-256:", sha3_256(msg))
print("SHA3-512:", sha3_512(msg))
print("SHA3-384:", sha3_384(msg))
print("SHA3-224:", sha3_224(msg))
print("SHAKE128 (256 bits):", shake128(msg, 256))
print("SHAKE256 (512 bits):", shake256(msg, 512))
print("Keccak-512:", keccak_512(msg))
print("keccak-384: ",keccak_384(msg))
import hashlib
print("hashlib.sha3_256:", hashlib.sha3_256(msg.encode('utf-8')).hexdigest())
print("hashlib.sha3_512:", hashlib.sha3_512(msg.encode('utf-8')).hexdigest())
print("hashlib.sha3_384:", hashlib.sha3_384(msg.encode('utf-8')).hexdigest())
print("hashlib.sha3_224:", hashlib.sha3_224(msg.encode('utf-8')).hexdigest())
print("hashlib.shake_128 (32 bytes):", hashlib.shake_128(msg.encode('utf-8')).hexdigest(32))
print("hashlib.shake_256 (64 bytes):", hashlib.shake_256(msg.encode('utf-8')).hexdigest(64))
print("is match ?", sha3_256(msg) == hashlib.sha3_256(msg.encode('utf-8')).hexdigest())
print("is match?", sha3_512(msg) == hashlib.sha3_512(msg.encode('utf-8')).hexdigest())
print("is match?", sha3_384(msg) == hashlib.sha3_384(msg.encode('utf-8')).hexdigest())
print("is match?", sha3_224(msg) == hashlib.sha3_224(msg.encode('utf-8')).hexdigest())
print("is match?", shake128(msg, 256) == hashlib.shake_128(msg.encode('utf-8')).hexdigest(32))

SHA3-256: 3338be694f50c5f338814986cdf0686453a888b84f424d792af4b9202398f392
SHA3-512: 75d527c368f2efe848ecf6b073a36767800805e9eef2b1857d5f984f036eb6df891d75f72d9b154518c1cd58835286d1da9a38deba3de98b5a53e5ed78a84976
SHA3-384: 720aea11019ef06440fbf05d87aa24680a2153df3907b23631e7177ce620fa1330ff07c0fddee54699a4c3ee0ee9d887
SHA3-224: b87f88c72702fff1748e58b87e9141a42c0dbedc29a78cb0d4a5cd81
SHAKE128 (256 bits): 8eb4b6a932f280335ee1a279f8c208a349e7bc65daf831d3021c213825292463
SHAKE256 (512 bits): 1234075ae4a1e77316cf2d8000974581a343b9ebbca7e3d1db83394c30f221626f594e4f0de63902349a5ea5781213215813919f92a4d86d127466e3d07e8be3
Keccak-512: 52fa80662e64c128f8389c9ea6c73d4c02368004bf4463491900d11aaadca39d47de1b01361f207c512cfa79f0f92c3395c67ff7928e3f5ce3e3c852b392f976
keccak-384:  dcef6fb7908fd52ba26aaba75121526abbf1217f1c0a31024652d134d3e32fb4cd8e9c703b8f43e7277b59a5cd402175
hashlib.sha3_256: 3338be694f50c5f338814986cdf0686453a888b84f424d792af4b9202398f392
hashlib.sha3_512: 75d527c368f2efe848ecf6b0

In [38]:
import random
import string
import hashlib

def generate_random_word(length):
    """Generates a random word of a given length."""
    letters = string.ascii_lowercase
    return ''.join(random.choice(letters) for i in range(length))

def compare_hashes_for_random_words(num_words=50, word_length=10):
    """
    Generates random words and compares custom SHA3 and SHAKE hashes with hashlib's,
    printing each word and indicating a match with a checkmark.
    """
    print(f"Comparing hashes for {num_words} random words...")

    sha3_256_all_match = True
    sha3_224_all_match = True
    sha3_384_all_match = True
    sha3_512_all_match = True
    shake128_all_match = True
    shake256_all_match = True

    for i in range(num_words):
        # Increase word length with each iteration
        current_word_length = word_length + i
        word = generate_random_word(current_word_length)
        print(f"\nWord '{word}' (Length: {current_word_length}):")

        # SHA3-256 comparison
        custom_hash_256 = sha3_256(word)
        hashlib_hash_256 = hashlib.sha3_256(word.encode('utf-8')).hexdigest()
        print(f"  SHA3-256:")
        print(f"    Custom: {custom_hash_256}")
        print(f"    Hashlib: {hashlib_hash_256}")
        if custom_hash_256 == hashlib_hash_256:
            print("    ✅ Match")
        else:
            print("    Mismatch")
            sha3_256_all_match = False

        # SHA3-224 comparison
        custom_hash_224 = sha3_224(word)
        hashlib_hash_224 = hashlib.sha3_224(word.encode('utf-8')).hexdigest()
        print(f"  SHA3-224:")
        print(f"    Custom: {custom_hash_224}")
        print(f"    Hashlib: {hashlib_hash_224}")
        if custom_hash_224 == hashlib_hash_224:
            print("    ✅ Match")
        else:
            print("    Mismatch")
            sha3_224_all_match = False

        # SHA3-384 comparison
        custom_hash_384 = sha3_384(word)
        hashlib_hash_384 = hashlib.sha3_384(word.encode('utf-8')).hexdigest()
        print(f"  SHA3-384:")
        print(f"    Custom: {custom_hash_384}")
        print(f"    Hashlib: {hashlib_hash_384}")
        if custom_hash_384 == hashlib_hash_384:
            print("    ✅ Match")
        else:
            print("    Mismatch")
            sha3_384_all_match = False

        # SHA3-512 comparison
        custom_hash_512 = sha3_512(word)
        hashlib_hash_512 = hashlib.sha3_512(word.encode('utf-8')).hexdigest()
        print(f"  SHA3-512:")
        print(f"    Custom: {custom_hash_512}")
        print(f"    Hashlib: {hashlib_hash_512}")
        if custom_hash_512 == hashlib_hash_512:
            print("    ✅ Match")
        else:
            print("    Mismatch")
            sha3_512_all_match = False

        # SHAKE128 comparison (32 bytes digest size)
        custom_hash_shake128 = shake128(word, 256) # 256 bits = 32 bytes
        hashlib_hash_shake128 = hashlib.shake_128(word.encode('utf-8')).hexdigest(32)
        print(f"  SHAKE128 (256 bits):")
        print(f"    Custom: {custom_hash_shake128}")
        print(f"    Hashlib: {hashlib_hash_shake128}")
        if custom_hash_shake128 == hashlib_hash_shake128:
            print("    ✅ Match")
        else:
            print("    Mismatch")
            shake128_all_match = False

        # SHAKE256 comparison (32 bytes digest size)
        custom_hash_shake256 = shake256(word, 256) # 256 bits = 32 bytes
        hashlib_hash_shake256 = hashlib.shake_256(word.encode('utf-8')).hexdigest(32)
        print(f"  SHAKE256 (256 bits):")
        print(f"    Custom: {custom_hash_shake256}")
        print(f"    Hashlib: {hashlib_hash_shake256}")
        if custom_hash_shake256 == hashlib_hash_shake256:
            print("    ✅ Match")
        else:
            print("    Mismatch")
            shake256_all_match = False


    print("\n--- Summary ---")
    if sha3_256_all_match:
        print("All generated SHA3-256 hashes matched hashlib's output.")
    else:
        print("Some SHA3-256 hashes did not match hashlib's output.")

    if sha3_224_all_match:
        print("All generated SHA3-224 hashes matched hashlib's output.")
    else:
        print("Some SHA3-224 hashes did not match hashlib's output.")

    if sha3_384_all_match:
        print("All generated SHA3-384 hashes matched hashlib's output.")
    else:
        print("Some SHA3-384 hashes did not match hashlib's output.")

    if sha3_512_all_match:
        print("All generated SHA3-512 hashes matched hashlib's output.")
    else:
        print("Some SHA3-512 hashes did not match hashlib's output.")

    if shake128_all_match:
        print("All generated SHAKE128 (256 bits) hashes matched hashlib's output.")
    else:
        print("Some SHAKE128 (256 bits) hashes did not match hashlib's output.")

    if shake256_all_match:
        print("All generated SHAKE256 (256 bits) hashes matched hashlib's output.")
    else:
        print("Some SHAKE256 (256 bits) hashes did not match hashlib's output.")


# Run the comparison
compare_hashes_for_random_words()

Comparing hashes for 50 random words...

Word 'lcfflabxpv' (Length: 10):
  SHA3-256:
    Custom: 26b05ebbca04355a2f964e6869e07a5ca020878e80b86a48e29e470c420672d4
    Hashlib: 26b05ebbca04355a2f964e6869e07a5ca020878e80b86a48e29e470c420672d4
    ✅ Match
  SHA3-224:
    Custom: b92ea2aa16ae19f2020a44931114bb8bd00dd347b6a91f7c2b2a7b29
    Hashlib: b92ea2aa16ae19f2020a44931114bb8bd00dd347b6a91f7c2b2a7b29
    ✅ Match
  SHA3-384:
    Custom: d3d4689701dffa526fa9a78ec1b8ca983224b6dc2af377d0895920fce13b572e928a0661879c4d0eb2cecaf4ccb2bf0f
    Hashlib: d3d4689701dffa526fa9a78ec1b8ca983224b6dc2af377d0895920fce13b572e928a0661879c4d0eb2cecaf4ccb2bf0f
    ✅ Match
  SHA3-512:
    Custom: 81719c4f3d131c24974fc025f3cf3c0b42f13a3be51640eacb6d47b70cbf634b443949909cdddc2c1622d314e3d706992d701e38190a78a8df513f4ac843411c
    Hashlib: 81719c4f3d131c24974fc025f3cf3c0b42f13a3be51640eacb6d47b70cbf634b443949909cdddc2c1622d314e3d706992d701e38190a78a8df513f4ac843411c
    ✅ Match
  SHAKE128 (256 bits):
    Custom: 