# 🛡️ SHA-1 Hash Implementation in Python (From Scratch)

Welcome to this Jupyter Notebook project, where we **build the SHA-1 cryptographic hash function from scratch** in Python! 🚀

## 📚 Overview

This notebook demonstrates a step-by-step construction of the SHA-1 algorithm, including:

- Bitwise operations and message preprocessing
- Message scheduling and block processing
- Implementation of SHA-1's core functions and rounds
- Final hash computation and validation

All code is written **from scratch** without using any external cryptographic libraries, providing a clear educational insight into how SHA-1 works under the hood.

## 🧩 Features

- 📦 **Pure Python**: No dependencies on external hash libraries.
- 📝 **Well-documented**: Each function and step is explained for clarity.
- 🧪 **Tested**: Includes test vectors to verify correctness.
- 🔍 **Debugging Aids**: Intermediate values and comments for learning and troubleshooting.

## 🏗️ Structure

- **Initialization**: SHA-1 constants and initial hash values.
- **Preprocessing**: Padding and length encoding of the input message.
- **Message Schedule**: Expansion of 16 initial 32-bit words to 80 words.
- **Main Loop**: 80 rounds of SHA-1 compression.
- **Final Output**: Concatenation of hash values to produce the final digest.

## 🧑‍💻 Usage

Simply run the notebook cells in order. You can test the implementation with your own input strings or use the provided test cases.

## 🧠 Learning Goals

- Understand the internal workings of SHA-1.
- Learn about bitwise operations, binary encoding, and modular arithmetic in cryptography.
- Gain experience in building cryptographic primitives from the ground up.

## ⚠️ Disclaimer

This implementation is for **educational purposes only**. Do **not** use it for production or security-critical applications.

---

Happy hashing! ✨

In [1]:
h0 = 0x67452301
h1 = 0xEFCDAB89
h2 = 0x98BADCFE
h3 = 0x10325476
h4 = 0xC3D2E1F0

In [2]:
kt1=0x5a827999
kt2=0x6ed9eba1 
kt3=0x8f1bbcdc
kt4=0xca62c1d6
#k is a solution of this equation below -> where l is size of message in bits . 
# l + 1 + k = 448mod 512  

In [3]:
def number_to_binary(n):
    """Convert an integer to a binary string without the '0b' prefix."""
    if not isinstance(n, int):
        raise ValueError("Input must be an integer.")
    print(len((64-len(bin(n)[2:]))*'0' + bin(n)[2:]))    
    return (64-len(bin(n)[2:]))*'0' + bin(n)[2:]  # Remove '0b' prefix

In [4]:
number_to_binary(2**64-1)

64


'1111111111111111111111111111111111111111111111111111111111111111'

In [5]:
def preprocessing_input(input):
    n=len(input)
    k= (448-(1+n*8))%512
    print('1'+ k*'0'+ number_to_binary(n*8))
    print(len('1'+ k*'0'+ number_to_binary(n*8)))

In [6]:
def number_to_binary(n):
    """Convert an integer to a 64-bit binary string."""
    if not isinstance(n, int):
        raise ValueError("Input must be an integer.")
    return bin(n)[2:].zfill(64)  # Ensures exactly 64 bits

def preprocessing_input(input):
    """Prepares input for SHA-1 by adding padding and length encoding."""
    input_bytes = input.encode('utf-8')  # Convert to bytes
    n = len(input_bytes)  # Length in bytes
    # n=len(input)
    # print(n)
    message_bits = ''.join(format(byte, '08b') for byte in input_bytes)  # Convert each byte to 8-bit binary
    message_bits += '1'  # Append single '1' bit
    # Compute k (number of zero bits required)
    k = (448 - ((n * 8 + 1) % 512)) % 512
    message_bits += '0' * k  # Append k zero bits
    # Append original length in 64-bit binary
    message_bits += number_to_binary(n * 8)
    return message_bits

In [7]:
def message_schedule_16_blocks(padded_message_block):
    """
    Extracts W₀ to W₁₅ from a 512-bit padded message block and converts to hex.
    Assumes input is a string of '0' and '1' (binary representation).
    """
    W = []
    for i in range(16):  # 16 words of 32 bits each
        word = padded_message_block[i * 32: (i + 1) * 32]  # Extract 32-bit chunk
        word_int = int(word, 2)  # Convert binary string to integer
        # word_hex = hex(word_int)[2:].zfill(8)  # Convert integer to hex (8 characters)
        # W.append(word_hex)
        W.append(word_int)
    return W

In [8]:
def left_rotate(n, b):
    """Left rotate a 32-bit integer n by b bits."""
    return ((n << b) | (n >> (32 - b))) & 0xFFFFFFFF  # Ensure 32-bit result

def words_17_to_80(W):
    """
    Generates W₁₆ to W₇₉ using:
    W_t = (W_{t-3} ⊕ W_{t-8} ⊕ W_{t-14} ⊕ W_{t-16}) <<< 1
    """
    # Convert input hex strings to integers
    # W = [int(word, 16) for word in W]  

    for i in range(16, 80):
        new_word = W[i-3] ^ W[i-8] ^ W[i-14] ^ W[i-16]  # XOR operation
        new_word = left_rotate(new_word, 1)  # Left rotate by 1
        W.append(new_word & 0xFFFFFFFF)  # Ensure it's 32-bit
    return W

In [9]:
def f1(b, c, d):
    return (b & c) | ((~b) & d)  # Ch (choose function)

def f2(b, c, d):
    return b ^ c ^ d  # Parity function

def f3(b, c, d):
    return (b & c) | (b & d) | (c & d)  # Majority function

def f4(b, c, d):
    return b ^ c ^ d  # Parity function again

In [10]:
def Nth_block_output(input,h0_temp,h1_temp,h2_temp,h3_temp,h4_temp):
  a, b, c, d, e = h0_temp, h1_temp, h2_temp, h3_temp, h4_temp
  W=words_17_to_80(message_schedule_16_blocks(input))
  for i in range(80):
    if 0 <= i <= 19:
        f = f1(b, c, d)
        k = kt1    
    elif 20 <= i <= 39:
        f = f2(b, c, d)
        k = kt2
    elif 40 <= i <= 59:
        f = f3(b, c, d)
        k = kt3
    elif 60 <= i <= 79:
        f = f4(b, c, d)
        k = kt4
    # Calculate temp value (mod 2^32 to keep within 32 bits)
    temp = (left_rotate(a, 5) + f + e + k + W[i]) & 0xFFFFFFFF
    # Update values for next round
    e = d
    d = c
    c = left_rotate(b, 30)
    b = a
    a = temp
  h0_temp = (h0_temp + a) & 0xFFFFFFFF
  h1_temp = (h1_temp + b) & 0xFFFFFFFF
  h2_temp = (h2_temp + c) & 0xFFFFFFFF   
  h3_temp = (h3_temp + d) & 0xFFFFFFFF
  h4_temp = (h4_temp + e) & 0xFFFFFFFF
  return h0_temp,h1_temp,h2_temp,h3_temp,h4_temp

In [None]:
def generalized_SHA1(input):
    h0 = 0x67452301
    h1 = 0xEFCDAB89
    h2 = 0x98BADCFE
    h3 = 0x10325476
    h4 = 0xC3D2E1F0
    kt1 = 0x5A827999
    kt2 = 0x6ED9EBA1
    kt3 = 0x8F1BBCDC
    kt4 = 0xCA62C1D6
    h0_temp, h1_temp, h2_temp, h3_temp, h4_temp = h0, h1, h2, h3, h4  # Initialize with default values

    length_processed_text = len(preprocessing_input(input))  # Get processed input length
    input_temp=preprocessing_input(input); 
    for i in range(length_processed_text // 512):
        if i == 0:
            h0_temp, h1_temp, h2_temp, h3_temp, h4_temp = Nth_block_output(input_temp[0:512], h0, h1, h2, h3, h4)
        else:    
            h0_temp, h1_temp, h2_temp, h3_temp, h4_temp = Nth_block_output(input_temp[i * 512:(i + 1) * 512], 
                                                                           h0_temp, h1_temp, h2_temp, h3_temp, h4_temp)
        
        # 🔹 Print intermediate values in hex for debugging
        # print(f"After block {i+1}:")
        # print(f"h0 = {h0_temp:08x}")
        # print(f"h1 = {h1_temp:08x}")
        # print(f"h2 = {h2_temp:08x}")
        # print(f"h3 = {h3_temp:08x}")
        # print(f"h4 = {h4_temp:08x}")
        # print("-" * 50)  # Separator for readability

    # Convert each temp value to a zero-padded 8-character hex string before concatenating
    return ''.join([
        f"{h0_temp:08x}", 
        f"{h1_temp:08x}", 
        f"{h2_temp:08x}", 
        f"{h3_temp:08x}", 
        f"{h4_temp:08x}"
    ])

print(generalized_SHA1("abcdbcdecdefdefgefghfghighijhijkijkljklmklmnlmnomnopnopq"))
# print("84983e441c3bd26ebaae4aa1f95129e5e54670f1"=="84983e441c3bd26ebaae4aa1f95129e5e54670f1")
# print(generalized_SHA1("a"*1000000))
# print(generalized_SHA1("PRASANNA IS A DIVINE HEAVENLY PERSON❤"))
# print("7002e4deaa44740d720d96c10c353c2691b04eb6"=="7002e4deaa44740d720d96c10c353c2691b04eb6")
# print("e4978ee22f39c19e0dfbb9257e7e257f68975a99"=="e4978ee22f39c19e0dfbb9257e7e257f68975a99")
# print("0ffd065a3454e24808990e50d7fad216026732a2"=="0ffd065a3454e24808990e50d7fad216026732a2")
# import time 
# t1=time.perf_counter()
# print(generalized_SHA1("a"*1000000))
# t2=time.perf_counter();
# print(t2-t1)

84983e441c3bd26ebaae4aa1f95129e5e54670f1


In [12]:
print("34aa973cd4c4daa4f61eeb2bdbad27316534016f"=="34aa973cd4c4daa4f61eeb2bdbad27316534016f")

True


In [13]:
input_2="abcdbcdecdefdefgefghfghighijhijkijkljklmklmnlmnomnopnopq"

In [14]:
W11 = message_schedule_16_blocks(preprocessing_input(input_2)[512:])
print(len(preprocessing_input((input_2))[512:]))
# print(W11)
W22= words_17_to_80(W11)
print(W22)

512
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 448, 0, 0, 896, 0, 0, 1792, 0, 896, 3584, 0, 0, 7168, 0, 3456, 14336, 1152, 0, 28672, 3584, 13824, 57344, 3584, 0, 118272, 0, 55296, 229376, 20224, 0, 462336, 60928, 226560, 917504, 57344, 13824, 1895936, 0, 913408, 3670016, 312832, 57344, 7400960, 974848, 3579392, 14680064, 972288, 57344, 30320128, 0, 14155776, 58777600, 5177344, 0, 118358016, 15597568, 58006528, 234881024, 14737408, 3553280, 485396480, 0, 233869312, 939581440, 80180224]


In [15]:
# Initial hash values (SHA-1 constants)
h0 = 0xf4286818
h1 = 0xc37b27ae
h2 = 0x0408f581
h3 = 0x84677148
h4 = 0x4a566572

# SHA-1 Constants
kt1 = 0x5A827999
kt2 = 0x6ED9EBA1
kt3 = 0x8F1BBCDC
kt4 = 0xCA62C1D6

def left_rotate(n, b):
    """Left rotate a 32-bit integer n by b bits."""
    return ((n << b) | (n >> (32 - b))) & 0xFFFFFFFF  # Ensure 32-bit result

# SHA-1 Functions
def f1(b, c, d):
    return (b & c) | ((~b) & d)  # Ch (choose function)

def f2(b, c, d):
    return b ^ c ^ d  # Parity function

def f3(b, c, d):
    return (b & c) | (b & d) | (c & d)  # Majority function

def f4(b, c, d):
    return b ^ c ^ d  # Parity function again

# SHA-1 Main Loop
a, b, c, d, e = h0, h1, h2, h3, h4

for i in range(80):
    if 0 <= i <= 19:
        f = f1(b, c, d)
        k = kt1    
    elif 20 <= i <= 39:
        f = f2(b, c, d)
        k = kt2
    elif 40 <= i <= 59:
        f = f3(b, c, d)
        k = kt3
    elif 60 <= i <= 79:
        f = f4(b, c, d)
        k = kt4

    # Calculate temp value (mod 2^32 to keep within 32 bits)
    temp = (left_rotate(a, 5) + f + e + k + W22[i]) & 0xFFFFFFFF

    # Update values for next round
    e = d
    d = c
    c = left_rotate(b, 30)
    b = a
    a = temp

# Final hash values (Add to initial hash values)
h0 = (h0 + a) & 0xFFFFFFFF
h1 = (h1 + b) & 0xFFFFFFFF
h2 = (h2 + c) & 0xFFFFFFFF
h3 = (h3 + d) & 0xFFFFFFFF
h4 = (h4 + e) & 0xFFFFFFFF
# print(h0,h1,h2,h3,h4)
# Final SHA-1 Hash (Concatenate final hash values)
final_hash = ' '.join(f'{x:08x}' for x in [h0, h1, h2, h3, h4])

print("SHA-1 Hash:", final_hash)
print(final_hash=='84983e44 1c3bd26e baae4aa1 f95129e5 e54670f1')

SHA-1 Hash: 84983e44 1c3bd26e baae4aa1 f95129e5 e54670f1
True
