## Problem 3 – Padding

Author: Michael Ferry  
Date: November 2025

This problem looks at how SHA-256 prepares messages before hashing.  
According to the Secure Hash Standard (FIPS PUB 180-4), every message must be
padded so that it becomes a sequence of **512-bit blocks**.  
This involves adding a `1` bit, filling the rest with `0`s, and placing the
64-bit length of the original message at the end.

We’ll:

1. Write a generator function `block_parse(msg)` that accepts a bytes object.
2. Apply SHA-256 padding rules from Sections 5.1.1 and 5.2.1.
3. Ensure that the final block (or final two blocks) contain the correct padding.
4. Yield each 512-bit (64-byte) block as a bytes object.
5. Test the generator with messages of different sizes to confirm everything works.


### Step 1 — Start the block_parse generator

The first thing needed is the basic structure of the generator that will handle
splitting the message into 512-bit blocks. At this point the function does not
apply any padding yet. It just creates the generator so it can be tested and
expanded in the next steps.



In [4]:
def block_parse(msg: bytes):
    """
    Basic generator structure for processing a bytes message.
    The padding and block logic will be added in the next steps.
    """
    # Temporary placeholder output
    yield msg


# Quick check
list(block_parse(b"abc"))

[b'abc']

### Step 2 – Add SHA-256 padding

SHA-256 has a specific padding format that every message must follow before it
can be processed. The rules are:

1. Add a single `1` bit (0x80 as a byte).
2. Add `0` bytes until the message length is 56 bytes mod 64.
3. Append the original message length as a 64-bit big-endian integer.

After this, the message will always split cleanly into 512-bit (64-byte) blocks.
This step adds the padding logic so the generator can output properly padded
blocks in the next step.


In [None]:
def block_parse(msg: bytes):
    """
    Parse a message into padded SHA-256 blocks (there is padding only in this step).
    Full block splitting added in the next step.
    """

    # Length of original message in bits
    bit_len = len(msg) * 8

    # Step 1: append the 0x80 byte
    padded = msg + b"\x80"

    # Step 2: pad with zeros until length ≡ 56 mod 64
    while (len(padded) % 64) != 56:
        padded += b"\x00"

    # Step 3: append original length as 64-bit big-endian integer
    padded += bit_len.to_bytes(8, "big")

    # Yield the whole padded message for now (block splitting in Step 3)
    yield padded


# Quick check
list(block_parse(b"abc"))


[b'abc\x80\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x18']