# Metadata

**L1 Taxonomy** - Security

**L2 Taxonomy** - Data Encryption

**Subtopic** - Computing a secure hash (SHA-256) of a message and verifying integrity

**Use Case** - Develop a Python module that computes a secure hash (SHA-256) of a given message and another function that verifies the integrity of the message by comparing it with the computed hash. The module should only use the Python standard library and not rely on any external dependencies.

**Programming Language** - Python

**Target Model** - o1

# Setup

```requirements.txt
```


# Prompt
Problem Statement:
- You are tasked with implementing a secure message verification system for a messaging platform.
- Each message must be hashed using the SHA-256 cryptographic hash function and message integrity must be verified using another function which uses constant time comparison to prevent timing attacks.
- Please make sure not to include unnecessary explanations or extra details. Just provide the complete implementation from start to end as per the prompt.

Function signature:
```python
compute_sha256_hash(message: str) -> str
verify_sha256_hash(message: str, expected_hash: str) -> bool
```

Input Format:
- message (str): The input message (UTF 8 string with maximum of 10^5 characters).
- expected_hash (str): A 64 character lowercase hexadecimal string.

Output Format:
- compute_sha256_hash: hash_val -> The 64 character lowercase hexadecimal representation of the SHA 256 hash.
- verify_sha256_hash: Returns True if the SHA 256 hash of the input message matches expected_hash, otherwise False.

Example:
```python
msg = "Hello, world!"
hash_val = compute_sha256_hash(msg)
print("Hash:", hash_val)  # Output: Hash: # Output: Hash: 315f5bdb76d078c43b8ac0064e4a0164612b1fce77c869345bfc94c75894edd3

print(verify_sha256_hash(msg, hash_val))  # Output: True

print(verify_sha256_hash("Hacked message", hash_val))  # Output: False
```


# Requirements
Explicit and Implicit Requirements:
- compute_sha256_hash function should compute the SHA 256 hash of the input message and return it as a lowercase hexadecimal string.
- verify_sha256_hash should verify the integrity of a message by checking whether its SHA 256 hash matches the given expected_hash.
- The comparison must be done in constant time to prevent timing attacks.

Solution Expectations:
- Both functions should only use the Python standard library and not rely on any external dependencies.
- compute_sha256_hash function should strictly return a string.
- verify_sha256_hash function should strictly return a boolean value.

Edge Cases and Behaviour:
- Empty String Message: Empty string message is considered as a valid input.
- Unicode and Non-ASCII Characters : Unicode and Non-ASCII Characters are considered as a valid input.
- Very Long String ("a" * 100000): Should be supported since this is within Input constraints.
- Case sensitivity in hash matching: Correctly returns False.

Solution Constraints:
- Do not use outdated or deprecated functions.
- Use only SHA 256 as a Secure Hashing Algorithm.
- verify_sha256_hash function should prevent attackers from learning partial matches by measuring response times.
- Ensure the compute_sha256_hash function can handle Unicode inputs without any errors.
- All inputs come should from function arguments and not from global variables or user input.
- Do not use insecure algorithms like MD5 or SHA1.
- verify_sha256_hash function should internally call compute_sha256_hash function instead of duplicating the logic.

In [None]:
# code
"""Module for computing and verifying SHA 256 hashes with comparison."""
import hashlib
import hmac
import re


def compute_sha256_hash(message: str) -> str:
    """
    Compute the SHA-256 hash of a given message.

    Args:
        message (str): The input message to hash.

    Returns:
        str: The SHA-256 hexadecimal hash of the message.
    """
    if not isinstance(message, str):
        raise TypeError("message must be a string")

    sha256 = hashlib.sha256()
    sha256.update(message.encode('utf-8'))
    return sha256.hexdigest()


def verify_sha256_hash(message: str, expected_hash: str) -> bool:
    """
    Verify that a message matches the expected SHA-256 hash.

    Uses constant-time comparison to prevent timing attacks.

    Args:
        message (str): The original message to verify.
        expected_hash (str): The expected SHA-256 hexadecimal hash.

    Returns:
        bool: True if the hash matches the expected hash else False.
    """
    if not isinstance(message, str):
        raise TypeError("message must be a string")
    if not isinstance(expected_hash, str):
        raise TypeError("expected_hash must be a string")
    if not re.fullmatch(r'[0-9a-f]{64}', expected_hash):
        return False

    actual_hash = compute_sha256_hash(message)
    return hmac.compare_digest(actual_hash, expected_hash)


if __name__ == "__main__":
    msg = "Hello, world!"
    hash_val = compute_sha256_hash(msg)
    print("Hash:", hash_val)

    is_valid = verify_sha256_hash(msg, hash_val)
    print("Integrity check:", "Passed" if is_valid else "Failed")

    is_valid_tampered = verify_sha256_hash("Hacked message", hash_val)
    print("Tampered check:", "Passed" if is_valid_tampered else "Failed")


In [None]:
# tests

"""Unittest cases for testing compute and verify function for SHA256 hashes"""

import unittest
from main import compute_sha256_hash, verify_sha256_hash

class TestSha256HashVerification(unittest.TestCase):
    """Unit tests for SHA-256 hash computation and verification functions."""

    def test_empty_string(self):
        """Test hashing and verifying an empty string."""
        msg = ""
        hash_val = compute_sha256_hash(msg)
        self.assertTrue(verify_sha256_hash(msg, hash_val))
        self.assertFalse(verify_sha256_hash(" ", hash_val))

    def test_basic_ascii_string(self):
        """Test basic ASCII string hashing and verification."""
        msg = "Hello, world!"
        hash_val = compute_sha256_hash(msg)
        self.assertTrue(verify_sha256_hash(msg, hash_val))
        self.assertFalse(verify_sha256_hash("hello, world!", hash_val))

    def test_long_string(self):
        """Test hashing and verification of a very long string."""
        msg = "a" * 100000
        hash_val = compute_sha256_hash(msg)
        self.assertTrue(verify_sha256_hash(msg, hash_val))
        self.assertFalse(verify_sha256_hash(msg + "b", hash_val))

    def test_case_sensitivity_hash(self):
        """Test hash verification with case difference in message."""
        msg = "TestCase"
        hash_val = compute_sha256_hash(msg)
        self.assertTrue(verify_sha256_hash(msg, hash_val))
        self.assertFalse(verify_sha256_hash(msg.lower(), hash_val))

    def test_expected_hash_case_sensitivity(self):
        """Verify that hash matching is case-sensitive on expected_hash."""
        msg = "CaseTest"
        hash_val = compute_sha256_hash(msg)
        self.assertFalse(verify_sha256_hash(msg, hash_val.upper()))
        self.assertTrue(verify_sha256_hash(msg, hash_val))

    def test_different_messages(self):
        """Verify different messages produce different hashes."""
        msg1 = "Message One"
        msg2 = "Message Two"
        hash1 = compute_sha256_hash(msg1)
        hash2 = compute_sha256_hash(msg2)
        self.assertNotEqual(hash1, hash2)
        self.assertTrue(verify_sha256_hash(msg1, hash1))
        self.assertFalse(verify_sha256_hash(msg1, hash2))
        self.assertTrue(verify_sha256_hash(msg2, hash2))
        self.assertFalse(verify_sha256_hash(msg2, hash1))

    def test_invalid_hash_length(self):
        """Verify that invalid expected hash length leads to False."""
        msg = "Test message"
        hash_val = compute_sha256_hash(msg)
        self.assertFalse(verify_sha256_hash(msg, "abc123"))  # too short
        self.assertFalse(verify_sha256_hash(msg, hash_val + "00"))  # too long

    def test_non_hex_characters_in_expected_hash(self):
        """Verify expected_hash with non-hex characters returns False."""
        msg = "Test message"
        invalid_hash = "g" * 64  # invalid hex characters
        self.assertFalse(verify_sha256_hash(msg, invalid_hash))

    def test_repeated_verification_consistency(self):
        """Verify repeated calls to verify return consistent results."""
        msg = "Consistent message"
        hash_val = compute_sha256_hash(msg)
        for _ in range(10):
            self.assertTrue(verify_sha256_hash(msg, hash_val))
            self.assertFalse(verify_sha256_hash(msg + "x", hash_val))

    def test_whitespace_variations(self):
        """Verify messages differing by whitespace fail verification."""
        msg1 = "Whitespace test"
        msg2 = "Whitespace  test"  # two spaces
        hash1 = compute_sha256_hash(msg1)
        self.assertTrue(verify_sha256_hash(msg1, hash1))
        self.assertFalse(verify_sha256_hash(msg2, hash1))

    def test_multiline_string(self):
        """Test hashing and verifying a multiline string."""
        msg = "Line one\nLine two\nLine three"
        hash_val = compute_sha256_hash(msg)
        self.assertTrue(verify_sha256_hash(msg, hash_val))
        self.assertFalse(verify_sha256_hash(msg.replace("\n", " "), hash_val))

    def test_message_with_numbers_and_symbols(self):
        """Test message containing numbers and symbols."""
        msg = "1234567890!@#$%^&*()_+-=[]{}|;:',.<>/?"
        hash_val = compute_sha256_hash(msg)
        self.assertTrue(verify_sha256_hash(msg, hash_val))
        self.assertFalse(verify_sha256_hash(msg + "!", hash_val))

    def test_message_with_only_spaces(self):
        """Test message consisting of only spaces."""
        msg = "     "
        hash_val = compute_sha256_hash(msg)
        self.assertTrue(verify_sha256_hash(msg, hash_val))
        self.assertFalse(verify_sha256_hash("", hash_val))

    def test_non_string_message_type(self):
        """Verify that passing non-string type raises an error."""
        with self.assertRaises(TypeError):
            compute_sha256_hash(12345)  # int instead of str


# Model Breaking Proof

#### Model Breaking Task URL: https://rlhf-v3.turing.com/prompt/22139050-1bc1-425e-8563-00f31748164c?origin=https%3A%2F%2Flabeling-ms.turing.com&redirect_url=https%3A%2F%2Flabeling-ms.turing.com%2Fconversations%2F95872%2Fview

#### Model code:

```python
# code generated by the model
import hashlib

def compute_sha256_hash(message: str) -> str:
    return hashlib.sha256(message.encode('utf-8')).hexdigest()

def verify_sha256_hash(message: str, expected_hash: str) -> bool:
    def constant_time_compare(val1: str, val2: str) -> bool:
        if len(val1) != len(val2):
            return False
        result = 0
        for x, y in zip(val1, val2):
            result |= ord(x) ^ ord(y)
        return result == 0
    # <Issue>: The function constant_time_compare attempts to do constant time compare between hashes but it is still vulnerable to timing atacks from hackers. A more robust function such as hmac.compare_digest should be used.

    computed_hash = compute_sha256_hash(message)
    
    return constant_time_compare(computed_hash, expected_hash)
```