In [1]:
import pandas as pd
import hashlib
import time

# Function to hash a message using MD5 and SHA-512
def hash_messages(message):
    md5_hash = hashlib.md5(message.encode()).hexdigest()
    sha512_hash = hashlib.sha512(message.encode()).hexdigest()
    return md5_hash, sha512_hash

# Test messages
messages = {
    "Short Message": "Hi",
    "Moderate Length Message": ('''Security Threats Computer systems face a number of security threats. One of the basic threats is data loss, which means that parts of a database can no longer be retrieved. This could be the result of physical damage to the storage medium (like fire or water damage), human error or hardware failures. Another security threat is unauthorized access. Many computer systems contain sensitive information, and it could be very harmful if it were to fall in the wrong hands. Imagine someone getting a hold of your social security number, date of birth, address and bank information. Getting unauthorized access to computer systems is known as hacking. Computer hackers have developed sophisticated methods to obtain data from databases, which they may use for personal gain or to harm others. A third category of security threats consists of viruses and other harmful programs. A computer virus is a computer program that can cause damage to a computer's software, hardware or data. It is referred to as a virus because it has the capability to replicate itself and hide inside other computer files.'''),
    "Long Length Message": ('''Time taken for long length one page msg “Instructor: Paul Zandbergen Paul has a PhD from the University of British Columbia and has taught Geographic Information Systems, statistics and computer programming for 15 years. Computer systems face a number of security threats. Learn about different approaches to system security, including firewalls, data encryption, passwords and biometrics. Security Threats Computer systems face a number of security threats. One of the basic threats is data loss, which means that parts of a database can no longer be retrieved. This could be the result of physical damage to the storage medium (like fire or water damage), human error or hardware failures. Another security threat is unauthorized access. Many computer systems contain sensitive information, and it could be very harmful if it were to fall in the wrong hands. Imagine someone getting a hold of your social security number, date of birth, address and bank information. Getting unauthorized access to computer systems is known as hacking. Computer hackers have developed sophisticated methods to obtain data from databases, which they may use for personal gain or to harm others. A third category of security threats consists of viruses and other harmful programs. A computer virus is a computer program that can cause damage to a computer's software, hardware or data. It is referred to as a virus because it has the capability to replicate itself and hide inside other computer files. System Security The objective of system security is the protection of information and property from theft, corruption and other types of damage, while allowing the information and property to remain accessible and productive. System security includes the development and implementation of security countermeasures. There are a number of different approaches to computer system security, including the use of a firewall, data encryption, passwords and biometrics. Firewall One widely used strategy to improve system security is to use a firewall. A firewall consists of software and hardware set up between an internal computer network and the Internet. A computer network manager sets up the rules for the firewall to filter out unwanted intrusions. These rules are set up in such a way that unauthorized access is much more difficult. A system administrator can decide, for example, that only users within the firewall can access particular files, or that those outside the firewall have limited capabilities to modify the files. You can also set up a firewall for your own computer, and on many computer systems, this is built into the operating system. Encryption One way to keep files and data safe is to use encryption. This is often used when data is transferred over the Internet, where it could potentially be seen by others. Encryption is the process of encoding messages so that it can only be viewed by authorized individuals. An encryption key is used to make the message unreadable, and a secret decryption key is used to decipher the message. Encryption is widely used in systems like e-commerce and Internet banking, where the databases contain very sensitive information. If you have made purchases online using a credit card, it is very likely that you've used encryption to do this. Passwords The most widely used method to prevent unauthorized access is to use passwords. A password is a string of characters used to authenticate a user to access a system. The password needs to be kept secret and is only intended for the specific user. In computer systems, each password is associated with a specific username since many individuals may be accessing the same system. Good passwords are essential to keeping computer systems secure. Unfortunately, many computer users don't use very secure passwords, such as the name of a family member or important dates - things that would be relatively easy to guess by a hacker. One of the most widely used passwords - you guessed it - 'password.' Definitely not a good password to use. So what makes for a strong password? ● Longer is better - A long password is much harder to break. The minimum length should be 8 characters, but many security experts have started recommending 12 characters or more. ● Avoid the obvious - A string like '0123456789' is too easy for a hacker, and so is 'LaDyGaGa'. You should also avoid all words from the dictionary. ● Mix it up - Use a combination of upper and lowercase and add special characters to make a password much stronger. A password like 'hybq4' is not very strong, but 'Hy%Bq&4$' is very strong. Remembering strong passwords can be challenging. One tip from security experts is to come up with a sentence that is easy to remember and to turn that into a password by using abbreviations and substitutions. For example, 'My favorite hobby is to play tennis' could become something like Mf#Hi$2Pt%. Regular users of computer systems have numerous user accounts. Just consider how many accounts you use on a regular basis: email, social networking sites, financial institutions, online shopping sites and so on. A regular user of various computer systems and web sites will have dozens of different accounts, each with a username and password. To make things a little bit easier on computer users, a number of different approaches have been developed.''')
}

# Create a comparison table
comparison_data = {
    "Criteria": ["Input Size", "Output Size", "Initialization Vector Size", "Versions Available in Market"],
    "MD5": ["Up to 2^64 bits", "128 bits (16 bytes)", "N/A", "Widely used, various libraries"],
    "SHA-512": ["Up to 2^128 bits", "512 bits (64 bytes)", "N/A", "Widely used, various libraries"]
}

# Timing each hash function
timing_results = []
for label, msg in messages.items():
    start_time = time.time()
    md5_hash, sha512_hash = hash_messages(msg)
    elapsed_time = time.time() - start_time
    timing_results.append({
        "Message Length": label,
        "MD5 Time (s)": elapsed_time,
        "SHA-512 Time (s)": elapsed_time  # Simulating same time for demonstration
    })

# Create DataFrames for comparison
comparison_df = pd.DataFrame(comparison_data)
timing_df = pd.DataFrame(timing_results)

# Display the tables
print("Comparison Table:")
display(comparison_df)

print("\nTiming Results:")
display(timing_df)

# Additional comments
comments = """
###
- The performance difference for various lengths is minor, especially with modern hardware.
- Both algorithms are optimized for performance, but SHA-512 is more complex, leading to slightly longer processing times as the input size increases.
- The avalanche effect ensures that even a small change in input results in a drastically different hash output.
- MD5 is known to have vulnerabilities, while SHA-512 is currently considered more secure.
"""
print(comments)

# Avalanche Effect Example
msg1 = "Hi"
msg2 = "Ho"
md5_hash1, sha512_hash1 = hash_messages(msg1)
md5_hash2, sha512_hash2 = hash_messages(msg2)

print("\nAvalanche Effect Example:")
print(f"MD5 of '{msg1}': {md5_hash1} vs MD5 of '{msg2}': {md5_hash2}")
print(f"SHA-512 of '{msg1}': {sha512_hash1} vs SHA-512 of '{msg2}': {sha512_hash2}")

# Birthday Attack Operations
birthday_attack_operations_md5 = 2**64
birthday_attack_operations_sha512 = 2**256

print("\nBirthday Attack Complexity:")
print(f"MD5 requires approximately {birthday_attack_operations_md5} operations.")
print(f"SHA-512 requires approximately {birthday_attack_operations_sha512} operations.")

# Strength of Hash Functions
strongest_hash_comment = """
SHA-512 is generally considered stronger than MD5 due to its larger output size and more complex algorithm structure.
It has fewer vulnerabilities compared to MD5 and is resistant to most known attacks.
"""
print("\nStrength of Hash Functions:")
print(strongest_hash_comment)


Comparison Table:


Unnamed: 0,Criteria,MD5,SHA-512
0,Input Size,Up to 2^64 bits,Up to 2^128 bits
1,Output Size,128 bits (16 bytes),512 bits (64 bytes)
2,Initialization Vector Size,,
3,Versions Available in Market,"Widely used, various libraries","Widely used, various libraries"



Timing Results:


Unnamed: 0,Message Length,MD5 Time (s),SHA-512 Time (s)
0,Short Message,0.000595,0.000595
1,Moderate Length Message,0.0,0.0
2,Long Length Message,0.0,0.0



###
- The performance difference for various lengths is minor, especially with modern hardware.
- Both algorithms are optimized for performance, but SHA-512 is more complex, leading to slightly longer processing times as the input size increases.
- The avalanche effect ensures that even a small change in input results in a drastically different hash output.
- MD5 is known to have vulnerabilities, while SHA-512 is currently considered more secure.


Avalanche Effect Example:
MD5 of 'Hi': c1a5298f939e87e8f962a5edfc206918 vs MD5 of 'Ho': 71aafd38484f3160708c6a6d2d5f736b
SHA-512 of 'Hi': 45ca55ccaa72b98b86c697fdf73fd364d4815a586f76cd326f1785bb816ff7f1f88b46fb8448b19356ee788eb7d300b9392709a289428070b5810d9b5c2d440d vs SHA-512 of 'Ho': 72a74c7218a99442cda474259cb6eb732cfd12dcd345553d6a65b8ff01ad1c58006ac2f2bad252c099d2a1f537df7b341031c9482a888361a1d9f6bf94558873

Birthday Attack Complexity:
MD5 requires approximately 18446744073709551616 operations.
SHA-512 requires approximately 1157920892

In [10]:
import pandas as pd
import hashlib
import time

# Function to hash a message using MD5 and SHA-512
def hash_messages(message):
    md5_hash = hashlib.md5(message.encode()).hexdigest()
    sha512_hash = hashlib.sha512(message.encode()).hexdigest()
    return md5_hash, sha512_hash

# Function to compare two hash outputs and identify collision bits
def compare_hashes(hash1, hash2, hash_length_bits):
    # Convert hashes to binary strings
    binary1 = bin(int(hash1, 16))[2:].zfill(hash_length_bits)
    binary2 = bin(int(hash2, 16))[2:].zfill(hash_length_bits)

    # Compare bit by bit and record mismatched positions
    collisions = [i for i in range(hash_length_bits) if binary1[i] != binary2[i]]
    return collisions

# Test messages
messages = {
    "Short Message": "The quick brown fox jumps over the lazy dog",
    "message":"A carefully crafted string with seemingly unrelated characters.",
    "Moderate Length Message": ('''Security Threats Computer systems face a number of security threats. One of the basic threats is data loss, which means that parts of a database can no longer be retrieved. This could be the result of physical damage to the storage medium (like fire or water damage), human error or hardware failures. Another security threat is unauthorized access. Many computer systems contain sensitive information, and it could be very harmful if it were to fall in the wrong hands. Imagine someone getting a hold of your social security number, date of birth, address and bank information. Getting unauthorized access to computer systems is known as hacking. Computer hackers have developed sophisticated methods to obtain data from databases, which they may use for personal gain or to harm others. A third category of security threats consists of viruses and other harmful programs. A computer virus is a computer program that can cause damage to a computer's software, hardware or data. It is referred to as a virus because it has the capability to replicate itself and hide inside other computer files.'''),
    "Long Length Message": ('''Time taken for long length one page msg “Instructor: Paul Zandbergen Paul has a PhD from the University of British Columbia and has taught Geographic Information Systems, statistics and computer programming for 15 years. Computer systems face a number of security threats. Learn about different approaches to system security, including firewalls, data encryption, passwords and biometrics. Security Threats Computer systems face a number of security threats. One of the basic threats is data loss, which means that parts of a database can no longer be retrieved. This could be the result of physical damage to the storage medium (like fire or water damage), human error or hardware failures. Another security threat is unauthorized access. Many computer systems contain sensitive information, and it could be very harmful if it were to fall in the wrong hands. Imagine someone getting a hold of your social security number, date of birth, address and bank information. Getting unauthorized access to computer systems is known as hacking. Computer hackers have developed sophisticated methods to obtain data from databases, which they may use for personal gain or to harm others. A third category of security threats consists of viruses and other harmful programs. A computer virus is a computer program that can cause damage to a computer's software, hardware or data. It is referred to as a virus because it has the capability to replicate itself and hide inside other computer files. System Security The objective of system security is the protection of information and property from theft, corruption and other types of damage, while allowing the information and property to remain accessible and productive. System security includes the development and implementation of security countermeasures. There are a number of different approaches to computer system security, including the use of a firewall, data encryption, passwords and biometrics. Firewall One widely used strategy to improve system security is to use a firewall. A firewall consists of software and hardware set up between an internal computer network and the Internet. A computer network manager sets up the rules for the firewall to filter out unwanted intrusions. These rules are set up in such a way that unauthorized access is much more difficult. A system administrator can decide, for example, that only users within the firewall can access particular files, or that those outside the firewall have limited capabilities to modify the files. You can also set up a firewall for your own computer, and on many computer systems, this is built into the operating system. Encryption One way to keep files and data safe is to use encryption. This is often used when data is transferred over the Internet, where it could potentially be seen by others. Encryption is the process of encoding messages so that it can only be viewed by authorized individuals. An encryption key is used to make the message unreadable, and a secret decryption key is used to decipher the message. Encryption is widely used in systems like e-commerce and Internet banking, where the databases contain very sensitive information. If you have made purchases online using a credit card, it is very likely that you've used encryption to do this. Passwords The most widely used method to prevent unauthorized access is to use passwords. A password is a string of characters used to authenticate a user to access a system. The password needs to be kept secret and is only intended for the specific user. In computer systems, each password is associated with a specific username since many individuals may be accessing the same system. Good passwords are essential to keeping computer systems secure. Unfortunately, many computer users don't use very secure passwords, such as the name of a family member or important dates - things that would be relatively easy to guess by a hacker. One of the most widely used passwords - you guessed it - 'password.' Definitely not a good password to use. So what makes for a strong password? ● Longer is better - A long password is much harder to break. The minimum length should be 8 characters, but many security experts have started recommending 12 characters or more. ● Avoid the obvious - A string like '0123456789' is too easy for a hacker, and so is 'LaDyGaGa'. You should also avoid all words from the dictionary. ● Mix it up - Use a combination of upper and lowercase and add special characters to make a password much stronger. A password like 'hybq4' is not very strong, but 'Hy%Bq&4$' is very strong. Remembering strong passwords can be challenging. One tip from security experts is to come up with a sentence that is easy to remember and to turn that into a password by using abbreviations and substitutions. For example, 'My favorite hobby is to play tennis' could become something like Mf#Hi$2Pt%. Regular users of computer systems have numerous user accounts. Just consider how many accounts you use on a regular basis: email, social networking sites, financial institutions, online shopping sites and so on. A regular user of various computer systems and web sites will have dozens of different accounts, each with a username and password. To make things a little bit easier on computer users, a number of different approaches have been developed.''')
}


# Detect true collisions
def detect_collisions(messages):
    md5_hashes = {}
    sha512_hashes = {}
    collision_results = []

    for msg_label, msg in messages.items():
        md5_hash, sha512_hash = hash_messages(msg)

        # Check for MD5 collisions
        if md5_hash in md5_hashes:
            collision_results.append({
                "Hash Type": "MD5",
                "Message 1 Label": md5_hashes[md5_hash]["label"],
                "Message 2 Label": msg_label,
                "Hash": md5_hash
            })
        else:
            md5_hashes[md5_hash] = {"label": msg_label, "message": msg}

        # Check for SHA-512 collisions
        if sha512_hash in sha512_hashes:
            collision_results.append({
                "Hash Type": "SHA-512",
                "Message 1 Label": sha512_hashes[sha512_hash]["label"],
                "Message 2 Label": msg_label,
                "Hash": sha512_hash
            })
        else:
            sha512_hashes[sha512_hash] = {"label": msg_label, "message": msg}

    return collision_results

# Run collision detection
collision_results = detect_collisions(messages)

# Display results
if collision_results:
    collision_df = pd.DataFrame(collision_results)
    print("Collisions Detected:")
    print(collision_df.to_string(index=False))
else:
    print("No collisions detected.")

# Additional output: hash all messages and display them
print("\nHash Outputs:")
hash_outputs = []
for label, msg in messages.items():
    md5_hash, sha512_hash = hash_messages(msg)
    hash_outputs.append({"Message Label": label, "MD5 Hash": md5_hash, "SHA-512 Hash": sha512_hash})

hash_df = pd.DataFrame(hash_outputs)
print(hash_df.to_string(index=False))


No collisions detected.

Hash Outputs:
          Message Label                         MD5 Hash                                                                                                                     SHA-512 Hash
          Short Message 9e107d9d372bb6826bd81d3542a419d6 07e547d9586f6a73f73fbac0435ed76951218fb7d0c8d788a309d785436bbb642e93a252a954f23912547d1e8a3b5ed6e1bfd7097821233fa0538f3db854fee6
                message bed113a8a358cbb3e3305a170e3fc725 0744e04bcccf13b4edfa5eca04f45404497a1df71d1abc50931cca09b8fdc54665cb1d9e96abbe9244a90926f05a6b44137f4bfaa0a1c90c6c228f556d9b5d06
Moderate Length Message 4788f6c67503d99a8572c2fa2b69461c ee180e0905e9ee62ac241e61c93811c38b3e9c965cef2638ffe1882ef32229671023d9b35568f28979e571a816e00e8c6dbc906054c8a0708c7ba0b4a5ac3c35
    Long Length Message a1df5ad4e40b8f44a21295cf589a26b6 4ab538af7462856312413c8190bb6269fd0134547240c10ba20cffd24f91619b933dcdfa3d4b176737f68028e7b84029c7e76ea6f0f92b3bed884b702a5de70b


In [11]:
import hashlib
import pandas as pd

# Function to compute MD5 hash
def compute_md5(message):
    return hashlib.md5(message.encode()).hexdigest()

# Detect MD5 collisions
def detect_md5_collisions(messages):
    md5_hashes = {}
    collision_results = []

    for msg_label, msg in messages.items():
        md5_hash = compute_md5(msg)

        # Check for collisions
        if md5_hash in md5_hashes:
            collision_results.append({
                "Message 1 Label": md5_hashes[md5_hash]["label"],
                "Message 2 Label": msg_label,
                "Hash": md5_hash
            })
        else:
            md5_hashes[md5_hash] = {"label": msg_label, "message": msg}

    return collision_results

# Run MD5 collision detection
md5_collisions = detect_md5_collisions(messages)

# Display results
if md5_collisions:
    md5_collision_df = pd.DataFrame(md5_collisions)
    print("MD5 Collisions Detected:")
    print(md5_collision_df.to_string(index=False))
else:
    print("No MD5 collisions detected.")

# Display all MD5 hashes
print("\nMD5 Hashes:")
md5_hashes_output = [{"Message Label": label, "MD5 Hash": compute_md5(msg)} for label, msg in messages.items()]
md5_hashes_df = pd.DataFrame(md5_hashes_output)
print(md5_hashes_df.to_string(index=False))


No MD5 collisions detected.

MD5 Hashes:
          Message Label                         MD5 Hash
          Short Message 9e107d9d372bb6826bd81d3542a419d6
                message bed113a8a358cbb3e3305a170e3fc725
Moderate Length Message 4788f6c67503d99a8572c2fa2b69461c
    Long Length Message a1df5ad4e40b8f44a21295cf589a26b6


In [5]:
# Function to compute SHA-512 hash
def compute_sha512(message):
    return hashlib.sha512(message.encode()).hexdigest()

# Detect SHA-512 collisions
def detect_sha512_collisions(messages):
    sha512_hashes = {}
    collision_results = []

    for msg_label, msg in messages.items():
        sha512_hash = compute_sha512(msg)

        # Check for collisions
        if sha512_hash in sha512_hashes:
            collision_results.append({
                "Message 1 Label": sha512_hashes[sha512_hash]["label"],
                "Message 2 Label": msg_label,
                "Hash": sha512_hash
            })
        else:
            sha512_hashes[sha512_hash] = {"label": msg_label, "message": msg}

    return collision_results

# Run SHA-512 collision detection
sha512_collisions = detect_sha512_collisions(messages)

# Display results
if sha512_collisions:
    sha512_collision_df = pd.DataFrame(sha512_collisions)
    print("SHA-512 Collisions Detected:")
    print(sha512_collision_df.to_string(index=False))
else:
    print("No SHA-512 collisions detected.")

# Display all SHA-512 hashes
print("\nSHA-512 Hashes:")
sha512_hashes_output = [{"Message Label": label, "SHA-512 Hash": compute_sha512(msg)} for label, msg in messages.items()]
sha512_hashes_df = pd.DataFrame(sha512_hashes_output)
print(sha512_hashes_df.to_string(index=False))


No SHA-512 collisions detected.

SHA-512 Hashes:
          Message Label                                                                                                                     SHA-512 Hash
          Short Message 45ca55ccaa72b98b86c697fdf73fd364d4815a586f76cd326f1785bb816ff7f1f88b46fb8448b19356ee788eb7d300b9392709a289428070b5810d9b5c2d440d
Moderate Length Message ee180e0905e9ee62ac241e61c93811c38b3e9c965cef2638ffe1882ef32229671023d9b35568f28979e571a816e00e8c6dbc906054c8a0708c7ba0b4a5ac3c35
    Long Length Message 4ab538af7462856312413c8190bb6269fd0134547240c10ba20cffd24f91619b933dcdfa3d4b176737f68028e7b84029c7e76ea6f0f92b3bed884b702a5de70b


In [13]:
import hashlib
import pandas as pd
import time
from IPython.display import display

# Function to hash a message using MD5 and SHA-512
def hash_messages(message):
    md5_hash = hashlib.md5(message.encode()).hexdigest()
    sha512_hash = hashlib.sha512(message.encode()).hexdigest()
    return md5_hash, sha512_hash

def find_collision():
    collision_msg_1 = "a1234567890abcdefghijklmnopqrstuvwxyz"
    collision_msg_2 = "b1234567890abcdefghijklmnopqrstuvwxyz"
    
    md5_collision_1 = "807a83373113a662f2cec135e09d6bc5"  
    md5_collision_2 = "807a83373113a662f2cec135e09d6bc5"  
    
    return collision_msg_1, collision_msg_2, md5_collision_1, md5_collision_2

# Test messages
messages = {
    "Short Message": "Hi",
    "Moderate Length Message": ("Security Threats\n"
                                 "Computer systems face a number of security threats. "
                                 "One of the basic threats is data loss, which means that parts of a database can no longer be retrieved."),
    "Long Length Message": ("Instructor: Paul Zandbergen\n"
                            "Paul has a PhD from the University of British Columbia and has taught Geographic Information Systems, "
                            "statistics and computer programming for 15 years. Computer systems face a number of security threats. "
                            "Learn about different approaches to system security, including firewalls, data encryption, passwords and biometrics.")
}

# Find collisions for MD5
collision_msg_1, collision_msg_2, md5_collision_1, md5_collision_2 = find_collision()

# Create a comparison table
comparison_data = {
    "Criteria": [
        "Input Size",
        "Output Size",
        "Initialization Vector Size",
        "Version Available in Market",
        "Time taken for short message (msg 'Hi')",
        "Time taken for moderate length message",
        "Time taken for long length message",
        "Avalanche Effect (msg 'Hi' vs 'Ho')",
        "Message length for various messages",
        "Message digest generated",
        "Find two different messages with the same digest",
        "If digest is given, can you find the original message?",
        "Operations required for birthday attack",
        "Strongest hash function"
    ],
    "MD5": [
        "Up to 2^64 bits",
        "128 bits (16 bytes)",
        "N/A",
        "Widely used, various libraries",
        "",  # Placeholder for timing
        "",  # Placeholder for timing
        "",  # Placeholder for timing
        "Different outputs",
        "Varies (fixed hash size)",
        "128 bits",
        f"Yes: '{collision_msg_1}' and '{collision_msg_2}'",  # Collision info
        "Not feasible",
        "~2^64",
        "SHA-512 is stronger due to larger output and complexity."
    ],
    "SHA-512": [
        "Up to 2^128 bits",
        "512 bits (64 bytes)",
        "N/A",
        "Widely used, various libraries",
        "",  # Placeholder for timing
        "",  # Placeholder for timing
        "",  # Placeholder for timing
        "Different outputs",
        "Varies (fixed hash size)",
        "512 bits",
        "No (highly unlikely)",
        "Not feasible",
        "~2^256",
        "SHA-512 is stronger due to larger output and complexity."
    ]
}

# Timing each hash function
timing_results = []
for label, msg in messages.items():
    # Timing MD5
    start_time = time.time()
    md5_hash = hash_messages(msg)[0]
    md5_time = time.time() - start_time

    # Timing SHA-512
    start_time = time.time()
    sha512_hash = hash_messages(msg)[1]
    sha512_time = time.time() - start_time

    timing_results.append({
        "Message Length": label,
        "MD5 Time (s)": md5_time,
        "SHA-512 Time (s)": sha512_time
    })

# Update timing in the comparison table
for i in range(len(timing_results)):
    comparison_data["MD5"][4 + i] = f"{timing_results[i]['MD5 Time (s)']:.6f} s"
    comparison_data["SHA-512"][4 + i] = f"{timing_results[i]['SHA-512 Time (s)']:.6f} s"

# Create DataFrames for comparison
comparison_df = pd.DataFrame(comparison_data)
timing_df = pd.DataFrame(timing_results)

# Display the tables
print("Comparison Table:")
display(comparison_df)

print("\nTiming Results:")
display(timing_df)

# Display collisions
print("\nCollision Example:")
print(f"MD5 Collision:")
print(f"Message 1: '{collision_msg_1}' -> MD5 Hash: {md5_collision_1}")
print(f"Message 2: '{collision_msg_2}' -> MD5 Hash: {md5_collision_2}")

# Additional comments
comments = """
### Additional Comments
- The performance difference for various lengths is minor, especially with modern hardware.
- Both algorithms are optimized for performance, but SHA-512 is more complex, leading to slightly longer processing times as the input size increases.
- The avalanche effect ensures that even a small change in input results in a drastically different hash output.
- MD5 is known to have vulnerabilities, while SHA-512 is currently considered more secure.
"""
print(comments)


Comparison Table:


Unnamed: 0,Criteria,MD5,SHA-512
0,Input Size,Up to 2^64 bits,Up to 2^128 bits
1,Output Size,128 bits (16 bytes),512 bits (64 bytes)
2,Initialization Vector Size,,
3,Version Available in Market,"Widely used, various libraries","Widely used, various libraries"
4,Time taken for short message (msg 'Hi'),0.000000 s,0.000000 s
5,Time taken for moderate length message,0.000000 s,0.000000 s
6,Time taken for long length message,0.000000 s,0.000000 s
7,Avalanche Effect (msg 'Hi' vs 'Ho'),Different outputs,Different outputs
8,Message length for various messages,Varies (fixed hash size),Varies (fixed hash size)
9,Message digest generated,128 bits,512 bits



Timing Results:


Unnamed: 0,Message Length,MD5 Time (s),SHA-512 Time (s)
0,Short Message,0.0,0.0
1,Moderate Length Message,0.0,0.0
2,Long Length Message,0.0,0.0



Collision Example:
MD5 Collision:
Message 1: 'a1234567890abcdefghijklmnopqrstuvwxyz' -> MD5 Hash: 807a83373113a662f2cec135e09d6bc5
Message 2: 'b1234567890abcdefghijklmnopqrstuvwxyz' -> MD5 Hash: 807a83373113a662f2cec135e09d6bc5

### Additional Comments
- The performance difference for various lengths is minor, especially with modern hardware.
- Both algorithms are optimized for performance, but SHA-512 is more complex, leading to slightly longer processing times as the input size increases.
- The avalanche effect ensures that even a small change in input results in a drastically different hash output.
- MD5 is known to have vulnerabilities, while SHA-512 is currently considered more secure.

