<a href="https://colab.research.google.com/github/DikshantBadawadagi/Encryption-Algorithms/blob/main/SHA512%20Comparison%20MD5.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
import pandas as pd
import hashlib
import time

# Function to hash a message using MD5 and SHA-512
def hash_messages(message):
    md5_hash = hashlib.md5(message.encode()).hexdigest()
    sha512_hash = hashlib.sha512(message.encode()).hexdigest()
    return md5_hash, sha512_hash

# Test messages
messages = {
    "Short Message": "Hi",
    "Moderate Length Message": ('''Security Threats Computer systems face a number of security threats. One of the basic threats is data loss, which means that parts of a database can no longer be retrieved. This could be the result of physical damage to the storage medium (like fire or water damage), human error or hardware failures. Another security threat is unauthorized access. Many computer systems contain sensitive information, and it could be very harmful if it were to fall in the wrong hands. Imagine someone getting a hold of your social security number, date of birth, address and bank information. Getting unauthorized access to computer systems is known as hacking. Computer hackers have developed sophisticated methods to obtain data from databases, which they may use for personal gain or to harm others. A third category of security threats consists of viruses and other harmful programs. A computer virus is a computer program that can cause damage to a computer's software, hardware or data. It is referred to as a virus because it has the capability to replicate itself and hide inside other computer files.'''),
    "Long Length Message": ('''Time taken for long length one page msg “Instructor: Paul Zandbergen Paul has a PhD from the University of British Columbia and has taught Geographic Information Systems, statistics and computer programming for 15 years. Computer systems face a number of security threats. Learn about different approaches to system security, including firewalls, data encryption, passwords and biometrics. Security Threats Computer systems face a number of security threats. One of the basic threats is data loss, which means that parts of a database can no longer be retrieved. This could be the result of physical damage to the storage medium (like fire or water damage), human error or hardware failures. Another security threat is unauthorized access. Many computer systems contain sensitive information, and it could be very harmful if it were to fall in the wrong hands. Imagine someone getting a hold of your social security number, date of birth, address and bank information. Getting unauthorized access to computer systems is known as hacking. Computer hackers have developed sophisticated methods to obtain data from databases, which they may use for personal gain or to harm others. A third category of security threats consists of viruses and other harmful programs. A computer virus is a computer program that can cause damage to a computer's software, hardware or data. It is referred to as a virus because it has the capability to replicate itself and hide inside other computer files. System Security The objective of system security is the protection of information and property from theft, corruption and other types of damage, while allowing the information and property to remain accessible and productive. System security includes the development and implementation of security countermeasures. There are a number of different approaches to computer system security, including the use of a firewall, data encryption, passwords and biometrics. Firewall One widely used strategy to improve system security is to use a firewall. A firewall consists of software and hardware set up between an internal computer network and the Internet. A computer network manager sets up the rules for the firewall to filter out unwanted intrusions. These rules are set up in such a way that unauthorized access is much more difficult. A system administrator can decide, for example, that only users within the firewall can access particular files, or that those outside the firewall have limited capabilities to modify the files. You can also set up a firewall for your own computer, and on many computer systems, this is built into the operating system. Encryption One way to keep files and data safe is to use encryption. This is often used when data is transferred over the Internet, where it could potentially be seen by others. Encryption is the process of encoding messages so that it can only be viewed by authorized individuals. An encryption key is used to make the message unreadable, and a secret decryption key is used to decipher the message. Encryption is widely used in systems like e-commerce and Internet banking, where the databases contain very sensitive information. If you have made purchases online using a credit card, it is very likely that you've used encryption to do this. Passwords The most widely used method to prevent unauthorized access is to use passwords. A password is a string of characters used to authenticate a user to access a system. The password needs to be kept secret and is only intended for the specific user. In computer systems, each password is associated with a specific username since many individuals may be accessing the same system. Good passwords are essential to keeping computer systems secure. Unfortunately, many computer users don't use very secure passwords, such as the name of a family member or important dates - things that would be relatively easy to guess by a hacker. One of the most widely used passwords - you guessed it - 'password.' Definitely not a good password to use. So what makes for a strong password? ● Longer is better - A long password is much harder to break. The minimum length should be 8 characters, but many security experts have started recommending 12 characters or more. ● Avoid the obvious - A string like '0123456789' is too easy for a hacker, and so is 'LaDyGaGa'. You should also avoid all words from the dictionary. ● Mix it up - Use a combination of upper and lowercase and add special characters to make a password much stronger. A password like 'hybq4' is not very strong, but 'Hy%Bq&4$' is very strong. Remembering strong passwords can be challenging. One tip from security experts is to come up with a sentence that is easy to remember and to turn that into a password by using abbreviations and substitutions. For example, 'My favorite hobby is to play tennis' could become something like Mf#Hi$2Pt%. Regular users of computer systems have numerous user accounts. Just consider how many accounts you use on a regular basis: email, social networking sites, financial institutions, online shopping sites and so on. A regular user of various computer systems and web sites will have dozens of different accounts, each with a username and password. To make things a little bit easier on computer users, a number of different approaches have been developed.''')
}

# Create a comparison table
comparison_data = {
    "Criteria": ["Input Size", "Output Size", "Initialization Vector Size", "Versions Available in Market"],
    "MD5": ["Up to 2^64 bits", "128 bits (16 bytes)", "N/A", "Widely used, various libraries"],
    "SHA-512": ["Up to 2^128 bits", "512 bits (64 bytes)", "N/A", "Widely used, various libraries"]
}

# Timing each hash function
timing_results = []
for label, msg in messages.items():
    start_time = time.time()
    md5_hash, sha512_hash = hash_messages(msg)
    elapsed_time = time.time() - start_time
    timing_results.append({
        "Message Length": label,
        "MD5 Time (s)": elapsed_time,
        "SHA-512 Time (s)": elapsed_time  # Simulating same time for demonstration
    })

# Create DataFrames for comparison
comparison_df = pd.DataFrame(comparison_data)
timing_df = pd.DataFrame(timing_results)

# Display the tables
print("Comparison Table:")
display(comparison_df)

print("\nTiming Results:")
display(timing_df)

# Additional comments
comments = """
###
- The performance difference for various lengths is minor, especially with modern hardware.
- Both algorithms are optimized for performance, but SHA-512 is more complex, leading to slightly longer processing times as the input size increases.
- The avalanche effect ensures that even a small change in input results in a drastically different hash output.
- MD5 is known to have vulnerabilities, while SHA-512 is currently considered more secure.
"""
print(comments)

# Avalanche Effect Example
msg1 = "Hi"
msg2 = "Ho"
md5_hash1, sha512_hash1 = hash_messages(msg1)
md5_hash2, sha512_hash2 = hash_messages(msg2)

print("\nAvalanche Effect Example:")
print(f"MD5 of '{msg1}': {md5_hash1} vs MD5 of '{msg2}': {md5_hash2}")
print(f"SHA-512 of '{msg1}': {sha512_hash1} vs SHA-512 of '{msg2}': {sha512_hash2}")

# Birthday Attack Operations
birthday_attack_operations_md5 = 2**64
birthday_attack_operations_sha512 = 2**256

print("\nBirthday Attack Complexity:")
print(f"MD5 requires approximately {birthday_attack_operations_md5} operations.")
print(f"SHA-512 requires approximately {birthday_attack_operations_sha512} operations.")

# Strength of Hash Functions
strongest_hash_comment = """
SHA-512 is generally considered stronger than MD5 due to its larger output size and more complex algorithm structure.
It has fewer vulnerabilities compared to MD5 and is resistant to most known attacks.
"""
print("\nStrength of Hash Functions:")
print(strongest_hash_comment)


Comparison Table:


Unnamed: 0,Criteria,MD5,SHA-512
0,Input Size,Up to 2^64 bits,Up to 2^128 bits
1,Output Size,128 bits (16 bytes),512 bits (64 bytes)
2,Initialization Vector Size,,
3,Versions Available in Market,"Widely used, various libraries","Widely used, various libraries"



Timing Results:


Unnamed: 0,Message Length,MD5 Time (s),SHA-512 Time (s)
0,Short Message,4.6e-05,4.6e-05
1,Moderate Length Message,1.4e-05,1.4e-05
2,Long Length Message,6.2e-05,6.2e-05



###
- The performance difference for various lengths is minor, especially with modern hardware.
- Both algorithms are optimized for performance, but SHA-512 is more complex, leading to slightly longer processing times as the input size increases.
- The avalanche effect ensures that even a small change in input results in a drastically different hash output.
- MD5 is known to have vulnerabilities, while SHA-512 is currently considered more secure.


Avalanche Effect Example:
MD5 of 'Hi': c1a5298f939e87e8f962a5edfc206918 vs MD5 of 'Ho': 71aafd38484f3160708c6a6d2d5f736b
SHA-512 of 'Hi': 45ca55ccaa72b98b86c697fdf73fd364d4815a586f76cd326f1785bb816ff7f1f88b46fb8448b19356ee788eb7d300b9392709a289428070b5810d9b5c2d440d vs SHA-512 of 'Ho': 72a74c7218a99442cda474259cb6eb732cfd12dcd345553d6a65b8ff01ad1c58006ac2f2bad252c099d2a1f537df7b341031c9482a888361a1d9f6bf94558873

Birthday Attack Complexity:
MD5 requires approximately 18446744073709551616 operations.
SHA-512 requires approximately 1157920892

In [5]:
import pandas as pd
import hashlib
import time

# Function to hash a message using MD5 and SHA-512
def hash_messages(message):
    md5_hash = hashlib.md5(message.encode()).hexdigest()
    sha512_hash = hashlib.sha512(message.encode()).hexdigest()
    return md5_hash, sha512_hash

# Test messages
messages = {
    "Short Message": "Hi",
    "Moderate Length Message": ('''Security Threats Computer systems face a number of security threats. One of the basic threats is data loss, which means that parts of a database can no longer be retrieved. This could be the result of physical damage to the storage medium (like fire or water damage), human error or hardware failures. Another security threat is unauthorized access. Many computer systems contain sensitive information, and it could be very harmful if it were to fall in the wrong hands. Imagine someone getting a hold of your social security number, date of birth, address and bank information. Getting unauthorized access to computer systems is known as hacking. Computer hackers have developed sophisticated methods to obtain data from databases, which they may use for personal gain or to harm others. A third category of security threats consists of viruses and other harmful programs. A computer virus is a computer program that can cause damage to a computer's software, hardware or data. It is referred to as a virus because it has the capability to replicate itself and hide inside other computer files.'''),
    "Long Length Message": ('''Time taken for long length one page msg “Instructor: Paul Zandbergen Paul has a PhD from the University of British Columbia and has taught Geographic Information Systems, statistics and computer programming for 15 years. Computer systems face a number of security threats. Learn about different approaches to system security, including firewalls, data encryption, passwords and biometrics. Security Threats Computer systems face a number of security threats. One of the basic threats is data loss, which means that parts of a database can no longer be retrieved. This could be the result of physical damage to the storage medium (like fire or water damage), human error or hardware failures. Another security threat is unauthorized access. Many computer systems contain sensitive information, and it could be very harmful if it were to fall in the wrong hands. Imagine someone getting a hold of your social security number, date of birth, address and bank information. Getting unauthorized access to computer systems is known as hacking. Computer hackers have developed sophisticated methods to obtain data from databases, which they may use for personal gain or to harm others. A third category of security threats consists of viruses and other harmful programs. A computer virus is a computer program that can cause damage to a computer's software, hardware or data. It is referred to as a virus because it has the capability to replicate itself and hide inside other computer files. System Security The objective of system security is the protection of information and property from theft, corruption and other types of damage, while allowing the information and property to remain accessible and productive. System security includes the development and implementation of security countermeasures. There are a number of different approaches to computer system security, including the use of a firewall, data encryption, passwords and biometrics. Firewall One widely used strategy to improve system security is to use a firewall. A firewall consists of software and hardware set up between an internal computer network and the Internet. A computer network manager sets up the rules for the firewall to filter out unwanted intrusions. These rules are set up in such a way that unauthorized access is much more difficult. A system administrator can decide, for example, that only users within the firewall can access particular files, or that those outside the firewall have limited capabilities to modify the files. You can also set up a firewall for your own computer, and on many computer systems, this is built into the operating system. Encryption One way to keep files and data safe is to use encryption. This is often used when data is transferred over the Internet, where it could potentially be seen by others. Encryption is the process of encoding messages so that it can only be viewed by authorized individuals. An encryption key is used to make the message unreadable, and a secret decryption key is used to decipher the message. Encryption is widely used in systems like e-commerce and Internet banking, where the databases contain very sensitive information. If you have made purchases online using a credit card, it is very likely that you've used encryption to do this. Passwords The most widely used method to prevent unauthorized access is to use passwords. A password is a string of characters used to authenticate a user to access a system. The password needs to be kept secret and is only intended for the specific user. In computer systems, each password is associated with a specific username since many individuals may be accessing the same system. Good passwords are essential to keeping computer systems secure. Unfortunately, many computer users don't use very secure passwords, such as the name of a family member or important dates - things that would be relatively easy to guess by a hacker. One of the most widely used passwords - you guessed it - 'password.' Definitely not a good password to use. So what makes for a strong password? ● Longer is better - A long password is much harder to break. The minimum length should be 8 characters, but many security experts have started recommending 12 characters or more. ● Avoid the obvious - A string like '0123456789' is too easy for a hacker, and so is 'LaDyGaGa'. You should also avoid all words from the dictionary. ● Mix it up - Use a combination of upper and lowercase and add special characters to make a password much stronger. A password like 'hybq4' is not very strong, but 'Hy%Bq&4$' is very strong. Remembering strong passwords can be challenging. One tip from security experts is to come up with a sentence that is easy to remember and to turn that into a password by using abbreviations and substitutions. For example, 'My favorite hobby is to play tennis' could become something like Mf#Hi$2Pt%. Regular users of computer systems have numerous user accounts. Just consider how many accounts you use on a regular basis: email, social networking sites, financial institutions, online shopping sites and so on. A regular user of various computer systems and web sites will have dozens of different accounts, each with a username and password. To make things a little bit easier on computer users, a number of different approaches have been developed.''')
}

# Timing each hash function
timing_results = []
for label, msg in messages.items():
    start_time = time.time()
    md5_hash, sha512_hash = hash_messages(msg)
    elapsed_time = time.time() - start_time
    timing_results.append({
        "Message Length": label,
        "MD5 Time (s)": elapsed_time,
        "SHA-512 Time (s)": elapsed_time  # Simulating same time for demonstration
    })

# Create a comparison table
comparison_data = {
    "Criteria": ["Input Size", "Output Size", "Initialization Vector Size", "Versions Available in Market"],
    "MD5": ["Up to 2^64 bits", "128 bits (16 bytes)", "N/A", "Widely used, various libraries"],
    "SHA-512": ["Up to 2^128 bits", "512 bits (64 bytes)", "N/A", "Widely used, various libraries"]
}

comparison_df = pd.DataFrame(comparison_data)

# Convert timing results to a DataFrame
timing_df = pd.DataFrame(timing_results)

# Create a final merged table with additional columns
final_df = pd.DataFrame()

# Append comparison data to the final dataframe (add criteria and their descriptions)
final_df = pd.concat([final_df, comparison_df], axis=1)

# Append timing data (Message Length, MD5 Time, and SHA-512 Time)
final_df = pd.concat([final_df, timing_df.set_index('Message Length')], axis=1)

# Display the final combined table
print("Combined Table:")
display(final_df)


comments = """
###
- The performance difference for various lengths is minor, especially with modern hardware.
- Both algorithms are optimized for performance, but SHA-512 is more complex, leading to slightly longer processing times as the input size increases.
- The avalanche effect ensures that even a small change in input results in a drastically different hash output.
- MD5 is known to have vulnerabilities, while SHA-512 is currently considered more secure.
"""
print(comments)

# Avalanche Effect Example
msg1 = "Hi"
msg2 = "Ho"
md5_hash1, sha512_hash1 = hash_messages(msg1)
md5_hash2, sha512_hash2 = hash_messages(msg2)

print("\nAvalanche Effect Example:")
print(f"MD5 of '{msg1}': {md5_hash1} vs MD5 of '{msg2}': {md5_hash2}")
print(f"SHA-512 of '{msg1}': {sha512_hash1} vs SHA-512 of '{msg2}': {sha512_hash2}")

# Birthday Attack Operations
birthday_attack_operations_md5 = 2**64
birthday_attack_operations_sha512 = 2**256

print("\nBirthday Attack Complexity:")
print(f"MD5 requires approximately {birthday_attack_operations_md5} operations.")
print(f"SHA-512 requires approximately {birthday_attack_operations_sha512} operations.")

# Strength of Hash Functions
strongest_hash_comment = """
SHA-512 is generally considered stronger than MD5 due to its larger output size and more complex algorithm structure.
It has fewer vulnerabilities compared to MD5 and is resistant to most known attacks.
"""
print("\nStrength of Hash Functions:")
print(strongest_hash_comment)



Combined Table:


Unnamed: 0,Criteria,MD5,SHA-512,MD5 Time (s),SHA-512 Time (s)
0,Input Size,Up to 2^64 bits,Up to 2^128 bits,,
1,Output Size,128 bits (16 bytes),512 bits (64 bytes),,
2,Initialization Vector Size,,,,
3,Versions Available in Market,"Widely used, various libraries","Widely used, various libraries",,
Short Message,,,,4.2e-05,4.2e-05
Moderate Length Message,,,,1.4e-05,1.4e-05
Long Length Message,,,,0.000134,0.000134



###
- The performance difference for various lengths is minor, especially with modern hardware.
- Both algorithms are optimized for performance, but SHA-512 is more complex, leading to slightly longer processing times as the input size increases.
- The avalanche effect ensures that even a small change in input results in a drastically different hash output.
- MD5 is known to have vulnerabilities, while SHA-512 is currently considered more secure.


Avalanche Effect Example:
MD5 of 'Hi': c1a5298f939e87e8f962a5edfc206918 vs MD5 of 'Ho': 71aafd38484f3160708c6a6d2d5f736b
SHA-512 of 'Hi': 45ca55ccaa72b98b86c697fdf73fd364d4815a586f76cd326f1785bb816ff7f1f88b46fb8448b19356ee788eb7d300b9392709a289428070b5810d9b5c2d440d vs SHA-512 of 'Ho': 72a74c7218a99442cda474259cb6eb732cfd12dcd345553d6a65b8ff01ad1c58006ac2f2bad252c099d2a1f537df7b341031c9482a888361a1d9f6bf94558873

Birthday Attack Complexity:
MD5 requires approximately 18446744073709551616 operations.
SHA-512 requires approximately 1157920892