# Homework 2

## Exercise 1:

**1. Suppose a password is chosen as a concatenation of seven lower-case dictionary words. Each word is selected uniformly at random from a dictionary of size 50,000. An example of such a password is "mothercathousefivenextcrossroom". How many bits of entropy does this have?**

In [1]:
import math

# Dictionary size
dictionary_size = 50000

# Number of words in the password
num_words = 7

# Calculate the total possibilities
total_possibilities = dictionary_size ** num_words

# Calculate the entropy in bits
entropy_bits = math.log2(total_possibilities)

print("Entropy (bits):", entropy_bits)

Entropy (bits): 109.26748332105768


**2. Consider an alternative scheme where a password is chosen as a sequence of 10 random alphanumeric characters (including both lower-case and upper-case letters). An example is "dA3mG67Rrs". How many bits of entropy does this have?**

In [2]:
# Number of possible characters for each position
num_options = 62  # 26 lowercase letters + 26 uppercase letters + 10 digits

# Number of characters in the password
num_characters = 10

# Calculate the total possibilities
total_possibilities = num_options ** num_characters

# Calculate the entropy in bits
entropy_bits = math.log2(total_possibilities)

print("Entropy (bits):", entropy_bits)

Entropy (bits): 59.54196310386875


**3. Which password is better, the one from 1. or 2.?**

1 because it has more entropy.

## Exercise 2:

**1. Design a data verification system using hash functions. Explain the steps involved in the process.**

<img src="./exercise2.png" alt="Image" width="700" height="450">

1. Choose hash function: First we have to choose the hash function we are going to work with. We have to select a cryptographic hash function known for its collision resistance and pre-image resistance, for example we can choose SHA-256, SHA-3, and others.

2. Data hashing: We have to do the data hashing. When data is created or modified, apply the chosen hash function to generate a fixed-length hash value from the data. This is the data's digital signature.

3. Transmit or stor data and its hash: Send the data and its corresponding hash value together. We have to keep the data and its hash in a secure location or transmit the hash separately to an authorized verifier.

4. Data Retrieval or Verification:  When the data needs to be retrieved or verified, recalculate the hash value from the retrieved data.

5. Comparison:  Compare the recalculated hash value with the original hash value (the one sent or stored). If they match, the data is considered intact and authentic. If they differ, it indicates that the data has been tampered with or corrupted.

6. Respond to Mismatch:  If the hash values do not match, take appropriate action, such as rejecting the data, investigating possible tampering, or seeking a trusted source for the correct data.

7. Periodic Verification:  For stored data, perform regular verification to ensure its integrity over time.


Additional Considerations:

Salting (Optional): In some cases, you may add a salt (a random value) to the data before hashing it. The salt is stored alongside the data and used during verification. This technique is often used in password hashing to defend against rainbow table attacks.

Keyed Hashes: Some systems use a secret key along with the data for hashing (HMAC, for example). The key adds an additional layer of security by making it challenging for unauthorized parties to generate valid hash values.

Security: Ensure the secure transmission or storage of the hash values to prevent attackers from modifying the data and the corresponding hash.

Secure Hash Function: Always choose a cryptographically secure hash function that is resistant to common attacks, such as collision attacks.

Documentation: Document the process and keep records of the data and hash values to ensure data integrity.


**2. Discuss the advantages and disadvantages of using hash functions for data verification.**

- Advantages:

Data Integrity Verification: Hash functions are excellent for detecting any changes or corruption in data. If the hash value of the received data matches the expected hash value, it indicates that the data is likely intact and has not been tampered with.

Speed: Hash functions are computationally efficient and fast. Verifying data integrity using hash functions can be done quickly, making them suitable for real-time or near-real-time applications.

Deterministic: Hash functions produce consistent results for the same input. This property ensures that the verification process is repeatable and consistent.

Fixed-Length Output: Hash functions generate a fixed-length hash value, regardless of the input size. This makes it easy to handle and store hash values.

Checksums and Data Deduplication: Hash functions are used in checksums and data deduplication to identify duplicate data quickly. This is valuable for saving storage space and optimizing data transfer.

Password Storage: Hash functions, along with salting, are used to securely store and verify passwords. This prevents attackers from easily determining the original passwords.

- Disadvantages:

No Data Recovery: Hash functions are one-way functions, meaning they cannot be reversed to obtain the original data. While this is advantageous for security, it also means that you cannot recover the original data from the hash.

Collision Vulnerabilities: Although rare, hash functions can produce the same hash value for different inputs (collisions). This can be exploited by attackers to undermine data integrity, especially if the hash function used is not cryptographically secure.

Data Length Insensitivity: Hash functions do not consider the length of the input data. A single-bit change in input can result in a completely different hash value. This can make it challenging to identify minor alterations in large datasets.

Pre-image Attacks: Certain hash functions may be vulnerable to pre-image attacks, where an attacker can guess the original data based on the hash value. This is especially concerning if the hash function is not cryptographically secure.

Protection of Hash Values: The security of the verification process heavily depends on the protection of the hash values. If an attacker can modify both the data and its hash, the verification process is compromised.

Dependence on Hash Function Security: The security of the entire data verification process depends on the cryptographic strength of the chosen hash function. Weak hash functions can lead to vulnerabilities.

**3. Provide an example of a real-world application where a data verification system using hash functions is used.**

A real-world application of data verification using hash functions is in software or file distribution. When software or files are distributed they are packaged into a compressed archive. Before this a hash function is applied to each file, this hash function generates a fixed-size output called a hash value. The hash values are made avaialable on a trusted site and users can download the hash values and compare them to the hash values of the files they downloaded. If the hash values match then the files are intact and have not been tampered with. If the hash values do not match then the files have been tampered with and the user should not use them.

## Exercise 3: 

**1. Define what a Message Authentication Code (MAC) is and how it is used in cryptography.**

A message authentication code (MAC) is a cryptographic checksum, which is a hash, basically a fixed-size string of characters generated from data with a hash function, on data that uses a session key to detect both accidental and intentional modifications of the data. Usually this data is from a message or data transmission. This message is generated with the secret key and the content of the message, which is a checksum or a tag that can be attached to the message. A MAC is computed on both the sender's and receiver's sides using the same session key. The receiver recomputes the MAC on the received data and compares it to the MAC sent by the sender. If the two MACs match, the data is considered authentic.

**2. Explain the process of generating and verifying a MAC.**

1. Select a session key: Choose a session key that is known only to the sender and receiver, that means it is private. This key is used to generate the MAC and verify it.

2. Select a message to authenticate: The sender has a message or data that they want to transmit.

3. MAC Generation: The sender applies a MAC algorithm to the message and session key to generate a MAC. 

4. Attach the MAC: The MAC is a fixed-length string of characters that is attached to the message so both are sent to the receiver.

5. Receiver gets the message and MAC: The receiver receives the message and MAC.

6. MAC Verification: The receiver recalculates the MAC using the received data and session key.

7. Comparison: Compare the recalculated MAC with the original MAC. If they match, the data is considered intact and authentic. If they differ, it indicates that the data has been tampered with or corrupted.

8. Respond to Mismatch: If the MACs do not match, take appropriate action, such as rejecting the data, investigating possible tampering, or seeking a trusted source for the correct data.

**3. Discuss the importance of using MACs in secure communication systems.**

Message Authentication Codes (MACs) play a crucial role in ensuring the security and integrity of data in secure communication systems. They are used in various applications, including secure email, digital signatures, and secure web communication. MACs provide the following benefits:

- Data Integrity: MACs provide a means to verify that the data transmitted between parties has not been tampered with during transmission. By calculating a MAC over the data and sending it along with the data, the recipient can recompute the MAC and compare it with the received MAC. If the MACs match, it's highly likely that the data hasn't been altered.

- Data Authentication: MACs can confirm the authenticity of the sender. When a sender calculates a MAC over the data using a secret key, it essentially signs the data. The recipient can use the same key to verify the MAC, confirming that the data is from the legitimate sender and hasn't been spoofed.

- Protection Against Replay Attacks: MACs can protect against replay attacks where an attacker intercepts and resends data to impersonate the sender. The MAC, which is unique to each message, ensures that the recipient doesn't accept duplicate or out-of-order messages.

- Cryptographic Hash Functions: MACs are often constructed using cryptographic hash functions, which are designed to be resistant to pre-image attacks and collisions. This adds a layer of security and ensures the MAC's strength.

- Confidentiality (when combined with encryption): MACs are often used in combination with encryption, providing both confidentiality and data integrity. Encrypting the data ensures its secrecy, while the MAC verifies its integrity.

- Secure Key Management: MACs require a shared secret key between the sender and the recipient. This key is crucial for generating and verifying the MAC. Secure key management is essential for the security of the communication system.

- Non-Repudiation: In some contexts, MACs can be used to establish non-repudiation. If a sender's MAC is verified by the recipient and a third party, it provides evidence that the sender indeed sent the data and cannot deny it later.

- Wide Applicability: MACs are used in various secure communication protocols, including secure email (e.g., S/MIME), digital signatures, secure web communication (HTTPS), and many other applications.

## Exercise 4:

Given the values of p = 17 and q = 23, generate a pair of keys for RSA.

In [13]:
# Given the values of p = 17 and q = 23, generate a pair of keys for RSA.

# p and q are prime numbers
p = 17
q = 23

# Calculate the modulus n
n = p * q
print("Modulus n:", n, end="\n")

# Calculate the totient phi(n)
phi_n = (p - 1) * (q - 1)
print("Totient phi(n):", phi_n)

Modulus n: 391
Totient phi(n): 352


In [14]:
# Choose an integer e that is a relative prime to the totient
e = 3
print("e:", e) 
# Check 1 < e < phi_n
if 1 < e < phi_n:
    print("e is between 1 and phi_n")
# Chack if gcd(e, phi_n) = 1
if math.gcd(e, phi_n) == 1:
    print("e is a relative prime to phi_n")

e: 3
e is between 1 and phi_n
e is a relative prime to phi_n


In [15]:
# Calculate d which is the modular multiplicative inverse of e
d = pow(e, -1, phi_n)
print("d:", d, end="\n\n")

# Print the public key
print("Public key:", (n, e))

# Print the private key
print("Private key:", (n, d))

d: 235

Public key: (391, 3)
Private key: (391, 235)


## Exercise 5:

**1. Design a public key infrastructure (PKI) system. Explain the components and their roles in the system.**

A public key infrastructure (PKI) is a system of hardware, software, policies, and procedures that creates, manages, distributes, and revokes digital certificates. It is used to verify the identity of users and devices in a network and ensure the confidentiality and integrity of data.

A PKI system consists of the following components:

- Certificate Authority (CA): A trusted entity that issues digital certificates to users and devices. The CA is responsible for verifying the identity of the certificate holder and signing the certificate using its private key. The CA's public key is used to verify the certificate's signature. The CA is also responsible for revoking certificates when necessary. It also manages cetificate databases and certificate stores.

- Registration Authority (RA): An optional entity that assists the CA in verifying the identity of certificate holders. The RA is responsible for authenticating users and devices and forwarding the authentication information to the CA.

- Certificate Database: A database that stores the issued certificates and their associated information. The database is used to retrieve certificates and verify their authenticity. Also it stores the certicate revocation list (CRL). 

- Certificate Policy: A set of rules and guidelines that govern the issuance and use of certificates. The certificate policy is used to ensure the security and integrity of the PKI system.

- Certificate Practice Statement (CPS): A document that describes the practices and procedures of the CA. The CPS is used to ensure the security and integrity of the PKI system.

- Certificate: A digital document that contains the public key of a user or device and is signed by a trusted CA. The certificate is used to verify the identity of the certificate holder and ensure the confidentiality and integrity of data.

- Private Key: A secret key that is known only to the certificate holder. The private key is used to decrypt data encrypted with the certificate holder's public key and sign data using the certificate holder's digital signature.

- Public Key: A key that is made publicly available and is used to encrypt data for the certificate holder and verify the certificate holder's digital signature.

-Certification path: A chain of certificates that links the certificate holder to a trusted CA. The certification path is used to verify the authenticity of the certificate holder.

**2. Discuss the advantages and challenges of implementing a PKI system.**

- Advantages of Implementing a PKI System:

Enhanced Security: A PKI system provides strong security mechanisms, including encryption, digital signatures, and identity verification, ensuring the confidentiality, integrity, and authenticity of data.

Secure Communication: PKI enables secure communication over insecure networks (e.g., the internet). It ensures that data remains confidential and cannot be intercepted or tampered with.

Authentication: PKI allows for user and entity authentication, which is crucial for ensuring that users are who they claim to be. It's fundamental in various applications, including online banking, secure email, and secure access to systems.

Non-Repudiation: Digital signatures provided by PKI offer non-repudiation, meaning that senders cannot deny the authenticity of their digital signatures. This is vital in legal and financial contexts.

Data Integrity: PKI helps verify the integrity of transmitted data, ensuring that it has not been tampered with during transmission.

Scalability: PKI systems are highly scalable, accommodating a large number of users, devices, and entities while maintaining security and efficiency.

User Privacy: PKI allows for the secure exchange of data without revealing the content to unauthorized parties.

Certificate-Based Access Control: Certificates can be used to control access to resources, enabling fine-grained access permissions.

Regulatory Compliance: Many regulations and standards require the use of PKI for data protection and compliance, making it essential for businesses in regulated industries.

- Challenges of Implementing a PKI System:

Complexity: PKI systems are inherently complex, involving various components such as Certificate Authorities (CAs), Registration Authorities (RAs), and certificate policies. Managing and maintaining these components can be challenging.

Key Management: Secure key management is crucial for PKI systems. Protecting and managing private keys is a significant challenge, and a compromise can lead to security breaches.

Cost: Implementing and maintaining a PKI system can be costly. This includes expenses related to hardware, software, personnel, and security measures.

Interoperability: Ensuring that PKI components from different vendors work together seamlessly can be complex, particularly in heterogeneous environments.

Scalability Challenges: Managing a growing number of certificates, especially in large enterprises or across multiple organizations, requires careful planning and resources.

Trust Issues: Trusting the root CA is critical. If the root CA is compromised or not trusted, the entire PKI system is at risk.

Revocation Management: Managing and disseminating Certificate Revocation Lists (CRLs) can be resource-intensive and may lead to performance issues.

User Education: Users must be educated on how to use PKI features properly, including certificate handling and digital signature verification.

Certificate Lifecycle Management: Managing the lifecycle of certificates, including issuance, renewal, and revocation, is complex and requires rigorous processes.

Distributed Environments: Implementing PKI in distributed or cloud environments may introduce additional challenges due to the dynamic nature of these environments.

**3. Provide an example of a real-world application where a PKI system is used.**

 real-world application where a Public Key Infrastructure (PKI) system is widely used is in securing web communication through HTTPS (Hypertext Transfer Protocol Secure). Here's how a PKI system is applied in the context of HTTPS:

Application: Secure Web Browsing (HTTPS)

Overview: HTTPS is a secure version of the HTTP protocol used for web communication. It ensures that data transmitted between a web browser and a web server remains confidential, unaltered, and authentic. PKI plays a fundamental role in this security.

Components and Roles:

Certificate Authority (CA): Trusted CAs issue digital certificates to websites. They validate the identity of the website owner and attest to the association between the website's public key and its domain name.

Digital Certificates: Website owners obtain digital certificates containing their public key and domain information. These certificates are digitally signed by the CA.

Web Browser: When a user accesses a secure website, the web browser checks the website's digital certificate. If the certificate is valid and signed by a trusted CA, the browser establishes a secure connection.

Encryption: PKI enables secure encryption of data exchanged between the user's browser and the web server, preventing eavesdropping.

Digital Signatures: Digital certificates include digital signatures to ensure that the certificate and its public key have not been tampered with during transmission.

## Exercise 6:

Design a system for digital signatures based on public-key cryptography. Explain the steps involved in the process and the role of each component.

Key Pair Generation:

Role: The key pair generation component creates public and private key pairs.
Steps:
Generate a public-private key pair using a secure algorithm (e.g., RSA, DSA, ECDSA).
Safeguard the private key.
Signing Component:

Role: The signing component uses the private key to create a digital signature.
Steps:
Hash the data to be signed using a secure hash function (e.g., SHA-256).
Encrypt the hash value with the private key to create the digital signature.
Include the digital signature with the data.
Verification Component:

Role: The verification component uses the public key to verify the digital signature.
Steps:
Receive the data and its associated digital signature.
Hash the data using the same hash function used during signing.
Decrypt the digital signature using the sender's public key.
Compare the calculated hash value with the decrypted value. If they match, the signature is valid.
Public Key Distribution:

Role: Ensuring that the public key is available to anyone who needs to verify the signature.
Steps:
Publish the sender's public key in a public key repository or distribute it with the data.
Ensure that the public key is up to date and secure.
Private Key Protection:

Role: Safeguarding the private key to prevent unauthorized signing.
Steps:
Store the private key in a secure hardware module or secure storage.
Implement access control and strong authentication for key usage.
Data Transmission:

Role: Facilitating the exchange of data with digital signatures.
Steps:
The sender signs the data with their private key.
The recipient receives the data and the digital signature.
Verification Decision:

Role: Determining whether the digital signature is valid.
Steps:
The recipient verifies the digital signature using the sender's public key.
If the verification is successful, the data is considered authentic and unaltered.
Steps Involved in the Process:

Sign Data:

The sender signs the data using their private key, creating a digital signature.
Send Data and Signature:

The sender transmits the data and its associated digital signature to the recipient.
Verify Signature:

The recipient receives the data and digital signature.
The recipient uses the sender's public key to verify the digital signature.
Validation:

If the verification is successful, the data is considered authentic and unaltered.
If the verification fails, the data may be tampered with or not from the claimed sender.
Secure Key Management:

Safeguard the private key to prevent unauthorized signing.
Ensure that the public key is distributed securely and is up to date.