### Week 1: Introduction to Cryptography Basics


## * Difference between bytes and strings

str: Represents text (a sequence of Unicode characters). Each character in a str object is a Unicode code point. Enclosed in single or double quotes ('hello' or "hello").

bytes: Represents binary data (a sequence of bytes). Each byte is a value between 0 and 255. Enclosed in single or double quotes prefixed with a b (b'hello' or b"hello").

| Feature                  | `bytes`                                          | `str`                                       |
|--------------------------|--------------------------------------------------|---------------------------------------------|
| **Data Representation**  | Sequence of immutable binary data (0–255).       | Sequence of Unicode characters.             |
| **Encoding/Decoding**    | Can be decoded to `str` using `.decode()`.        | Must be encoded to `bytes` using `.encode()`. |
| **Content Type**         | Binary data, not human-readable.                 | Human-readable text.                        |
| **Usage**                | Typically used for binary file operations, networking, and cryptographic data. | Used for text processing and display.       |
| **Concatenation**        | Must use `bytes` objects only.                   | Can concatenate with other `str` objects.   |
| **Iteration**            | Iterates over integers (byte values).            | Iterates over Unicode characters.           |                                       |
| **Search Methods**       | `.find()`, `.rfind()`, `.index()`, `.count()`.    | `.find()`, `.rfind()`, `.index()`, `.count()`. |
| **Case Transformations** | Not supported.                                   | `.upper()`, `.lower()`, `.title()`, etc.    |
| **Hexadecimal Support**  | Supports `.hex()` and `.fromhex()`.              | Not supported.                              |
| **Padding**              | Can pad with specific bytes.                    | Can pad with specific characters.           |
| **Character Type Checks**| Not supported.                                   | `.isalpha()`, `.isdigit()`, `.isalnum()`, etc. |
| **File Operations**      | Used for binary file read/write operations.      | Used for text file read/write operations.   |
| **Iterating Output**     | Produces integers (e.g., `b'abc' -> [97, 98, 99]`). | Produces characters (e.g., `'abc' -> ['a', 'b', 'c']`). |


In [87]:
data_str = "Hello"
data_bytes = b"Hello"

encoded = data_str.encode('utf-8') # converts str to bytes
# many encodings available, utf-8 is the most common, some others are ascii, utf-16, utf-32
decoded = data_bytes.decode('utf-8')
print(b'\x48\x65\x6c\x6c\x6f') # Hello
print(b'\110\145\154\154\157') # Hello

print(f"Original String: {data_str} (type: {type(data_str)})")
print(f"Encoded Bytes: {encoded} (type: {type(encoded)})")
print(f"Decoded String: {decoded} (type: {type(decoded)})")
try:
    invalid_bytes = data_str + data_bytes  # This will raise a TypeError
except TypeError as e:
    print(f"Error: {e}")

# Proper Conversion
valid_bytes = data_str.encode('utf-8') + data_bytes
print(f"Concatenated Bytes: {valid_bytes}")

b'Hello'
b'Hello'
Original String: Hello (type: <class 'str'>)
Encoded Bytes: b'Hello' (type: <class 'bytes'>)
Decoded String: Hello (type: <class 'str'>)
Error: can only concatenate str (not "bytes") to str
Concatenated Bytes: b'HelloHello'


# 1. Classical Ciphers


## 1.1 Caesar Cipher

 It is believed to be proposed by Julius Caesar around 55 B.C. to secure his military communication and is therefore considered the first documented Encryption cipher
 
 It’s a type of substitution cipher, i.e., each letter of a given text is replaced by a letter with a fixed number of positions down the alphabet. For example, with a shift of 3, A would be replaced by D, B would become E, and so on, i.e., E<sub>n</sub>(x) = (x+<it>n</it>)mod 26
<p align="center">
  <img src="./ceaserCipher.png" alt="image not in the same folder">
</p>





In [64]:
def caesar_encrypt(plaintext, shift):
    encrypted = ""
    for char in plaintext:
        if char.isalpha():
            shift_base = ord('A') if char.isupper() else ord('a') # checking if character is uppercase or lowercase
            # ord('A') = 65
            # chr(65) = 'A'
            encrypted += chr((ord(char) - shift_base + shift) % 26 + shift_base) # (x+shift)%26
        else:
            encrypted += char # if the character is non-alphabetic, 1-9,!,@,$, etc.
    return encrypted

plaintext = "WELCOME TO CRYPTOGUE"
shift = 3
ciphertext = caesar_encrypt(plaintext, shift)
print(f"Ciphertext: {ciphertext}")

Ciphertext: ZHOFRPH WR FUBSWRJXH


### It is very weak due to the fact that just changing the parity of the shift decrypts the ciphertext

In [65]:
def caesar_decrypt(ciphertext, shift):
    return caesar_encrypt(ciphertext, -shift)

decrypted = caesar_decrypt(ciphertext, shift)
print(f"Decrypted: {decrypted}")

Decrypted: WELCOME TO CRYPTOGUE


## 1.2 Vigenère Cipher

Proposed sometime in the 16<sup>th</sup> century, it remained pretty much unbreakable until the mid-19<sup>th</sup> century when Kasiski published his technique on frequency analysis of alphabets to break the cipher partially. Theoretically, it is unbreakable, but can be exploited when the key is short in comparision to the text.

It uses a simple form of polyalphabetic substitution. A polyalphabetic cipher is any cipher based on substitution, using multiple substitution alphabets.

Eg - Plaintext : CRYPTOGUEWEEKONE, Keyword : MNPCSEC. the given keyword is repeated in a circular manner until it matches the length of the plain text, KEY : MNPCSECMNPCSECMN. E<sub>i</sub> = (P<sub>i</sub> + K<sub>i</sub>) mod 26
<p align="center">
  <img src="./vigenere.png" alt="image not in the same folder">
</p>

In [66]:
def vigenere_encrypt(plaintext, key):
    encrypted = ""
    key = key.lower()
    key_index = 0
    for char in plaintext:
        if char.isalpha():
            shift_base = ord('A') if char.isupper() else ord('a') # character is uppercase or lowercase
            shift = ord(key[key_index]) - ord('a')
            encrypted += chr((ord(char) - shift_base + shift) % 26 + shift_base)
            key_index = (key_index + 1) % len(key) # ensures the cyclic loop in the key
        else:
            encrypted += char # for non-alphabetic characters like 1-9,!,@,$, etc.
    return encrypted

plaintext = "CRYPTOGUEWEEKONE"
key = "MNPCSEC"
ciphertext = vigenere_encrypt(plaintext, key)

print(f"Ciphertext: {ciphertext}")

Ciphertext: OENRLSIGRLGWOQZR


In [67]:
def vigenere_decrypt(ciphertext, key):
    decrypted_key = ''.join(chr((26 - (ord(k.lower()) - ord('a'))) % 26 + ord('a')) for k in key)
    return vigenere_encrypt(ciphertext, decrypted_key)

decrypted = vigenere_decrypt(ciphertext, key)
print(f"Decrypted: {decrypted}")

Decrypted: CRYPTOGUEWEEKONE


## 1.3 Atbash Cipher

Atbash cipher is a substitution cipher with just one specific key where all the letters are reversed that is A to Z and Z to A. Does not provide any real cryptographic security since the decryption is just applying the encryption again
<p align="center">
  <img src="./atbash.png" alt="image not in the same folder">
</p>

In [68]:
def atbash_encrypt(plaintext):
    encrypted = ""
    for char in plaintext:
        if char.isalpha():
            shift_base = ord('A') if char.isupper() else ord('a') # character is uppercase or lowercase
            encrypted += chr(shift_base + (25 - (ord(char) - shift_base))) # x = 25 - x
        else:
            encrypted += char # for non-alpha characters like 1-9,!,@,$, etc.
    return encrypted


plaintext = "WAYTOOEASY"
ciphertext = atbash_encrypt(plaintext)

print(f"Ciphertext: {ciphertext}")


Ciphertext: DZBGLLVZHB


In [69]:
def atbash_decrypt(ciphertext):
    return atbash_encrypt(ciphertext)

decrypted = atbash_decrypt(ciphertext)

print(f"Decrypted: {decrypted}")

Decrypted: WAYTOOEASY


## 1.4 MonoAlphabetic Cipher

In Monoalphabetic cipher, the mapping is done randomly and the difference between the letters is not uniform. The key is a 26 letter permutation of the alphabet. The text letters are mapped to this new permutation for encryption. Is prone to frequency analysis attacks if the encrypted text length is long, like vigenere cipher.
<p align="center">
  <img src="./monoalphabetic.png" alt="image not in the same folder">
</p>

In [70]:
def monoalphabetic_encrypt(plaintext, key):
    alphabet = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'
    key_map = {alphabet[i]: key[i] for i in range(len(alphabet))} # mapping the alphabet to the key
    encrypted = ''.join(key_map[char] if char in key_map else char for char in plaintext.upper())
    return encrypted


key = "KEYWORDGAFCBHIJNLMQSPTUZXV"
plaintext = "ALL JUMBLED UP"
ciphertext = monoalphabetic_encrypt(plaintext, key)

print(f"Ciphertext: {ciphertext}")


Ciphertext: KBB FPHEBOW PN


In [71]:
def monoalphabetic_decrypt(ciphertext, key):
    alphabet = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'
    reverse_key_map = {key[i]: alphabet[i] for i in range(len(alphabet))} # mapping the key to the alphabet
    decrypted = ''.join(reverse_key_map[char] if char in reverse_key_map else char for char in ciphertext.upper())
    return decrypted

decrypted = monoalphabetic_decrypt(ciphertext, key)
print(f"Decrypted: {decrypted}")

Decrypted: ALL JUMBLED UP


## 1.5 Vernam Cipher - A One-Time Pad (OTP) Cipher

The Vernam Cipher is a cryptographic method invented by Gilbert Vernam in 1917. It is a symmetric cipher and the first known implementation of the **One-Time Pad (OTP)** encryption technique. When implemented correctly with a truly random key, the Vernam Cipher achieves **perfect secrecy**—the ciphertext gives no information about the plaintext.

**How It Works:**
1. **Plaintext and Key:**
    - The plaintext is the message to encrypt.
    - The key is a random sequence of characters, equal in length to the plaintext.

2. **Encryption:**
    - Each character of the plaintext is XORed with the corresponding character of the key.
    - The result is the ciphertext.

3. **Decryption:**
    - The ciphertext is XORed with the same key to retrieve the plaintext.
    - This works because XOR is a symmetric operation: $ (a \oplus b) \oplus b = a $.

**Key Properties:**
- The key must be truly random, as long as the plaintext, and used only once.
- The key must remain secret to ensure security.


**Mathematical Representation:**
- Encryption: $ C_i = P_i \oplus K_i $
- Decryption: $ P_i = C_i \oplus K_i $
where $ P $, $ C $, and $ K $ are the plaintext, ciphertext, and key respectively.

<p align="center">
  <img src="./vernam.jpg" alt="image not in the same folder">
</p>


In [72]:
def vernam_cipher_otp_encrypt(plaintext, key):
    """
    Encrypts the plaintext using the Vernam Cipher (OTP).

    Parameters:
    - plaintext: The message to encrypt (string).
    - key: A random string of the same length as the plaintext.

    Returns:
    - Ciphertext: The encrypted message as a string.
    """
    if len(plaintext) != len(key):
        raise ValueError("Key must be the same length as the plaintext.")
    
    ciphertext = ''.join(
        chr(ord(p) ^ ord(k))  # XOR each character in plaintext and key
        for p, k in zip(plaintext, key)
    )
    return ciphertext

def vernam_cipher_otp_decrypt(ciphertext, key):
    """
    Decrypts the ciphertext using the Vernam Cipher (OTP).

    Parameters:
    - ciphertext: The encrypted message (string).
    - key: The same random string used for encryption.

    Returns:
    - Plaintext: The decrypted original message.
    """
    return vernam_cipher_otp_encrypt(ciphertext, key)  # XOR is symmetric

# Example Usage
plaintext = "QWERTYUIOP"
key = "1234567890"  # Key must match the length of the plaintext
# print("t")
# Encryption
ciphertext = vernam_cipher_otp_encrypt(plaintext, key)
print("ciiphertext: ", ciphertext)
cipher_text_bytes = [ord(c) for c in ciphertext]
print("cipher_text_bytes: ", cipher_text_bytes)

# Decryption
decrypted_text = vernam_cipher_otp_decrypt(ciphertext, key)
print("Decrypted text:", decrypted_text)


ciiphertext:  `evfaobqv`
cipher_text_bytes:  [96, 101, 118, 102, 97, 111, 98, 113, 118, 96]
Decrypted text: QWERTYUIOP


## 1.6 ChaCha Stream cipher
**Description:**
The ChaCha cipher is a modern and secure stream cipher designed by Daniel J. Bernstein in 2008 as an improvement over the Salsa20 cipher. It is widely used in secure communications, providing both high performance and strong cryptographic security. ChaCha is the backbone of many encryption systems, including TLS (Transport Layer Security).

**Key Features:**
1. **Secure:** Resistant to known cryptographic attacks.
2. **Fast:** Optimized for performance on both software and hardware.
3. **Simple:** Uses basic operations like addition, XOR, and bitwise rotations.

**How It Works:**
1. **Initialization:**
   - A 256-bit key and a 96-bit nonce (number used once) are used.
   - A 512-bit initial state is created, combining constants, the key, the nonce, and a counter.

2. **Quarter Round Function:**
   - The cipher uses a "quarter round" function to mix the state. This function applies operations (addition, XOR, rotation) on four 32-bit words.

3. **ChaCha Rounds:**
   - The state is transformed through 20 rounds of the quarter round function (10 "double rounds").
   - After these rounds, the transformed state is added back to the initial state.

4. **Output Generation:**
   - The resulting state is serialized and used as a keystream.
   - Plaintext is XORed with the keystream to produce the ciphertext.
  
<p align="center">
  <img src="./ChaCha.png" alt="image not in the same folder">
</p>

In [73]:
from Crypto.Cipher import ChaCha20
from Crypto.Random import get_random_bytes

# ChaCha Stream Cipher Example

# Key and Nonce generation
key = get_random_bytes(32)  # 256-bit key
nonce = get_random_bytes(12)  # 96-bit nonce
print("key: ", key)
print("nonce: ", nonce)

# Plaintext message
plaintext = b"Confidential message using ChaCha cipher"

# Encrypt
cipher = ChaCha20.new(key=key, nonce=nonce)
ciphertext = cipher.encrypt(plaintext)
print("Ciphertext:", ciphertext.hex())

# Decrypt
decipher = ChaCha20.new(key=key, nonce=nonce)
decrypted_message = decipher.decrypt(ciphertext)
print("Decrypted Message:", decrypted_message.decode())


key:  b'*\xb3@\xad\xf1\xbb\xd9O\xf7E\xf7\xe8\x92@f\xaa\x07R\x9d\xdc\xa3$\x01G\x0cV\xa4_\x85\xf3\xdb~'
nonce:  b'F|\xdb\x1b\xd0\xf3-q-\x16\xc9\xcf'
Ciphertext: afb8473f66d976b2236afeae023f8dc06703124c6fe7e433ab1c828ed0f383c13890df222d8515f5
Decrypted Message: Confidential message using ChaCha cipher


# 2 Pseudo-Random Generators

Pseudo Random Number Generator(PRNG) refers to an algorithm that uses mathematical formulas to produce sequences of random numbers. PRNGs generate a sequence of numbers approximating the properties of random numbers. A PRNG starts from an arbitrary starting state using a seed state. 

<p align="center">
  <img src="./prng.png" alt="image not in the same folder">
</p>

In [74]:
import random

def prg(seed, length):
    random.seed(seed)
    return [random.randint(0, 1) for _ in range(length)]

seed = 42
length = 10
output = prg(seed, length)
print(f"PRG Output: {output}")

PRG Output: [0, 0, 1, 0, 0, 0, 0, 0, 1, 0]


# 3 AES

AES is a symmetric block cipher, meaning it uses the same key for both encryption and decryption. It encrypts data in fixed-size blocks of 128 bits (16 bytes) and can use key sizes of 128, 192, or 256 bits. AES is widely used due to its strong security and efficiency.

## 3.1 AES, with ECB

ECB is one of the simplest block cipher modes. In this mode, The plaintext is divided into fixed-size blocks, each block is independently encrypted using the cipher and the secret key.
Encryption:
- Divide the input plaintext into blocks of 16 bytes.
- Encrypt each block using AES with the same key.
- Concatenate the encrypted blocks to produce the final ciphertext.
<p align="center">
  <img src="./aes_ebc.png" alt="image not in the same folder">
</p>

Each block is independent of the other blocks, i.e., the blocks are parallel to each other

In [75]:
from Crypto.Cipher import AES
from Crypto.Util.Padding import pad, unpad

key = b'mnp csec'.ljust(16, b'\0') # because the key needs to be 16, 24 or 32 bytes long
# therefore, we have added 16 null bytes to the key
data = b'all hail the aes encryption'

cipher_ecb = AES.new(key, AES.MODE_ECB)
ciphertext_ecb = cipher_ecb.encrypt(pad(data, AES.block_size))

print(f"ECB Ciphertext: {ciphertext_ecb}")


ECB Ciphertext: b'\x85t\xfa|\xb9\xaa\xcd\xd7\xf6\xd4{_?\x1a>\xb2U\xa4/\xc3\x08M\xf3ij\xf4vR\xc3\xf5\xcfc'


### For decryption, each ciphertext block is decrypted independently using the same key.

In [76]:
plaintext_ecb = unpad(cipher_ecb.decrypt(ciphertext_ecb), AES.block_size)
print(f"ECB Decrypted: {plaintext_ecb}")

ECB Decrypted: b'all hail the aes encryption'


## 3.2 AES, with CBC

CBC (Cipher Block Chaining) mode is a block cipher mode that enhances security by introducing randomness and dependency between blocks. Each message starts with a unique, random IV of the same size as the AES block size (16 bytes for AES). This ensures that even identical plaintexts encrypted with the same key result in different ciphertexts.
Encryption:
- The plaintext is divided into fixed-size blocks (e.g., 16 bytes for AES).
- Each block of plaintext is XORed with the ciphertext of the previous block (or IV for the first block).
- The result is then encrypted using the AES algorithm and the secret key.
- C<sub>i</sub> = Encrypt(P<sub>i</sub> &oplus; C<sub>i-1</sub>)

-  Where:
  - <ul>
    <li>P<sub>i</sub> is the plaintext block.</li>
    <li>C<sub>i-1</sub> is the previous ciphertext block.</li>
    <li>C<sub>0</sub> = IV (Initialization Vector).</li>
  </ul>
<p align="center">
  <img src="./aes_cbc.png" alt="image not in the same folder">
</p>

The encryption is in series rather than being parallel as in EBC. This provides additional layer of security, making CBC comparitively stronger than EBC, though computationally expensive


In [77]:
from Crypto.Random import get_random_bytes

iv = get_random_bytes(16) 

cipher_cbc_encrypt = AES.new(key, AES.MODE_CBC, iv)
ciphertext_cbc = cipher_cbc_encrypt.encrypt(pad(data, AES.block_size))
print(f"CBC Ciphertext: {ciphertext_cbc}")

CBC Ciphertext: b"m\xc4\xe8\x05\x9a\xee+\xfb?\x99%\xd5\x1cmK\x82\x00\x89\xce\x84qlvu\x9fB\x9e'\xc78\xd0\xd2"


The ciphertext is decrypted block by block. The output of the decryption is XORed with the previous ciphertext block (or IV for the first block) to retrieve the original plaintext.
Decryption formula:
P<sub>i</sub> = Decrypt(C<sub>i</sub>) ⊕ C<sub>i-1</sub>

**the AES cipher object is stateful, meaning that once it is used for encryption, its internal state changes, making it unsuitable for decryption. Therefore, for decryption, you need to redefine the AES cipher object**

In [78]:
cipher_cbc_decrypt = AES.new(key, AES.MODE_CBC, iv)
plaintext_cbc = unpad(cipher_cbc_decrypt.decrypt(ciphertext_cbc), AES.block_size)
print(f"CBC Decrypted: {plaintext_cbc}")

CBC Decrypted: b'all hail the aes encryption'
