# Notebook 11.b: The Caesar Cipher

Welcome back, agent! In our last notebook, you learned the fundamental vocabulary of cryptography, including the critical difference between **encoding** and **encrypting**. Now, it's time to put that knowledge into practice by building our first real encryption program.

Today, we'll implement the famous and ancient **Caesar Cipher**. We will treat this as an **encryption pipeline**, breaking the problem down into distinct encoding and encrypting steps.

**Learning Objectives:**
*   Reinforce the difference between encoding and encrypting through implementation.
*   Use `ord()` and `chr()` to build an character-to-number codec.
*   Apply the modulo operator (`%`) to perform a mathematical shift on numerical data.
*   Combine codec and cipher functions to build a complete encryption/decryption pipeline.

**Estimated Time:** 45-60 minutes

**Prerequisites/Review:**
*   Concepts from [Notebook 11.a: An Introduction to Secret Messages](https://colab.research.google.com/github/sguy/programming-and-problem-solving/blob/main/notebooks/11.a-introduction-to-secret-messages.ipynb).
*   Basic Python functions, loops, and conditional statements.

Let's begin building our pipeline!

## 🐍 New Concept: What is the Caesar Cipher?

The Caesar Cipher is one of the simplest and most widely known encryption techniques. It's named after Julius Caesar, who used it to communicate with his generals.

**How it works:**
It's a type of **substitution cipher** where each letter in the plaintext (the original message) is replaced by a letter some fixed number of positions down the alphabet. This fixed number is called the **shift** or the **key**.

For example, with a **shift of 3**:

| Original Letter | Shifted by +3 | Encrypted Letter |
| :-------------- | :------------ | :------------- |
| A               | A → B → C → D | D              |
| B               | B → C → D → E | E              |
| ...             | ...           | ...            |
| Z               | Z → A → B → C | C              |

So, the message "HELLO" with a shift of 3 would be encrypted as "KHOOR". To decrypt it, the receiver simply shifts each letter back by 3 positions.

## 🐍 New Concept: Our Plan: The Encryption Pipeline

To build our cipher, we will follow the vocabulary we learned in the last notebook. Instead of one big, confusing function, we will build an **encryption pipeline** by creating a series of small, single-purpose functions that work together.

Our pipeline has three main stages:

1.  **Encode:** Convert each letter of our plaintext message into a number. The resulting list of numbers is still a form of **plaintext**!
2.  **Encrypt:** Take each number and apply the secret key. The result is a list of encrypted numbers. Now the data is **ciphertext**.
3.  **Decode:** Convert the encrypted numbers back into letters to form the final ciphertext message.

To decrypt a message, the pipeline simply runs in reverse.

![The Encryption Pipeline](https://raw.githubusercontent.com/sguy/programming-and-problem-solving/main/notebooks/images/encryption_pipeline.svg)

### 🎯 Stage 1: The Codec (Letters <-> Numbers)

In Notebook 11.a, you built `encode_to_ascii` and `decode_from_ascii` functions. For our Caesar cipher, we need a similar codec that maps uppercase English letters (A-Z) to numbers 0-25 and vice-versa.

We'll use the same logic you learned, but we'll put it into two helper functions: `letter_to_number` and `number_to_letter`. For convenience, the solutions are provided here, but remember the core idea from Notebook 11.a!

In [None]:
def letter_to_number(letter):
    """Encodes a single uppercase letter into a number from 0-25."""
    return ord(letter.upper()) - ord('A')

def number_to_letter(number):
    """Decodes a number from 0-25 back into an uppercase letter."""
    return chr(number + ord('A'))

# --- Test the Codec ---
test_letter = 'C'
test_number = letter_to_number(test_letter)
print(f"'{test_letter}' encodes to the number {test_number}") # Expected: 2

decoded_letter = number_to_letter(test_number)
print(f"The number {test_number} decodes back to '{decoded_letter}'") # Expected: C

## 🐍 New Concept: Helper Functions for Message Preparation

Before we build our full encryption pipeline, we need a way to clean up our messages. The Caesar cipher traditionally only works on letters, and we've decided to focus on uppercase letters (A-Z) for simplicity. This means we need to:

1.  Convert all letters to uppercase.
2.  Remove any characters that are not letters (spaces, punctuation, numbers, etc.).

Let's create a helper function, `prepare_message`, to do this for us. This keeps our main encryption functions clean and focused on their primary task.

In [None]:
def prepare_message(message):
    """Converts a message to uppercase and filters out non-alphabetic characters."""
    cleaned_message = ""
    for char in message:
        if 'A' <= char.upper() <= 'Z': # Check if it's an English letter
            cleaned_message += char.upper()
    return cleaned_message

# --- Test the Helper Function ---
test_message = "Hello, World! 123"
prepared = prepare_message(test_message)
print(f"Original: '{test_message}'")
print(f"Prepared: '{prepared}'") # Expected: 'HELLOWORLD'

💡 **Tip:** You'll notice that 'A', 'B', 'C', ... 'Z' have consecutive numerical values. This is key to how we'll implement the Caesar cipher shift!

For uppercase letters 'A' through 'Z':
*   `ord('A')` is 65
*   `ord('Z')` is 90

If we have a character, say 'C' (`ord('C')` is 67), and we want to shift it by 3:
1.  Get its number: `ord('C')` -> `67`
2.  Add the shift: `67 + 3` -> `70`
3.  Convert back to character: `chr(70)` -> `'F'`

This seems to work! But what about wrapping around from 'Z' back to 'A'?

### Wrapping Around: The Modulo Operator (`%`)

If we have 'Y' (`ord('Y')` is 89) and we shift by 3, `89 + 3 = 92`. `chr(92)` is '\\'. That's not what we want! We want 'B'.

This is where a very useful tool called the **modulo operator (`%`)** comes in. The modulo operator gives you the remainder of a division.

Example: `10 % 3` is `1` (because 10 divided by 3 is 3 with a remainder of 1).

For our cipher, we're working with 26 letters in the alphabet.

Let's think about the letters as having positions 0 through 25. We can calculate this by taking the `ord()` value of a letter and subtracting the `ord()` value of 'A'.

| Character | `ord(Character)` | `ord(Character) - ord('A')` | Position (0-25) |
| :-------- | :--------------- | :-------------------------- | :-------------- |
| 'A'       | 65               | `65 - 65`                   | 0               |
| 'B'       | 66               | `66 - 65`                   | 1               |
| 'C'       | 67               | `67 - 65`                   | 2               |
| ...       | ...              | ...                         | ...             |
| 'Z'       | 90               | `90 - 65`                   | 25              |

To get the 0-25 position of a character `char`:
`position = ord(char) - ord('A')`

Example for 'C':
* `ord('C')` is 67.
* `ord('A')` is 65.
* `position = 67 - 65 = 2`. So 'C' is at position 2 (0-indexed).

Now, to shift and wrap:
1.  Get the 0-25 position: `original_pos = ord(char) - ord('A')`
2.  Add the shift: `shifted_pos_temp = original_pos + shift`
3.  Apply modulo 26 to wrap around: `final_pos = shifted_pos_temp % 26`
4.  Convert back to the ASCII range by adding `ord('A')`: `final_char_code = final_pos + ord('A')`
5.  Convert code to character: `final_char = chr(final_char_code)`

Let's try 'Y' (position 24) with a shift of 3:
1.  `original_pos = ord('Y') - ord('A') = 89 - 65 = 24`
2.  `shifted_pos_temp = 24 + 3 = 27`
3.  `final_pos = 27 % 26 = 1` (This is the 0-25 position for 'B'!)
4.  `final_char_code = 1 + ord('A') = 1 + 65 = 66`
5.  `final_char = chr(66)` which is 'B'. It works!

⚠️ **Heads Up!** This logic assumes we are only dealing with uppercase English letters 'A' through 'Z'.

### Helper Function: Checking for Uppercase

To make our code cleaner when we check if a character is an uppercase letter, let's define a small helper function.

In [None]:
def is_uppercase(char):
    """Checks if a character is an uppercase English letter (A-Z)."""
    return "A" <= char <= "Z"

# Test it
print("'C' is uppercase: ", str(is_uppercase("C")))
print("'c' is uppercase: ", str(is_uppercase("c")))
print("'!' is uppercase: ", str(is_uppercase("!")))

### Handling Non-Alphabetic Characters

What about spaces, punctuation, or numbers in our message? The Caesar cipher traditionally only applies to letters.
For our program, we'll make a design choice: **If a character is not an uppercase letter (A-Z), we'll leave it unchanged.**

We can use our `is_uppercase(char)` helper function for this check:
```python
if is_uppercase(char):
    # It's an uppercase letter, so encrypt it
else:
    # It's not an uppercase letter, so keep it as is
```

## 🎯 Your Turn to Code: The `encode_char()` Function

Let's create a function `encode_char(char, shift)` that takes a single character and a shift value, and returns the encoded character.

**Remember our logic:**
1.  Check if `char` is an uppercase letter (between 'A' and 'Z').
2.  If it is:
    a.  Calculate its 0-25 position (e.g., 'A' is 0, 'B' is 1, ...).
    b.  Add the `shift`.
    c.  Use the modulo (`% 26`) operator to wrap around.
    d.  Convert the new 0-25 position back to an ASCII character code (by adding `ord('A')`).
    e.  Convert the code back to a character using `chr()`.
    f.  Return this new character.
3.  If it's NOT an uppercase letter, just return the original `char` unchanged.

In [None]:
def encode_char(char, shift):
    """Encodes a single character using the Caesar cipher.
    Only encrypts uppercase English letters (A-Z).
    Other characters are returned unchanged.
    """
    if is_uppercase(char): # Check if it's an uppercase letter
        # It's an uppercase letter, so encrypt it
        start_code = ord("A")
        # 1. Get the 0-25 position from 'A'
        original_pos = ord(char) - start_code # YOUR CODE HERE (hint: ord(char) - start_code)

        # 2. Add the shift
        shifted_pos_temp = original_pos + shift # YOUR CODE HERE

        # 3. Apply modulo 26 to wrap around
        final_pos = shifted_pos_temp % 26 # YOUR CODE HERE

        # 4. Convert back to the ASCII range by adding start_code (ord('A'))
        final_char_code = final_pos + start_code # YOUR CODE HERE

        # 5. Convert code to character
        encoded_character = chr(final_char_code) # YOUR CODE HERE

        return encoded_character
    else:
        # It's not an uppercase letter, so keep it as is
        return char # YOUR CODE HERE (return the original character)

# Let's test it!
print("Encoding 'A' with shift 3:", encode_char("A", 3))  # Expected: D
print("Encoding 'X' with shift 5:", encode_char("X", 5))  # Expected: C
print("Encoding 'H' with shift 7:", encode_char("H", 7))  # Expected: O
print("Encoding ' ' with shift 3:", encode_char(" ", 3))  # Expected:
print("Encoding '!' with shift 5:", encode_char("!", 5))  # Expected: !
print("Encoding 'm' with shift 3:", encode_char("m", 3))  # Expected: m (because it's lowercase)

## 🐍 Python Tool: Loops for Repetition (`for` loops)

Great! We can now encode a single character. But messages usually have many characters!
We need a way to go through each character in our message string and apply our `encode_char` function to it.

This is where **loops** come in. A loop is a way to repeat a block of code multiple times.
Python's `for` loop is perfect for iterating over sequences, like the characters in a string.

**Basic `for` loop syntax with a string:**
To go through each character, we can get the length of the string and then loop from 0 up to (but not including) the length. Inside the loop, we use the current number (index) to get the character at that position.
```python
my_string = "PYTHON"
string_length = len(my_string)
for i in range(string_length):  # i will go from 0, 1, 2, ..., up to length-1
    letter = my_string[i]       # Get the character at index i
    # This block of code will run for each character in my_string
    print(letter)               # 'letter' holds the current character
```
This would print:
P
Y
T
H
O
N

The `range(number)` function generates a sequence of numbers starting from 0 up to (but not including) `number`.

✅ **Check Your Understanding:**

Consider the following Python code:
```python
word = "LOOP"
for i in range(len(word)):
    character = word[i]
    print(character + "!")
```
What would be printed to the screen when this code is run?

<details>
  <summary>Click to see the answer</summary>
  The code would print:
  L!
  O!
  O!
  P!
</details>

In [None]:
example_message = "CODE"
print("Iterating through the message:", example_message)

message_length = len(example_message)
for index in range(message_length):
    current_char = example_message[index]
    print("Character at index", index, "is:", current_char)
    # Later, we'll call encode_char(current_char, shift) here

## 🎯 Your Turn to Code: The `encode_message()` Function

Now, let's write `encode_message(message, shift)`.

**Logic:**
1.  First, convert the entire input `message` to uppercase using `.upper()`.
2.  Create an empty list (e.g., `encoded_chars = []`) to store our resulting characters.
3.  Use a `for` loop with `range(len(uppercase_message))` to get each index `i`.
4.  Inside the loop, get the character `char_to_encode = uppercase_message[i]`.
5.  Call your `encode_char(char_to_encode, shift)` function for the current character and the given shift.
6.  Append the result from `encode_char` to your `encoded_chars` list.
7.  After the loop finishes, join the `encoded_chars` list into a single string and `return` it.

In [None]:
def encode_message(message, shift):
    """Encodes an entire message using the Caesar cipher.
    Converts message to uppercase first.
    """
    uppercase_message = message.upper() # 1. Convert to uppercase
    # 2. Initialize an empty LIST to store the encoded characters
    encoded_chars = [] # YOUR CODE HERE

    # 3. Loop through each character in the uppercase_message
    for i in range(...): # YOUR CODE HERE (complete the for loop header using range and len)
        # 4. Get the character at the current index i
        char_to_encode = uppercase_message[...] # YOUR CODE HERE
        # 5. Encode the current character using encode_char()
        encoded_char = encode_char(..., ...) # YOUR CODE HERE

        # 6. Add the encoded character to our list of characters
        encoded_chars.append(...) # YOUR CODE HERE

    # 7. Join the list of characters back into a single string
    encoded_text = "".join(encoded_chars) # YOUR CODE HERE
    # 8. Return the fully encoded message
    return encoded_text # YOUR CODE HERE

# Let's test encode_message!
secret_shift = 3
plain_message = "Hello World"
cipher_text = encode_message(plain_message, secret_shift)
print("Original: '" + plain_message + "'")
print("Shift: " + str(secret_shift))
print("Encoded:  '" + cipher_text + "'") # Expected: KHOOR ZRUOG

another_message = "PYTHON IS FUN!"
another_shift = 7
encoded_another = encode_message(another_message, another_shift)
print("Original: '" + another_message + "'")
print("Shift: " + str(another_shift))
print("Encoded:  '" + encoded_another + "'") # Expected: WFAOVU PZ MBU!

## 🕵️‍♀️ Decoding the Message

Fantastic! You can now encode messages. But a secret message isn't much good if the recipient can't decode it!

How do we reverse the process?
If encoding involves shifting letters *forward* by `shift` positions, decoding should involve shifting them *backward* by `shift` positions.

### 🎯 Your Turn to Code: The `decode_char()` Function

Let's write `decode_char(char, shift)`.
The logic is very similar to `encode_char`, but instead of *adding* the shift, you'll be *subtracting* it.

**Think about the 0-25 position calculation:**
`original_pos = ord(char) - ord('A')`
`shifted_pos_temp = original_pos - shift`  <-- Notice the minus!
`final_pos = shifted_pos_temp % 26`
`final_char_code = final_pos + ord('A')`
`decoded_character = chr(final_char_code)`

⚠️ **A small math detail for modulo with negative numbers:**
In Python, `(-5) % 26` gives `21`. This is good! It means if we are at 'A' (pos 0) and shift back by 5, we get `(0 - 5) % 26 = -5 % 26 = 21`, which is 'V'. This is the correct behavior for wrapping backwards.
So, the formula `(original_pos - shift) % 26` works correctly for decoding too!

In [None]:
def decode_char(char, shift):
    """Decodes a single character using the Caesar cipher.
    Only decrypts uppercase English letters (A-Z).
    Other characters are returned unchanged.
    """
    if is_uppercase(char):
        start_code = ord("A")
        original_pos = ord(char) - start_code

        # Subtract the shift for decoding
        shifted_pos_temp = original_pos - shift # YOUR CODE HERE

        final_pos = shifted_pos_temp % 26
        final_char_code = final_pos + start_code
        decoded_character = chr(final_char_code)
        return decoded_character
    else:
        return char

# Let's test decode_char!
print("Decoding 'D' with shift 3:", decode_char("D", 3))  # Expected: A
print("Decoding 'C' with shift 5:", decode_char("C", 5))  # Expected: X
print("Decoding 'O' with shift 7:", decode_char("O", 7))  # Expected: H
print("Decoding ' ' with shift 3:", decode_char(" ", 3))  # Expected:
print("Decoding '!' with shift 5:", decode_char("!", 5))  # Expected: !

### 🤔 Stop and Think: `encode_char` vs. `decode_char`

Look at your `encode_char` and `decode_char` functions. They are very similar!

*   How are they different?
*   Could `decode_char(char, shift)` be implemented by calling `encode_char(char, some_modified_shift)`?
    *   Hint: If encoding is shifting by `+shift`, decoding is like encoding by `-shift`.
    *   So, `decode_char(char, shift)` could potentially be `encode_char(char, -shift)` or `encode_char(char, 26 - shift)` (because `(X - S) % 26` is the same as `(X + (26 - S)) % 26`).
    *   This is an example of the **DRY (Don't Repeat Yourself)** principle in programming. If you find yourself writing very similar code, see if you can reuse parts of it!

For now, having two separate functions is fine for learning, but it's good to think about these connections.

💡 **Tip for Testing:** A good way to test if your encode and decode functions work together is to see if decoding an encoded character gets you back to the original:
`original_char == decode_char(encode_char(original_char, shift), shift)` should be `True`.

In [None]:
test_char = "P"
test_shift = 10
encoded_p = encode_char(test_char, test_shift)
decoded_back = decode_char(encoded_p, test_shift)

print("Original: " + test_char)
print("Encoded with shift " + str(test_shift) + ": " + encoded_p)
print("Decoded back with shift " + str(test_shift) + ": " + decoded_back)
print("Do they match? " + str(test_char == decoded_back))

### 🎯 Your Turn to Code: The `decode_message()` Function

Now, create `decode_message(message, shift)`.
This will be very similar to `encode_message`, but it will call `decode_char` inside the loop.

In [None]:
def decode_message(message, shift):
    """Decodes an entire message using the Caesar cipher.
    Assumes the input message might be mixed case, but decoding applies to uppercase letters.
    (It's good practice for decode_message to also handle .upper() if encode_message does,
     or to assume the input ciphertext is already in the correct format from encode_message)
    """
    # We'll assume the input message is already in the format produced by encode_message (uppercase letters, other chars unchanged).
    # If you wanted to handle mixed-case input here, you might add: message = message.upper()
    # Initialize an empty LIST to store the decoded characters
    decoded_chars = [] # YOUR CODE HERE

    # Loop through each character in the message
    for i in range(...): # YOUR CODE HERE (complete the for loop header using range and len)
        # Get the character at the current index i
        char_to_decode = message[...] # YOUR CODE HERE
        # Decode the current character using decode_char()
        decoded_char = decode_char(..., ...) # YOUR CODE HERE

        # Add the decoded character to our list of characters
        decoded_chars.append(...) # YOUR CODE HERE

    # Join the list of characters back into a single string
    decoded_text = "".join(decoded_chars) # YOUR CODE HERE

    return decoded_text # YOUR CODE HERE

# Let's test decode_message!
cipher_to_decode = "KHOOR ZRUOG"
key_shift = 3
original_plaintext = decode_message(cipher_to_decode, key_shift)
print("Ciphertext: '" + cipher_to_decode + "'")
print("Shift: " + str(key_shift))
print("Decoded:    '" + original_plaintext + "'") # Expected: HELLO WORLD

another_cipher = "WFAOVU PZ MBU!"
another_key = 7
decoded_another_plain = decode_message(another_cipher, another_key)
print("Ciphertext: '" + another_cipher + "'")
print("Shift: " + str(another_key))
print("Decoded:    '" + decoded_another_plain + "'") # Expected: PYTHON IS FUN!

### 🤔 Stop and Think: `encode_message` vs. `decode_message`

*   Is the relationship between `encode_message` and `decode_message` similar to the one between `encode_char` and `decode_char`?
*   Could you reuse code here too? (e.g., `decode_message(message, shift)` could call `encode_message(message, -shift)` or `encode_message(message, 26 - shift)` if `encode_message` correctly handles negative/alternative shifts for its `encode_char` calls).

**Testing the whole system:**

In [None]:
my_secret_message = "Meet me at midnight!"
my_secret_key = 5

print("Original Message: " + my_secret_message)

encrypted_version = encode_message(my_secret_message, my_secret_key)
print("Encoded Version:  " + encrypted_version)

decrypted_version = decode_message(encrypted_version, my_secret_key)
print("Decoded Version:  " + decrypted_version)

# Check if the final decoded version matches the original (after converting original to uppercase for fair comparison)
if my_secret_message.upper() == decrypted_version:
    print("\n🎉 Success! The message was encoded and decoded correctly!")
else:
    print("\n🤔 Hmm, something went wrong. The decoded message doesn't match the original uppercase message.")

✅ **Check Your Understanding:**

What would the following Python code display?
```python
print(chr(ord('M') + 2))
```

<details>
  <summary>Click to see the answer</summary>
  The code would print:
  `O`
  
  (Because `ord('M')` gives the numerical value for 'M'. Adding 2 to it gives the numerical value for 'O'. `chr()` then converts this new number back to the character 'O'.)
</details>

## 🎉 Part 7 Wrap-up & What's Next! 🎉

Congratulations, Agent! You've successfully implemented the Caesar Cipher in Python!

**Here's a recap of what you learned and built:**
*   The **Caesar Cipher** works by shifting letters a fixed number of places in the alphabet.
*   Python's `ord()` and `chr()` functions are essential for converting between characters and their numerical (ASCII/Unicode) values, which allows us to do math with letters.
*   The **modulo operator (`%`)** is crucial for handling the "wrap-around" effect in the alphabet (e.g., 'Z' + 1 = 'A').
*   String methods like `.upper()` help simplify text processing.
*   **`for` loops** allow you to iterate through each character of a string to perform operations like encoding or decoding.
*   You built functions (`encode_char`, `encode_message`, `decode_char`, `decode_message`) to create a modular and reusable cipher tool.
*   You thought about the **DRY (Don't Repeat Yourself)** principle when comparing encoding and decoding logic.

**Key Takeaways:**
*   Complex problems can be broken down into smaller, manageable functions.
*   Understanding how characters are represented as numbers opens up many possibilities for text manipulation.
*   Loops are fundamental for processing items in a sequence (like characters in a string or items in a list).

### Next Up: Notebook 6: Prime Numbers 🔢

In our next notebook, [🔢 Notebook 8: Prime Numbers - A Problem-Solving Adventure!](https://colab.research.google.com/github/sguy/programming-and-problem-solving/blob/main/notebooks/09-prime-numbers.ipynb), we'll switch gears from secret codes to number theory as we explore **Prime Numbers**. We'll learn how to determine if a number is prime and generate lists of primes, using and reinforcing concepts like loops and conditional logic in a new context.

**Going Further (Optional Challenges):**
*   Can you modify your Caesar cipher to handle lowercase letters as well (encrypting 'a' to 'd' with shift 3, etc.)?
*   What about numbers? Should they be shifted too, or left alone?
*   The Caesar cipher is quite easy to break. Can you think why? (Hint: letter frequency). Research other simple ciphers like the Vigenère cipher (which is much stronger!).

[Return to Table of Contents](https://colab.research.google.com/github/sguy/programming-and-problem-solving/blob/main/notebooks/table-of-contents.ipynb)