<a href="https://colab.research.google.com/github/brendanpshea/computing_concepts_python/blob/main/IntroCS_07_Cybersecurity.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Introduction to Cybersecurity: Why Security Matters in Code

Cybersecurity is the practice of protecting systems, networks, and programs from digital attacks. In today's interconnected world, nearly every aspect of our lives involves digital information that needs protection.

As beginning programmers, understanding cybersecurity fundamentals will help you:

* Write safer code that protects user data
* Understand potential vulnerabilities in applications
* Develop habits that prevent security breaches
  * Validating user input
  * Securing sensitive information
  * Testing for common security flaws
* Prepare for careers in an increasingly security-conscious industry

**Cybersecurity** refers to the body of technologies, processes, and practices designed to protect networks, devices, programs, and data from attack, damage, or unauthorized access.

**Malicious actors** are individuals or groups who attempt to exploit vulnerabilities in software and hardware for various purposes, including stealing data, causing disruption, or gaining unauthorized access to systems.

Remember: Security isn't something added at the end of development—it should be considered from the very beginning of any project!

# Understanding Text in Python: Strings, Unicode, and ASCII

Before we can work with cybersecurity concepts, we need to understand how computers store and process text. In Python, text is represented as **strings**, which are sequences of characters.

Each character in a string is represented by a numeric code:

* **ASCII** (American Standard Code for Information Interchange) is an older standard that uses 7 bits to represent 128 different characters
  * Only includes English letters, numbers, and basic symbols
  * Limited to characters used in English

* **Unicode** is a modern standard that can represent virtually every character from all writing systems worldwide
  * Includes characters from all languages, mathematical symbols, emojis, and more
  * Python 3 uses Unicode for all strings by default

* **UTF-8** (Unicode Transformation Format 8-bit) is the most common encoding for Unicode
  * A variable-width encoding that uses between 1 and 4 bytes per character
  * ASCII characters use just 1 byte (efficient for English text)
  * Characters from other languages use 2-4 bytes
  * This makes UTF-8 both compact and universal
  * It's the dominant encoding for the web and most software

In [3]:
# Let's examine how UTF-8 uses different numbers of bytes per character
for char in "ABCйあ😎":
    char_bytes = char.encode('utf-8')
    print(f"Character: {char}, UTF-8 bytes: {char_bytes}, Length: {len(char_bytes)} bytes")


Character: A, UTF-8 bytes: b'A', Length: 1 bytes
Character: B, UTF-8 bytes: b'B', Length: 1 bytes
Character: C, UTF-8 bytes: b'C', Length: 1 bytes
Character: й, UTF-8 bytes: b'\xd0\xb9', Length: 2 bytes
Character: あ, UTF-8 bytes: b'\xe3\x81\x82', Length: 3 bytes
Character: 😎, UTF-8 bytes: b'\xf0\x9f\x98\x8e', Length: 4 bytes


Here's how characters map to their numeric values:

| Character | ASCII Value | Unicode Value (decimal) |
|-----------|-------------|-------------------------|
| 'A'       | 65          | 65                      |
| 'B'       | 66          | 66                      |
| 'Z'       | 90          | 90                      |
| 'a'       | 97          | 97                      |
| 'z'       | 122         | 122                     |
| '!'       | 33          | 33                      |
| 'й' (Cyrillic) | N/A (not in ASCII) | 1081       |
| '東' (Japanese) | N/A (not in ASCII) | 26481      |

Let's explore working with character codes in Python:

In [6]:
# Working with character codes - Spy Communication System Basics

# The ord() function gets the numeric value of a character
print("Character to code conversion:")
print(f"The code for 'A' is: {ord('A')}")
print(f"The code for 'a' is: {ord('a')}")
print(f"The code for '!' is: {ord('!')} \n")


Character to code conversion:
The code for 'A' is: 65
The code for 'a' is: 97
The code for '!' is: 33 



In [7]:

# The chr() function converts a numeric code back to a character
print("Code to character conversion:")
print(f"The character for code 77 is: {chr(77)}")  # M
print(f"The character for code 105 is: {chr(105)}")  # i
print(f"The character for code 54 is: {chr(54)} \n")  # 6


Code to character conversion:
The character for code 77 is: M
The character for code 105 is: i
The character for code 54 is: 6 



In [5]:
# A spy might examine a message character by character
secret_codename = "Agent007"
print(f"Examining codename: {secret_codename}")
for char in secret_codename:
    print(f"Character: {char}, Code: {ord(char)}")

Examining codename: Agent007
Character: A, Code: 65
Character: g, Code: 103
Character: e, Code: 101
Character: n, Code: 110
Character: t, Code: 116
Character: 0, Code: 48
Character: 0, Code: 48
Character: 7, Code: 55


# Essential String Methods for Cryptography

Python's built-in string methods provide powerful tools for manipulating text, which will be essential for our cybersecurity work. Let's explore some key methods through examples:

* **upper()** and **lower()**: Convert text to uppercase or lowercase
  * Useful for normalizing text
  * Example: `"Secret".upper()` returns `"SECRET"`

* **join()**: Combines a list of strings with a specified separator
  * Great for reassembling processed characters
  * Example: `"-".join(['C', 'I', 'A'])` returns `"C-I-A"`

* **split()**: Divides a string into a list based on a delimiter
  * Helpful for breaking messages into processable chunks
  * Example: `"Operation Midnight".split()` returns `['Operation', 'Midnight']`

* **replace()**: Substitutes specified text with new text
  * Essential for substitution operations
  * Example: `"Agent".replace('A', '4')` returns `"4gent"`

Let's explore these methods:

In [8]:
# A classified message
message = "Meet Agent X at the Blue Parrot Cafe"

# 1. Converting case
upper_message = message.upper()
lower_message = message.lower()

print("Original message:", message)
print("Uppercase:", upper_message)
print("Lowercase:", lower_message)

Original message: Meet Agent X at the Blue Parrot Cafe
Uppercase: MEET AGENT X AT THE BLUE PARROT CAFE
Lowercase: meet agent x at the blue parrot cafe


In [9]:
# 2. Splitting strings
words = message.split()
print("\nSplit into words:", words)
print("Number of words:", len(words))

# Split by a specific character
parts = message.split('a')
print("Split by 'a':", parts)


Split into words: ['Meet', 'Agent', 'X', 'at', 'the', 'Blue', 'Parrot', 'Cafe']
Number of words: 8
Split by 'a': ['Meet Agent X ', 't the Blue P', 'rrot C', 'fe']


In [10]:
# 3. Joining strings
code_name = ['S', 'P', 'E', 'C', 'T', 'R', 'E']
joined_name = "".join(code_name)
print("\nJoined characters:", joined_name)

# Join with a separator
dash_name = "-".join(code_name)
print("Joined with dashes:", dash_name)


Joined characters: SPECTRE
Joined with dashes: S-P-E-C-T-R-E


In [11]:
# 4. Replacing text
redacted = message.replace("Agent X", "[REDACTED]")
print("\nRedacted message:", redacted)

# Multiple replacements can be chained
coded = message.replace('e', '3').replace('a', '4').replace('t', '7')
print("Basic letter substitution:", coded)


Redacted message: Meet [REDACTED] at the Blue Parrot Cafe
Basic letter substitution: M337 Ag3n7 X 47 7h3 Blu3 P4rro7 C4f3


**String immutability** means that strings cannot be modified after creation - operations create new strings instead. This is why we need to capture the result when using these methods:


In [14]:
# Demonstrating string immutability
codename = "SKYFALL"
print(f"Original codename: {codename}")

# This doesn't change codename!
codename.replace('S', '$')
print(f"After replace without assignment: {codename}")  # Still "SKYFALL"

Original codename: SKYFALL
After replace without assignment: SKYFALL


In [15]:

# This works because we create a new string and reassign
codename = codename.replace('S', '$')
print(f"After replace with assignment: {codename}")  # Now "$KYFALL"

After replace with assignment: $KYFALL


These string methods provide the building blocks for text manipulation that we'll use in more complex cryptographic operations later.