# Lab 01: Python for Security Fundamentals

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/depalmar/ai_for_the_win/blob/main/notebooks/lab01_python_security.ipynb)

**Difficulty: Easy | Time: 3-4 hours | No Prerequisites**

Welcome to AI for the Win! This introductory lab teaches Python basics through security-focused examples. No prior programming experience required.

## Learning Objectives

By the end of this lab, you will:
1. **Write and run Python scripts** - Your first security-focused code
2. **Work with data types** - Strings, numbers, booleans, lists, dictionaries
3. **Control program flow** - If statements, loops, functions
4. **Read and write files** - Logs, CSVs, JSON
5. **Parse security data** - Regular expressions for IOC extraction
6. **Make HTTP requests** - API interactions for threat intelligence

## Why Python for Security?

```
                  WHY PYTHON FOR SECURITY?
                                                             
   Industry Standard: Most security tools use Python      
   Rich Libraries: pandas, requests, scikit-learn         
   Quick Prototyping: Rapid tool development              
   AI/ML Ready: All major frameworks support Python       
                                                             
   Common Uses:                                              
   - Log parsing and analysis                                
   - IOC extraction and enrichment                           
   - Automation of security tasks                            
   - Machine learning for threat detection                   
   - API integrations (VirusTotal, MISP, etc.)              
```

**Next:** Lab 02 (Prompt Engineering) to learn how to communicate with AI effectively

---

# Part 1: Python Basics

This section covers the fundamental building blocks of Python programming.

## 1.1 Your First Python Code

Python reads code from top to bottom. The `print()` function displays output.

In [None]:
# This is a comment - Python ignores lines starting with #
# Comments help explain your code to others (and your future self!)

print("Welcome to AI for the Win!")
print("Let's learn Python for security!")
print()  # Empty print for spacing

# Try changing these messages and running the cell again!

## 1.2 Variables and Data Types

Variables store data. Python figures out the type automatically.

```
‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê
‚îÇ                  PYTHON DATA TYPES                          ‚îÇ
‚îú‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚î§
‚îÇ                                                             ‚îÇ
‚îÇ   STRING (str)      ‚îÇ Text in quotes    ‚îÇ "192.168.1.1"    ‚îÇ
‚îÇ   INTEGER (int)     ‚îÇ Whole numbers     ‚îÇ 443, -1, 0       ‚îÇ
‚îÇ   FLOAT (float)     ‚îÇ Decimal numbers   ‚îÇ 7.5, 3.14159     ‚îÇ
‚îÇ   BOOLEAN (bool)    ‚îÇ True/False        ‚îÇ True, False      ‚îÇ
‚îÇ   LIST (list)       ‚îÇ Ordered sequence  ‚îÇ [1, 2, 3]        ‚îÇ
‚îÇ   DICTIONARY (dict) ‚îÇ Key-value pairs   ‚îÇ {"ip": "1.2.3.4"}‚îÇ
‚îÇ                                                             ‚îÇ
‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò
```

In [None]:
# STRINGS - Text data (use quotes, single or double)
ip_address = "192.168.1.100"
hostname = "workstation-01"
alert_message = "Suspicious login detected"

# NUMBERS - Integers (whole) and floats (decimal)
port = 443
failed_attempts = 5
risk_score = 7.5

# BOOLEANS - True or False (note capitalization!)
is_malicious = True
is_whitelisted = False

# f-strings let you embed variables in text (put f before the quotes)
print(f"Alert: {alert_message}")
print(f"Source IP: {ip_address}:{port}")
print(f"Failed attempts: {failed_attempts}")
print(f"Risk score: {risk_score}")
print(f"Is malicious? {is_malicious}")

In [None]:
# Check the type of variables
print(f"ip_address is type: {type(ip_address)}")
print(f"port is type: {type(port)}")
print(f"risk_score is type: {type(risk_score)}")
print(f"is_malicious is type: {type(is_malicious)}")

In [None]:
## 1.3 Lists - Collections of Items

Lists hold multiple items in order. Use square brackets `[]`.

In [None]:
# List of suspicious IPs (IOCs - Indicators of Compromise)
suspicious_ips = ["10.0.0.5", "192.168.1.100", "172.16.0.50"]

# Access items by index (starts at 0, not 1!)
first_ip = suspicious_ips[0]    # "10.0.0.5"
second_ip = suspicious_ips[1]   # "192.168.1.100"
last_ip = suspicious_ips[-1]    # "172.16.0.50" (negative index = from end)

print(f"First IP: {first_ip}")
print(f"Last IP: {last_ip}")
print(f"All IPs: {suspicious_ips}")

# Add items to the list
suspicious_ips.append("10.10.10.10")
print(f"After append: {suspicious_ips}")

# Check if item exists
if "192.168.1.100" in suspicious_ips:
    print("‚ö†Ô∏è IP 192.168.1.100 is in the suspicious list!")

# Get count
print(f"Total suspicious IPs: {len(suspicious_ips)}")

## 1.4 Dictionaries - Key-Value Pairs

Dictionaries map keys to values (like a lookup table). Use curly braces `{}`.

In [None]:
# Security event as a dictionary
event = {
    "timestamp": "2024-01-15T10:30:00Z",
    "source_ip": "192.168.1.100",
    "destination_ip": "10.0.0.5",
    "port": 443,
    "action": "blocked",
    "severity": "high"
}

# Access values by key
print(f"Event severity: {event['severity']}")
print(f"Source: {event['source_ip']}")

# Add new key
event["analyst"] = "alice"
print(f"Assigned analyst: {event['analyst']}")

# Check if key exists
if "severity" in event:
    print("‚úì Severity is defined")

# Loop through keys and values
print("\nüìã Full event details:")
for key, value in event.items():
    print(f"  {key}: {value}")

---

# Part 2: Control Flow

Control flow determines what code runs and when.

## 2.1 If Statements - Making Decisions

In [None]:
# Threat classification based on score
threat_score = 8.5

# if-elif-else chain - Python checks each condition in order
# and runs the FIRST one that's True
if threat_score >= 9:
    severity = "CRITICAL"
    color = "üî¥"
elif threat_score >= 7:
    severity = "HIGH"
    color = "üü†"
elif threat_score >= 4:
    severity = "MEDIUM"
    color = "üü°"
else:
    severity = "LOW"
    color = "üü¢"

print(f"Score {threat_score} -> {color} Severity: {severity}")

# Try changing threat_score to see different results!

In [None]:
# Multiple conditions with AND / OR
failed_logins = 10
severity = "critical"

# AND - both conditions must be true
if failed_logins > 5 and severity in ["high", "critical"]:
    print("‚ö†Ô∏è Account lockout recommended")

# OR - either condition can be true
if severity == "critical" or failed_logins > 20:
    print("üö® Immediate investigation required")

# Ternary (one-liner if/else) - useful for quick assignments
status = "blocked" if failed_logins > 3 else "allowed"
print(f"Login status: {status}")

## 2.2 Loops - Repeating Actions

Loops let you run code multiple times.

In [None]:
# FOR LOOP - iterate over a sequence (list, string, range)
iocs = ["malware.exe", "evil.dll", "backdoor.ps1"]

print("üîç Analyzing IOCs:")
for ioc in iocs:
    print(f"  Analyzing: {ioc}")
    if ioc.endswith(".exe"):
        print("    ‚ö†Ô∏è Executable detected!")
    elif ioc.endswith(".ps1"):
        print("    ‚ö†Ô∏è PowerShell script detected!")

# FOR with range - repeat a specific number of times
print("\nüìä Counting failed attempts:")
for i in range(1, 6):  # range(1, 6) gives [1, 2, 3, 4, 5]
    print(f"  Attempt {i}")

# FOR with enumerate - get both index and value
print("\nüìã Alert queue:")
alerts = ["Malware detected", "Port scan", "Brute force"]
for index, alert in enumerate(alerts):
    print(f"  Alert #{index + 1}: {alert}")

In [None]:
# WHILE LOOP - repeat until condition is false
attempts = 0
max_attempts = 3

print("üîê Login simulation:")
while attempts < max_attempts:
    print(f"  Login attempt {attempts + 1}")
    attempts += 1
print("‚ùå Max attempts reached - account locked")

---

# Part 3: Functions - Reusable Code

Functions let you package code for reuse. They take inputs (arguments), do something, and optionally return outputs.

In [None]:
# DEFINING A FUNCTION
#
# def function_name(argument1, argument2):  <-- name and inputs
#     """Docstring - explains what function does"""  <-- documentation
#     # code here
#     return result  <-- output (optional)

def calculate_risk_score(failed_logins: int, is_admin: bool, is_after_hours: bool) -> float:
    """
    Calculate risk score based on login behavior.

    This docstring explains:
    - What the function does
    - What arguments it takes
    - What it returns

    Args:
        failed_logins: Number of failed login attempts
        is_admin: Whether the account is an admin
        is_after_hours: Whether the attempt is outside business hours

    Returns:
        Risk score from 0.0 to 10.0
    """
    score = 0.0

    # Base score from failed logins (cap at 5 points)
    score += min(failed_logins, 5)

    # Admin accounts are higher risk
    if is_admin:
        score += 3.0

    # After-hours activity is suspicious
    if is_after_hours:
        score += 2.0

    return min(score, 10.0)  # Cap at 10

# USING THE FUNCTION
risk = calculate_risk_score(failed_logins=4, is_admin=True, is_after_hours=True)
print(f"Risk score: {risk}/10")

if risk >= 7:
    print("üö® HIGH RISK - Investigate immediately")
elif risk >= 4:
    print("‚ö†Ô∏è MEDIUM RISK - Review within 24 hours")
else:
    print("‚úÖ LOW RISK - Log for reference")

In [None]:
# Another example: IP classification function
def is_private_ip(ip: str) -> bool:
    """
    Check if an IP address is in a private range (RFC 1918).

    Private ranges:
        - 10.0.0.0/8     (10.0.0.0 - 10.255.255.255)
        - 172.16.0.0/12  (172.16.0.0 - 172.31.255.255)
        - 192.168.0.0/16 (192.168.0.0 - 192.168.255.255)

    Args:
        ip: IPv4 address as string (e.g., "192.168.1.1")

    Returns:
        True if private, False if public
    """
    # Split IP into octets (the 4 numbers)
    octets = [int(x) for x in ip.split(".")]

    # Check private ranges
    # 10.0.0.0/8
    if octets[0] == 10:
        return True
    # 172.16.0.0/12 (172.16.x.x through 172.31.x.x)
    if octets[0] == 172 and 16 <= octets[1] <= 31:
        return True
    # 192.168.0.0/16
    if octets[0] == 192 and octets[1] == 168:
        return True

    return False

# Test with different IPs
test_ips = ["192.168.1.1", "8.8.8.8", "10.0.0.1", "172.16.50.1", "1.1.1.1"]
print("üåê IP Classification:")
for ip in test_ips:
    ip_type = "üè† Private" if is_private_ip(ip) else "üåç Public"
    print(f"  {ip}: {ip_type}")

---

# Part 3.5: Modules and Libraries - Reusing Code from Others

**Modules** are Python files containing functions and variables you can reuse. **Libraries** (also called packages) are collections of related modules.

Instead of writing everything from scratch, you can import and use code that others have written.

```
‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê
‚îÇ                  MODULES & LIBRARIES                         ‚îÇ
‚îú‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚î§
‚îÇ                                                             ‚îÇ
‚îÇ   WITHOUT MODULES:                                          ‚îÇ
‚îÇ   def calculate_md5(data):                                  ‚îÇ
‚îÇ       # ... hundreds of lines of cryptographic code ...     ‚îÇ
‚îÇ       pass                                                  ‚îÇ
‚îÇ                                                             ‚îÇ
‚îÇ   WITH MODULES:                                             ‚îÇ
‚îÇ   import hashlib                                            ‚îÇ
‚îÇ   hash_value = hashlib.md5(b"data").hexdigest()  # 1 line!‚îÇ
‚îÇ                                                             ‚îÇ
‚îÇ   TYPES:                                                    ‚îÇ
‚îÇ   ‚Ä¢ Standard Library - Built-in (json, csv, re, os)        ‚îÇ
‚îÇ   ‚Ä¢ Third-Party - Install with pip (requests, pandas)      ‚îÇ
‚îÇ                                                             ‚îÇ
‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò
```

## Import Syntax

There are several ways to import modules:

In [None]:
# =========================================
# IMPORT SYNTAX - 4 Ways to Import
# =========================================

# METHOD 1: Import entire module
import json
data = json.loads('{"key": "value"}')  # Use as module.function()
print(f"Method 1: {data}")

# METHOD 2: Import specific functions
from json import loads, dumps
data2 = loads('{"key": "value2"}')  # Use function directly
print(f"Method 2: {data2}")

# METHOD 3: Import with alias (shorthand)
import hashlib as h  # Can use any name, but use standard aliases!
hash_val = h.md5(b"test").hexdigest()
print(f"Method 3 - MD5 hash: {hash_val}")

# METHOD 4: Import everything (NOT RECOMMENDED!)
# from json import *  # DON'T DO THIS - you don't know what you're importing!
# Better to be explicit about what you need

print("\n" + "="*50)

In [None]:
# =========================================
# STANDARD LIBRARY - Always Available
# =========================================
# These modules come with Python - no installation needed!

# os - Operating system operations
import os
# Get environment variable (secure way to store API keys)
api_key = os.getenv("API_KEY", "default-key-if-not-set")
print(f"üîë API Key: {api_key}")

# pathlib - Modern file path operations  
from pathlib import Path
current_file = Path("security_tool.py")
print(f"üìÅ File exists: {current_file.exists()}")

# datetime - Working with dates and times
from datetime import datetime
now = datetime.now()
print(f"‚è∞ Current time: {now.strftime('%Y-%m-%d %H:%M:%S')}")

# hashlib - Cryptographic hashing (for file hashes, IOCs)
import hashlib
file_content = b"malicious payload"
file_hash = hashlib.sha256(file_content).hexdigest()
print(f"üîê SHA256 hash: {file_hash[:16]}...")  # First 16 chars

# collections.Counter - Count occurrences (common in security)
from collections import Counter
ips = ["1.1.1.1", "2.2.2.2", "1.1.1.1", "1.1.1.1", "3.3.3.3"]
ip_counts = Counter(ips)
print(f"\nüìä IP frequency: {dict(ip_counts)}")
print(f"   Most common: {ip_counts.most_common(1)[0]}")

# base64 - Encode/decode (common in malware obfuscation)
import base64
encoded = base64.b64encode(b"hidden payload")
print(f"\nüî§ Base64 encoded: {encoded.decode()}")

print("\nüí° All these modules are built-in - no 'pip install' needed!")

## Third-Party Packages with pip

Third-party packages are NOT included with Python - you must install them first using `pip` (Python's package installer).

```bash
# Run these commands in your terminal (NOT in Python code!)
pip install requests      # HTTP requests for APIs
pip install pandas        # Data analysis and manipulation
pip install scikit-learn  # Machine learning library

# Install specific version
pip install requests==2.31.0

# Install from requirements.txt (common in projects)
pip install -r requirements.txt
```

**Common Security Packages:**

| Package | Purpose | Used In Labs |
|---------|---------|-------------|
| `requests` | Make HTTP requests to APIs | Labs 08, 10-18 |
| `pandas` | Data analysis and manipulation | Labs 10-13, 19+ |
| `numpy` | Numerical operations, arrays | Labs 10-13 |
| `scikit-learn` | Machine learning algorithms | Labs 10-13, 19+ |
| `matplotlib` | Data visualization | Labs 06, 10-13 |

**How to Check What's Installed:**

```bash
pip list              # Show all installed packages
pip show requests     # Show details about a specific package
```

> üí° **Important**: When you see `import requests` or `import pandas` in later labs, remember these are third-party packages you may need to install first!

---

# Part 4: Regular Expressions for Security

Regular expressions (regex) are patterns for matching text. They're essential for extracting IOCs (Indicators of Compromise) from logs and reports.

```
‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê
‚îÇ              COMMON REGEX PATTERNS FOR SECURITY              ‚îÇ
‚îú‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚î§
‚îÇ                                                             ‚îÇ
‚îÇ   \d       ‚Üí Any digit (0-9)                               ‚îÇ
‚îÇ   \w       ‚Üí Any word character (a-z, A-Z, 0-9, _)         ‚îÇ
‚îÇ   \s       ‚Üí Any whitespace (space, tab, newline)          ‚îÇ
‚îÇ   .        ‚Üí Any character (except newline)                ‚îÇ
‚îÇ   +        ‚Üí One or more of the previous                   ‚îÇ
‚îÇ   *        ‚Üí Zero or more of the previous                  ‚îÇ
‚îÇ   ?        ‚Üí Zero or one of the previous                   ‚îÇ
‚îÇ   {n}      ‚Üí Exactly n of the previous                     ‚îÇ
‚îÇ   {n,m}    ‚Üí Between n and m of the previous               ‚îÇ
‚îÇ   [abc]    ‚Üí Any character in the set                      ‚îÇ
‚îÇ   [a-z]    ‚Üí Any character in the range                    ‚îÇ
‚îÇ   \b       ‚Üí Word boundary                                 ‚îÇ
‚îÇ   ^        ‚Üí Start of string                               ‚îÇ
‚îÇ   $        ‚Üí End of string                                 ‚îÇ
‚îÇ                                                             ‚îÇ
‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò
```

In [None]:
import re  # The regex module

# EXTRACTING IPs FROM TEXT
# The r"" means "raw string" - treats backslashes literally
# Pattern breakdown:
#   \b           - word boundary (so we don't match partial numbers)
#   (?:\d{1,3}\.){3}  - three groups of 1-3 digits followed by a dot
#   \d{1,3}      - final group of 1-3 digits
#   \b           - word boundary

log_line = "Failed login from 192.168.1.100 to 10.0.0.5 at 2024-01-15 10:30:00"

ip_pattern = r"\b(?:\d{1,3}\.){3}\d{1,3}\b"
ips = re.findall(ip_pattern, log_line)  # findall returns ALL matches

print(f"üìç Log line: {log_line}")
print(f"üîç Found IPs: {ips}")

In [None]:
# EXTRACTING FILE HASHES
# Hashes are fixed-length hexadecimal strings:
#   MD5:     32 characters (e.g., d41d8cd98f00b204e9800998ecf8427e)
#   SHA1:    40 characters
#   SHA256:  64 characters

text = """
Malware analysis report:
MD5: d41d8cd98f00b204e9800998ecf8427e
SHA256: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
Additional sample: MD5: abc123def456abc123def456abc12345
"""

# Patterns: [a-fA-F0-9] matches hex characters, {32} means exactly 32
md5_pattern = r"\b[a-fA-F0-9]{32}\b"
sha256_pattern = r"\b[a-fA-F0-9]{64}\b"

md5_hashes = re.findall(md5_pattern, text)
sha256_hashes = re.findall(sha256_pattern, text)

print(f"üîê MD5 hashes found ({len(md5_hashes)}):")
for h in md5_hashes:
    print(f"    {h}")

print(f"\nüîê SHA256 hashes found ({len(sha256_hashes)}):")
for h in sha256_hashes:
    print(f"    {h}")

---

# Part 5: Working with Files

Security work involves reading logs, writing reports, and parsing structured data.

```
‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê
‚îÇ                  FILE OPERATIONS                             ‚îÇ
‚îú‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚î§
‚îÇ                                                             ‚îÇ
‚îÇ   OPEN MODES:                                               ‚îÇ
‚îÇ   "r"  ‚Üí Read (file must exist)                            ‚îÇ
‚îÇ   "w"  ‚Üí Write (creates new or overwrites)                 ‚îÇ
‚îÇ   "a"  ‚Üí Append (adds to end of file)                      ‚îÇ
‚îÇ   "r+" ‚Üí Read and write                                    ‚îÇ
‚îÇ                                                             ‚îÇ
‚îÇ   COMMON FORMATS:                                           ‚îÇ
‚îÇ   .txt  ‚Üí Plain text (logs, blocklists)                    ‚îÇ
‚îÇ   .csv  ‚Üí Comma-separated values (alerts, events)          ‚îÇ
‚îÇ   .json ‚Üí Structured data (API responses, configs)         ‚îÇ
‚îÇ                                                             ‚îÇ
‚îÇ   THE 'with' STATEMENT:                                     ‚îÇ
‚îÇ   with open("file.txt", "r") as f:                         ‚îÇ
‚îÇ       content = f.read()                                   ‚îÇ
‚îÇ   # File automatically closed after the block              ‚îÇ
‚îÇ                                                             ‚îÇ
‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò
```

In [None]:
import json

# =========================================
# WRITING AND READING JSON FILES
# =========================================
# JSON is the most common format for security data (API responses, configs)

# Create sample threat data
threats = [
    {"ip": "45.33.32.156", "type": "c2", "score": 9.5},
    {"ip": "185.220.101.1", "type": "scanner", "score": 6.0},
]

# WRITE to JSON file
# indent=2 makes it human-readable with nice formatting
with open("threats.json", "w") as f:
    json.dump(threats, f, indent=2)
print("‚úÖ Wrote threats.json")

# READ from JSON file
with open("threats.json", "r") as f:
    loaded = json.load(f)

print(f"üìÇ Loaded {len(loaded)} threats from file:")
print(json.dumps(loaded, indent=2))

# =========================================
# WORKING WITH TEXT FILES
# =========================================

# Write a blocklist
blocked_ips = ["10.0.0.5", "172.16.0.50", "192.168.1.100"]
with open("blocklist.txt", "w") as f:
    for ip in blocked_ips:
        f.write(f"{ip}\n")  # \n adds newline
print("\n‚úÖ Wrote blocklist.txt")

# Read and process line by line
print("üìã Reading blocklist:")
with open("blocklist.txt", "r") as f:
    for line in f:
        ip = line.strip()  # Remove whitespace/newline
        print(f"  Blocking: {ip}")

---

# Part 6: Making API Requests

APIs (Application Programming Interfaces) let you interact with external services programmatically.

```
‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê
‚îÇ                  HTTP BASICS                                 ‚îÇ
‚îú‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚î§
‚îÇ                                                             ‚îÇ
‚îÇ   HTTP METHODS:                                             ‚îÇ
‚îÇ   GET    ‚Üí Retrieve data (read-only)                       ‚îÇ
‚îÇ   POST   ‚Üí Submit/create data                              ‚îÇ
‚îÇ   PUT    ‚Üí Update existing data                            ‚îÇ
‚îÇ   DELETE ‚Üí Remove data                                     ‚îÇ
‚îÇ                                                             ‚îÇ
‚îÇ   STATUS CODES:                                             ‚îÇ
‚îÇ   200 ‚Üí OK (success)                                       ‚îÇ
‚îÇ   201 ‚Üí Created                                            ‚îÇ
‚îÇ   400 ‚Üí Bad Request (your error)                           ‚îÇ
‚îÇ   401 ‚Üí Unauthorized (need auth)                           ‚îÇ
‚îÇ   403 ‚Üí Forbidden (not allowed)                            ‚îÇ
‚îÇ   404 ‚Üí Not Found                                          ‚îÇ
‚îÇ   429 ‚Üí Too Many Requests (rate limited)                   ‚îÇ
‚îÇ   500 ‚Üí Server Error                                       ‚îÇ
‚îÇ                                                             ‚îÇ
‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò
```

In [None]:
import requests

# =========================================
# MAKING API REQUESTS
# =========================================

# SIMPLE GET REQUEST
response = requests.get("https://httpbin.org/ip", timeout=5)
print(f"Status Code: {response.status_code}")
print(f"Response: {response.json()}")

# GET WITH QUERY PARAMETERS
# ?key=value&key2=value2
params = {"query": "python security", "limit": 10}
response = requests.get("https://httpbin.org/get", params=params, timeout=5)
print(f"\nüì§ Sent parameters: {response.json()['args']}")

# =========================================
# ERROR HANDLING (IMPORTANT!)
# =========================================

def safe_api_request(url: str, timeout: int = 5) -> dict:
    """
    Make an API request with proper error handling.

    Always handle errors - APIs can fail for many reasons:
    - Network issues
    - Rate limiting
    - Server errors
    - Invalid responses
    """
    try:
        response = requests.get(url, timeout=timeout)
        response.raise_for_status()  # Raises exception for 4xx/5xx status codes
        return {"success": True, "data": response.json()}
    except requests.Timeout:
        return {"success": False, "error": "Request timed out"}
    except requests.HTTPError as e:
        return {"success": False, "error": f"HTTP {e.response.status_code}"}
    except requests.RequestException as e:
        return {"success": False, "error": str(e)}

# Test with a working URL
result = safe_api_request("https://httpbin.org/ip")
print(f"\n‚úÖ Good request: {result}")

# Test with a bad URL
result = safe_api_request("https://httpbin.org/status/404")
print(f"‚ùå Bad request: {result}")

---

# Part 7: Putting It All Together - IOC Extractor

Now let's combine everything you've learned to build a real security tool!

In [None]:
# =========================================
# IOC EXTRACTOR - A REAL SECURITY TOOL!
# =========================================
# This combines: functions, regex, dictionaries, and loops

def extract_iocs(text: str) -> dict:
    """
    Extract Indicators of Compromise (IOCs) from text.

    This is a common task in security operations:
    - Parsing threat reports
    - Analyzing phishing emails
    - Processing incident tickets

    Args:
        text: Any text that might contain IOCs

    Returns:
        Dictionary of IOC types and their values
    """
    iocs = {
        # IPv4 addresses
        "ips": re.findall(r"\b(?:\d{1,3}\.){3}\d{1,3}\b", text),

        # MD5 hashes (32 hex characters)
        "md5": re.findall(r"\b[a-fA-F0-9]{32}\b", text),

        # SHA256 hashes (64 hex characters)
        "sha256": re.findall(r"\b[a-fA-F0-9]{64}\b", text),

        # Domains (simplified pattern)
        "domains": re.findall(r"\b[a-zA-Z0-9][a-zA-Z0-9-]{0,61}[a-zA-Z0-9]\.[a-zA-Z]{2,}\b", text),

        # URLs (http or https)
        "urls": re.findall(r'https?://[^\s<>"]+', text),

        # Email addresses
        "emails": re.findall(r'\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b', text),
    }

    # Remove duplicates and empty categories
    # set() removes duplicates, list() converts back
    return {k: list(set(v)) for k, v in iocs.items() if v}


# Test with a sample threat report
threat_report = """
THREAT INTELLIGENCE REPORT
==========================

The APT group deployed a new variant of their malware toolkit.

Initial Access: Phishing email from attacker@malicious-corp.com
Subject: "Invoice #12345 - Payment Required"

The malware connects to these C2 servers:
- 45.33.32.156 (primary)
- 185.220.101.1 (backup)
- malware-c2.evil-domain.com

Malware Hashes:
- MD5: d41d8cd98f00b204e9800998ecf8427e
- SHA256: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855

The payload downloads additional tools from:
https://malware.example.com/beacon
https://update-service.evil-domain.com/payload.exe

Contact security@your-company.com if you observe this activity.
"""

# Extract and display IOCs
print("üîç EXTRACTING IOCs FROM THREAT REPORT")
print("=" * 50)

extracted = extract_iocs(threat_report)

for ioc_type, values in extracted.items():
    print(f"\nüìå {ioc_type.upper()} ({len(values)} found):")
    for v in values:
        print(f"    ‚Ä¢ {v}")

---

# üéâ Congratulations!

You've learned Python basics with security context! You can now:

‚úÖ Write Python scripts with variables, functions, and control flow
‚úÖ Work with lists, dictionaries, and files
‚úÖ Use regular expressions to extract IOCs
‚úÖ Make API requests with error handling

## Quick Reference

```python
# STRINGS
text = "Hello"
text.lower()           # "hello"
text.upper()           # "HELLO"
text.split(",")        # Split into list
"x" in text            # Check if contains
f"Value: {var}"        # f-string formatting

# LISTS
items = [1, 2, 3]
items.append(4)        # Add item
items[0]               # First item
items[-1]              # Last item
len(items)             # Length
[x*2 for x in items]   # List comprehension

# DICTIONARIES
d = {"key": "value"}
d["key"]               # Get value
d.get("key", "default") # Get with default
d.keys()               # All keys
d.values()             # All values
d.items()              # Key-value pairs

# FILES
with open("file.txt", "r") as f:  # Read
with open("file.txt", "w") as f:  # Write
json.load(f)           # Read JSON
json.dump(data, f)     # Write JSON

# REGEX
import re
re.findall(pattern, text)  # Find all matches
re.search(pattern, text)   # Find first match
re.sub(pattern, repl, text) # Replace

# API REQUESTS
import requests
response = requests.get(url, timeout=5)
response.status_code   # HTTP status
response.json()        # Parse JSON response
```

## Next Steps

Continue your learning journey:

| Lab | Topic | What You'll Build |
|-----|-------|-------------------|
| **02** | Intro to Prompt Engineering | Learn to communicate with AI |
| **04** | ML Concepts Primer | Understanding machine learning |
| **07** | Hello World ML | Your first classifier |
| **08** | Working with APIs | API integration skills |
| **10** | Phishing Classifier | Email threat detection |

## Practice Exercises

Try these on your own:

1. **Failed Login Analyzer**: Read login events, count failures per user, flag users with >3 failures
2. **IOC Blocklist Generator**: Validate IPs, filter private addresses, write blocklist file
3. **Log Monitor**: Parse log file, extract ERROR/WARN messages, group by hour

---

*You're ready for more advanced security AI labs! üöÄ*