# PyGuard Notebook Security Demo

This notebook demonstrates various security issues that PyGuard can detect in Jupyter notebooks.

**DO NOT run this notebook in production!** It contains intentional security vulnerabilities for demonstration purposes.

## 1. Hardcoded Secrets (HIGH/CRITICAL)

PyGuard detects hardcoded credentials and secrets:

In [None]:
# SECURITY ISSUE: Hardcoded secrets
api_key = "sk-1234567890abcdef1234567890abcdef"
password = "SuperSecret123"
github_token = "ghp_abcdefghijklmnopqrstuvwxyz1234567890"
aws_access_key_id = "AKIAIOSFODNN7EXAMPLE"
aws_secret_access_key = "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY"

# FIX: Use environment variables
# import os
# api_key = os.getenv('API_KEY')
# password = os.getenv('PASSWORD')

## 2. PII Exposure (HIGH)

Personal data should never be hardcoded:

In [None]:
# SECURITY ISSUE: PII in code
user_email = "john.doe@company.com"
ssn = "987-65-4321"
phone = "555-123-4567"
credit_card = "4532-1234-5678-9010"

# FIX: Use placeholder values or environment variables
# user_email = os.getenv('USER_EMAIL')
# ssn = '***-**-****'

## 3. Dangerous Magic Commands (HIGH)

Shell commands can be dangerous:

In [None]:
# SECURITY ISSUE: Shell command execution
# !rm -rf /tmp/data
# %system cat /etc/passwd
# %%writefile /tmp/dangerous.sh

# FIX: Use Python subprocess with validation
# import subprocess
# subprocess.run(['rm', '-rf', '/tmp/data'], check=True)

## 4. Code Injection (CRITICAL)

Never use eval/exec with user input:

In [None]:
# SECURITY ISSUE: Code injection
# user_input = input('Enter expression: ')
# result = eval(user_input)  # Can execute arbitrary code!

# FIX: Use ast.literal_eval for safe evaluation
# import ast
# result = ast.literal_eval(user_input)

## 5. Command Injection (CRITICAL)

Avoid shell=True with user input:

In [None]:
# SECURITY ISSUE: Command injection

# filename = input('Enter filename: ')
# subprocess.run(f'cat {filename}', shell=True)  # Injection possible!

# FIX: Use command lists with shell=False
# subprocess.run(['cat', filename], shell=False, check=True)

## 6. Unsafe Deserialization (HIGH)

Pickle can execute arbitrary code:

In [None]:
# SECURITY ISSUE: Unsafe deserialization

# with open('untrusted.pkl', 'rb') as f:
#     data = pickle.load(f)  # Can execute arbitrary code!

# FIX: Use JSON or other safe formats
# import json
# with open('data.json', 'r') as f:
#     data = json.load(f)

## 7. ML Pipeline Security (CRITICAL)

Model loading can be dangerous:

In [None]:
# SECURITY ISSUE: Unsafe model loading
# import torch
# model = torch.load('untrusted_model.pth')  # Arbitrary code execution risk!

# FIX: Verify model checksum before loading
# import hashlib
# with open('model.pth', 'rb') as f:
#     checksum = hashlib.sha256(f.read()).hexdigest()
# assert checksum == KNOWN_GOOD_CHECKSUM

## 8. Data Validation (MEDIUM)

Always validate input data to prevent data poisoning:

In [None]:
# SECURITY ISSUE: No data validation

# df = pd.read_csv('untrusted_data.csv')  # No dtype validation!

# FIX: Specify dtypes and validate schema
# df = pd.read_csv('data.csv', dtype={
#     'column1': 'int64',
#     'column2': 'float64'
# })

## 9. XSS Vulnerabilities (HIGH)

Raw HTML display can introduce XSS risks:

In [None]:
# SECURITY ISSUE: XSS vulnerability

# user_input = input('Enter HTML: ')
# display(HTML(user_input))  # User input not sanitized!

# FIX: Use safe display methods
# from IPython.display import Text
# display(Text(user_input))

## 10. Execution Order Issues (MEDIUM)

Variables should be defined before use:

In [None]:
# SECURITY ISSUE: Using variable before definition
# This cell should be run AFTER the next cell
# print(result)

In [None]:
# This cell defines the variable
result = 42

## How to Use PyGuard

To scan this notebook for security issues:

```python
from pyguard.lib.notebook_security import scan_notebook

# Scan the notebook
issues = scan_notebook('notebook_security_demo.ipynb')

# Print all issues
for issue in issues:
    print(f"{issue.severity}: {issue.message}")
    print(f"  Cell {issue.cell_index}, Line {issue.line_number}")
    print(f"  Fix: {issue.fix_suggestion}\n")
```

Or use the fixer to automatically remediate issues:

```python
from pathlib import Path
from pyguard.lib.notebook_security import NotebookSecurityAnalyzer, NotebookFixer

# Analyze
analyzer = NotebookSecurityAnalyzer()
issues = analyzer.analyze_notebook(Path('notebook_security_demo.ipynb'))

# Fix
fixer = NotebookFixer()
success, fixes = fixer.fix_notebook(
    Path('notebook_security_demo.ipynb'),
    issues
)

print(f"Applied {len(fixes)} fixes")
```

## Summary

PyGuard detects 15+ categories of security issues in Jupyter notebooks:

1. ✅ Hardcoded Secrets (API keys, passwords, tokens)
2. ✅ PII Exposure (SSN, email, credit cards, phone numbers)
3. ✅ Dangerous Magic Commands (shell execution, file operations)
4. ✅ Code Injection (eval, exec, compile)
5. ✅ Command Injection (shell=True)
6. ✅ Unsafe Deserialization (pickle)
7. ✅ ML Pipeline Security (unsafe model loading, data poisoning)
8. ✅ XSS Vulnerabilities (raw HTML display)
9. ✅ Information Disclosure (paths in outputs)
10. ✅ Execution Order Issues (variables used before definition)
11. ✅ PII in Outputs (sensitive data in cell outputs)
12. ✅ Untrusted Notebooks (metadata warnings)
13. ✅ Non-Standard Kernels (security warnings)
14. ✅ Data Validation Issues (missing type checks)
15. ✅ Path Traversal (unsafe file operations)

For more information, see: [Notebook Security Guide](../docs/guides/notebook-security-guide.md)