# Module 08: Windows Security & Permissions

**Difficulty**: ⭐⭐⭐ (Advanced)

**Estimated Time**: 60 minutes

**Prerequisites**: 
- Completed Modules 00-07
- Administrator access recommended
- Understanding of user accounts and permissions
- Familiarity with security concepts

## Learning Objectives

By the end of this notebook, you will be able to:

1. **Check** file and directory permissions
2. **Modify** access control lists (ACLs)
3. **Detect** if script is running with admin privileges
4. **Request** elevation to administrator when needed
5. **Manage** credentials securely
6. **Apply** security best practices to data science workflows

## Introduction: Why Security Matters for Data Scientists

Security isn't just for IT departments. Data scientists handle sensitive information:

### Security Risks in Data Science

**1. Data Breaches**
- Customer PII (personally identifiable information)
- Financial records
- Proprietary datasets
- Model architectures and weights

**2. Unauthorized Access**
- Other users reading your data
- Malicious processes accessing files
- Network shares with weak permissions
- Accidental exposure of credentials

**3. Privilege Escalation**
- Scripts requiring admin rights
- Installing packages system-wide
- Modifying system configurations
- Accessing restricted resources

**4. Credential Management**
- API keys in code (bad!)
- Database passwords in notebooks (worse!)
- Shared credentials (dangerous!)
- Unencrypted credential storage (critical!)

### Windows Security Model

Windows uses:
- **Users and Groups**: Identity management
- **ACLs (Access Control Lists)**: File/folder permissions
- **UAC (User Account Control)**: Privilege elevation
- **DPAPI (Data Protection API)**: Credential encryption

This module covers practical security for data science workflows.

In [None]:
# Setup: Import required libraries
import os
import sys
import subprocess
from pathlib import Path
import ctypes
import platform

# Verify we're on Windows
if platform.system() != 'Windows':
    print("⚠ This module requires Windows")
    print("  Security features are Windows-specific")
else:
    print("✓ Running on Windows")
    print("Setup complete!")

## 1. Detecting Administrator Privileges

Many operations require administrator (elevated) privileges. Always check first.

In [None]:
# Check if running as administrator
def is_admin():
    """
    Check if current process has administrator privileges.
    
    Returns:
        bool: True if running as administrator
    """
    try:
        return ctypes.windll.shell32.IsUserAnAdmin() != 0
    except:
        return False

# Check current privileges
if is_admin():
    print("✓ Running with administrator privileges")
    print("  Can perform elevated operations")
else:
    print("ℹ Running with standard user privileges")
    print("  Some operations may require elevation")

# Show current user
current_user = os.getenv('USERNAME')
print(f"\nCurrent user: {current_user}")

### 1.1 Requesting Administrator Privileges

If your script needs admin rights, you can request elevation (UAC prompt).

In [None]:
# Request elevation to administrator
def run_as_admin(script_path):
    """
    Re-run the current script with administrator privileges.
    
    Args:
        script_path: Path to Python script to run elevated
    
    Note:
        This will show a UAC prompt to the user.
        The current process will exit and new elevated process starts.
    """
    if is_admin():
        print("Already running as administrator")
        return True
    
    print("Requesting administrator privileges...")
    print("You will see a UAC prompt.")
    
    try:
        # Use ShellExecute with 'runas' verb to elevate
        ctypes.windll.shell32.ShellExecuteW(
            None,
            "runas",
            sys.executable,
            f'"{script_path}"',
            None,
            1  # SW_SHOWNORMAL
        )
        
        print("Elevated process started")
        return True
    
    except Exception as e:
        print(f"Failed to elevate: {e}")
        return False

# Example usage pattern
print("Example: Checking privileges before admin operation\n")
print("if not is_admin():")
print("    print('This operation requires administrator privileges')")
print("    # Option 1: Exit gracefully")
print("    sys.exit(1)")
print("    # Option 2: Request elevation")
print("    run_as_admin(__file__)")
print("    sys.exit(0)")
print("")
print("# Continue with admin operations...")

## 2. File and Directory Permissions

Check and understand file permissions to ensure data security.

In [None]:
# Check file permissions using icacls
def check_file_permissions(file_path):
    """
    Check permissions on a file or directory.
    
    Args:
        file_path: Path to file or directory
    
    Returns:
        str: Permission information
    """
    file_path = Path(file_path)
    
    if not file_path.exists():
        return f"Path does not exist: {file_path}"
    
    try:
        # Use icacls to query permissions
        result = subprocess.run(
            ['icacls', str(file_path)],
            capture_output=True,
            text=True
        )
        
        if result.returncode == 0:
            return result.stdout
        else:
            return f"Error: {result.stderr}"
    
    except Exception as e:
        return f"Error checking permissions: {e}"

# Example: Check permissions on current directory
current_dir = Path.cwd()
print(f"Checking permissions for: {current_dir}\n")

permissions = check_file_permissions(current_dir)
print(permissions)

print("\nPermission abbreviations:")
print("  F  = Full control")
print("  M  = Modify")
print("  RX = Read and execute")
print("  R  = Read-only")
print("  W  = Write-only")

### 2.1 Setting Secure Permissions

Restrict access to sensitive data files to current user only.

In [None]:
# Set restrictive permissions (current user only)
def set_private_permissions(file_path, dry_run=True):
    """
    Set file/directory to be accessible only by current user.
    
    Args:
        file_path: Path to secure
        dry_run: If True, only show what would be done
    
    Returns:
        bool: Success status
    """
    file_path = Path(file_path)
    current_user = os.getenv('USERNAME')
    
    if not file_path.exists():
        print(f"✗ Path does not exist: {file_path}")
        return False
    
    print(f"{'DRY RUN: Would set' if dry_run else 'Setting'} private permissions:")
    print(f"  Path: {file_path}")
    print(f"  Owner: {current_user}")
    print(f"  Access: Full control for {current_user} only")
    
    if not dry_run:
        try:
            # Remove all existing permissions
            subprocess.run(
                ['icacls', str(file_path), '/inheritance:r'],
                capture_output=True,
                check=True
            )
            
            # Grant full control to current user only
            subprocess.run(
                ['icacls', str(file_path), '/grant', f'{current_user}:F'],
                capture_output=True,
                check=True
            )
            
            print("✓ Permissions set successfully")
            return True
        
        except subprocess.CalledProcessError as e:
            print(f"✗ Failed to set permissions: {e}")
            return False
    else:
        print("\nSet dry_run=False to actually modify permissions")
        return True

# Example: Create a test file and secure it
test_file = Path('temp_secure_test.txt')
test_file.write_text('Sensitive data here')

print("Example: Securing a sensitive file\n")
set_private_permissions(test_file, dry_run=True)

# Cleanup
test_file.unlink()
print("\n(Test file removed)")

## 3. Secure Credential Management

**Never** store credentials in code! Use environment variables or secure storage.

### Credential Storage Options

| Method | Security | Convenience | Best For |
|--------|----------|-------------|----------|
| **Environment Variables** | Medium | High | Development, local scripts |
| **Windows Credential Manager** | High | Medium | Production, user-specific |
| **Azure Key Vault** | Very High | Low | Enterprise, cloud |
| **.env Files** | Low | High | Local dev only (gitignored!) |
| **Hardcoded** | None | High | NEVER DO THIS |

This module covers environment variables and .env files.

In [None]:
# Secure credential loading from environment
import os
from pathlib import Path

class SecureConfig:
    """
    Load configuration securely from environment variables.
    """
    
    @staticmethod
    def load_from_env(key_name, required=True, default=None):
        """
        Load value from environment variable.
        
        Args:
            key_name: Environment variable name
            required: If True, raise error if not found
            default: Default value if not found
        
        Returns:
            str: Value from environment
        """
        value = os.getenv(key_name, default)
        
        if required and value is None:
            raise ValueError(
                f"Required environment variable not set: {key_name}\n"
                f"Set it with: setx {key_name} \"your_value\""
            )
        
        return value
    
    @staticmethod
    def load_from_dotenv(env_file='.env'):
        """
        Load environment variables from .env file.
        
        Args:
            env_file: Path to .env file
        
        Note:
            .env files should NEVER be committed to git!
            Add .env to .gitignore
        """
        env_path = Path(env_file)
        
        if not env_path.exists():
            print(f"⚠ .env file not found: {env_path}")
            return False
        
        # Check if in .gitignore
        gitignore_path = Path('.gitignore')
        if gitignore_path.exists():
            gitignore_content = gitignore_path.read_text()
            if '.env' not in gitignore_content:
                print("⚠ WARNING: .env file exists but not in .gitignore!")
                print("  Add '.env' to .gitignore to prevent accidental commits")
        
        # Parse .env file
        with open(env_path) as f:
            for line in f:
                line = line.strip()
                
                # Skip comments and empty lines
                if not line or line.startswith('#'):
                    continue
                
                # Parse KEY=VALUE
                if '=' in line:
                    key, value = line.split('=', 1)
                    key = key.strip()
                    value = value.strip().strip('"').strip("'")
                    os.environ[key] = value
        
        print(f"✓ Loaded environment variables from {env_path}")
        return True

# Example usage
print("Example: Secure credential loading\n")
print("# In your .env file (NOT committed to git):")
print("API_KEY=your_secret_key_here")
print("DB_PASSWORD=your_db_password")
print("")
print("# In your Python code:")
print("config = SecureConfig()")
print("config.load_from_dotenv('.env')")
print("")
print("api_key = config.load_from_env('API_KEY', required=True)")
print("db_password = config.load_from_env('DB_PASSWORD', required=True)")

### 3.1 Creating Secure .env Files

Best practices for .env file management.

In [None]:
# Create .env template
def create_env_template(template_path='.env.template', env_path='.env'):
    """
    Create .env template and check for actual .env file.
    
    Args:
        template_path: Path for template file (committed to git)
        env_path: Path for actual .env file (gitignored)
    """
    template_content = """# Environment Configuration Template
# Copy this file to .env and fill in your actual values
# NEVER commit .env to git! (it should be in .gitignore)

# API Keys
API_KEY=your_api_key_here
SECRET_KEY=your_secret_key_here

# Database
DB_HOST=localhost
DB_PORT=5432
DB_NAME=your_database
DB_USER=your_username
DB_PASSWORD=your_password

# Email
SMTP_SERVER=smtp.example.com
SMTP_PORT=587
SMTP_USERNAME=your_email@example.com
SMTP_PASSWORD=your_email_password
"""
    
    # Create template
    template_path = Path(template_path)
    template_path.write_text(template_content)
    print(f"✓ Created template: {template_path}")
    print("  This file CAN be committed to git")
    
    # Check if actual .env exists
    env_path = Path(env_path)
    if not env_path.exists():
        print(f"\nℹ No .env file found")
        print(f"  Copy {template_path} to {env_path}")
        print(f"  Then fill in your actual credentials")
    else:
        print(f"\n✓ Found existing .env file: {env_path}")
    
    # Check .gitignore
    gitignore_path = Path('.gitignore')
    if gitignore_path.exists():
        gitignore_content = gitignore_path.read_text()
        if '.env' in gitignore_content:
            print("✓ .env is in .gitignore (safe)")
        else:
            print("⚠ WARNING: .env NOT in .gitignore!")
            print("  Add this line to .gitignore:")
            print("  .env")
    else:
        print("\nℹ No .gitignore found")
        print("  Create .gitignore with: .env")

# Demo
print("Creating .env template...\n")
create_env_template(
    template_path='demo.env.template',
    env_path='demo.env'
)

# Cleanup
Path('demo.env.template').unlink()
print("\n(Demo template removed)")

## 4. Security Best Practices for Data Science

Practical security guidelines for your data science work.

### 4.1 Data File Security Checklist

**For Sensitive Datasets:**

✅ **DO:**
- Store in user-only directories (set with `set_private_permissions()`)
- Use descriptive names without exposing content (e.g., `customer_data.csv` not `john_smith_medical_records.csv`)
- Encrypt files at rest for highly sensitive data
- Add to .gitignore if contains PII or proprietary data
- Use relative paths, never absolute paths with usernames
- Delete or anonymize when no longer needed

❌ **DON'T:**
- Store in public/shared directories (Desktop, Downloads)
- Commit sensitive data to git repositories
- Email raw sensitive datasets
- Keep unnecessary copies
- Use predictable filenames for secret data

### 4.2 Code Security Checklist

**For Scripts and Notebooks:**

✅ **DO:**
- Use environment variables for credentials
- Validate all user inputs (prevent injection attacks)
- Use parameterized queries for databases (prevent SQL injection)
- Log security events (failed auth, unauthorized access)
- Handle exceptions without exposing sensitive info
- Use virtual environments to isolate dependencies

❌ **DON'T:**
- Hardcode passwords, API keys, tokens in code
- Print sensitive data to console/logs
- Use `eval()` or `exec()` on user input
- Ignore security warnings from linters
- Run untrusted code without sandboxing

### 4.3 Credential Security Checklist

**For API Keys, Passwords, Tokens:**

✅ **DO:**
- Store in environment variables or .env file
- Add .env to .gitignore
- Use different credentials for dev/prod
- Rotate credentials periodically
- Revoke credentials when no longer needed
- Use least privilege (minimum required permissions)

❌ **DON'T:**
- Commit credentials to git (even private repos)
- Share credentials via email/chat
- Reuse personal credentials for work
- Use default/weak passwords
- Store credentials in browser autofill for work accounts

## 5. Practice Exercises

### Exercise 1: Secure Data Directory Setup

Create a secure data directory structure:
1. Create `data/sensitive/` directory
2. Set permissions to current user only
3. Create `.gitignore` to exclude `data/sensitive/`
4. Create `README.md` with security guidelines
5. Verify permissions are set correctly

**Hint**: Use `set_private_permissions()` and verify with `check_file_permissions()`

In [None]:
# Exercise 1: Your solution here

def setup_secure_data_directory(base_dir='data'):
    """
    Create secure data directory with proper permissions.
    
    Args:
        base_dir: Base directory for data
    """
    # TODO: Create directory structure
    # TODO: Set private permissions
    # TODO: Create .gitignore
    # TODO: Create README with guidelines
    # TODO: Verify permissions
    pass

# Test your solution
# setup_secure_data_directory('test_data')

### Exercise 2: Credential Validator

Create a tool to audit code for security issues:
1. Scan Python files for hardcoded credentials
2. Check for common patterns (API_KEY=, password=, token=)
3. Look for suspicious eval/exec usage
4. Verify .env files are in .gitignore
5. Generate security audit report

**Hint**: Use regex patterns and file scanning

In [None]:
# Exercise 2: Your solution here

class SecurityAuditor:
    """
    Audit code for common security issues.
    """
    
    def __init__(self, project_dir):
        # TODO: Initialize auditor
        pass
    
    def scan_for_hardcoded_credentials(self):
        # TODO: Scan Python files for credentials
        pass
    
    def check_gitignore(self):
        # TODO: Verify sensitive files are ignored
        pass
    
    def generate_report(self):
        # TODO: Generate audit report
        pass

# Test your auditor
# auditor = SecurityAuditor('.')
# auditor.generate_report()

### Exercise 3: Secure Configuration Manager

Build a configuration management system:
1. Load config from multiple sources (env, .env, defaults)
2. Validate required settings are present
3. Mask sensitive values in logs
4. Support environment-specific configs (dev, prod)
5. Provide clear error messages for missing credentials

**Hint**: Extend `SecureConfig` class

In [None]:
# Exercise 3: Your solution here

class ConfigurationManager:
    """
    Manage application configuration securely.
    """
    
    def __init__(self, env='development'):
        # TODO: Initialize config manager
        pass
    
    def load_all_configs(self):
        # TODO: Load from all sources (env, .env, defaults)
        pass
    
    def validate_required(self, required_keys):
        # TODO: Check all required settings present
        pass
    
    def get_safe_summary(self):
        # TODO: Return config summary with masked secrets
        pass

# Test your manager
# config = ConfigurationManager(env='development')
# config.load_all_configs()
# print(config.get_safe_summary())

## 6. Summary

### Key Concepts

1. **Administrator Privileges**
   - Check with `is_admin()` before elevated operations
   - Request elevation with UAC when needed
   - Follow principle of least privilege

2. **File Permissions**
   - Use `icacls` to check and modify ACLs
   - Set private permissions for sensitive data
   - Understand Windows permission model (F, M, RX, R, W)

3. **Credential Management**
   - **Never** hardcode credentials
   - Use environment variables or .env files
   - Keep .env in .gitignore
   - Provide .env.template for team

4. **Security Best Practices**
   - Validate all inputs
   - Use parameterized queries
   - Log security events
   - Encrypt sensitive data at rest
   - Rotate credentials regularly

### Real-World Applications

- **Data Protection**: Secure customer PII and proprietary datasets
- **API Integration**: Manage API keys and tokens safely
- **Database Access**: Store connection strings securely
- **Team Collaboration**: Share configs without exposing secrets
- **Compliance**: Meet security requirements (GDPR, HIPAA, etc.)

### What's Next?

In **Module 09: Advanced Scripting Techniques**, you'll learn:
- Parallel processing with multiprocessing
- Advanced error handling patterns
- Structured logging with Python logging
- Testing automation scripts
- Performance optimization

### Self-Assessment

Before moving on, make sure you can:
- [ ] Check if running as administrator
- [ ] Request elevation when needed
- [ ] Check and modify file permissions
- [ ] Load credentials from environment variables
- [ ] Create and use .env files safely
- [ ] Apply security best practices to data science workflows

---

**Continue to Module 09** when ready!