# Module 3: Python Libraries

This module covers **essential Python standard library modules** for interacting with the operating system and Python interpreter.

## What you'll be able to do after this module
- Navigate and manipulate the **file system** programmatically.
- Access and modify **environment variables**.
- Work with **file paths** in a cross-platform way.
- Understand how Python interacts with the **interpreter** and **system**.
- Handle **command-line arguments** and **standard I/O streams**.

## How to use this notebook
- Run cells top-to-bottom once.
- Each section has: **theory ‚Üí examples**.

## Table of Contents
1. The `os` Module: Operating system interface
2. Environment Variables: Reading and setting system variables
3. Path Operations: Cross-platform path handling with `os.path` and `pathlib`
4. The `sys` Module: Python interpreter interaction
5. Command-Line Arguments: Building CLI tools
6. Standard I/O Streams: stdin, stdout, stderr

---

In [None]:
# Setup: Helper function for section banners

def banner(title: str) -> None:
    print(f"\n{'=' * 10} {title} {'=' * 10}")

banner("Module 3: Python Libraries")
print("Ready to explore os and sys modules!")

## 1. The `os` Module

### Theory
The `os` module provides a **portable way** to use operating system-dependent functionality.

It allows you to:
- Get information about the current working directory
- Create, rename, and remove files/directories
- List directory contents
- Execute system commands

**Key principle:** Use `os` for cross-platform compatibility instead of hardcoding shell commands.

In [None]:
import os

banner("os Module Basics")

# Current working directory
cwd = os.getcwd()
print(f"Current directory: {cwd}")

# Operating system name ('posix' for Linux/Mac, 'nt' for Windows)
print(f"OS name: {os.name}")

# List directory contents
print(f"\nFiles in current directory:")
for item in os.listdir('.')[:5]:  # Show first 5 items
    print(f"  - {item}")

In [None]:
banner("Directory Operations")

# Create a directory (mkdir) - creates single directory
# os.mkdir('new_folder')

# Create nested directories (makedirs) - creates all intermediate directories
# os.makedirs('parent/child/grandchild', exist_ok=True)

# Remove a directory (only if empty)
# os.rmdir('new_folder')

# Remove nested empty directories
# os.removedirs('parent/child/grandchild')

# Check if path exists
print(f"Does '.' exist? {os.path.exists('.')}")
print(f"Is '.' a directory? {os.path.isdir('.')}")
print(f"Is 'os' a file? {os.path.isfile('os')}")

# Safe directory creation example
demo_dir = 'demo_module3'
if not os.path.exists(demo_dir):
    os.mkdir(demo_dir)
    print(f"Created: {demo_dir}")
else:
    print(f"Already exists: {demo_dir}")

In [None]:
banner("Walking Directory Trees")

# os.walk() - recursively traverse directories
# Returns: (dirpath, dirnames, filenames) for each directory

print("Directory structure (first 3 levels):")
count = 0
for root, dirs, files in os.walk('.'):
    # Skip hidden directories and common non-essential folders
    dirs[:] = [d for d in dirs if not d.startswith('.') and d != '__pycache__']
    
    level = root.replace('.', '').count(os.sep)
    indent = ' ' * 2 * level
    print(f"{indent}{os.path.basename(root)}/")
    
    # Show files
    subindent = ' ' * 2 * (level + 1)
    for file in files[:3]:  # Limit files shown
        print(f"{subindent}{file}")
    
    count += 1
    if count >= 5:  # Limit depth for demo
        break

## 2. Environment Variables

### Theory
Environment variables are **key-value pairs** stored by the operating system.

Common uses:
- Configuration (database URLs, API keys)
- System paths (PATH, HOME)
- Application settings

**Best practice:** Never hardcode sensitive data; use environment variables instead.

In [None]:
banner("Environment Variables")

# Access all environment variables (dict-like)
print(f"Total env vars: {len(os.environ)}")

# Get specific variables
home = os.environ.get('HOME', 'Not set')
user = os.environ.get('USER', 'Not set')
shell = os.environ.get('SHELL', 'Not set')

print(f"HOME: {home}")
print(f"USER: {user}")
print(f"SHELL: {shell}")

# Safe access with default value
debug_mode = os.environ.get('DEBUG', 'false')
print(f"DEBUG: {debug_mode}")

In [None]:
banner("Setting Environment Variables")

# Set an environment variable (only for current process)
os.environ['MY_APP_CONFIG'] = 'production'
print(f"MY_APP_CONFIG: {os.environ['MY_APP_CONFIG']}")

# Alternative: os.putenv() - but os.environ is preferred

# Check if variable exists
if 'MY_APP_CONFIG' in os.environ:
    print("Config variable is set!")

# Delete an environment variable
del os.environ['MY_APP_CONFIG']
print(f"After deletion: {os.environ.get('MY_APP_CONFIG', 'Not found')}")

# Show PATH variable (split into readable format)
print("\nFirst 3 PATH entries:")
path_entries = os.environ.get('PATH', '').split(os.pathsep)
for entry in path_entries[:3]:
    print(f"  - {entry}")

## 3. Path Operations

### Theory
File paths differ between operating systems:
- **Unix/Mac:** `/home/user/file.txt`
- **Windows:** `C:\Users\user\file.txt`

Python provides two approaches:
1. **`os.path`** - Traditional string-based path manipulation
2. **`pathlib`** - Modern object-oriented path handling (Python 3.4+)

**Recommendation:** Use `pathlib` for new code; it's more readable and powerful.

In [None]:
banner("os.path Operations")

# Join paths (cross-platform)
path = os.path.join('folder', 'subfolder', 'file.txt')
print(f"Joined path: {path}")

# Split path into directory and filename
dirname, filename = os.path.split('/home/user/documents/report.pdf')
print(f"Directory: {dirname}")
print(f"Filename: {filename}")

# Get file extension
name, ext = os.path.splitext('report.pdf')
print(f"Name: {name}, Extension: {ext}")

# Get absolute path
print(f"Absolute path of '.': {os.path.abspath('.')}")

# Normalize path (resolve .. and .)
messy_path = 'folder/../folder/./file.txt'
print(f"Normalized: {os.path.normpath(messy_path)}")

In [None]:
from pathlib import Path

banner("pathlib - Modern Path Handling")

# Create Path objects
current = Path('.')
home = Path.home()
cwd = Path.cwd()

print(f"Current: {current.resolve()}")
print(f"Home: {home}")
print(f"CWD: {cwd}")

# Join paths using / operator (very readable!)
config_path = home / '.config' / 'myapp' / 'settings.json'
print(f"Config path: {config_path}")

# Path properties
demo_file = Path('/home/user/documents/report.pdf')
print(f"\nPath properties of {demo_file}:")
print(f"  name: {demo_file.name}")
print(f"  stem: {demo_file.stem}")
print(f"  suffix: {demo_file.suffix}")
print(f"  parent: {demo_file.parent}")
print(f"  parts: {demo_file.parts}")

In [None]:
banner("pathlib - Practical Examples")

# Check existence and type
p = Path('.')
print(f"Exists: {p.exists()}")
print(f"Is directory: {p.is_dir()}")
print(f"Is file: {p.is_file()}")

# List directory contents
print("\nPython files in current directory:")
for py_file in Path('.').glob('*.py'):
    print(f"  - {py_file}")

# Recursive glob (find all .ipynb files)
print("\nNotebook files:")
for notebook in Path('.').glob('**/*.ipynb'):
    print(f"  - {notebook}")

# Read/write files (pathlib way)
demo_file = Path('demo_module3') / 'test.txt'
demo_file.parent.mkdir(exist_ok=True)  # Ensure directory exists
demo_file.write_text('Hello from pathlib!')
content = demo_file.read_text()
print(f"\nFile content: {content}")

## 4. The `sys` Module

### Theory
The `sys` module provides access to **Python interpreter** variables and functions.

Common uses:
- Get Python version and platform info
- Access command-line arguments
- Modify module search paths
- Control standard I/O streams
- Exit programs with status codes

In [None]:
import sys

banner("sys Module - Interpreter Info")

# Python version
print(f"Python version: {sys.version}")
print(f"Version info: {sys.version_info}")
print(f"Major.Minor: {sys.version_info.major}.{sys.version_info.minor}")

# Platform
print(f"\nPlatform: {sys.platform}")

# Executable path
print(f"Python executable: {sys.executable}")

# Default encoding
print(f"Default encoding: {sys.getdefaultencoding()}")

In [None]:
banner("sys.path - Module Search Paths")

# Python searches these paths for modules (in order)
print("Module search paths:")
for i, path in enumerate(sys.path[:5]):
    print(f"  {i}: {path or '(current directory)'}")
print(f"  ... ({len(sys.path)} total paths)")

# Add custom path (useful for project imports)
# sys.path.insert(0, '/path/to/my/modules')

# Check loaded modules
print(f"\nLoaded modules: {len(sys.modules)}")
print("Some loaded modules:")
for name in list(sys.modules.keys())[:5]:
    print(f"  - {name}")

In [None]:
banner("sys - Memory and Recursion")

# Object size in bytes
small_list = [1, 2, 3]
big_list = list(range(1000))
print(f"Size of [1,2,3]: {sys.getsizeof(small_list)} bytes")
print(f"Size of range(1000) list: {sys.getsizeof(big_list)} bytes")
print(f"Size of empty dict: {sys.getsizeof({})} bytes")
print(f"Size of empty string: {sys.getsizeof('')} bytes")

# Recursion limit
print(f"\nRecursion limit: {sys.getrecursionlimit()}")
# sys.setrecursionlimit(2000)  # Increase if needed (be careful!)

# Reference count (CPython specific)
x = []
print(f"Reference count of x: {sys.getrefcount(x)}")

## 5. Command-Line Arguments

### Theory
`sys.argv` is a list containing command-line arguments passed to the script.

- `sys.argv[0]` is the script name
- `sys.argv[1:]` are the arguments

For complex CLI tools, consider using `argparse` (standard library) or `click` (third-party).

In [None]:
banner("sys.argv - Command Line Arguments")

# In a notebook, sys.argv contains Jupyter-related args
print(f"sys.argv: {sys.argv}")
print(f"Script name: {sys.argv[0]}")

# Simulating command-line parsing
# If this were a script run as: python script.py --verbose input.txt
# sys.argv would be: ['script.py', '--verbose', 'input.txt']

# Example: Simple argument parsing
def parse_args(args: list[str]) -> dict:
    """Simple argument parser."""
    result = {'verbose': False, 'files': []}
    
    for arg in args[1:]:  # Skip script name
        if arg in ('-v', '--verbose'):
            result['verbose'] = True
        elif not arg.startswith('-'):
            result['files'].append(arg)
    
    return result

# Demo
demo_args = ['script.py', '--verbose', 'input.txt', 'output.txt']
parsed = parse_args(demo_args)
print(f"\nParsed args: {parsed}")

In [None]:
import argparse

banner("argparse - Professional CLI")

# argparse provides automatic help, type checking, and validation

def create_parser():
    parser = argparse.ArgumentParser(
        description='Process some files.',
        epilog='Example: python script.py -v input.txt'
    )
    
    parser.add_argument('files', nargs='+', help='Input files to process')
    parser.add_argument('-v', '--verbose', action='store_true', help='Increase output verbosity')
    parser.add_argument('-o', '--output', default='result.txt', help='Output file (default: result.txt)')
    parser.add_argument('-n', '--count', type=int, default=1, help='Number of iterations')
    
    return parser

# Demo (in notebook, we parse a list instead of sys.argv)
parser = create_parser()

# Simulate: python script.py -v -n 5 file1.txt file2.txt
demo_args = ['-v', '-n', '5', 'file1.txt', 'file2.txt']
args = parser.parse_args(demo_args)

print(f"Verbose: {args.verbose}")
print(f"Output: {args.output}")
print(f"Count: {args.count}")
print(f"Files: {args.files}")

# Show auto-generated help
print("\n--- Help Message ---")
parser.print_help()

## 6. Standard I/O Streams

### Theory
Every program has three standard streams:
- **`sys.stdin`**: Standard input (keyboard by default)
- **`sys.stdout`**: Standard output (console by default)
- **`sys.stderr`**: Standard error (console by default, but separate)

These can be **redirected** for logging, testing, or piping data.

In [None]:
banner("Standard I/O Streams")

# Standard streams
print(f"stdin: {sys.stdin}")
print(f"stdout: {sys.stdout}")
print(f"stderr: {sys.stderr}")

# Write to stderr (useful for error messages)
print("This is an error message", file=sys.stderr)

# Flush output (important for real-time logging)
print("Flushing output...", flush=True)

In [None]:
from io import StringIO

banner("Redirecting Output")

# Capture stdout to a string (useful for testing)
old_stdout = sys.stdout
sys.stdout = captured = StringIO()

# This print goes to our StringIO buffer
print("This is captured!")
print("So is this!")

# Restore stdout
sys.stdout = old_stdout

# Get captured content
output = captured.getvalue()
print(f"Captured output:\n{output}")

# Better approach: use contextlib.redirect_stdout
from contextlib import redirect_stdout

buffer = StringIO()
with redirect_stdout(buffer):
    print("Safely captured with context manager")

print(f"Buffer content: {buffer.getvalue()}")

In [None]:
banner("sys.exit - Program Termination")

# sys.exit(code) terminates the program
# code 0 = success, non-zero = error

def process_data(data):
    """Example function with exit codes."""
    if not data:
        print("Error: No data provided", file=sys.stderr)
        # sys.exit(1)  # Would exit with error code
        return 1  # Return code instead for demo
    
    print(f"Processing {len(data)} items...")
    return 0  # Success

# Demo
result = process_data([])
print(f"Exit code: {result}")

result = process_data([1, 2, 3])
print(f"Exit code: {result}")

# Common exit code patterns:
# 0 - Success
# 1 - General error
# 2 - Command line usage error
# 126 - Command not executable
# 127 - Command not found
# 130 - Script terminated by Ctrl+C

## Summary

### `os` Module
| Function | Purpose |
|----------|--------|
| `os.getcwd()` | Get current working directory |
| `os.listdir(path)` | List directory contents |
| `os.mkdir(path)` | Create directory |
| `os.makedirs(path)` | Create nested directories |
| `os.remove(path)` | Delete file |
| `os.environ` | Access environment variables |
| `os.path.join()` | Join path components |
| `os.walk(path)` | Recursively traverse directories |

### `pathlib` Module
| Feature | Example |
|---------|--------|
| Create path | `Path('/home/user')` |
| Join paths | `path / 'subdir' / 'file.txt'` |
| Get filename | `path.name` |
| Get extension | `path.suffix` |
| Check exists | `path.exists()` |
| Glob files | `path.glob('*.py')` |

### `sys` Module
| Variable/Function | Purpose |
|-------------------|--------|
| `sys.version` | Python version string |
| `sys.argv` | Command-line arguments |
| `sys.path` | Module search paths |
| `sys.stdin/stdout/stderr` | Standard I/O streams |
| `sys.exit(code)` | Exit program |
| `sys.getsizeof(obj)` | Object size in bytes |

---

**Next Module:** File Systems & Module Architecture - Building robust packages and project structure.

In [None]:
# Cleanup demo files
import shutil

banner("Cleanup")

demo_dir = Path('demo_module3')
if demo_dir.exists():
    shutil.rmtree(demo_dir)
    print(f"Removed: {demo_dir}")
else:
    print("Nothing to clean up")

print("\nModule 3 complete! üêç")