# Chapter 28: Environment and Configuration

This notebook covers how CLI tools interact with the operating system environment: reading environment variables with `os.environ`, inspecting command-line arguments via `sys.argv`, integrating configuration files, and using exit codes for process communication.

## Key Concepts
- **`os.environ`**: A mapping of environment variables
- **`sys.argv`**: The raw list of command-line arguments
- **Config file integration**: Layering file-based config with CLI arguments
- **Exit codes**: Communicating success or failure to the calling process

## Section 1: Environment Variables with os.environ

`os.environ` is a dictionary-like object that provides access to environment variables. Many CLI tools use environment variables for configuration (e.g., `DATABASE_URL`, `API_KEY`).

In [None]:
import os

# os.environ behaves like a dict[str, str]
print(f"Type: {type(os.environ).__name__}")
print(f"HOME: {os.environ.get('HOME', 'not set')}")
print(f"PATH (first 80 chars): {os.environ.get('PATH', '')[:80]}...")

In [None]:
# Setting and reading environment variables
os.environ["MY_APP_DEBUG"] = "true"
os.environ["MY_APP_PORT"] = "8080"

debug_flag: str = os.environ["MY_APP_DEBUG"]
port_str: str = os.environ["MY_APP_PORT"]

print(f"MY_APP_DEBUG: {debug_flag!r} (type: {type(debug_flag).__name__})")
print(f"MY_APP_PORT: {port_str!r} (type: {type(port_str).__name__})")
print()
print("Note: Environment variables are always strings.")
print(f"Port as int: {int(port_str)}")

# Clean up
del os.environ["MY_APP_DEBUG"]
del os.environ["MY_APP_PORT"]

In [None]:
# Safe access with .get() and defaults
# This is the recommended pattern for optional environment variables
database_url: str = os.environ.get("DATABASE_URL", "sqlite:///default.db")
log_level: str = os.environ.get("LOG_LEVEL", "info")
max_workers: int = int(os.environ.get("MAX_WORKERS", "4"))

print(f"Database URL: {database_url}")
print(f"Log level: {log_level}")
print(f"Max workers: {max_workers} (type: {type(max_workers).__name__})")

In [None]:
# Checking for required environment variables
def require_env(name: str) -> str:
    """Get a required environment variable or raise an error."""
    value: str | None = os.environ.get(name)
    if value is None:
        raise EnvironmentError(f"Required environment variable '{name}' is not set")
    return value


# Set a variable, then require it
os.environ["API_KEY"] = "sk-test-12345"
key: str = require_env("API_KEY")
print(f"API_KEY: {key}")

# Missing variable raises an error
try:
    require_env("MISSING_SECRET")
except EnvironmentError as e:
    print(f"Error: {e}")

# Clean up
del os.environ["API_KEY"]

## Section 2: sys.argv â€” Raw Command-Line Arguments

`sys.argv` is a plain list of strings representing the command-line arguments. `sys.argv[0]` is the script name, and `sys.argv[1:]` are the user-provided arguments. While `argparse` is preferred for real CLI tools, `sys.argv` is useful for understanding how argument parsing works underneath.

In [None]:
import sys

# sys.argv is always a list with at least one element
print(f"Type: {type(sys.argv).__name__}")
print(f"Length: {len(sys.argv)}")
print(f"Script name (argv[0]): {sys.argv[0]}")
print(f"Is list: {isinstance(sys.argv, list)}")

In [None]:
# Simulating manual argument parsing (for educational purposes)
# In real code, always use argparse instead

def manual_parse(argv: list[str]) -> dict[str, str | bool]:
    """Parse arguments manually from a list of strings."""
    result: dict[str, str | bool] = {}
    i: int = 0
    while i < len(argv):
        arg: str = argv[i]
        if arg.startswith("--"):
            key: str = arg[2:].replace("-", "_")
            # Check if next arg is a value or another flag
            if i + 1 < len(argv) and not argv[i + 1].startswith("--"):
                result[key] = argv[i + 1]
                i += 2
            else:
                result[key] = True
                i += 1
        else:
            result.setdefault("positional", [])
            result["positional"].append(arg)  # type: ignore[union-attr]
            i += 1
    return result


# Simulate parsing
fake_argv: list[str] = ["app.py", "input.txt", "--verbose", "--output", "result.csv"]
parsed: dict[str, str | bool] = manual_parse(fake_argv[1:])  # Skip script name

print(f"Input: {fake_argv}")
print(f"Parsed: {parsed}")

## Section 3: Combining Environment Variables with argparse

A common pattern is to check environment variables as fallback defaults for CLI arguments. This allows configuration via both the command line and environment.

In [None]:
import argparse


def build_env_aware_parser() -> argparse.ArgumentParser:
    """Build a parser that uses environment variables as defaults."""
    parser = argparse.ArgumentParser(prog="webapp")

    # CLI args override env vars, which override hardcoded defaults
    parser.add_argument(
        "--host",
        default=os.environ.get("APP_HOST", "127.0.0.1"),
        help="Server host (env: APP_HOST, default: %(default)s)",
    )
    parser.add_argument(
        "--port",
        type=int,
        default=int(os.environ.get("APP_PORT", "8000")),
        help="Server port (env: APP_PORT, default: %(default)s)",
    )
    parser.add_argument(
        "--debug",
        action="store_true",
        default=os.environ.get("APP_DEBUG", "").lower() in ("1", "true", "yes"),
        help="Enable debug mode (env: APP_DEBUG)",
    )

    return parser


# Without env vars: uses hardcoded defaults
parser = build_env_aware_parser()
args = parser.parse_args([])
print(f"Without env vars: host={args.host}, port={args.port}, debug={args.debug}")

# With env vars: uses environment as defaults
os.environ["APP_HOST"] = "0.0.0.0"
os.environ["APP_PORT"] = "3000"
os.environ["APP_DEBUG"] = "true"

parser = build_env_aware_parser()
args = parser.parse_args([])
print(f"With env vars:    host={args.host}, port={args.port}, debug={args.debug}")

# CLI args still override environment
args = parser.parse_args(["--host", "localhost", "--port", "9000"])
print(f"CLI override:     host={args.host}, port={args.port}, debug={args.debug}")

# Clean up
del os.environ["APP_HOST"]
del os.environ["APP_PORT"]
del os.environ["APP_DEBUG"]

## Section 4: Config File Integration

Many CLI tools support configuration files (TOML, JSON, INI). A robust tool layers configuration with this priority:
1. Command-line arguments (highest priority)
2. Environment variables
3. Config file values
4. Hardcoded defaults (lowest priority)

In [None]:
import json
import tempfile
from pathlib import Path


def load_config_file(path: Path) -> dict[str, object]:
    """Load configuration from a JSON file."""
    if not path.exists():
        return {}
    with path.open() as f:
        config: dict[str, object] = json.load(f)
    return config


def merge_config(
    defaults: dict[str, object],
    file_config: dict[str, object],
    env_config: dict[str, object],
    cli_args: dict[str, object],
) -> dict[str, object]:
    """Merge configuration sources with proper priority."""
    merged: dict[str, object] = {}
    merged.update(defaults)
    merged.update({k: v for k, v in file_config.items() if v is not None})
    merged.update({k: v for k, v in env_config.items() if v is not None})
    merged.update({k: v for k, v in cli_args.items() if v is not None})
    return merged


# Create a temporary config file
config_data: dict[str, object] = {
    "host": "config-host.example.com",
    "port": 5000,
    "debug": False,
    "log_level": "warning",
}

with tempfile.NamedTemporaryFile(mode="w", suffix=".json", delete=False) as f:
    json.dump(config_data, f)
    config_path: Path = Path(f.name)

# Load and display
file_config: dict[str, object] = load_config_file(config_path)
print(f"Config file: {config_path}")
print(f"File config: {json.dumps(file_config, indent=2)}")

# Clean up temp file
config_path.unlink()

In [None]:
# Demonstrate the full configuration layering
defaults: dict[str, object] = {
    "host": "127.0.0.1",
    "port": 8000,
    "debug": False,
    "log_level": "info",
}

file_config = {
    "host": "config-host.example.com",
    "port": 5000,
}

env_config: dict[str, object] = {
    "port": 3000,  # Overrides file config
}

cli_args_config: dict[str, object] = {
    "debug": True,  # Overrides everything
}

final: dict[str, object] = merge_config(defaults, file_config, env_config, cli_args_config)

print("Configuration layering:")
print(f"  Defaults:    {defaults}")
print(f"  File config: {file_config}")
print(f"  Env config:  {env_config}")
print(f"  CLI args:    {cli_args_config}")
print(f"  Final:       {final}")
print()
print(f"  host comes from file_config: {final['host']}")
print(f"  port comes from env_config:  {final['port']}")
print(f"  debug comes from cli_args:   {final['debug']}")
print(f"  log_level comes from defaults: {final['log_level']}")

## Section 5: Exit Codes

Exit codes communicate a program's result to the calling process (shell, CI pipeline, etc.).
- `0` means success
- Non-zero means failure (by convention, `1` is a general error, `2` is usage error)

`sys.exit()` raises a `SystemExit` exception, which can be caught for testing.

In [None]:
import sys

# sys.exit raises SystemExit, which can be caught
try:
    sys.exit(0)
except SystemExit as e:
    print(f"Exit code: {e.code} (success)")

try:
    sys.exit(1)
except SystemExit as e:
    print(f"Exit code: {e.code} (general error)")

try:
    sys.exit(2)
except SystemExit as e:
    print(f"Exit code: {e.code} (usage error)")

In [None]:
# sys.exit can also accept a string message (implies exit code 1)
try:
    sys.exit("Fatal error: configuration file not found")
except SystemExit as e:
    print(f"Exit message: {e.code}")
    print(f"Type of code: {type(e.code).__name__}")

In [None]:
# Common pattern: a main function that returns an exit code
from enum import IntEnum


class ExitCode(IntEnum):
    """Standard exit codes for the application."""
    SUCCESS = 0
    GENERAL_ERROR = 1
    USAGE_ERROR = 2
    CONFIG_ERROR = 3
    IO_ERROR = 4


def main(args: list[str]) -> ExitCode:
    """Application entry point that returns an exit code."""
    if not args:
        print("Error: no arguments provided", file=sys.stderr)
        return ExitCode.USAGE_ERROR

    filename: str = args[0]
    if not filename.endswith(".txt"):
        print(f"Error: unsupported file type: {filename}", file=sys.stderr)
        return ExitCode.GENERAL_ERROR

    print(f"Processing {filename}...")
    return ExitCode.SUCCESS


# Simulate different invocations
for test_args in [[], ["data.csv"], ["report.txt"]]:
    code: ExitCode = main(test_args)
    print(f"  args={test_args} -> exit code {code} ({code.name})\n")

## Section 6: Putting It All Together

A complete example of a CLI tool that combines argparse, environment variables, config file loading, and proper exit codes.

In [None]:
import argparse
import json
import os
import sys
import tempfile
from pathlib import Path


def build_full_parser() -> argparse.ArgumentParser:
    """Build a parser for a complete CLI tool."""
    parser = argparse.ArgumentParser(
        prog="appctl",
        description="Application control tool with layered configuration",
    )
    parser.add_argument(
        "--config",
        type=Path,
        default=Path(os.environ.get("APPCTL_CONFIG", "config.json")),
        help="Config file path (env: APPCTL_CONFIG)",
    )
    parser.add_argument(
        "--host",
        default=os.environ.get("APPCTL_HOST"),
        help="Server host (env: APPCTL_HOST)",
    )
    parser.add_argument(
        "--port",
        type=int,
        default=int(os.environ.get("APPCTL_PORT", "0")) or None,
        help="Server port (env: APPCTL_PORT)",
    )
    parser.add_argument(
        "-v", "--verbose",
        action="store_true",
        help="Enable verbose output",
    )
    return parser


def run_app(argv: list[str] | None = None) -> int:
    """Run the application with full configuration layering."""
    parser: argparse.ArgumentParser = build_full_parser()
    args: argparse.Namespace = parser.parse_args(argv)

    # Start with defaults
    config: dict[str, object] = {
        "host": "127.0.0.1",
        "port": 8000,
        "verbose": False,
    }

    # Layer file config
    if args.config.exists():
        with args.config.open() as f:
            file_cfg: dict[str, object] = json.load(f)
        config.update(file_cfg)
        if args.verbose:
            print(f"Loaded config from {args.config}")

    # Layer CLI args (only if explicitly provided)
    if args.host is not None:
        config["host"] = args.host
    if args.port is not None:
        config["port"] = args.port
    if args.verbose:
        config["verbose"] = True

    print(f"Final config: {json.dumps(config, indent=2, default=str)}")
    return 0


# Create a temp config file for demonstration
demo_config: dict[str, object] = {"host": "0.0.0.0", "port": 5000}
with tempfile.NamedTemporaryFile(
    mode="w", suffix=".json", delete=False
) as f:
    json.dump(demo_config, f)
    tmp_config_path: str = f.name

# Run with config file and CLI override
print("--- With config file and CLI port override ---")
exit_code: int = run_app([
    "--config", tmp_config_path,
    "--port", "9090",
    "--verbose",
])
print(f"Exit code: {exit_code}")

# Clean up
Path(tmp_config_path).unlink()

In [None]:
# Run with only defaults (no config file, no env vars, no CLI args)
print("--- With only defaults ---")
exit_code = run_app([])
print(f"Exit code: {exit_code}")

## Section 7: Best Practices for CLI Configuration

When building production CLI tools, follow these conventions for a good user experience.

In [None]:
# Best practice: use a dataclass to hold validated configuration
from dataclasses import dataclass


@dataclass(frozen=True)
class AppConfig:
    """Validated, immutable application configuration."""
    host: str
    port: int
    debug: bool
    log_level: str

    def __post_init__(self) -> None:
        """Validate configuration values."""
        if not 1 <= self.port <= 65535:
            raise ValueError(f"Port must be 1-65535, got {self.port}")
        valid_levels: set[str] = {"debug", "info", "warning", "error", "critical"}
        if self.log_level not in valid_levels:
            raise ValueError(f"Invalid log level: {self.log_level}")


# Valid configuration
config = AppConfig(host="0.0.0.0", port=8080, debug=True, log_level="debug")
print(f"Valid config: {config}")

# Invalid port is caught immediately
try:
    bad_config = AppConfig(host="localhost", port=99999, debug=False, log_level="info")
except ValueError as e:
    print(f"Validation error: {e}")

# Invalid log level
try:
    bad_config = AppConfig(host="localhost", port=8080, debug=False, log_level="verbose")
except ValueError as e:
    print(f"Validation error: {e}")

## Summary

### os.environ
- **`os.environ[key]`** reads a variable (raises `KeyError` if missing)
- **`os.environ.get(key, default)`** reads safely with a fallback
- Environment variables are always strings; cast them explicitly

### sys.argv
- **`sys.argv`** is a `list[str]` with the script name at index 0
- Use `argparse` instead of parsing `sys.argv` manually

### Config File Integration
- Layer configuration with this priority: **CLI args > env vars > config file > defaults**
- Use `argparse` defaults that read from `os.environ.get(...)` for env var integration
- Validate final configuration with a dataclass or similar structure

### Exit Codes
- **`sys.exit(0)`** signals success; non-zero signals failure
- **`sys.exit()`** raises `SystemExit`, which can be caught in tests
- Define exit codes as an `IntEnum` for clarity and consistency
- Return exit codes from `main()` instead of calling `sys.exit()` directly for testability