# Chapter 33: Subprocess

This notebook covers Python's `subprocess` module for running external commands from within Python. You will learn how to execute commands, capture their output, handle errors, and work with pipes.

## Key Concepts
- **`subprocess.run()`**: The recommended high-level function for running commands
- **`capture_output`** / **`text`**: Capturing stdout and stderr as strings
- **`check=True`** / **`check_returncode()`**: Raising exceptions on command failure
- **`subprocess.Popen`**: Low-level process management with direct pipe control
- **Piping**: Connecting the output of one process to the input of another
- **Timeouts**: Preventing commands from running indefinitely

## Section 1: Running Commands with `subprocess.run()`

`subprocess.run()` is the primary way to run external commands. It waits for the command to complete and returns a `CompletedProcess` object.

In [None]:
import subprocess

# Run a simple command
result: subprocess.CompletedProcess[str] = subprocess.run(
    ["echo", "hello"],
    capture_output=True,
    text=True,
)

print(f"Return code: {result.returncode}")
print(f"Stdout:      {result.stdout.strip()!r}")
print(f"Stderr:      {result.stderr.strip()!r}")
print(f"Success:     {result.returncode == 0}")

In [None]:
import subprocess

# The args parameter is the command as a list of strings
# Each element is a separate argument (no shell splitting needed)
result: subprocess.CompletedProcess[str] = subprocess.run(
    ["python3", "-c", "print('Hello from Python subprocess!')"],
    capture_output=True,
    text=True,
)

print(f"Output: {result.stdout.strip()}")
print(f"Args used: {result.args}")

## Section 2: Capturing Output

Use `capture_output=True` (shorthand for `stdout=subprocess.PIPE, stderr=subprocess.PIPE`) and `text=True` to get string output instead of bytes.

In [None]:
import subprocess
import sys

# capture_output=True captures both stdout and stderr
result: subprocess.CompletedProcess[str] = subprocess.run(
    [sys.executable, "-c", "print('standard output')\nimport sys; sys.stderr.write('standard error\\n')"],
    capture_output=True,
    text=True,
)

print(f"stdout: {result.stdout.strip()!r}")
print(f"stderr: {result.stderr.strip()!r}")

In [None]:
import subprocess
import sys

# Without text=True, output is bytes
result_bytes: subprocess.CompletedProcess[bytes] = subprocess.run(
    [sys.executable, "-c", "print('bytes output')"],
    capture_output=True,
)

print(f"Type: {type(result_bytes.stdout)}")
print(f"Raw:  {result_bytes.stdout}")

# With text=True, output is str
result_text: subprocess.CompletedProcess[str] = subprocess.run(
    [sys.executable, "-c", "print('text output')"],
    capture_output=True,
    text=True,
)

print(f"\nType: {type(result_text.stdout)}")
print(f"Text: {result_text.stdout.strip()!r}")

## Section 3: Error Handling and Return Codes

Commands indicate success or failure via their return code. A return code of `0` means success; non-zero means failure. You can check this manually or let Python raise an exception.

In [None]:
import subprocess
import sys

# A command that exits with a non-zero code
result: subprocess.CompletedProcess[bytes] = subprocess.run(
    [sys.executable, "-c", "raise SystemExit(1)"],
    capture_output=True,
)

print(f"Return code: {result.returncode}")
print(f"Success: {result.returncode == 0}")

# check_returncode() raises CalledProcessError on non-zero exit
try:
    result.check_returncode()
    print("No error raised")
except subprocess.CalledProcessError as e:
    print(f"\nCalledProcessError raised!")
    print(f"  Command:     {e.cmd}")
    print(f"  Return code: {e.returncode}")

In [None]:
import subprocess
import sys

# check=True automatically raises CalledProcessError on non-zero exit
try:
    subprocess.run(
        [sys.executable, "-c", "raise SystemExit(42)"],
        capture_output=True,
        text=True,
        check=True,
    )
except subprocess.CalledProcessError as e:
    print(f"Command failed with return code {e.returncode}")
    print(f"Stderr: {e.stderr.strip()!r}")

# Successful command with check=True works normally
result: subprocess.CompletedProcess[str] = subprocess.run(
    [sys.executable, "-c", "print('all good')"],
    capture_output=True,
    text=True,
    check=True,
)
print(f"\nSuccess: {result.stdout.strip()!r}")

## Section 4: Sending Input to a Process

The `input` parameter lets you send data to a command's stdin.

In [None]:
import subprocess
import sys

# Send input to a Python script via stdin
script: str = "import sys; data = sys.stdin.read(); print(f'Received: {data.strip()}')"

result: subprocess.CompletedProcess[str] = subprocess.run(
    [sys.executable, "-c", script],
    input="hello from parent process",
    capture_output=True,
    text=True,
)

print(f"Output: {result.stdout.strip()}")

In [None]:
import subprocess
import sys

# Process multi-line input
script: str = """
import sys
lines = sys.stdin.readlines()
for i, line in enumerate(lines, 1):
    print(f"Line {i}: {line.strip()}")
"""

input_data: str = "apple\nbanana\ncherry"

result: subprocess.CompletedProcess[str] = subprocess.run(
    [sys.executable, "-c", script],
    input=input_data,
    capture_output=True,
    text=True,
)

print(result.stdout.strip())

## Section 5: Timeouts

The `timeout` parameter (in seconds) prevents commands from running indefinitely. A `TimeoutExpired` exception is raised if the command takes too long.

In [None]:
import subprocess
import sys

# A command that completes quickly
result: subprocess.CompletedProcess[str] = subprocess.run(
    [sys.executable, "-c", "print('fast')"],
    capture_output=True,
    text=True,
    timeout=5,
)
print(f"Fast command: {result.stdout.strip()!r}")

# A command that takes too long
try:
    subprocess.run(
        [sys.executable, "-c", "import time; time.sleep(10)"],
        capture_output=True,
        timeout=1,
    )
except subprocess.TimeoutExpired as e:
    print(f"\nTimeoutExpired after {e.timeout}s")
    print(f"Command: {e.cmd}")

## Section 6: Low-Level Process Control with `Popen`

`subprocess.Popen` provides direct control over the process. Unlike `run()`, it does not wait for the process to finish -- you manage the lifecycle yourself.

In [None]:
import subprocess
import sys

# Start a process without waiting
proc: subprocess.Popen[str] = subprocess.Popen(
    [sys.executable, "-c", "print('from popen')"],
    stdout=subprocess.PIPE,
    stderr=subprocess.PIPE,
    text=True,
)

# Wait for it and get output
stdout, stderr = proc.communicate()

print(f"PID:         {proc.pid}")
print(f"Return code: {proc.returncode}")
print(f"Stdout:      {stdout.strip()!r}")
print(f"Stderr:      {stderr.strip()!r}")

In [None]:
import subprocess
import sys

# Popen with input via communicate()
proc: subprocess.Popen[str] = subprocess.Popen(
    [sys.executable, "-c", "import sys; print(sys.stdin.read().upper())"],
    stdin=subprocess.PIPE,
    stdout=subprocess.PIPE,
    stderr=subprocess.PIPE,
    text=True,
)

stdout, stderr = proc.communicate(input="hello world")

print(f"Input:  'hello world'")
print(f"Output: {stdout.strip()!r}")

## Section 7: Piping Between Processes

You can connect the stdout of one process to the stdin of another, similar to shell pipes (`cmd1 | cmd2`). Use `Popen` for this pattern.

In [None]:
import subprocess
import sys

# Simulate: echo "hello world" | python -c "upper()"
# Process 1: generates output
producer: subprocess.Popen[str] = subprocess.Popen(
    [sys.executable, "-c", "print('hello world')"],
    stdout=subprocess.PIPE,
    text=True,
)

# Process 2: reads from process 1's stdout
consumer: subprocess.Popen[str] = subprocess.Popen(
    [sys.executable, "-c", "import sys; print(sys.stdin.read().strip().upper())"],
    stdin=producer.stdout,
    stdout=subprocess.PIPE,
    text=True,
)

# Allow producer to receive SIGPIPE if consumer exits
if producer.stdout:
    producer.stdout.close()

output, _ = consumer.communicate()
print(f"Piped result: {output.strip()!r}")

In [None]:
import subprocess
import sys

# For simple piping, subprocess.run with input is often cleaner
# Step 1: generate data
step1: subprocess.CompletedProcess[str] = subprocess.run(
    [sys.executable, "-c", "print('line one\\nline two\\nline three')"],
    capture_output=True,
    text=True,
)

# Step 2: process the data from step 1
step2: subprocess.CompletedProcess[str] = subprocess.run(
    [sys.executable, "-c", "import sys; lines = sys.stdin.readlines(); print(f'Got {len(lines)} lines')"],
    input=step1.stdout,
    capture_output=True,
    text=True,
)

print(f"Step 1 output: {step1.stdout.strip()!r}")
print(f"Step 2 output: {step2.stdout.strip()!r}")

## Section 8: Practical Patterns

Common patterns for working with subprocess in real applications.

In [None]:
import subprocess
import sys


def run_command(args: list[str], timeout: int = 30) -> tuple[bool, str, str]:
    """Run a command and return (success, stdout, stderr)."""
    try:
        result: subprocess.CompletedProcess[str] = subprocess.run(
            args,
            capture_output=True,
            text=True,
            timeout=timeout,
        )
        return (result.returncode == 0, result.stdout, result.stderr)
    except subprocess.TimeoutExpired:
        return (False, "", f"Command timed out after {timeout}s")
    except FileNotFoundError:
        return (False, "", f"Command not found: {args[0]}")


# Test with a successful command
success, stdout, stderr = run_command([sys.executable, "--version"])
print(f"Success: {success}")
print(f"Output:  {stdout.strip() or stderr.strip()}")

# Test with a non-existent command
success, stdout, stderr = run_command(["nonexistent_command"])
print(f"\nSuccess: {success}")
print(f"Error:   {stderr}")

In [None]:
import subprocess
import sys


def get_python_module_version(module_name: str) -> str | None:
    """Get the version of an installed Python module via subprocess."""
    script: str = f"""
try:
    import {module_name}
    version = getattr({module_name}, '__version__', 'unknown')
    print(version)
except ImportError:
    print('NOT_INSTALLED')
"""
    result: subprocess.CompletedProcess[str] = subprocess.run(
        [sys.executable, "-c", script],
        capture_output=True,
        text=True,
    )
    output: str = result.stdout.strip()
    if output == "NOT_INSTALLED":
        return None
    return output


# Check some modules
modules: list[str] = ["os", "sys", "json", "subprocess"]
for mod in modules:
    version: str | None = get_python_module_version(mod)
    status: str = version if version else "not installed"
    print(f"{mod:>12}: {status}")

## Section 9: Shell Mode and Security

Passing `shell=True` runs the command through the shell, enabling shell features like globbing and pipes. However, this introduces security risks when using untrusted input.

In [None]:
import subprocess

# shell=True allows shell syntax (use with caution!)
result: subprocess.CompletedProcess[str] = subprocess.run(
    "echo 'hello' | tr 'a-z' 'A-Z'",
    shell=True,
    capture_output=True,
    text=True,
)
print(f"Shell pipe result: {result.stdout.strip()!r}")

# PREFERRED: Avoid shell=True by using list args
# This is safer because arguments are not parsed by the shell
result_safe: subprocess.CompletedProcess[str] = subprocess.run(
    ["echo", "hello world"],
    capture_output=True,
    text=True,
)
print(f"List args result: {result_safe.stdout.strip()!r}")

print("\nRule of thumb: avoid shell=True unless you need shell features")
print("and you fully control the command string.")

## Summary

### `subprocess.run()` -- High-Level API
- **`subprocess.run(args, ...)`**: Run a command, wait for completion, return `CompletedProcess`
- **`capture_output=True`**: Capture stdout and stderr (equivalent to `stdout=PIPE, stderr=PIPE`)
- **`text=True`**: Decode output as strings instead of bytes
- **`input="..."`**: Send data to the command's stdin
- **`check=True`**: Raise `CalledProcessError` on non-zero exit code
- **`timeout=N`**: Raise `TimeoutExpired` after N seconds

### `CompletedProcess` Attributes
- **`.returncode`**: Exit code (`0` = success)
- **`.stdout`** / **`.stderr`**: Captured output
- **`.args`**: The command that was run
- **`.check_returncode()`**: Raises `CalledProcessError` if returncode is non-zero

### `subprocess.Popen` -- Low-Level API
- **`Popen(args, stdin, stdout, stderr)`**: Start a process without waiting
- **`.communicate(input=...)`**: Send input, wait for completion, return `(stdout, stderr)`
- **`.pid`**: Process ID
- Use `Popen` when you need to pipe between processes or manage process lifecycles

### Best Practices
- Prefer `subprocess.run()` over `Popen` for simple use cases
- Always pass commands as **lists** (`["cmd", "arg"]`) instead of strings
- Avoid `shell=True` unless you need shell features and fully control the input
- Use `check=True` or `check_returncode()` to catch failures early
- Set `timeout` to prevent runaway processes