# Command-Line Programs

**Teaching:** 30 min  
**Exercises:** 15 min

## Learning Objectives

- Use the values of command-line arguments in a program.
- Handle flags and files separately in a command-line program.
- Read data from standard input in a program so that it can be used in a pipeline.
- Create robust command-line interfaces with proper error handling.
- Understand the difference between running and importing Python scripts.

## Questions

- How can I write Python programs that will work like Unix command-line tools?
- How do I handle command-line arguments and flags?
- How can I make my programs work in pipelines?

---

The Jupyter Notebook and other interactive tools are great for prototyping code and exploring data, but sooner or later we will want to use our program in a pipeline or run it in a shell script to process thousands of data files.

In this lesson, we'll learn how to make our programs work like other Unix command-line tools. For example, we may want a program that reads a dataset and prints the average inflammation per patient.

## Setup

Let's start by importing the libraries we'll need:

In [None]:
import sys
import numpy as np
import argparse
import os
import glob

## Introduction to sys.argv

The `sys` library connects a Python program to the system it is running on. One of its most important features is `sys.argv`, which contains the command-line arguments that a program was run with.

Let's explore how `sys.argv` works:

In [None]:
# In a Jupyter notebook, sys.argv will show the notebook kernel information
print(f"sys.argv contains: {sys.argv}")
print(f"sys.argv[0] (script name): {sys.argv[0]}")
print(f"Number of arguments: {len(sys.argv)}")

In a real command-line script, `sys.argv` would contain:
- `sys.argv[0]`: The name of the script
- `sys.argv[1:]`: The command-line arguments

For example, if we ran:
```bash
python my_script.py --mean inflammation-01.csv inflammation-02.csv
```

Then `sys.argv` would be:
```python
['my_script.py', '--mean', 'inflammation-01.csv', 'inflammation-02.csv']
```

## Creating Our First Command-Line Script

Let's create a simple command-line script to analyze inflammation data. First, we'll simulate what `sys.argv` would look like:

In [None]:
# Simulate command-line arguments for demonstration
def simulate_command_line(args):
    """Simulate sys.argv for demonstration purposes."""
    return ['inflammation_analyzer.py'] + args

# Basic version - always prints mean for a single file
def analyze_inflammation_v1(argv):
    """Version 1: Basic inflammation analyzer."""
    script = argv[0]
    filename = argv[1]
    
    print(f"Script: {script}")
    print(f"Processing file: {filename}")
    
    try:
        data = np.loadtxt(f'../data/{filename}', delimiter=',')
        for row_mean in np.mean(data, axis=1):
            print(f"{row_mean:.3f}")
    except FileNotFoundError:
        print(f"Error: File {filename} not found")
    except Exception as e:
        print(f"Error processing {filename}: {e}")

# Test the basic version
test_args = simulate_command_line(['inflammation-01.csv'])
print("=== Testing Basic Version ===")
analyze_inflammation_v1(test_args)

## Handling Multiple Files

Now let's improve our script to handle multiple files:

In [None]:
def analyze_inflammation_v2(argv):
    """Version 2: Handle multiple files."""
    script = argv[0]
    filenames = argv[1:]  # All arguments after script name
    
    if not filenames:
        print("Error: No filenames provided")
        return
    
    print(f"Script: {script}")
    print(f"Processing {len(filenames)} file(s)")
    
    for filename in filenames:
        print(f"\n--- Processing {filename} ---")
        try:
            data = np.loadtxt(f'../data/{filename}', delimiter=',')
            means = np.mean(data, axis=1)
            print(f"Patients: {len(means)}")
            print(f"Average inflammation per patient (first 5):")
            for i, row_mean in enumerate(means[:5]):
                print(f"  Patient {i+1}: {row_mean:.3f}")
            if len(means) > 5:
                print(f"  ... and {len(means)-5} more patients")
        except FileNotFoundError:
            print(f"Error: File {filename} not found")
        except Exception as e:
            print(f"Error processing {filename}: {e}")

# Test with multiple files
test_args = simulate_command_line(['inflammation-01.csv', 'inflammation-02.csv'])
print("=== Testing Multiple Files Version ===")
analyze_inflammation_v2(test_args)

## Adding Command-Line Flags

Let's add support for different statistical operations using flags like `--min`, `--mean`, and `--max`:

In [None]:
def analyze_inflammation_v3(argv):
    """Version 3: Add support for different statistical operations."""
    script = argv[0]
    
    if len(argv) < 2:
        print("Usage: python script.py [--min|--mean|--max] filename1 [filename2 ...]")
        return
    
    action = argv[1]
    filenames = argv[2:]
    
    # Validate action
    valid_actions = ['--min', '--mean', '--max']
    if action not in valid_actions:
        print(f"Error: Action must be one of {valid_actions}, got '{action}'")
        return
    
    if not filenames:
        print("Error: No filenames provided")
        return
    
    print(f"Script: {script}")
    print(f"Action: {action}")
    print(f"Processing {len(filenames)} file(s)")
    
    for filename in filenames:
        print(f"\n--- {action.replace('--', '').upper()} values for {filename} ---")
        try:
            data = np.loadtxt(f'../data/{filename}', delimiter=',')
            
            if action == '--min':
                values = np.amin(data, axis=1)
            elif action == '--mean':
                values = np.mean(data, axis=1)
            elif action == '--max':
                values = np.amax(data, axis=1)
            
            print(f"Per-patient {action.replace('--', '')} (first 5 patients):")
            for i, value in enumerate(values[:5]):
                print(f"  Patient {i+1}: {value:.3f}")
            if len(values) > 5:
                print(f"  ... and {len(values)-5} more patients")
                
        except FileNotFoundError:
            print(f"Error: File {filename} not found")
        except Exception as e:
            print(f"Error processing {filename}: {e}")

# Test with different actions
print("=== Testing with --mean ===")
test_args = simulate_command_line(['--mean', 'inflammation-01.csv'])
analyze_inflammation_v3(test_args)

print("\n=== Testing with --max ===")
test_args = simulate_command_line(['--max', 'inflammation-01.csv'])
analyze_inflammation_v3(test_args)

## Handling Standard Input

Command-line tools often read from standard input when no files are specified, allowing them to work in pipelines. Let's add this capability:

In [None]:
def process_data(data_source, action):
    """Process data from a file or stdin."""
    try:
        # data_source can be a filename or file-like object (e.g., sys.stdin)
        data = np.loadtxt(data_source, delimiter=',')
        
        if action == '--min':
            values = np.amin(data, axis=1)
        elif action == '--mean':
            values = np.mean(data, axis=1)
        elif action == '--max':
            values = np.amax(data, axis=1)
        
        return values
    except Exception as e:
        print(f"Error processing data: {e}")
        return None

def analyze_inflammation_v4(argv):
    """Version 4: Add support for standard input."""
    script = argv[0]
    
    if len(argv) < 2:
        print("Usage: python script.py [--min|--mean|--max] [filename1 filename2 ...]")
        print("       If no filenames provided, reads from standard input")
        return
    
    action = argv[1]
    filenames = argv[2:]
    
    # Validate action
    valid_actions = ['--min', '--mean', '--max']
    if action not in valid_actions:
        print(f"Error: Action must be one of {valid_actions}, got '{action}'")
        return
    
    if len(filenames) == 0:
        # Read from standard input
        print(f"Reading from standard input, computing {action.replace('--', '')}...")
        print("(In a real command-line environment, this would read from stdin)")
        # In notebook environment, we'll simulate with a sample file
        try:
            values = process_data('../data/small-01.csv', action)
            if values is not None:
                for val in values:
                    print(f"{val:.3f}")
        except Exception as e:
            print(f"Error: {e}")
    else:
        # Process specified files
        for filename in filenames:
            print(f"\n--- {action.replace('--', '').upper()} for {filename} ---")
            try:
                values = process_data(f'../data/{filename}', action)
                if values is not None:
                    for i, val in enumerate(values[:5]):
                        print(f"Patient {i+1}: {val:.3f}")
                    if len(values) > 5:
                        print(f"... and {len(values)-5} more")
            except Exception as e:
                print(f"Error: {e}")

# Test without filenames (simulates reading from stdin)
print("=== Testing Standard Input (simulated) ===")
test_args = simulate_command_line(['--mean'])
analyze_inflammation_v4(test_args)

## Using argparse for Professional Command-Line Interfaces

While handling `sys.argv` directly works for simple cases, Python's `argparse` library provides a much more robust way to handle command-line arguments:

In [None]:
def create_inflammation_parser():
    """Create an argument parser for the inflammation analyzer."""
    parser = argparse.ArgumentParser(
        description='Analyze patient inflammation data',
        formatter_class=argparse.RawDescriptionHelpFormatter,
        epilog="""
Examples:
  python script.py --mean inflammation-01.csv
  python script.py --max inflammation-*.csv
  cat inflammation-01.csv | python script.py --min
        """
    )
    
    # Action argument (required)
    parser.add_argument(
        'action',
        choices=['min', 'mean', 'max'],
        help='Statistical operation to perform'
    )
    
    # File arguments (optional)
    parser.add_argument(
        'files',
        nargs='*',
        help='Input CSV files (if none provided, reads from stdin)'
    )
    
    # Optional arguments
    parser.add_argument(
        '--output-format',
        choices=['simple', 'detailed', 'csv'],
        default='simple',
        help='Output format (default: simple)'
    )
    
    parser.add_argument(
        '--precision',
        type=int,
        default=3,
        help='Number of decimal places (default: 3)'
    )
    
    parser.add_argument(
        '--quiet',
        action='store_true',
        help='Suppress informational messages'
    )
    
    return parser

def analyze_inflammation_professional(args):
    """Professional version using argparse."""
    parser = create_inflammation_parser()
    
    # Parse arguments
    try:
        parsed_args = parser.parse_args(args)
    except SystemExit:
        # argparse calls sys.exit() on error, catch it for demo
        return
    
    if not parsed_args.quiet:
        print(f"Inflammation Data Analyzer")
        print(f"Action: {parsed_args.action}")
        print(f"Output format: {parsed_args.output_format}")
        print(f"Precision: {parsed_args.precision}")
    
    # Determine data sources
    if parsed_args.files:
        data_sources = [(f'../data/{f}', f) for f in parsed_args.files]
    else:
        # Would read from stdin in real environment
        data_sources = [('../data/small-01.csv', 'stdin')]
        if not parsed_args.quiet:
            print("Reading from standard input (simulated)...")
    
    # Process each data source
    for data_path, display_name in data_sources:
        try:
            if not parsed_args.quiet and len(data_sources) > 1:
                print(f"\n--- Processing {display_name} ---")
            
            data = np.loadtxt(data_path, delimiter=',')
            
            # Calculate statistics
            if parsed_args.action == 'min':
                values = np.amin(data, axis=1)
            elif parsed_args.action == 'mean':
                values = np.mean(data, axis=1)
            elif parsed_args.action == 'max':
                values = np.amax(data, axis=1)
            
            # Format output
            format_string = f"{{:.{parsed_args.precision}f}}"
            
            if parsed_args.output_format == 'simple':
                for val in values:
                    print(format_string.format(val))
            elif parsed_args.output_format == 'detailed':
                for i, val in enumerate(values):
                    print(f"Patient {i+1:3d}: {format_string.format(val)}")
            elif parsed_args.output_format == 'csv':
                print(','.join(format_string.format(val) for val in values))
            
        except FileNotFoundError:
            print(f"Error: File {display_name} not found", file=sys.stderr)
        except Exception as e:
            print(f"Error processing {display_name}: {e}", file=sys.stderr)

# Test the professional version
print("=== Testing Professional Version ===")
test_args = ['mean', 'small-01.csv', '--output-format', 'detailed']
analyze_inflammation_professional(test_args)

print("\n=== Testing with CSV Output ===")
test_args = ['max', 'small-01.csv', '--output-format', 'csv', '--precision', '2']
analyze_inflammation_professional(test_args)

## Creating Reusable Script Templates

Let's create a template that can be saved as a standalone Python script:

In [None]:
# This is the structure of a complete command-line script
template_script = """
#!/usr/bin/env python3
'''
Inflammation Data Analyzer

A command-line tool for analyzing patient inflammation data.
'''

import sys
import numpy as np
import argparse

def main():
    '''Main function - entry point for the script.'''
    parser = argparse.ArgumentParser(
        description='Analyze patient inflammation data'
    )
    
    parser.add_argument(
        'action',
        choices=['min', 'mean', 'max'],
        help='Statistical operation to perform'
    )
    
    parser.add_argument(
        'files',
        nargs='*',
        help='Input CSV files'
    )
    
    args = parser.parse_args()
    
    # Process files or stdin
    if args.files:
        for filename in args.files:
            process_file(filename, args.action)
    else:
        process_stdin(args.action)

def process_file(filename, action):
    '''Process a single file.'''
    try:
        data = np.loadtxt(filename, delimiter=',')
        values = calculate_stats(data, action)
        for val in values:
            print(f'{val:.3f}')
    except Exception as e:
        print(f'Error processing {filename}: {e}', file=sys.stderr)

def process_stdin(action):
    '''Process data from standard input.'''
    try:
        data = np.loadtxt(sys.stdin, delimiter=',')
        values = calculate_stats(data, action)
        for val in values:
            print(f'{val:.3f}')
    except Exception as e:
        print(f'Error processing stdin: {e}', file=sys.stderr)

def calculate_stats(data, action):
    '''Calculate statistics for the given action.'''
    if action == 'min':
        return np.amin(data, axis=1)
    elif action == 'mean':
        return np.mean(data, axis=1)
    elif action == 'max':
        return np.amax(data, axis=1)

if __name__ == '__main__':
    main()
"""

print("Complete script template:")
print(template_script)

# Save the script to a file for demonstration
script_filename = '../files/inflammation_analyzer.py'
try:
    with open(script_filename, 'w') as f:
        f.write(template_script.strip())
    print(f"\nScript saved to: {script_filename}")
    print("You could run it with: python inflammation_analyzer.py mean inflammation-01.csv")
except Exception as e:
    print(f"Could not save script: {e}")

## Exercises

Let's practice creating command-line tools with some exercises:

### Exercise 1: Simple Calculator

Create a command-line calculator that performs basic arithmetic operations:

In [None]:
def calculator(argv):
    """Simple command-line calculator.
    
    Usage: calculator.py --add 1 2
           calculator.py --subtract 5 3
           calculator.py --multiply 4 7
           calculator.py --divide 10 2
    """
    if len(argv) != 4:
        print("Usage: script.py [--add|--subtract|--multiply|--divide] num1 num2")
        return
    
    script, operation, num1_str, num2_str = argv
    
    # Validate operation
    valid_ops = ['--add', '--subtract', '--multiply', '--divide']
    if operation not in valid_ops:
        print(f"Error: Operation must be one of {valid_ops}")
        return
    
    # Convert numbers
    try:
        num1 = float(num1_str)
        num2 = float(num2_str)
    except ValueError:
        print("Error: Arguments must be numbers")
        return
    
    # Perform calculation
    if operation == '--add':
        result = num1 + num2
    elif operation == '--subtract':
        result = num1 - num2
    elif operation == '--multiply':
        result = num1 * num2
    elif operation == '--divide':
        if num2 == 0:
            print("Error: Division by zero")
            return
        result = num1 / num2
    
    print(f"{result}")

# Test the calculator
print("=== Testing Calculator ===")
test_cases = [
    ['calc.py', '--add', '1', '2'],
    ['calc.py', '--subtract', '10', '3'],
    ['calc.py', '--multiply', '4', '5'],
    ['calc.py', '--divide', '15', '3']
]

for test in test_cases:
    print(f"Command: {' '.join(test)}")
    calculator(test)
    print()

### Exercise 2: File Checker

Create a script that checks if inflammation data files have the same dimensions:

In [None]:
def file_checker(argv):
    """Check that inflammation data files have consistent dimensions."""
    script = argv[0]
    filenames = argv[1:]
    
    if len(filenames) < 2:
        print("Usage: script.py file1.csv file2.csv [file3.csv ...]")
        print("Need at least 2 files to compare")
        return
    
    print(f"Checking consistency of {len(filenames)} files...")
    
    # Get dimensions of first file as reference
    try:
        first_data = np.loadtxt(f'../data/{filenames[0]}', delimiter=',')
        reference_shape = first_data.shape
        print(f"Reference file {filenames[0]}: {reference_shape[0]} rows, {reference_shape[1]} columns")
    except Exception as e:
        print(f"Error reading reference file {filenames[0]}: {e}")
        return
    
    # Check other files
    all_consistent = True
    for filename in filenames[1:]:
        try:
            data = np.loadtxt(f'../data/{filename}', delimiter=',')
            if data.shape == reference_shape:
                print(f"✓ {filename}: {data.shape[0]} rows, {data.shape[1]} columns - CONSISTENT")
            else:
                print(f"✗ {filename}: {data.shape[0]} rows, {data.shape[1]} columns - INCONSISTENT")
                all_consistent = False
        except Exception as e:
            print(f"✗ {filename}: Error reading file - {e}")
            all_consistent = False
    
    print(f"\nResult: {'All files consistent' if all_consistent else 'Files have inconsistent dimensions'}")
    return all_consistent

# Test the file checker
print("=== Testing File Checker ===")
test_args = ['check.py', 'inflammation-01.csv', 'inflammation-02.csv', 'small-01.csv']
file_checker(test_args)

### Exercise 3: Advanced Inflammation Analyzer with argparse

Create a more sophisticated analyzer using argparse:

In [None]:
def create_advanced_analyzer():
    """Create an advanced inflammation analyzer with full argument parsing."""
    parser = argparse.ArgumentParser(
        description='Advanced Inflammation Data Analyzer',
        epilog='Examples:\n'
               '  %(prog)s mean --per-patient inflammation-01.csv\n'
               '  %(prog)s max --per-day --output summary.txt inflammation-*.csv',
        formatter_class=argparse.RawDescriptionHelpFormatter
    )
    
    # Positional arguments
    parser.add_argument(
        'statistic',
        choices=['min', 'max', 'mean', 'std'],
        help='Statistical measure to calculate'
    )
    
    parser.add_argument(
        'files',
        nargs='*',
        help='Input inflammation data files'
    )
    
    # Optional arguments
    parser.add_argument(
        '--per-patient',
        action='store_true',
        help='Calculate statistics per patient (default: per day)'
    )
    
    parser.add_argument(
        '--per-day',
        action='store_true',
        help='Calculate statistics per day'
    )
    
    parser.add_argument(
        '--output', '-o',
        help='Output file (default: stdout)'
    )
    
    parser.add_argument(
        '--format',
        choices=['simple', 'table', 'csv', 'json'],
        default='simple',
        help='Output format'
    )
    
    parser.add_argument(
        '--verbose', '-v',
        action='store_true',
        help='Verbose output'
    )
    
    return parser

def advanced_analyzer(args_list):
    """Run the advanced analyzer with given arguments."""
    parser = create_advanced_analyzer()
    
    try:
        args = parser.parse_args(args_list)
    except SystemExit:
        return
    
    # Default behavior
    if not args.per_patient and not args.per_day:
        args.per_patient = True  # Default to per-patient
    
    if args.verbose:
        print(f"Advanced Inflammation Analyzer")
        print(f"Statistic: {args.statistic}")
        print(f"Mode: {'per-patient' if args.per_patient else 'per-day'}")
        print(f"Format: {args.format}")
        print(f"Files: {len(args.files) if args.files else 0}")
    
    # Process files
    results = []
    
    if not args.files:
        print("No files provided, would read from stdin in real environment")
        return
    
    for filename in args.files:
        try:
            data = np.loadtxt(f'../data/{filename}', delimiter=',')
            
            # Calculate statistics
            if args.per_patient:
                axis = 1  # Along days for each patient
                labels = [f"Patient {i+1}" for i in range(data.shape[0])]
            else:
                axis = 0  # Along patients for each day
                labels = [f"Day {i+1}" for i in range(data.shape[1])]
            
            if args.statistic == 'min':
                values = np.min(data, axis=axis)
            elif args.statistic == 'max':
                values = np.max(data, axis=axis)
            elif args.statistic == 'mean':
                values = np.mean(data, axis=axis)
            elif args.statistic == 'std':
                values = np.std(data, axis=axis)
            
            results.append((filename, labels, values))
            
        except Exception as e:
            print(f"Error processing {filename}: {e}")
    
    # Format and display results
    output_lines = []
    
    for filename, labels, values in results:
        if args.format == 'simple':
            if len(results) > 1:
                output_lines.append(f"\n--- {filename} ---")
            for val in values:
                output_lines.append(f"{val:.3f}")
        
        elif args.format == 'table':
            if len(results) > 1:
                output_lines.append(f"\n{filename}:")
            for label, val in zip(labels[:5], values[:5]):
                output_lines.append(f"{label:12}: {val:8.3f}")
            if len(values) > 5:
                output_lines.append(f"... and {len(values)-5} more")
        
        elif args.format == 'csv':
            header = ','.join(labels[:5]) if len(labels) <= 5 else ','.join(labels[:5]) + ',...'
            data_row = ','.join(f"{val:.3f}" for val in values[:5])
            if len(values) <= 5:
                output_lines.extend([header, data_row])
            else:
                output_lines.extend([header + ',...', data_row + ',...'])
    
    # Output results
    for line in output_lines:
        print(line)

# Test the advanced analyzer
print("=== Testing Advanced Analyzer ===")
test_cases = [
    ['mean', '--per-patient', '--format', 'table', '--verbose', 'inflammation-01.csv'],
    ['max', '--per-day', '--format', 'csv', 'small-01.csv'],
    ['std', '--format', 'simple', 'inflammation-01.csv']
]

for i, test in enumerate(test_cases, 1):
    print(f"\n=== Test {i}: {' '.join(test)} ===")
    advanced_analyzer(test)

## Best Practices for Command-Line Scripts

Let's create a checklist of best practices for writing command-line programs:

In [None]:
best_practices = {
    "Argument Handling": [
        "Use argparse for complex argument parsing",
        "Provide clear help messages and examples",
        "Validate arguments early and fail fast",
        "Support both files and stdin when appropriate",
        "Use meaningful exit codes (0 for success, non-zero for errors)"
    ],
    "Error Handling": [
        "Catch and handle specific exceptions",
        "Print error messages to stderr, not stdout",
        "Provide helpful error messages",
        "Don't let programs fail silently",
        "Include file names in error messages"
    ],
    "Output Format": [
        "Keep output format simple by default",
        "Support multiple output formats when useful",
        "Make output easy to parse by other programs",
        "Use consistent field separators",
        "Consider providing headers for tabular data"
    ],
    "Code Structure": [
        "Use 'if __name__ == \"__main__\":' pattern",
        "Separate argument parsing from business logic",
        "Make functions reusable and testable",
        "Include docstrings and comments",
        "Follow PEP 8 style guidelines"
    ],
    "Unix Philosophy": [
        "Do one thing well",
        "Work with text streams when possible",
        "Be composable with other tools",
        "Follow the principle of least surprise",
        "Make programs easy to test and debug"
    ]
}

print("=== COMMAND-LINE PROGRAM BEST PRACTICES ===")
for category, practices in best_practices.items():
    print(f"\n{category.upper()}:")
    for i, practice in enumerate(practices, 1):
        print(f"  {i}. {practice}")

## Example: Complete Production Script

Here's an example of what a complete production-ready command-line script looks like:

In [None]:
production_script = '''
#!/usr/bin/env python3
"""
Inflammation Data Analyzer - Production Version

A robust command-line tool for analyzing patient inflammation data files.
Supports multiple statistical operations, input/output formats, and error handling.

Author: Your Name
Version: 1.0.0
"""

import sys
import os
import argparse
import numpy as np
import logging
from pathlib import Path


def setup_logging(verbose=False):
    """Setup logging configuration."""
    level = logging.DEBUG if verbose else logging.INFO
    logging.basicConfig(
        level=level,
        format='%(levelname)s: %(message)s'
    )


def parse_arguments():
    """Parse command-line arguments."""
    parser = argparse.ArgumentParser(
        description=__doc__,
        formatter_class=argparse.RawDescriptionHelpFormatter
    )
    
    parser.add_argument(
        'operation',
        choices=['min', 'max', 'mean', 'std'],
        help='Statistical operation to perform'
    )
    
    parser.add_argument(
        'files',
        nargs='*',
        help='Input CSV files (reads from stdin if none provided)'
    )
    
    parser.add_argument(
        '--axis',
        choices=['patients', 'days'],
        default='patients',
        help='Calculate statistics per patients or per days'
    )
    
    parser.add_argument(
        '--output', '-o',
        type=argparse.FileType('w'),
        default=sys.stdout,
        help='Output file (default: stdout)'
    )
    
    parser.add_argument(
        '--format',
        choices=['plain', 'csv', 'json'],
        default='plain',
        help='Output format'
    )
    
    parser.add_argument(
        '--precision',
        type=int,
        default=3,
        help='Decimal precision for output'
    )
    
    parser.add_argument(
        '--verbose', '-v',
        action='store_true',
        help='Enable verbose output'
    )
    
    parser.add_argument(
        '--version',
        action='version',
        version='%(prog)s 1.0.0'
    )
    
    return parser.parse_args()


def load_data(source):
    """Load data from file or stdin."""
    try:
        if isinstance(source, str):
            if not Path(source).exists():
                raise FileNotFoundError(f"File not found: {source}")
            data = np.loadtxt(source, delimiter=',')
            logging.debug(f"Loaded {source}: shape {data.shape}")
        else:
            data = np.loadtxt(source, delimiter=',')
            logging.debug(f"Loaded from stdin: shape {data.shape}")
        return data
    except Exception as e:
        raise ValueError(f"Error loading data: {e}")


def calculate_statistics(data, operation, axis):
    """Calculate statistics on the data."""
    axis_num = 1 if axis == 'patients' else 0
    
    operations = {
        'min': np.min,
        'max': np.max,
        'mean': np.mean,
        'std': np.std
    }
    
    func = operations[operation]
    return func(data, axis=axis_num)


def format_output(values, output_format, precision):
    """Format output according to specified format."""
    if output_format == 'plain':
        return '\\n'.join(f"{val:.{precision}f}" for val in values)
    elif output_format == 'csv':
        return ','.join(f"{val:.{precision}f}" for val in values)
    elif output_format == 'json':
        import json
        return json.dumps([round(float(val), precision) for val in values])


def main():
    """Main function."""
    try:
        args = parse_arguments()
        setup_logging(args.verbose)
        
        # Process input sources
        sources = args.files if args.files else [sys.stdin]
        
        for source in sources:
            try:
                data = load_data(source)
                values = calculate_statistics(data, args.operation, args.axis)
                output = format_output(values, args.format, args.precision)
                print(output, file=args.output)
                
            except Exception as e:
                logging.error(f"Error processing {source}: {e}")
                return 1
        
        return 0
        
    except KeyboardInterrupt:
        logging.info("Interrupted by user")
        return 1
    except Exception as e:
        logging.error(f"Unexpected error: {e}")
        return 1


if __name__ == '__main__':
    sys.exit(main())
'''

print("=== PRODUCTION-READY SCRIPT TEMPLATE ===")
print("This script includes:")
print("- Comprehensive argument parsing")
print("- Proper error handling and logging")
print("- Multiple output formats")
print("- Support for stdin and multiple files")
print("- Proper exit codes")
print("- Documentation and version info")
print("\nThe script would be saved to a .py file and made executable.")

## Summary

In this lesson, we learned how to create command-line programs in Python:

### Key Concepts

1. **sys.argv** - Contains command-line arguments passed to the script
2. **Argument parsing** - Processing and validating command-line arguments
3. **Standard input/output** - Reading from stdin and writing to stdout/stderr
4. **Error handling** - Graceful handling of errors with meaningful messages
5. **Multiple file processing** - Handling one or more input files

### Tools and Libraries

- **sys module** - Basic system interface and argument access
- **argparse module** - Professional argument parsing with help generation
- **File I/O** - Reading from files or standard input
- **Error handling** - Using try/except blocks and proper error reporting

### Best Practices

- Validate arguments early and fail fast
- Provide clear help messages and error messages
- Support standard input for pipeline compatibility
- Use appropriate exit codes
- Follow Unix philosophy: do one thing well
- Make scripts testable and maintainable

### Key Points

- The `sys` library connects a Python program to the system it is running on
- The list `sys.argv` contains the command-line arguments that a program was run with
- Avoid silent failures - always provide meaningful error messages
- The pseudo-file `sys.stdin` connects to a program's standard input
- Use `argparse` for robust command-line argument parsing
- Design programs to work well with other Unix tools and in pipelines

Command-line programs are essential for automating data analysis workflows and integrating Python scripts into larger data processing pipelines. They provide a way to make your analysis reproducible and scalable.