# CLI Architecture with Click

**Module 10: Building Professional CLIs with Click**

---

## Learning Objectives

- Understand CLI architecture patterns
- Build commands with Click
- Implement config validation workflows
- Create rich error messages
- Handle exceptions gracefully
- Test CLI commands

---

## Why Click?

**Problems with argparse:**
- Verbose boilerplate
- Manual type conversion
- Limited composability
- Poor testing support

**Click advantages:**
- Decorator-based API
- Automatic help generation
- Nested command groups
- Built-in testing utilities
- Type validation
- File path handling

---

## Part 1: Click Basics

### Simple Command

In [None]:
import click

@click.command()
@click.argument('name')
@click.option('--greeting', default='Hello', help='Greeting to use')
def greet(name, greeting):
    """Simple greeting command."""
    click.echo(f"{greeting}, {name}!")

# Simulate CLI call
from click.testing import CliRunner
runner = CliRunner()

result = runner.invoke(greet, ['World'])
print(result.output)

result = runner.invoke(greet, ['Alice', '--greeting', 'Hi'])
print(result.output)

### Arguments vs Options

**Arguments:**
- Required (unless optional=True)
- Positional
- No -- prefix

**Options:**
- Optional by default
- Named with -- prefix
- Can have short forms (-v)

In [None]:
@click.command()
@click.argument('config_file', type=click.Path(exists=True))
@click.option('--env', type=click.Choice(['dev', 'prod']), default='dev')
@click.option('--verbose', '-v', is_flag=True, help='Enable verbose output')
@click.option('--retries', type=int, default=3, help='Number of retries')
def process(config_file, env, verbose, retries):
    """Process config file."""
    if verbose:
        click.echo(f"Loading {config_file}...")
    click.echo(f"Environment: {env}")
    click.echo(f"Max retries: {retries}")

# Test
import tempfile
with tempfile.NamedTemporaryFile(mode='w', suffix='.yaml', delete=False) as f:
    f.write("test: data")
    temp_path = f.name

runner = CliRunner()
result = runner.invoke(process, [temp_path, '--env', 'prod', '-v', '--retries', '5'])
print(result.output)

import os
os.unlink(temp_path)

---

## Part 2: Command Groups

### Building Multi-Command CLIs

In [None]:
@click.group()
def odibi():
    """Odibi Data Pipeline Framework."""
    pass

@odibi.command()
@click.argument('config')
def validate(config):
    """Validate configuration file."""
    click.echo(f"Validating {config}...")
    click.secho("‚úì Config is valid", fg='green')

@odibi.command()
@click.argument('config')
@click.option('--env', default='development')
def run(config, env):
    """Execute pipeline."""
    click.echo(f"Running {config} in {env} mode...")
    click.secho("‚úì Pipeline completed", fg='green')

# Test
runner = CliRunner()

# Show help
result = runner.invoke(odibi, ['--help'])
print(result.output)
print()

# Run commands
result = runner.invoke(odibi, ['validate', 'config.yaml'])
print(result.output)

result = runner.invoke(odibi, ['run', 'config.yaml', '--env', 'production'])
print(result.output)

---

## Part 3: Odibi's CLI Architecture

### Current Implementation (argparse)

In [None]:
# From odibi/cli/main.py
import argparse

def create_parser():
    """Odibi's current argparse-based CLI."""
    parser = argparse.ArgumentParser(
        description="Odibi Data Pipeline Framework",
        formatter_class=argparse.RawDescriptionHelpFormatter,
        epilog="""
Examples:
  odibi run config.yaml                    Run a pipeline
  odibi validate config.yaml               Validate configuration
  odibi story generate config.yaml         Generate documentation
        """,
    )
    subparsers = parser.add_subparsers(dest="command", help="Available commands")

    # Run command
    run_parser = subparsers.add_parser("run", help="Execute pipeline")
    run_parser.add_argument("config", help="Path to YAML config file")
    run_parser.add_argument(
        "--env", default="development", help="Environment (development/production)"
    )

    # Validate command
    validate_parser = subparsers.add_parser("validate", help="Validate config")
    validate_parser.add_argument("config", help="Path to YAML config file")

    return parser

parser = create_parser()
args = parser.parse_args(['validate', 'config.yaml'])
print(f"Command: {args.command}")
print(f"Config: {args.config}")

### Validation Command Implementation

In [None]:
# From odibi/cli/validate.py
import yaml
from typing import Dict, Any

def validate_config_simple(config_path: str) -> int:
    """Simple validation - load YAML and check structure."""
    try:
        with open(config_path, 'r') as f:
            config_data = yaml.safe_load(f)
        
        # Basic validation
        if not isinstance(config_data, dict):
            raise ValueError("Config must be a dictionary")
        
        if 'connections' not in config_data:
            raise ValueError("Missing 'connections' section")
        
        if 'pipelines' not in config_data:
            raise ValueError("Missing 'pipelines' section")
        
        print("‚úì Config is valid")
        return 0
    
    except FileNotFoundError:
        print(f"‚úó Config file not found: {config_path}")
        return 1
    
    except yaml.YAMLError as e:
        print(f"‚úó YAML parse error: {e}")
        return 1
    
    except Exception as e:
        print(f"‚úó Validation failed: {e}")
        return 1

# Create test config
test_config = """
connections:
  db:
    type: databricks
    catalog: main

pipelines:
  sales_etl:
    steps:
      - extract:
          sql: "SELECT * FROM sales"
"""

with tempfile.NamedTemporaryFile(mode='w', suffix='.yaml', delete=False) as f:
    f.write(test_config)
    config_path = f.name

# Test validation
result = validate_config_simple(config_path)
print(f"Exit code: {result}")

os.unlink(config_path)

---

## Part 4: Custom Exceptions

### Exception Hierarchy

In [None]:
# From odibi/exceptions.py
from typing import Optional, List

class OdibiException(Exception):
    """Base exception for all ODIBI errors."""
    pass

class ConfigValidationError(OdibiException):
    """Configuration validation failed."""

    def __init__(self, message: str, file: Optional[str] = None, line: Optional[int] = None):
        self.message = message
        self.file = file
        self.line = line
        super().__init__(self._format_error())

    def _format_error(self) -> str:
        """Format error message with location info."""
        parts = ["Configuration validation error"]
        if self.file:
            parts.append(f"\n  File: {self.file}")
        if self.line:
            parts.append(f"\n  Line: {self.line}")
        parts.append(f"\n  Error: {self.message}")
        return "".join(parts)

# Test
try:
    raise ConfigValidationError(
        message="Missing required field 'name'",
        file="config.yaml",
        line=15
    )
except ConfigValidationError as e:
    print(e)

### Rich Connection Errors

In [None]:
class ConnectionError(OdibiException):
    """Connection failed or invalid."""

    def __init__(self, connection_name: str, reason: str, suggestions: Optional[List[str]] = None):
        self.connection_name = connection_name
        self.reason = reason
        self.suggestions = suggestions or []
        super().__init__(self._format_error())

    def _format_error(self) -> str:
        """Format connection error with suggestions."""
        parts = [
            f"‚úó Connection validation failed: {self.connection_name}",
            f"\n  Reason: {self.reason}",
        ]

        if self.suggestions:
            parts.append("\n\n  Suggestions:")
            for i, suggestion in enumerate(self.suggestions, 1):
                parts.append(f"\n    {i}. {suggestion}")

        return "".join(parts)

# Test
try:
    raise ConnectionError(
        connection_name="prod_db",
        reason="Catalog 'analytics' not found",
        suggestions=[
            "Check catalog name spelling",
            "Verify catalog access permissions",
            "Use 'odibi list catalogs' to see available catalogs"
        ]
    )
except ConnectionError as e:
    print(e)

### Execution Context for Better Errors

In [None]:
from dataclasses import dataclass

@dataclass
class ExecutionContext:
    """Runtime context for error reporting."""
    node_name: str
    config_file: Optional[str] = None
    config_line: Optional[int] = None
    step_index: Optional[int] = None
    total_steps: Optional[int] = None
    input_schema: Optional[List[str]] = None
    input_shape: Optional[tuple] = None
    previous_steps: Optional[List[str]] = None

class NodeExecutionError(OdibiException):
    """Node execution failed."""

    def __init__(
        self,
        message: str,
        context: ExecutionContext,
        original_error: Optional[Exception] = None,
        suggestions: Optional[List[str]] = None,
    ):
        self.message = message
        self.context = context
        self.original_error = original_error
        self.suggestions = suggestions or []
        super().__init__(self._format_error())

    def _format_error(self) -> str:
        """Generate rich error message with context."""
        parts = [f"‚úó Node execution failed: {self.context.node_name}"]

        # Location info
        if self.context.config_file:
            parts.append(f"\n  Location: {self.context.config_file}")
            if self.context.config_line:
                parts.append(f":{self.context.config_line}")

        # Step info
        if self.context.step_index is not None and self.context.total_steps:
            parts.append(f"\n  Step: {self.context.step_index + 1} of {self.context.total_steps}")

        # Error message
        parts.append(f"\n\n  Error: {self.message}")

        # Context information
        if self.context.input_schema:
            parts.append(f"\n\n  Available columns: {self.context.input_schema}")

        if self.context.previous_steps:
            parts.append("\n\n  Previous steps:")
            for step in self.context.previous_steps:
                parts.append(f"\n    ‚úì {step}")

        # Suggestions
        if self.suggestions:
            parts.append("\n\n  Suggestions:")
            for i, suggestion in enumerate(self.suggestions, 1):
                parts.append(f"\n    {i}. {suggestion}")

        return "".join(parts)

# Test
try:
    ctx = ExecutionContext(
        node_name="calculate_margin",
        config_file="sales_pipeline.yaml",
        config_line=42,
        step_index=2,
        total_steps=5,
        input_schema=['order_id', 'amount', 'discount'],
        previous_steps=['extract_orders', 'clean_data']
    )
    
    raise NodeExecutionError(
        message="Column 'revenue' not found",
        context=ctx,
        suggestions=[
            "Check column name spelling",
            "Verify upstream transformation produces 'revenue'",
            "Add revenue calculation in previous step"
        ]
    )
except NodeExecutionError as e:
    print(e)

---

## Part 5: Validation Workflow

### Explanation Linter

In [None]:
# From odibi/validation/explanation_linter.py
import re
from dataclasses import dataclass

@dataclass
class LintIssue:
    """A linting issue found in an explanation."""
    severity: str  # "error", "warning", "info"
    message: str
    rule: str

    def __str__(self):
        symbol = {"error": "‚ùå", "warning": "‚ö†Ô∏è", "info": "‚ÑπÔ∏è"}[self.severity]
        return f"{symbol} {self.message} [{self.rule}]"


class ExplanationLinter:
    """
    Lints explanation text for quality issues.

    Checks:
    - Minimum length
    - Required sections (Purpose, Details, Result)
    - Generic/lazy phrases
    - TODO placeholders
    - Formula formatting
    """

    REQUIRED_SECTIONS = ["Purpose", "Details", "Result"]
    LAZY_PHRASES = [
        "calculates stuff",
        "does things",
        "processes data",
        "handles records",
        "TODO",
        "[placeholder]",
        "TBD",
    ]
    MIN_LENGTH = 50

    def __init__(self):
        self.issues: List[LintIssue] = []

    def lint(self, explanation: str, operation_name: str = "unknown") -> List[LintIssue]:
        """Lint an explanation and return issues."""
        self.issues = []

        if not explanation or not explanation.strip():
            self.issues.append(
                LintIssue(
                    severity="error",
                    message=f"Explanation for '{operation_name}' is empty",
                    rule="E001",
                )
            )
            return self.issues

        # Check length
        if len(explanation.strip()) < self.MIN_LENGTH:
            self.issues.append(
                LintIssue(
                    severity="error",
                    message=f"Explanation for '{operation_name}' too short ({len(explanation.strip())} chars)",
                    rule="E002",
                )
            )

        # Check required sections
        for section in self.REQUIRED_SECTIONS:
            pattern = f"\\*\\*{section}:?\\*\\*"
            if not re.search(pattern, explanation, re.IGNORECASE):
                self.issues.append(
                    LintIssue(
                        severity="error",
                        message=f"Explanation for '{operation_name}' missing section: {section}",
                        rule="E003",
                    )
                )

        # Check for lazy phrases
        text_lower = explanation.lower()
        for phrase in self.LAZY_PHRASES:
            if phrase.lower() in text_lower:
                self.issues.append(
                    LintIssue(
                        severity="error",
                        message=f"Explanation for '{operation_name}' contains generic phrase: '{phrase}'",
                        rule="E004",
                    )
                )

        return self.issues

    def has_errors(self) -> bool:
        """Check if any errors were found."""
        return any(issue.severity == "error" for issue in self.issues)

# Test
linter = ExplanationLinter()

# Bad explanation
bad_explanation = "This step processes data."
issues = linter.lint(bad_explanation, "extract")
print("Bad explanation issues:")
for issue in issues:
    print(f"  {issue}")
print()

# Good explanation
good_explanation = """
**Purpose:** Calculate profit margin for each sale
**Details:** Uses the formula (revenue - cost) / revenue * 100
**Result:** Adds 'margin_pct' column with percentage values
"""
issues = linter.lint(good_explanation, "calculate_margin")
if not issues:
    print("‚úÖ Good explanation - no issues")
else:
    for issue in issues:
        print(f"  {issue}")

---

## Part 6: Click with Rich Output

### Colored Output and Progress

In [None]:
import click
import time

@click.command()
@click.argument('config', type=click.Path(exists=True))
@click.option('--verbose', '-v', is_flag=True)
def validate_with_progress(config, verbose):
    """Validate config with progress indication."""
    
    checks = [
        ("YAML syntax", 0.3),
        ("Schema validation", 0.5),
        ("Connection validation", 0.7),
        ("Pipeline structure", 0.4),
        ("Explanation quality", 0.6),
    ]
    
    click.echo(f"\nValidating {config}...\n")
    
    with click.progressbar(
        checks,
        label='Validation',
        show_eta=False,
        item_show_func=lambda x: x[0] if x else ''
    ) as bar:
        for check_name, duration in bar:
            if verbose:
                click.echo(f"  Checking {check_name}...")
            time.sleep(duration)
    
    click.echo()
    click.secho("‚úì All validation checks passed", fg='green', bold=True)
    click.echo()
    click.secho("Summary:", fg='cyan')
    click.echo(f"  ‚Ä¢ {len(checks)} checks completed")
    click.echo(f"  ‚Ä¢ 0 errors, 0 warnings")

# Note: Progress bar won't show in notebook, but works in terminal
# Simulating output:
print("\nValidating config.yaml...\n")
print("Validation  [####################################]  100%  Explanation quality")
print()
print("\033[92m‚úì All validation checks passed\033[0m")
print()
print("\033[96mSummary:\033[0m")
print("  ‚Ä¢ 5 checks completed")
print("  ‚Ä¢ 0 errors, 0 warnings")

---

## Part 7: Testing CLI Commands

### Click's Test Runner

In [None]:
from click.testing import CliRunner
import pytest

@click.command()
@click.argument('name')
@click.option('--uppercase', is_flag=True)
def greet(name, uppercase):
    """Greet someone."""
    greeting = f"Hello, {name}!"
    if uppercase:
        greeting = greeting.upper()
    click.echo(greeting)

# Test suite
def test_greet_basic():
    runner = CliRunner()
    result = runner.invoke(greet, ['Alice'])
    assert result.exit_code == 0
    assert result.output == "Hello, Alice!\n"

def test_greet_uppercase():
    runner = CliRunner()
    result = runner.invoke(greet, ['Bob', '--uppercase'])
    assert result.exit_code == 0
    assert result.output == "HELLO, BOB!\n"

def test_greet_missing_arg():
    runner = CliRunner()
    result = runner.invoke(greet, [])
    assert result.exit_code != 0
    assert "Error" in result.output

# Run tests
test_greet_basic()
print("‚úì test_greet_basic passed")

test_greet_uppercase()
print("‚úì test_greet_uppercase passed")

test_greet_missing_arg()
print("‚úì test_greet_missing_arg passed")

### Testing with Temp Files

In [None]:
@click.command()
@click.argument('config', type=click.Path(exists=True))
def validate_config(config):
    """Validate YAML config."""
    import yaml
    
    try:
        with open(config) as f:
            data = yaml.safe_load(f)
        
        # Validate structure
        if 'pipelines' not in data:
            raise ValueError("Missing 'pipelines' key")
        
        click.secho("‚úì Config is valid", fg='green')
        
    except Exception as e:
        click.secho(f"‚úó Validation failed: {e}", fg='red', err=True)
        raise click.Abort()

def test_validate_config_valid():
    runner = CliRunner()
    
    with runner.isolated_filesystem():
        # Create valid config
        with open('config.yaml', 'w') as f:
            f.write('pipelines:\n  - name: test\n')
        
        result = runner.invoke(validate_config, ['config.yaml'])
        assert result.exit_code == 0
        assert "‚úì" in result.output

def test_validate_config_invalid():
    runner = CliRunner()
    
    with runner.isolated_filesystem():
        # Create invalid config
        with open('config.yaml', 'w') as f:
            f.write('connections:\n  - name: db\n')
        
        result = runner.invoke(validate_config, ['config.yaml'])
        assert result.exit_code != 0
        assert "Missing 'pipelines'" in result.output

# Run tests
test_validate_config_valid()
print("‚úì test_validate_config_valid passed")

test_validate_config_invalid()
print("‚úì test_validate_config_invalid passed")

---

## Part 8: Complete Validation Command

### Production-Ready Implementation

In [None]:
@click.command()
@click.argument('config', type=click.Path(exists=True))
@click.option('--strict', is_flag=True, help='Fail on warnings')
@click.option('--verbose', '-v', is_flag=True, help='Show detailed output')
def validate(
    config: str,
    strict: bool,
    verbose: bool
) -> None:
    """Validate Odibi configuration file.
    
    Performs comprehensive validation:
    - YAML syntax
    - Schema structure
    - Connection configs
    - Pipeline definitions
    - Explanation quality
    """
    import yaml
    
    errors = []
    warnings = []
    
    try:
        # 1. Load YAML
        if verbose:
            click.echo("üìÑ Loading config file...")
        
        with open(config) as f:
            data = yaml.safe_load(f)
        
        # 2. Validate structure
        if verbose:
            click.echo("üîç Validating structure...")
        
        if not isinstance(data, dict):
            errors.append("Config must be a dictionary")
        
        if 'connections' not in data:
            errors.append("Missing 'connections' section")
        
        if 'pipelines' not in data:
            errors.append("Missing 'pipelines' section")
        
        # 3. Validate connections
        if verbose:
            click.echo("üîå Validating connections...")
        
        if 'connections' in data:
            for name, conn in data['connections'].items():
                if 'type' not in conn:
                    errors.append(f"Connection '{name}' missing 'type'")
        
        # 4. Validate pipelines
        if verbose:
            click.echo("‚öôÔ∏è  Validating pipelines...")
        
        if 'pipelines' in data:
            for name, pipeline in data['pipelines'].items():
                if 'steps' not in pipeline:
                    errors.append(f"Pipeline '{name}' missing 'steps'")
                else:
                    # Check explanations
                    for i, step in enumerate(pipeline['steps']):
                        step_name = f"{name}.step{i}"
                        
                        # Get explanation from step
                        explanation = None
                        for key, value in step.items():
                            if isinstance(value, dict) and 'explanation' in value:
                                explanation = value['explanation']
                        
                        if not explanation:
                            warnings.append(f"Step '{step_name}' missing explanation")
                        else:
                            # Lint explanation
                            linter = ExplanationLinter()
                            issues = linter.lint(explanation, step_name)
                            
                            for issue in issues:
                                if issue.severity == 'error':
                                    errors.append(f"{step_name}: {issue.message}")
                                elif issue.severity == 'warning':
                                    warnings.append(f"{step_name}: {issue.message}")
        
        # Report results
        click.echo()
        
        if errors:
            click.secho("‚ùå Validation failed", fg='red', bold=True)
            click.echo()
            click.secho(f"Errors ({len(errors)}):", fg='red')
            for error in errors:
                click.echo(f"  ‚Ä¢ {error}")
            click.echo()
        
        if warnings:
            click.secho(f"Warnings ({len(warnings)}):", fg='yellow')
            for warning in warnings:
                click.echo(f"  ‚Ä¢ {warning}")
            click.echo()
        
        if not errors and not warnings:
            click.secho("‚úÖ Configuration is valid", fg='green', bold=True)
        elif not errors:
            click.secho("‚úÖ Configuration is valid (with warnings)", fg='green', bold=True)
        
        # Exit code
        if errors:
            raise click.Abort()
        
        if strict and warnings:
            click.echo("Failing due to --strict mode")
            raise click.Abort()
    
    except yaml.YAMLError as e:
        click.secho(f"‚ùå YAML parse error: {e}", fg='red', err=True)
        raise click.Abort()
    
    except click.Abort:
        raise
    
    except Exception as e:
        click.secho(f"‚ùå Unexpected error: {e}", fg='red', err=True)
        if verbose:
            import traceback
            click.echo(traceback.format_exc())
        raise click.Abort()

# Test
runner = CliRunner()

with runner.isolated_filesystem():
    # Create test config
    config_content = """
connections:
  db:
    type: databricks
    catalog: main

pipelines:
  sales_etl:
    steps:
      - extract:
          sql: "SELECT * FROM sales"
          explanation: |
            **Purpose:** Extract raw sales data
            **Details:** Pulls all columns from sales table
            **Result:** Returns complete sales dataset
"""
    
    with open('config.yaml', 'w') as f:
        f.write(config_content)
    
    result = runner.invoke(validate, ['config.yaml', '--verbose'])
    print(result.output)

---

## Summary

### Key Concepts

1. **Click Architecture**
   - Decorator-based commands
   - Command groups for multi-command CLIs
   - Automatic help and validation

2. **Rich Error Messages**
   - Custom exception hierarchy
   - Context-aware error formatting
   - Actionable suggestions

3. **Validation Workflow**
   - Multi-stage validation
   - Explanation quality linting
   - Progress indication

4. **Testing**
   - CliRunner for isolated tests
   - Temporary filesystem
   - Exit code verification

### Click vs argparse

| Feature | argparse | Click |
|---------|----------|-------|
| Syntax | Verbose | Concise |
| Testing | Manual | Built-in runner |
| Validation | Manual | Automatic |
| Composability | Limited | Excellent |
| File handling | Manual | Built-in types |

### Next Steps

Complete exercises to practice:
- Building multi-command CLIs
- Adding progress bars
- Creating config linters
- Implementing dry-run mode

---