# Verification and Validation for Functional Safety

**ISO 26262 Part 6 (Software) and Part 4 (System) Compliance**

Author: Milin Patel  
Institution: Hochschule Kempten - University of Applied Sciences

---

## Learning Objectives

1. Understand the V-Model development lifecycle
2. Apply verification methods per ASIL level
3. Design validation test cases for safety requirements
4. Implement test coverage metrics

## 1. V-Model Overview

### ISO 26262 Development Phases

```
                    System Design
                   /             \
        SW Architecture         System Integration Test
             /                           \
    SW Unit Design                 SW Integration Test
         /                                   \
   Implementation  ──────────────────>  Unit Test
```

### Verification vs Validation

| Aspect | Verification | Validation |
|--------|--------------|------------|
| Question | Are we building it right? | Are we building the right thing? |
| Focus | Process compliance | Product fitness |
| Methods | Reviews, analysis, testing | System testing, field trials |
| Reference | Design specifications | User/safety requirements |

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from dataclasses import dataclass, field
from typing import List, Dict, Optional, Set
from enum import Enum
import json

print("V&V Tools Loaded")

## 2. Verification Methods per ASIL

ISO 26262-6 Table 9 specifies methods based on ASIL level.

In [None]:
class ASIL(Enum):
    QM = 0
    A = 1
    B = 2
    C = 3
    D = 4

class MethodRecommendation(Enum):
    NOT_RECOMMENDED = "--"
    RECOMMENDED = "o"
    HIGHLY_RECOMMENDED = "+"
    STRONGLY_RECOMMENDED = "++"

# ISO 26262-6 Table 9 - Verification methods for software unit testing
VERIFICATION_METHODS = {
    "Requirements-based test": {
        ASIL.QM: MethodRecommendation.RECOMMENDED,
        ASIL.A: MethodRecommendation.HIGHLY_RECOMMENDED,
        ASIL.B: MethodRecommendation.HIGHLY_RECOMMENDED,
        ASIL.C: MethodRecommendation.STRONGLY_RECOMMENDED,
        ASIL.D: MethodRecommendation.STRONGLY_RECOMMENDED
    },
    "Interface test": {
        ASIL.QM: MethodRecommendation.RECOMMENDED,
        ASIL.A: MethodRecommendation.HIGHLY_RECOMMENDED,
        ASIL.B: MethodRecommendation.HIGHLY_RECOMMENDED,
        ASIL.C: MethodRecommendation.STRONGLY_RECOMMENDED,
        ASIL.D: MethodRecommendation.STRONGLY_RECOMMENDED
    },
    "Fault injection test": {
        ASIL.QM: MethodRecommendation.NOT_RECOMMENDED,
        ASIL.A: MethodRecommendation.RECOMMENDED,
        ASIL.B: MethodRecommendation.RECOMMENDED,
        ASIL.C: MethodRecommendation.HIGHLY_RECOMMENDED,
        ASIL.D: MethodRecommendation.HIGHLY_RECOMMENDED
    },
    "Resource usage test": {
        ASIL.QM: MethodRecommendation.RECOMMENDED,
        ASIL.A: MethodRecommendation.RECOMMENDED,
        ASIL.B: MethodRecommendation.HIGHLY_RECOMMENDED,
        ASIL.C: MethodRecommendation.HIGHLY_RECOMMENDED,
        ASIL.D: MethodRecommendation.STRONGLY_RECOMMENDED
    },
    "Back-to-back test": {
        ASIL.QM: MethodRecommendation.NOT_RECOMMENDED,
        ASIL.A: MethodRecommendation.RECOMMENDED,
        ASIL.B: MethodRecommendation.RECOMMENDED,
        ASIL.C: MethodRecommendation.HIGHLY_RECOMMENDED,
        ASIL.D: MethodRecommendation.HIGHLY_RECOMMENDED
    }
}

# Coverage metrics per ASIL
COVERAGE_REQUIREMENTS = {
    "Statement coverage": {
        ASIL.QM: MethodRecommendation.RECOMMENDED,
        ASIL.A: MethodRecommendation.STRONGLY_RECOMMENDED,
        ASIL.B: MethodRecommendation.STRONGLY_RECOMMENDED,
        ASIL.C: MethodRecommendation.STRONGLY_RECOMMENDED,
        ASIL.D: MethodRecommendation.STRONGLY_RECOMMENDED
    },
    "Branch coverage": {
        ASIL.QM: MethodRecommendation.NOT_RECOMMENDED,
        ASIL.A: MethodRecommendation.HIGHLY_RECOMMENDED,
        ASIL.B: MethodRecommendation.STRONGLY_RECOMMENDED,
        ASIL.C: MethodRecommendation.STRONGLY_RECOMMENDED,
        ASIL.D: MethodRecommendation.STRONGLY_RECOMMENDED
    },
    "MC/DC coverage": {
        ASIL.QM: MethodRecommendation.NOT_RECOMMENDED,
        ASIL.A: MethodRecommendation.NOT_RECOMMENDED,
        ASIL.B: MethodRecommendation.RECOMMENDED,
        ASIL.C: MethodRecommendation.HIGHLY_RECOMMENDED,
        ASIL.D: MethodRecommendation.STRONGLY_RECOMMENDED
    }
}

def get_recommended_methods(asil: ASIL) -> Dict[str, str]:
    """Get recommended verification methods for given ASIL."""
    methods = {}
    for method, recommendations in VERIFICATION_METHODS.items():
        if recommendations[asil] != MethodRecommendation.NOT_RECOMMENDED:
            methods[method] = recommendations[asil].value
    return methods

def display_method_matrix():
    """Display verification method matrix."""
    print("Verification Methods per ASIL (ISO 26262-6 Table 9)")
    print("=" * 70)
    print(f"{'Method':<30} {'QM':^6} {'A':^6} {'B':^6} {'C':^6} {'D':^6}")
    print("-" * 70)
    
    for method, recs in VERIFICATION_METHODS.items():
        row = f"{method:<30}"
        for asil in ASIL:
            row += f" {recs[asil].value:^6}"
        print(row)
    
    print("\nLegend: ++ strongly recommended, + highly recommended, o recommended, -- not recommended")

display_method_matrix()

## 3. Test Case Design

In [None]:
@dataclass
class SafetyRequirement:
    """Safety requirement from HARA/Safety Concept."""
    id: str
    description: str
    asil: ASIL
    source: str  # e.g., "SG-001" (Safety Goal)
    acceptance_criteria: str
    
@dataclass 
class TestCase:
    """Test case for verification."""
    id: str
    name: str
    requirement_id: str
    preconditions: List[str]
    test_steps: List[str]
    expected_results: List[str]
    test_type: str  # "unit", "integration", "system", "acceptance"
    status: str = "not_run"  # "not_run", "pass", "fail", "blocked"
    actual_results: Optional[str] = None

@dataclass
class TestSuite:
    """Collection of test cases for a component/feature."""
    name: str
    component: str
    requirements: List[SafetyRequirement] = field(default_factory=list)
    test_cases: List[TestCase] = field(default_factory=list)
    
    def add_requirement(self, req: SafetyRequirement):
        self.requirements.append(req)
    
    def add_test_case(self, tc: TestCase):
        self.test_cases.append(tc)
    
    def get_traceability_matrix(self) -> pd.DataFrame:
        """Generate requirements-to-tests traceability matrix."""
        matrix = []
        for req in self.requirements:
            tests = [tc.id for tc in self.test_cases if tc.requirement_id == req.id]
            matrix.append({
                'Requirement': req.id,
                'ASIL': req.asil.name,
                'Description': req.description[:50] + '...' if len(req.description) > 50 else req.description,
                'Test Cases': ', '.join(tests) if tests else 'MISSING',
                'Coverage': 'Covered' if tests else 'Not Covered'
            })
        return pd.DataFrame(matrix)
    
    def get_coverage_summary(self) -> Dict:
        """Calculate test coverage summary."""
        total_reqs = len(self.requirements)
        covered_reqs = len(set(tc.requirement_id for tc in self.test_cases))
        
        total_tests = len(self.test_cases)
        passed = len([tc for tc in self.test_cases if tc.status == 'pass'])
        failed = len([tc for tc in self.test_cases if tc.status == 'fail'])
        not_run = len([tc for tc in self.test_cases if tc.status == 'not_run'])
        
        return {
            'total_requirements': total_reqs,
            'covered_requirements': covered_reqs,
            'requirement_coverage': covered_reqs / total_reqs * 100 if total_reqs > 0 else 0,
            'total_tests': total_tests,
            'passed': passed,
            'failed': failed,
            'not_run': not_run,
            'pass_rate': passed / total_tests * 100 if total_tests > 0 else 0
        }

## 4. Example: Object Detection V&V

In [None]:
# Create test suite for object detection component
detection_suite = TestSuite(
    name="Object Detection Verification",
    component="Camera-based Object Detection"
)

# Define safety requirements
safety_requirements = [
    SafetyRequirement(
        id="SR-OD-001",
        description="The system shall detect pedestrians within 100m range with recall >= 99%",
        asil=ASIL.D,
        source="SG-001",
        acceptance_criteria="Recall >= 99% on validation dataset (>10000 pedestrian instances)"
    ),
    SafetyRequirement(
        id="SR-OD-002",
        description="The system shall report detection confidence with calibration error < 5%",
        asil=ASIL.C,
        source="SG-002",
        acceptance_criteria="Expected Calibration Error (ECE) < 0.05"
    ),
    SafetyRequirement(
        id="SR-OD-003",
        description="The system shall complete detection processing within 50ms per frame",
        asil=ASIL.B,
        source="SG-003",
        acceptance_criteria="99th percentile latency < 50ms over 1000 frames"
    ),
    SafetyRequirement(
        id="SR-OD-004",
        description="The system shall indicate reduced confidence in adverse weather",
        asil=ASIL.C,
        source="SG-004",
        acceptance_criteria="Mean confidence decreases by >= 20% in rain/fog conditions"
    ),
    SafetyRequirement(
        id="SR-OD-005",
        description="The system shall detect OOD inputs with AUROC >= 0.95",
        asil=ASIL.C,
        source="SG-005",
        acceptance_criteria="AUROC >= 0.95 on OOD benchmark dataset"
    )
]

for req in safety_requirements:
    detection_suite.add_requirement(req)

print(f"Loaded {len(detection_suite.requirements)} safety requirements")

In [None]:
# Define test cases
test_cases = [
    # Tests for SR-OD-001 (Pedestrian Detection)
    TestCase(
        id="TC-OD-001",
        name="Pedestrian recall on KITTI validation set",
        requirement_id="SR-OD-001",
        preconditions=["Model loaded", "KITTI validation set available"],
        test_steps=[
            "Load all images with pedestrian annotations",
            "Run inference on each image",
            "Match predictions to ground truth (IoU > 0.5)",
            "Calculate recall"
        ],
        expected_results=["Recall >= 99%"],
        test_type="system",
        status="pass"
    ),
    TestCase(
        id="TC-OD-002",
        name="Pedestrian detection in low light",
        requirement_id="SR-OD-001",
        preconditions=["Model loaded", "Low-light test set available"],
        test_steps=[
            "Load images with brightness < 50 lux",
            "Run inference",
            "Calculate recall for pedestrians"
        ],
        expected_results=["Recall >= 95% in low light (degraded mode accepted)"],
        test_type="system",
        status="pass"
    ),
    # Tests for SR-OD-002 (Calibration)
    TestCase(
        id="TC-OD-003",
        name="Confidence calibration measurement",
        requirement_id="SR-OD-002",
        preconditions=["Model loaded", "Validation set with labels"],
        test_steps=[
            "Collect predictions with confidence scores",
            "Bin predictions by confidence (10 bins)",
            "Calculate accuracy per bin",
            "Compute ECE = sum(|accuracy - confidence| * bin_weight)"
        ],
        expected_results=["ECE < 0.05"],
        test_type="system",
        status="fail",
        actual_results="ECE = 0.08 - calibration needed"
    ),
    # Tests for SR-OD-003 (Latency)
    TestCase(
        id="TC-OD-004",
        name="Inference latency measurement",
        requirement_id="SR-OD-003",
        preconditions=["Model deployed on target hardware", "Test images loaded"],
        test_steps=[
            "Warm up model with 100 inferences",
            "Measure latency for 1000 frames",
            "Calculate 99th percentile"
        ],
        expected_results=["P99 latency < 50ms"],
        test_type="integration",
        status="pass"
    ),
    # Tests for SR-OD-004 (Weather awareness)
    TestCase(
        id="TC-OD-005",
        name="Confidence reduction in rain",
        requirement_id="SR-OD-004",
        preconditions=["Model loaded", "Rain simulation dataset"],
        test_steps=[
            "Compare confidence on clear vs rain images",
            "Calculate mean confidence difference"
        ],
        expected_results=["Confidence reduced by >= 20%"],
        test_type="system",
        status="pass"
    ),
    # Tests for SR-OD-005 (OOD Detection)
    TestCase(
        id="TC-OD-006",
        name="OOD detection AUROC measurement",
        requirement_id="SR-OD-005",
        preconditions=["OOD detector deployed", "ID and OOD test sets"],
        test_steps=[
            "Compute OOD scores for in-distribution samples",
            "Compute OOD scores for out-of-distribution samples",
            "Calculate AUROC"
        ],
        expected_results=["AUROC >= 0.95"],
        test_type="system",
        status="not_run"
    )
]

for tc in test_cases:
    detection_suite.add_test_case(tc)

print(f"Loaded {len(detection_suite.test_cases)} test cases")

In [None]:
# Display traceability matrix
print("Requirements Traceability Matrix")
print("=" * 90)
traceability = detection_suite.get_traceability_matrix()
print(traceability.to_string(index=False))

In [None]:
# Display coverage summary
summary = detection_suite.get_coverage_summary()

print("\nTest Coverage Summary")
print("=" * 50)
print(f"Requirement Coverage: {summary['covered_requirements']}/{summary['total_requirements']} ({summary['requirement_coverage']:.1f}%)")
print(f"Test Execution: {summary['passed'] + summary['failed']}/{summary['total_tests']} executed")
print(f"  Passed: {summary['passed']}")
print(f"  Failed: {summary['failed']}")
print(f"  Not Run: {summary['not_run']}")
print(f"Pass Rate: {summary['pass_rate']:.1f}%")

## 5. Code Coverage Analysis

In [None]:
@dataclass
class CoverageReport:
    """Simulated code coverage report."""
    module: str
    statement_coverage: float
    branch_coverage: float
    mcdc_coverage: float
    uncovered_lines: List[int] = field(default_factory=list)

def evaluate_coverage_compliance(report: CoverageReport, asil: ASIL) -> Dict:
    """Evaluate coverage against ISO 26262 requirements."""
    results = {
        'module': report.module,
        'asil': asil.name,
        'checks': []
    }
    
    # Statement coverage (required for all ASILs)
    stmt_required = 100  # Target 100% for safety-critical
    stmt_pass = report.statement_coverage >= stmt_required
    results['checks'].append({
        'metric': 'Statement Coverage',
        'required': f'>= {stmt_required}%',
        'actual': f'{report.statement_coverage:.1f}%',
        'status': 'PASS' if stmt_pass else 'FAIL'
    })
    
    # Branch coverage (ASIL A and above)
    if asil.value >= ASIL.A.value:
        branch_required = 100
        branch_pass = report.branch_coverage >= branch_required
        results['checks'].append({
            'metric': 'Branch Coverage',
            'required': f'>= {branch_required}%',
            'actual': f'{report.branch_coverage:.1f}%',
            'status': 'PASS' if branch_pass else 'FAIL'
        })
    
    # MC/DC coverage (ASIL C and D)
    if asil.value >= ASIL.C.value:
        mcdc_required = 100
        mcdc_pass = report.mcdc_coverage >= mcdc_required
        results['checks'].append({
            'metric': 'MC/DC Coverage',
            'required': f'>= {mcdc_required}%',
            'actual': f'{report.mcdc_coverage:.1f}%',
            'status': 'PASS' if mcdc_pass else 'FAIL'
        })
    
    results['overall'] = all(c['status'] == 'PASS' for c in results['checks'])
    return results

# Example coverage reports
coverage_reports = [
    CoverageReport("preprocessing.py", 98.5, 95.2, 88.3, [45, 67, 89]),
    CoverageReport("detection_model.py", 100.0, 100.0, 95.5, []),
    CoverageReport("postprocessing.py", 97.8, 92.1, 85.0, [23, 56, 78, 112]),
    CoverageReport("safety_monitor.py", 100.0, 100.0, 100.0, [])
]

print("Coverage Compliance Analysis (ASIL C)")
print("=" * 70)

for report in coverage_reports:
    result = evaluate_coverage_compliance(report, ASIL.C)
    status = "COMPLIANT" if result['overall'] else "NON-COMPLIANT"
    print(f"\n{report.module}: {status}")
    for check in result['checks']:
        print(f"  {check['metric']}: {check['actual']} (req: {check['required']}) - {check['status']}")
    if report.uncovered_lines:
        print(f"  Uncovered lines: {report.uncovered_lines}")

## 6. Fault Injection Testing

In [None]:
@dataclass
class FaultInjectionTest:
    """Fault injection test specification."""
    id: str
    fault_type: str
    injection_point: str
    expected_detection: str
    expected_reaction: str
    result: Optional[str] = None

def simulate_fault_injection_campaign() -> List[FaultInjectionTest]:
    """Define fault injection test campaign for perception system."""
    tests = [
        FaultInjectionTest(
            id="FI-001",
            fault_type="Sensor data loss",
            injection_point="Camera input interface",
            expected_detection="Within 100ms",
            expected_reaction="Switch to degraded mode, use LiDAR only",
            result="Detected in 45ms, failover successful"
        ),
        FaultInjectionTest(
            id="FI-002",
            fault_type="Bit flip in detection output",
            injection_point="Neural network output buffer",
            expected_detection="CRC check failure",
            expected_reaction="Discard corrupted frame, use previous",
            result="CRC detected corruption, frame discarded"
        ),
        FaultInjectionTest(
            id="FI-003",
            fault_type="Processing timeout",
            injection_point="Inference engine",
            expected_detection="Watchdog timeout at 60ms",
            expected_reaction="Signal timeout to planning module",
            result="Watchdog triggered at 60ms, timeout signaled"
        ),
        FaultInjectionTest(
            id="FI-004",
            fault_type="Memory corruption",
            injection_point="Model weights memory",
            expected_detection="Checksum verification failure",
            expected_reaction="Reload model from secure storage",
            result="Checksum mismatch detected, model reloaded"
        ),
        FaultInjectionTest(
            id="FI-005",
            fault_type="Calibration data corruption",
            injection_point="Camera calibration matrix",
            expected_detection="Plausibility check failure",
            expected_reaction="Use backup calibration, request service",
            result="Plausibility check passed (fault not severe enough)"
        )
    ]
    return tests

fi_tests = simulate_fault_injection_campaign()

print("Fault Injection Test Results")
print("=" * 80)
for test in fi_tests:
    print(f"\n{test.id}: {test.fault_type}")
    print(f"  Injection Point: {test.injection_point}")
    print(f"  Expected Detection: {test.expected_detection}")
    print(f"  Expected Reaction: {test.expected_reaction}")
    print(f"  Result: {test.result}")

## 7. V&V Dashboard

In [None]:
def create_vv_dashboard(suite: TestSuite, coverage_reports: List[CoverageReport]):
    """Create V&V status dashboard."""
    fig, axes = plt.subplots(2, 2, figsize=(14, 10))
    
    # 1. Test Execution Status
    ax1 = axes[0, 0]
    summary = suite.get_coverage_summary()
    statuses = [summary['passed'], summary['failed'], summary['not_run']]
    labels = ['Passed', 'Failed', 'Not Run']
    colors = ['#4caf50', '#f44336', '#9e9e9e']
    ax1.pie(statuses, labels=labels, colors=colors, autopct='%1.0f%%', startangle=90)
    ax1.set_title('Test Execution Status')
    
    # 2. Requirements Coverage by ASIL
    ax2 = axes[0, 1]
    asil_coverage = {}
    for req in suite.requirements:
        asil_name = req.asil.name
        if asil_name not in asil_coverage:
            asil_coverage[asil_name] = {'total': 0, 'covered': 0}
        asil_coverage[asil_name]['total'] += 1
        if any(tc.requirement_id == req.id for tc in suite.test_cases):
            asil_coverage[asil_name]['covered'] += 1
    
    asils = list(asil_coverage.keys())
    covered = [asil_coverage[a]['covered'] for a in asils]
    total = [asil_coverage[a]['total'] for a in asils]
    
    x = np.arange(len(asils))
    width = 0.35
    ax2.bar(x - width/2, total, width, label='Total', color='#2196f3')
    ax2.bar(x + width/2, covered, width, label='Covered', color='#4caf50')
    ax2.set_xticks(x)
    ax2.set_xticklabels(asils)
    ax2.set_ylabel('Requirements')
    ax2.set_title('Requirements Coverage by ASIL')
    ax2.legend()
    
    # 3. Code Coverage by Module
    ax3 = axes[1, 0]
    modules = [r.module for r in coverage_reports]
    stmt_cov = [r.statement_coverage for r in coverage_reports]
    branch_cov = [r.branch_coverage for r in coverage_reports]
    mcdc_cov = [r.mcdc_coverage for r in coverage_reports]
    
    x = np.arange(len(modules))
    width = 0.25
    ax3.bar(x - width, stmt_cov, width, label='Statement', color='#2196f3')
    ax3.bar(x, branch_cov, width, label='Branch', color='#ff9800')
    ax3.bar(x + width, mcdc_cov, width, label='MC/DC', color='#9c27b0')
    ax3.axhline(y=100, color='r', linestyle='--', label='Target')
    ax3.set_xticks(x)
    ax3.set_xticklabels([m.replace('.py', '') for m in modules], rotation=45, ha='right')
    ax3.set_ylabel('Coverage %')
    ax3.set_ylim(0, 110)
    ax3.set_title('Code Coverage by Module')
    ax3.legend(fontsize=8)
    
    # 4. V&V Progress Timeline (simulated)
    ax4 = axes[1, 1]
    phases = ['Unit Test', 'Integration', 'System', 'Acceptance']
    planned = [100, 80, 60, 40]
    actual = [100, 75, 45, 20]
    
    x = np.arange(len(phases))
    ax4.barh(x, planned, height=0.4, label='Planned', color='#bbdefb', align='center')
    ax4.barh(x, actual, height=0.4, label='Completed', color='#1976d2', align='center')
    ax4.set_yticks(x)
    ax4.set_yticklabels(phases)
    ax4.set_xlabel('Progress %')
    ax4.set_title('V&V Phase Progress')
    ax4.legend(loc='lower right')
    ax4.set_xlim(0, 110)
    
    plt.tight_layout()
    plt.show()

create_vv_dashboard(detection_suite, coverage_reports)

## 8. Key Takeaways

### V&V Planning

1. **Start with requirements**: All tests trace to safety requirements
2. **Match methods to ASIL**: Higher ASIL = more rigorous testing
3. **Coverage targets**: Statement, branch, MC/DC as appropriate
4. **Fault injection**: Essential for ASIL C/D safety mechanisms

### Test Design Principles

- **Requirements-based**: Every requirement has test coverage
- **Boundary values**: Test at limits of valid ranges
- **Negative testing**: Verify proper handling of invalid inputs
- **Regression**: Maintain test suite for continuous verification

### Documentation Requirements

| Document | Purpose |
|----------|--------|
| Test Plan | Overall V&V strategy |
| Test Specification | Detailed test cases |
| Test Report | Execution results |
| Traceability Matrix | Requirements to tests |
| Coverage Report | Code coverage metrics |

## References

1. ISO 26262:2018 - Part 4 (Product development at the system level)
2. ISO 26262:2018 - Part 6 (Product development at the software level)
3. DO-178C - Software Considerations in Airborne Systems (MC/DC reference)
4. ISTQB - Software Testing Body of Knowledge