# VedaRT Deterministic Debugging

This notebook demonstrates VedaRT's deterministic execution mode for debugging parallel code.

## Key Features

- 🎯 **Reproducible execution** - Same seed = same results every time
- 🐛 **Debug parallel bugs** - Eliminate race conditions during debugging
- 📝 **Execution tracing** - Record and replay execution timelines
- ✅ **Testing support** - Write reliable tests for parallel code

---

In [None]:
# Install VedaRT if needed
# !pip install vedart

import vedart as veda
import random
import time
from typing import List

## Example 1: The Problem - Non-Deterministic Parallel Code

Without deterministic mode, parallel code can behave differently on each run due to:
- Thread scheduling variations
- Race conditions
- Random number generation
- System load

In [None]:
def buggy_computation(x: int) -> int:
    """Simulates a computation with hidden randomness."""
    # Simulate variable execution time
    time.sleep(random.uniform(0.001, 0.01))
    
    # Random perturbation (simulates a bug)
    noise = random.randint(-2, 2)
    
    return x * 2 + noise

# Run multiple times - notice different results
print("Without deterministic mode:")
for run in range(3):
    result = veda.par_iter(range(10)).map(buggy_computation).collect()
    print(f"  Run {run+1}: {result[:5]}...")  # First 5 elements

**Notice**: Results vary between runs! This makes debugging nearly impossible.

## Example 2: The Solution - Deterministic Mode

With `veda.deterministic(seed=...)`, execution is reproducible:

In [None]:
print("\nWith deterministic mode (seed=42):")
for run in range(3):
    with veda.deterministic(seed=42):
        result = veda.par_iter(range(10)).map(buggy_computation).collect()
    print(f"  Run {run+1}: {result[:5]}...")  # First 5 elements

**Notice**: Identical results on every run! Now we can debug reliably.

## Example 3: Finding Bugs with Deterministic Mode

Let's use determinism to debug a subtle parallel bug:

In [None]:
# Shared state (bug!)
counter = {"value": 0}

def increment_counter(x: int) -> int:
    """Has a race condition on shared state."""
    # Read-modify-write race condition
    current = counter["value"]
    time.sleep(0.001)  # Simulate work
    counter["value"] = current + 1
    return x * 2

# Without deterministic mode
print("Non-deterministic runs:")
for run in range(3):
    counter["value"] = 0
    result = veda.par_iter(range(10)).map(increment_counter).collect()
    print(f"  Run {run+1}: counter = {counter['value']} (expected 10)")

print("\nDeterministic runs (seed=123):")
for run in range(3):
    counter["value"] = 0
    with veda.deterministic(seed=123):
        result = veda.par_iter(range(10)).map(increment_counter).collect()
    print(f"  Run {run+1}: counter = {counter['value']} (expected 10, bug is now reproducible!)")

**Key Insight**: The race condition is now reproducible! We can:
1. Set breakpoints
2. Add logging
3. Inspect state reliably
4. Fix the bug (use proper synchronization)

## Example 4: Execution Tracing

Record execution timeline for later analysis:

In [None]:
import tempfile
import os

def traced_computation(x: int) -> int:
    """A computation we want to trace."""
    time.sleep(random.uniform(0.01, 0.05))
    return x ** 2

# Create trace file
trace_file = os.path.join(tempfile.gettempdir(), "vedart_trace.json")

print(f"Recording trace to: {trace_file}")

# Record execution
with veda.deterministic(seed=42, trace_file=trace_file):
    result = veda.par_iter(range(20)).map(traced_computation).collect()

print(f"\nFirst 10 results: {result[:10]}")
print(f"Trace saved to: {trace_file}")
print(f"Trace size: {os.path.getsize(trace_file) if os.path.exists(trace_file) else 'N/A'} bytes")

## Example 5: Comparing Different Seeds

Different seeds produce different (but reproducible) executions:

In [None]:
def random_computation(x: int) -> int:
    return x + random.randint(0, 10)

seeds = [42, 123, 999]

print("Different seeds produce different (but reproducible) results:\n")

for seed in seeds:
    # Run twice with same seed
    results = []
    for _ in range(2):
        with veda.deterministic(seed=seed):
            result = veda.par_iter(range(5)).map(random_computation).collect()
        results.append(result)
    
    # Verify reproducibility
    assert results[0] == results[1], f"Seed {seed} not reproducible!"
    print(f"Seed {seed:3d}: {results[0]} (reproducible: ✓)")

## Example 6: Testing Parallel Code

Write reliable tests for parallel algorithms:

In [None]:
def parallel_sort_test():
    """Test a parallel sorting algorithm."""
    
    def parallel_partition(items: List[int]) -> List[int]:
        """Parallel quicksort-style partition."""
        if len(items) <= 1:
            return items
        
        pivot = items[len(items) // 2]
        
        # Parallel partition
        with veda.scope() as s:
            left_future = s.spawn(lambda: [x for x in items if x < pivot])
            right_future = s.spawn(lambda: [x for x in items if x > pivot])
            results = s.wait_all()
        
        left, right = results
        middle = [x for x in items if x == pivot]
        
        return left + middle + right
    
    # Test with deterministic mode
    test_data = [random.randint(0, 100) for _ in range(20)]
    
    with veda.deterministic(seed=42):
        result = parallel_partition(test_data)
    
    expected = sorted(test_data)
    
    if result == expected:
        print("✓ Parallel sort test PASSED")
        print(f"  Input:  {test_data[:10]}...")
        print(f"  Output: {result[:10]}...")
    else:
        print("✗ Parallel sort test FAILED")
        print(f"  Expected: {expected[:10]}...")
        print(f"  Got:      {result[:10]}...")

parallel_sort_test()

## Example 7: Performance Impact

Measure the overhead of deterministic mode:

In [None]:
import time

def benchmark_task(x: int) -> int:
    """Simple computation for benchmarking."""
    return sum(i * i for i in range(x))

num_tasks = 1000
task_range = range(100, 200)

# Benchmark normal mode
start = time.perf_counter()
result_normal = veda.par_iter(task_range).map(benchmark_task).sum()
time_normal = time.perf_counter() - start

# Benchmark deterministic mode
start = time.perf_counter()
with veda.deterministic(seed=42):
    result_det = veda.par_iter(task_range).map(benchmark_task).sum()
time_det = time.perf_counter() - start

overhead_pct = ((time_det - time_normal) / time_normal) * 100

print(f"Performance Comparison:")
print(f"  Normal mode:        {time_normal:.3f}s")
print(f"  Deterministic mode: {time_det:.3f}s")
print(f"  Overhead:           {overhead_pct:.1f}%")
print(f"\n  Results match: {result_normal == result_det}")

## Example 8: Nested Parallel Operations

Determinism works with nested parallel operations:

In [None]:
def outer_task(x: int) -> int:
    """Task that spawns inner parallel work."""
    # Inner parallel computation
    inner_result = veda.par_iter(range(x)).map(lambda i: i * 2).sum()
    return inner_result + random.randint(0, 5)

print("Nested parallel operations with determinism:\n")

for run in range(3):
    with veda.deterministic(seed=999):
        result = veda.par_iter(range(5)).map(outer_task).collect()
    print(f"  Run {run+1}: {result}")

print("\n✓ All runs produce identical results!")

## Best Practices

### ✅ When to Use Deterministic Mode

1. **Debugging parallel bugs** - Make race conditions reproducible
2. **Writing tests** - Ensure test reliability
3. **Performance analysis** - Compare different implementations fairly
4. **Benchmarking** - Reduce variance between runs

### ⚠️ Limitations

1. **Not bitwise deterministic** - Floating-point operations may still vary
2. **External I/O** - Network calls, file timestamps are non-deterministic
3. **System state** - External processes, hardware variations can affect results
4. **Performance overhead** - Adds 10-25% overhead (acceptable for debugging)

### 💡 Tips

- Use different seeds to explore different execution paths
- Enable tracing only when needed (performance impact)
- Document the seed used in bug reports
- Disable in production (use for dev/test only)

## Summary

VedaRT's deterministic mode transforms parallel debugging:

| Feature | Without Determinism | With Determinism |
|---------|---------------------|------------------|
| **Reproducibility** | ❌ Varies each run | ✅ Same results |
| **Debugging** | 😰 Heisenbug nightmare | 😊 Reliable breakpoints |
| **Testing** | ⚠️ Flaky tests | ✅ Stable tests |
| **Performance** | 100% | ~85-90% (acceptable) |

---

## Next Steps

- Try deterministic mode with your own parallel code
- Write tests using `veda.deterministic()`
- Experiment with execution tracing
- Read the [determinism guarantees documentation](../../docs/guarantees.md)

Happy debugging! 🐛🔍