Skip to content

How Async Dashboard Uses Crew and Targets

John Gavin edited this page Apr 13, 2026 · 4 revisions

How the Async Dashboard Uses Crew and Targets

Educational Deep Dive: Understanding the crew/targets workflow pattern in the randomwalk async dashboard

Related Pages:

Note (April 2026): The browser dashboard at dashboard_comprehensive.qmd no longer uses run_simulation() or crew. It uses a vectorized inline engine with JS round-trip batching for non-blocking UI. This page documents the crew patterns used by the native R execution modes (workers > 0). See Approach 5: Vectorized Batched for the WebR architecture.


Table of Contents

  1. Overview
  2. Understanding Workers: Sync vs Async Architecture ⭐ NEW
  3. Crew Integration
  4. Targets Integration
  5. Code Walkthrough
  6. Worker Lifecycle
  7. Data Flow
  8. Educational Examples

Overview

The async dashboard demonstrates production-ready patterns for parallel processing in R using:

  • crew: Distributed worker management
  • targets: (Future) Pipeline orchestration

While the current dashboard doesn't use targets explicitly (it's a Shiny app, not a pipeline), it uses crew patterns that integrate seamlessly with targets in analysis workflows.

Key Concepts

Crew provides:

  • Worker pool management
  • Task distribution
  • Asynchronous execution
  • Result collection

Targets provides:

  • Dependency tracking
  • Caching
  • Pipeline orchestration
  • Reproducibility

Together: They enable scalable, reproducible, parallel workflows.


Understanding Workers: Sync vs Async Architecture

Key Insight: The workers parameter controls more than just parallelism—it determines the entire code architecture.

The Three Modes

Mode workers Architecture Execution Code Path
Sync 0 Direct function calls Sequential run_sync()
Async (1 worker) 1 Crew task queue Sequential* run_async()
Async (parallel) 2+ Crew task queue Parallel run_async()

*With 1 worker, tasks execute sequentially but through the async infrastructure

Fractal Similarity: workers=0 vs workers=1

A fascinating property of the randomwalk simulation is that workers=0 and workers=1 produce structurally similar results through completely different code paths:

# workers=0: True synchronous execution
result_sync <- randomwalk::run_simulation(
  grid_size = 100,
  n_walkers = 6,
  workers = 0  # No crew controller created
)

# workers=1: Async architecture, sequential execution
result_async_1 <- randomwalk::run_simulation(
  grid_size = 100,
  n_walkers = 6,
  workers = 1  # Creates crew controller with 1 worker
)

# Both produce similar fractal patterns!
# But use completely different code paths.

Why This Matters

Educational Value: workers=1 lets you learn the async architecture without parallelism complexity:

Sync (workers=0):
┌──────────────────────────────────────────────────┐
│ for each walker:                                 │
│   result <- simulate_walker(walker_id, grid)     │
│   # Direct function call, no task queue          │
└──────────────────────────────────────────────────┘

Async (workers=1):
┌──────────────────────────────────────────────────┐
│ controller <- crew_controller_local(workers = 1) │
│ controller$start()                               │
│                                                  │
│ for each walker:                                 │
│   controller$push(command = simulate_walker(...))│
│   # Task queued, executed by single worker       │
│                                                  │
│ controller$wait(mode = 'all')                    │
│ results <- controller$pop()                      │
│ controller$terminate()                           │
└──────────────────────────────────────────────────┘

Comparison Table

Aspect workers=0 (Sync) workers=1 (Async) workers=2+ (Parallel)
Architecture Direct function calls Task queue + worker Task queue + workers
Overhead Minimal Crew controller + serialization Same as workers=1
Execution Sequential Sequential (1 worker) Parallel
Pattern Simple loop Push/wait/pop Push/wait/pop
Grid Updates Immediate (shared) After completion (snapshot) After completion (snapshot)
Use Case Production baseline Educational/testing Performance
WebR Compatible ✅ Yes ❌ No (needs crew) ❌ No (needs crew)

When to Use Each Mode

  • workers=0: WebR/Shinylive deployments, simple testing, baseline benchmarks
  • workers=1: Learning crew patterns, debugging async logic, testing task serialization
  • workers=2+: Production parallel processing, performance-critical workflows

The "Fractal" Insight

The term "fractal similarity" refers to how both approaches produce self-similar random walk patterns—the statistical properties of the output are comparable even though:

  1. Sync mode updates the grid after each walker step
  2. Async mode (even with 1 worker) uses static grid snapshots

This demonstrates that for many applications, the async overhead is acceptable and the patterns remain valid regardless of execution model.


Crew Integration

How Crew Works in the Dashboard

The randomwalk::run_simulation() function uses crew internally when workers > 0:

# From randomwalk package (R/run_simulation.R)
run_simulation <- function(grid_size,
                          n_walkers,
                          workers = 0,  # Number of crew workers
                          ...) {

  if (workers == 0) {
    # Sync mode: sequential processing
    result <- run_sync(...)
  } else {
    # Async mode: parallel processing with crew
    result <- run_async(workers = workers, ...)
  }

  return(result)
}

Crew Controller Creation

When using crew, the first step is creating a controller that manages the worker pool:

# From randomwalk package (internal)
run_async <- function(workers, grid_size, n_walkers, ...) {

  # Create crew controller
  controller <- crew::crew_controller_local(
    name = "randomwalk",
    workers = workers,
    seconds_idle = 10  # Kill idle workers after 10s
  )

  # Start the controller
  controller$start()

  # ... use controller for tasks ...

  # Clean up
  controller$terminate()
}

Key parameters:

  • workers: Number of parallel workers
  • seconds_idle: Worker timeout (conserves resources)
  • name: Identifier for this controller

Task Distribution

Crew distributes tasks across workers using push() and pop():

# Simplified example from randomwalk async implementation
run_async <- function(controller, n_walkers, ...) {

  # PUSH: Send tasks to workers
  for (walker_id in 1:n_walkers) {
    controller$push(
      command = simulate_walker(
        id = walker_id,
        grid_state = grid,
        ...
      ),
      data = list(walker_id = walker_id)  # Task metadata
    )
  }

  # WAIT: Let workers process tasks
  controller$wait(mode = "all")  # Wait for all tasks

  # POP: Collect results
  results <- controller$pop()

  return(results)
}

Push-Pop pattern:

  1. Push: Queue tasks to worker pool
  2. Wait: Block until tasks complete
  3. Pop: Retrieve results

Actual Implementation

The real implementation in randomwalk is more sophisticated:

# From R/run_async.R (simplified)
run_async <- function(workers, grid_size, n_walkers,
                     neighborhood, boundary, max_steps) {

  # Initialize controller
  controller <- crew::crew_controller_local(
    name = "randomwalk",
    workers = workers,
    seconds_idle = 10
  )
  controller$start()

  # Create initial grid
  grid <- initialize_grid(grid_size)

  # Create walker starting positions
  walker_positions <- sample_starting_positions(grid, n_walkers)

  # Push walker simulation tasks
  for (i in 1:n_walkers) {
    controller$push(
      command = {
        # This code runs IN THE WORKER
        randomwalk::simulate_single_walker(
          id = walker_id,
          start_pos = start_position,
          grid_snapshot = grid_state,  # Static snapshot
          neighborhood = neighborhood,
          boundary = boundary,
          max_steps = max_steps
        )
      },
      data = list(
        walker_id = i,
        start_position = walker_positions[[i]],
        grid_state = grid  # Each worker gets grid copy
      )
    )
  }

  # Wait for all walkers to complete
  controller$wait(mode = "all")

  # Collect results
  walker_results <- controller$pop()$result

  # Aggregate results
  final_grid <- aggregate_walker_paths(walker_results, grid)

  # Clean up
  controller$terminate()

  # Return combined results
  list(
    grid = final_grid,
    walkers = walker_results,
    statistics = calculate_stats(final_grid, walker_results),
    parameters = list(workers = workers, ...)
  )
}

Static Grid Snapshots

Critical design decision: Each worker receives a static copy of the grid state:

# Each worker gets its own grid copy
data = list(
  walker_id = i,
  grid_state = grid  # COPY, not reference
)

Why static snapshots?

  1. Simplicity: No synchronization logic needed
  2. Safety: No race conditions
  3. Predictability: Workers don't interfere with each other
  4. Performance: No locking overhead

Trade-off: Results differ from sync mode (acceptable for this application)


Targets Integration

While the Shiny dashboard doesn't use targets directly, the same crew code works seamlessly in targets pipelines.

Targets Pipeline Example

Here's how you'd use the same run_simulation() function in a targets workflow:

# _targets.R
library(targets)
library(crew)
library(randomwalk)

# Define crew controller for targets
tar_option_set(
  controller = crew_controller_local(
    name = "simulation_pipeline",
    workers = 4
  )
)

# Define pipeline
list(
  # Grid sizes to test
  tar_target(
    grid_sizes,
    c(20, 50, 100, 200)
  ),

  # Walker counts to test
  tar_target(
    walker_counts,
    c(5, 10, 20, 50)
  ),

  # Run simulations (parallelized across grid x walker combinations)
  tar_target(
    simulations,
    run_simulation(
      grid_size = grid_sizes,
      n_walkers = walker_counts,
      workers = 0,  # Targets handles parallelism
      neighborhood = "4-hood",
      boundary = "terminate",
      max_steps = 10000
    ),
    pattern = cross(grid_sizes, walker_counts),  # Cartesian product
    iteration = "list"
  ),

  # Aggregate results
  tar_target(
    summary_stats,
    summarize_simulations(simulations)
  ),

  # Generate plots
  tar_target(
    plots,
    create_plots(summary_stats)
  )
)

How targets uses crew:

  1. Targets creates tasks for each pattern combination
  2. Crew controller distributes tasks to workers
  3. Each worker runs run_simulation() independently
  4. Results cached and aggregated by targets

Nested Parallelism

You can even use nested parallelism - targets parallelizes simulations, each simulation parallelizes walkers:

tar_target(
  simulations,
  run_simulation(
    grid_size = grid_sizes,
    n_walkers = 100,
    workers = 2,  # Each simulation uses 2 crew workers
    ...
  ),
  pattern = map(grid_sizes),  # Targets parallelizes across grids
  iteration = "list"
)

Result: If targets has 4 workers and each simulation uses 2 workers, you have:

  • 4 simulations running simultaneously (targets-level)
  • Each using 2 workers internally (crew-level)
  • Total: 8 cores utilized

Code Walkthrough

Note: The code walkthrough below describes the native R crew-based path (workers > 0). The browser dashboard uses a completely different architecture — see Approach 5.

Let's trace what happens when you click "Run Simulation" with workers = 2 in native R:

Step 1: Dashboard Button Click

# inst/shiny/dashboard_async/app.R (line ~340)
observeEvent(input$run_sim, {

  # Log event
  add_log("=== RUN SIMULATION CLICKED ===")
  add_log(sprintf("Parameters: grid=%d, walkers=%d, workers=%d",
                  input$grid_size, input$n_walkers, input$workers))

  # Call randomwalk package
  result <- randomwalk::run_simulation(
    grid_size = input$grid_size,      # 100
    n_walkers = input$n_walkers,      # 6
    workers = input$workers,           # 2
    neighborhood = input$neighborhood,
    boundary = input$boundary,
    max_steps = input$max_steps
  )

  # Store and display results
  sim_result(result)
  add_log("Simulation completed successfully")
})

Step 2: Package Routing

# R/run_simulation.R
run_simulation <- function(grid_size, n_walkers, workers = 0, ...) {

  start_time <- Sys.time()

  if (workers == 0) {
    # Route to synchronous implementation
    logger::log_info("Mode: Synchronous")
    result <- run_sync(grid_size, n_walkers, ...)

  } else {
    # Route to asynchronous implementation
    logger::log_info("Mode: Asynchronous ({workers} workers)", workers = workers)
    result <- run_async(grid_size, n_walkers, workers, ...)
  }

  # Add timing information
  result$statistics$elapsed_time_secs <- as.numeric(Sys.time() - start_time)

  return(result)
}

Step 3: Crew Controller Initialization

# R/run_async.R (Step 3)
run_async <- function(grid_size, n_walkers, workers, ...) {

  logger::log_info("Creating crew controller with {workers} workers")

  # Create and start controller
  controller <- crew::crew_controller_local(
    name = "randomwalk",
    workers = workers,    # 2 workers
    seconds_idle = 10
  )

  controller$start()
  logger::log_info("Controller started")

  # ... continue to task distribution ...
}

What happens: Crew spawns 2 R processes (workers) waiting for tasks

Step 4: Grid Initialization

# R/run_async.R (Step 4)
run_async <- function(...) {
  # ... controller setup ...

  logger::log_info("Initializing grid of size {grid_size}x{grid_size}")

  # Create empty grid
  grid <- matrix(FALSE, nrow = grid_size, ncol = grid_size)

  # Generate starting positions for walkers
  logger::log_info("Created {n_walkers} walkers")
  walker_starts <- sample_starting_positions(grid, n_walkers)

  # ... continue to task distribution ...
}

Step 5: Task Distribution to Workers

# R/run_async.R (Step 5)
run_async <- function(...) {
  # ... setup complete ...

  logger::log_info("Distributing {n_walkers} walkers to {workers} workers")

  # Push each walker as a separate task
  for (i in 1:n_walkers) {  # 6 walkers
    controller$push(
      command = {
        # THIS CODE RUNS IN WORKER PROCESS
        randomwalk::simulate_single_walker(
          id = walker_id,
          start_row = start[[1]],
          start_col = start[[2]],
          grid = grid_snapshot,
          neighborhood = neighborhood,
          boundary = boundary,
          max_steps = max_steps
        )
      },
      data = list(
        walker_id = i,
        start = walker_starts[[i]],
        grid_snapshot = grid,  # Copy of grid
        neighborhood = neighborhood,
        boundary = boundary,
        max_steps = max_steps
      )
    )
  }

  logger::log_info("All tasks pushed to workers")

  # ... continue to waiting ...
}

What happens:

  • 6 walker tasks queued
  • 2 workers available
  • Worker 1 starts walker 1, 2
  • Worker 2 starts walker 3, 4
  • Workers take more tasks as they finish

Step 6: Wait for Completion

# R/run_async.R (Step 6)
run_async <- function(...) {
  # ... tasks distributed ...

  logger::log_info("Waiting for all walkers to complete")

  # Block until all 6 tasks finish
  controller$wait(mode = "all")

  logger::log_info("All walkers completed")

  # ... continue to result collection ...
}

Step 7: Collect Results

# R/run_async.R (Step 7)
run_async <- function(...) {
  # ... tasks complete ...

  logger::log_info("Collecting results from workers")

  # Pop results from completed tasks
  task_results <- controller$pop()

  # Extract walker data
  walker_results <- lapply(task_results$result, function(r) r)

  logger::log_info("Results collected")

  # ... continue to aggregation ...
}

Step 8: Aggregate and Clean Up

# R/run_async.R (Step 8)
run_async <- function(...) {
  # ... results collected ...

  logger::log_info("Aggregating walker paths onto grid")

  # Combine walker paths onto final grid
  final_grid <- grid  # Start with empty grid
  for (walker in walker_results) {
    for (pos in walker$path) {
      final_grid[pos[1], pos[2]] <- TRUE  # Mark visited
    }
  }

  # Calculate statistics
  stats <- list(
    total_steps = sum(sapply(walker_results, function(w) w$steps)),
    black_pixels = sum(final_grid),
    black_percentage = 100 * sum(final_grid) / length(final_grid),
    grid_size = grid_size,
    total_walkers = n_walkers
  )

  logger::log_info("=== SIMULATION COMPLETE ===")
  logger::log_info("Total steps: {stats$total_steps}")

  # Terminate controller
  controller$terminate()

  # Return results
  list(
    grid = final_grid,
    walkers = walker_results,
    statistics = stats,
    parameters = list(workers = workers, ...)
  )
}

Step 9: Return to Dashboard

# Back in dashboard (inst/shiny/dashboard_async/app.R)
observeEvent(input$run_sim, {
  # ... simulation call ...

  result <- randomwalk::run_simulation(...)  # Returns from Step 8

  # Store result
  sim_result(result)

  # Update status
  status_msg(sprintf(
    "Simulation complete! %d steps, %.1f%% coverage in %.2fs",
    result$statistics$total_steps,
    result$statistics$black_percentage,
    result$statistics$elapsed_time_secs
  ))

  # Display in tabs (plots auto-update via reactivity)
})

Worker Lifecycle

Timeline Visualization

Time →

Main Process (Dashboard):
├─ [Click "Run"] ──────────────────────────────── [Display Results]
│                                                  ↑
│  Crew Controller:                              │
│  ├─ Start ────┬─ Push Tasks ─── Wait ─── Pop ─┴─ Terminate
│               │                   ↑
│               │                   │
│               ↓                   │
│  Worker 1:    Idle ─ Task1 ─ Task3 ─ Task5 ─ Idle
│  Worker 2:    Idle ─ Task2 ─ Task4 ─ Task6 ─ Idle

Detailed Worker State Machine

Worker States:

[STARTED] ──→ [IDLE] ──→ [BUSY] ──→ [IDLE] ──→ [TERMINATED]
               ↑          │         ↑
               └──────────┴─────────┘
                  (task loop)

State transitions:

  1. STARTED: Worker process spawned
  2. IDLE: Waiting for tasks from controller
  3. BUSY: Executing simulate_single_walker()
  4. IDLE: Task complete, ready for more
  5. TERMINATED: Controller shutdown, worker exits

Data Flow

Sync vs Async Data Flow

Sync Mode (workers = 0):

Grid (shared) ──→ Walker 1 ──→ Grid updated
              ──→ Walker 2 ──→ Grid updated
              ──→ Walker 3 ──→ Grid updated
              ...

Sequential: Each walker sees previous walkers' changes

Async Mode (workers > 0):

Grid (original) ──→ Worker 1: [Walker 1] ──→ Paths collected
                │             [Walker 3]
                │             [Walker 5]
                │
                └──→ Worker 2: [Walker 2] ──→ Paths collected
                              [Walker 4]
                              [Walker 6]

All paths ──→ Aggregate ──→ Final Grid

Parallel: Workers use static grid snapshot

Memory Layout

Sync mode:

Main Process Memory:
├─ Grid (1 copy, updated)
└─ Walkers (sequential)

Async mode:

Main Process Memory:
├─ Grid (original)
└─ Controller

Worker 1 Memory:
├─ Grid (snapshot copy)
└─ Walkers 1, 3, 5 (active)

Worker 2 Memory:
├─ Grid (snapshot copy)
└─ Walkers 2, 4, 6 (active)

Memory trade-off: More copies (workers × grid size) for parallelism


Educational Examples

Example 1: Basic Crew Usage

Minimal example showing crew pattern:

library(crew)

# Create controller
controller <- crew_controller_local(workers = 2)
controller$start()

# Push 10 tasks
for (i in 1:10) {
  controller$push(
    command = i^2,  # Square each number
    data = list(i = i)
  )
}

# Wait and collect
controller$wait(mode = "all")
results <- controller$pop()$result

# Results: 1, 4, 9, 16, 25, ...
print(results)

# Clean up
controller$terminate()

Example 2: Crew with Custom Function

Using crew with a custom computation:

# Define expensive function
expensive_computation <- function(x) {
  Sys.sleep(1)  # Simulate work
  return(x * 2)
}

# Create controller
controller <- crew_controller_local(workers = 4)
controller$start()

# Push tasks
inputs <- 1:20
for (x in inputs) {
  controller$push(
    command = expensive_computation(value),
    data = list(value = x)
  )
}

# With 4 workers, 20 tasks take ~5 seconds (not 20)
system.time({
  controller$wait(mode = "all")
  results <- controller$pop()$result
})

controller$terminate()

Example 3: Monitoring Progress

Crew allows checking progress without blocking:

controller <- crew_controller_local(workers = 2)
controller$start()

# Push long-running tasks
for (i in 1:100) {
  controller$push(
    command = slow_function(x),
    data = list(x = i)
  )
}

# Monitor progress
while (!controller$empty()) {
  # Check without blocking
  completed <- controller$pop()
  if (!is.null(completed$result)) {
    message(sprintf("Completed: %d tasks", length(completed$result)))
  }
  Sys.sleep(0.5)  # Check every 500ms
}

controller$terminate()

Example 4: Error Handling

Crew captures errors from workers:

risky_function <- function(x) {
  if (x %% 3 == 0) stop("Divisible by 3!")
  return(x^2)
}

controller <- crew_controller_local(workers = 2)
controller$start()

for (i in 1:10) {
  controller$push(
    command = risky_function(value),
    data = list(value = i)
  )
}

controller$wait(mode = "all")
results <- controller$pop()

# Check for errors
errors <- results$error
successes <- results$result[!is.na(results$result)]

message(sprintf("Successes: %d, Errors: %d",
                length(successes), sum(!is.na(errors))))

controller$terminate()

See Also


References

  • Repository: https://github.com/JohnGavin/randomwalk
  • Key files:
    • R/simulation.R - All simulation code (sync, async, chunked modes)
    • R/walker.R - Walker creation and movement
    • R/walker_dynamic.R - Dynamic broadcasting walker (deprecated)
    • R/broadcasting.R - Pub/sub socket functions (deprecated)
    • vignettes/articles/dashboard_comprehensive.qmd - Main Shinylive dashboard

Note: All simulation modes (run_simulation(), async variants, chunked mode) are in the single R/simulation.R file - there are no separate run_async.R or run_sync.R files.