How Async Dashboard Uses Crew and Targets

How the Async Dashboard Uses Crew and Targets

Educational Deep Dive: Understanding the crew/targets workflow pattern in the randomwalk async dashboard

Related Pages:

Async Dashboard Approaches - All execution modes including vectorized WebR

Note (April 2026): The browser dashboard at dashboard_comprehensive.qmd no longer uses run_simulation() or crew. It uses a vectorized inline engine with JS round-trip batching for non-blocking UI. This page documents the crew patterns used by the native R execution modes (workers > 0). See Approach 5: Vectorized Batched for the WebR architecture.

Overview

The async dashboard demonstrates production-ready patterns for parallel processing in R using:

crew: Distributed worker management
targets: (Future) Pipeline orchestration

While the current dashboard doesn't use targets explicitly (it's a Shiny app, not a pipeline), it uses crew patterns that integrate seamlessly with targets in analysis workflows.

Key Concepts

Crew provides:

Worker pool management
Task distribution
Asynchronous execution
Result collection

Targets provides:

Dependency tracking
Caching
Pipeline orchestration
Reproducibility

Together: They enable scalable, reproducible, parallel workflows.

Understanding Workers: Sync vs Async Architecture

Key Insight: The workers parameter controls more than just parallelism—it determines the entire code architecture.

The Three Modes

Mode	`workers`	Architecture	Execution	Code Path
Sync	`0`	Direct function calls	Sequential	`run_sync()`
Async (1 worker)	`1`	Crew task queue	Sequential*	`run_async()`
Async (parallel)	`2+`	Crew task queue	Parallel	`run_async()`

*With 1 worker, tasks execute sequentially but through the async infrastructure

Fractal Similarity: workers=0 vs workers=1

A fascinating property of the randomwalk simulation is that workers=0 and workers=1 produce structurally similar results through completely different code paths:

# workers=0: True synchronous execution
result_sync <- randomwalk::run_simulation(
  grid_size = 100,
  n_walkers = 6,
  workers = 0  # No crew controller created
)

# workers=1: Async architecture, sequential execution
result_async_1 <- randomwalk::run_simulation(
  grid_size = 100,
  n_walkers = 6,
  workers = 1  # Creates crew controller with 1 worker
)

# Both produce similar fractal patterns!
# But use completely different code paths.

Why This Matters

Educational Value: workers=1 lets you learn the async architecture without parallelism complexity:

Sync (workers=0):
┌──────────────────────────────────────────────────┐
│ for each walker:                                 │
│   result <- simulate_walker(walker_id, grid)     │
│   # Direct function call, no task queue          │
└──────────────────────────────────────────────────┘

Async (workers=1):
┌──────────────────────────────────────────────────┐
│ controller <- crew_controller_local(workers = 1) │
│ controller$start()                               │
│                                                  │
│ for each walker:                                 │
│   controller$push(command = simulate_walker(...))│
│   # Task queued, executed by single worker       │
│                                                  │
│ controller$wait(mode = 'all')                    │
│ results <- controller$pop()                      │
│ controller$terminate()                           │
└──────────────────────────────────────────────────┘

Comparison Table

Aspect	workers=0 (Sync)	workers=1 (Async)	workers=2+ (Parallel)
Architecture	Direct function calls	Task queue + worker	Task queue + workers
Overhead	Minimal	Crew controller + serialization	Same as workers=1
Execution	Sequential	Sequential (1 worker)	Parallel
Pattern	Simple loop	Push/wait/pop	Push/wait/pop
Grid Updates	Immediate (shared)	After completion (snapshot)	After completion (snapshot)
Use Case	Production baseline	Educational/testing	Performance
WebR Compatible	✅ Yes	❌ No (needs crew)	❌ No (needs crew)

When to Use Each Mode

workers=0: WebR/Shinylive deployments, simple testing, baseline benchmarks
workers=1: Learning crew patterns, debugging async logic, testing task serialization
workers=2+: Production parallel processing, performance-critical workflows

The "Fractal" Insight

The term "fractal similarity" refers to how both approaches produce self-similar random walk patterns—the statistical properties of the output are comparable even though:

Sync mode updates the grid after each walker step
Async mode (even with 1 worker) uses static grid snapshots

This demonstrates that for many applications, the async overhead is acceptable and the patterns remain valid regardless of execution model.

Crew Integration

How Crew Works in the Dashboard

The randomwalk::run_simulation() function uses crew internally when workers > 0:

# From randomwalk package (R/run_simulation.R)
run_simulation <- function(grid_size,
                          n_walkers,
                          workers = 0,  # Number of crew workers
                          ...) {

  if (workers == 0) {
    # Sync mode: sequential processing
    result <- run_sync(...)
  } else {
    # Async mode: parallel processing with crew
    result <- run_async(workers = workers, ...)
  }

  return(result)
}

Crew Controller Creation

When using crew, the first step is creating a controller that manages the worker pool:

# From randomwalk package (internal)
run_async <- function(workers, grid_size, n_walkers, ...) {

  # Create crew controller
  controller <- crew::crew_controller_local(
    name = "randomwalk",
    workers = workers,
    seconds_idle = 10  # Kill idle workers after 10s
  )

  # Start the controller
  controller$start()

  # ... use controller for tasks ...

  # Clean up
  controller$terminate()
}

Key parameters:

workers: Number of parallel workers
seconds_idle: Worker timeout (conserves resources)
name: Identifier for this controller

Task Distribution

Crew distributes tasks across workers using push() and pop():

# Simplified example from randomwalk async implementation
run_async <- function(controller, n_walkers, ...) {

  # PUSH: Send tasks to workers
  for (walker_id in 1:n_walkers) {
    controller$push(
      command = simulate_walker(
        id = walker_id,
        grid_state = grid,
        ...
      ),
      data = list(walker_id = walker_id)  # Task metadata
    )
  }

  # WAIT: Let workers process tasks
  controller$wait(mode = "all")  # Wait for all tasks

  # POP: Collect results
  results <- controller$pop()

  return(results)
}

Push-Pop pattern:

Push: Queue tasks to worker pool
Wait: Block until tasks complete
Pop: Retrieve results

Actual Implementation

The real implementation in randomwalk is more sophisticated:

# From R/run_async.R (simplified)
run_async <- function(workers, grid_size, n_walkers,
                     neighborhood, boundary, max_steps) {

  # Initialize controller
  controller <- crew::crew_controller_local(
    name = "randomwalk",
    workers = workers,
    seconds_idle = 10
  )
  controller$start()

  # Create initial grid
  grid <- initialize_grid(grid_size)

  # Create walker starting positions
  walker_positions <- sample_starting_positions(grid, n_walkers)

  # Push walker simulation tasks
  for (i in 1:n_walkers) {
    controller$push(
      command = {
        # This code runs IN THE WORKER
        randomwalk::simulate_single_walker(
          id = walker_id,
          start_pos = start_position,
          grid_snapshot = grid_state,  # Static snapshot
          neighborhood = neighborhood,
          boundary = boundary,
          max_steps = max_steps
        )
      },
      data = list(
        walker_id = i,
        start_position = walker_positions[[i]],
        grid_state = grid  # Each worker gets grid copy
      )
    )
  }

  # Wait for all walkers to complete
  controller$wait(mode = "all")

  # Collect results
  walker_results <- controller$pop()$result

  # Aggregate results
  final_grid <- aggregate_walker_paths(walker_results, grid)

  # Clean up
  controller$terminate()

  # Return combined results
  list(
    grid = final_grid,
    walkers = walker_results,
    statistics = calculate_stats(final_grid, walker_results),
    parameters = list(workers = workers, ...)
  )
}

Static Grid Snapshots

Critical design decision: Each worker receives a static copy of the grid state:

# Each worker gets its own grid copy
data = list(
  walker_id = i,
  grid_state = grid  # COPY, not reference
)

Why static snapshots?

Simplicity: No synchronization logic needed
Safety: No race conditions
Predictability: Workers don't interfere with each other
Performance: No locking overhead

Trade-off: Results differ from sync mode (acceptable for this application)

Targets Integration

While the Shiny dashboard doesn't use targets directly, the same crew code works seamlessly in targets pipelines.

Targets Pipeline Example

Here's how you'd use the same run_simulation() function in a targets workflow:

# _targets.R
library(targets)
library(crew)
library(randomwalk)

# Define crew controller for targets
tar_option_set(
  controller = crew_controller_local(
    name = "simulation_pipeline",
    workers = 4
  )
)

# Define pipeline
list(
  # Grid sizes to test
  tar_target(
    grid_sizes,
    c(20, 50, 100, 200)
  ),

  # Walker counts to test
  tar_target(
    walker_counts,
    c(5, 10, 20, 50)
  ),

  # Run simulations (parallelized across grid x walker combinations)
  tar_target(
    simulations,
    run_simulation(
      grid_size = grid_sizes,
      n_walkers = walker_counts,
      workers = 0,  # Targets handles parallelism
      neighborhood = "4-hood",
      boundary = "terminate",
      max_steps = 10000
    ),
    pattern = cross(grid_sizes, walker_counts),  # Cartesian product
    iteration = "list"
  ),

  # Aggregate results
  tar_target(
    summary_stats,
    summarize_simulations(simulations)
  ),

  # Generate plots
  tar_target(
    plots,
    create_plots(summary_stats)
  )
)

How targets uses crew:

Targets creates tasks for each pattern combination
Crew controller distributes tasks to workers
Each worker runs run_simulation() independently
Results cached and aggregated by targets

Nested Parallelism

You can even use nested parallelism - targets parallelizes simulations, each simulation parallelizes walkers:

tar_target(
  simulations,
  run_simulation(
    grid_size = grid_sizes,
    n_walkers = 100,
    workers = 2,  # Each simulation uses 2 crew workers
    ...
  ),
  pattern = map(grid_sizes),  # Targets parallelizes across grids
  iteration = "list"
)

Result: If targets has 4 workers and each simulation uses 2 workers, you have:

4 simulations running simultaneously (targets-level)
Each using 2 workers internally (crew-level)
Total: 8 cores utilized

Code Walkthrough

Note: The code walkthrough below describes the native R crew-based path (workers > 0). The browser dashboard uses a completely different architecture — see Approach 5.

Let's trace what happens when you click "Run Simulation" with workers = 2 in native R:

Step 1: Dashboard Button Click

# inst/shiny/dashboard_async/app.R (line ~340)
observeEvent(input$run_sim, {

  # Log event
  add_log("=== RUN SIMULATION CLICKED ===")
  add_log(sprintf("Parameters: grid=%d, walkers=%d, workers=%d",
                  input$grid_size, input$n_walkers, input$workers))

  # Call randomwalk package
  result <- randomwalk::run_simulation(
    grid_size = input$grid_size,      # 100
    n_walkers = input$n_walkers,      # 6
    workers = input$workers,           # 2
    neighborhood = input$neighborhood,
    boundary = input$boundary,
    max_steps = input$max_steps
  )

  # Store and display results
  sim_result(result)
  add_log("Simulation completed successfully")
})

Step 2: Package Routing

# R/run_simulation.R
run_simulation <- function(grid_size, n_walkers, workers = 0, ...) {

  start_time <- Sys.time()

  if (workers == 0) {
    # Route to synchronous implementation
    logger::log_info("Mode: Synchronous")
    result <- run_sync(grid_size, n_walkers, ...)

  } else {
    # Route to asynchronous implementation
    logger::log_info("Mode: Asynchronous ({workers} workers)", workers = workers)
    result <- run_async(grid_size, n_walkers, workers, ...)
  }

  # Add timing information
  result$statistics$elapsed_time_secs <- as.numeric(Sys.time() - start_time)

  return(result)
}

Step 3: Crew Controller Initialization

# R/run_async.R (Step 3)
run_async <- function(grid_size, n_walkers, workers, ...) {

  logger::log_info("Creating crew controller with {workers} workers")

  # Create and start controller
  controller <- crew::crew_controller_local(
    name = "randomwalk",
    workers = workers,    # 2 workers
    seconds_idle = 10
  )

  controller$start()
  logger::log_info("Controller started")

  # ... continue to task distribution ...
}

What happens: Crew spawns 2 R processes (workers) waiting for tasks

Step 4: Grid Initialization

# R/run_async.R (Step 4)
run_async <- function(...) {
  # ... controller setup ...

  logger::log_info("Initializing grid of size {grid_size}x{grid_size}")

  # Create empty grid
  grid <- matrix(FALSE, nrow = grid_size, ncol = grid_size)

  # Generate starting positions for walkers
  logger::log_info("Created {n_walkers} walkers")
  walker_starts <- sample_starting_positions(grid, n_walkers)

  # ... continue to task distribution ...
}

Step 5: Task Distribution to Workers

# R/run_async.R (Step 5)
run_async <- function(...) {
  # ... setup complete ...

  logger::log_info("Distributing {n_walkers} walkers to {workers} workers")

  # Push each walker as a separate task
  for (i in 1:n_walkers) {  # 6 walkers
    controller$push(
      command = {
        # THIS CODE RUNS IN WORKER PROCESS
        randomwalk::simulate_single_walker(
          id = walker_id,
          start_row = start[[1]],
          start_col = start[[2]],
          grid = grid_snapshot,
          neighborhood = neighborhood,
          boundary = boundary,
          max_steps = max_steps
        )
      },
      data = list(
        walker_id = i,
        start = walker_starts[[i]],
        grid_snapshot = grid,  # Copy of grid
        neighborhood = neighborhood,
        boundary = boundary,
        max_steps = max_steps
      )
    )
  }

  logger::log_info("All tasks pushed to workers")

  # ... continue to waiting ...
}

What happens:

6 walker tasks queued
2 workers available
Worker 1 starts walker 1, 2
Worker 2 starts walker 3, 4
Workers take more tasks as they finish

Step 6: Wait for Completion

# R/run_async.R (Step 6)
run_async <- function(...) {
  # ... tasks distributed ...

  logger::log_info("Waiting for all walkers to complete")

  # Block until all 6 tasks finish
  controller$wait(mode = "all")

  logger::log_info("All walkers completed")

  # ... continue to result collection ...
}

Step 7: Collect Results

# R/run_async.R (Step 7)
run_async <- function(...) {
  # ... tasks complete ...

  logger::log_info("Collecting results from workers")

  # Pop results from completed tasks
  task_results <- controller$pop()

  # Extract walker data
  walker_results <- lapply(task_results$result, function(r) r)

  logger::log_info("Results collected")

  # ... continue to aggregation ...
}

Step 8: Aggregate and Clean Up

# R/run_async.R (Step 8)
run_async <- function(...) {
  # ... results collected ...

  logger::log_info("Aggregating walker paths onto grid")

  # Combine walker paths onto final grid
  final_grid <- grid  # Start with empty grid
  for (walker in walker_results) {
    for (pos in walker$path) {
      final_grid[pos[1], pos[2]] <- TRUE  # Mark visited
    }
  }

  # Calculate statistics
  stats <- list(
    total_steps = sum(sapply(walker_results, function(w) w$steps)),
    black_pixels = sum(final_grid),
    black_percentage = 100 * sum(final_grid) / length(final_grid),
    grid_size = grid_size,
    total_walkers = n_walkers
  )

  logger::log_info("=== SIMULATION COMPLETE ===")
  logger::log_info("Total steps: {stats$total_steps}")

  # Terminate controller
  controller$terminate()

  # Return results
  list(
    grid = final_grid,
    walkers = walker_results,
    statistics = stats,
    parameters = list(workers = workers, ...)
  )
}

Step 9: Return to Dashboard

# Back in dashboard (inst/shiny/dashboard_async/app.R)
observeEvent(input$run_sim, {
  # ... simulation call ...

  result <- randomwalk::run_simulation(...)  # Returns from Step 8

  # Store result
  sim_result(result)

  # Update status
  status_msg(sprintf(
    "Simulation complete! %d steps, %.1f%% coverage in %.2fs",
    result$statistics$total_steps,
    result$statistics$black_percentage,
    result$statistics$elapsed_time_secs
  ))

  # Display in tabs (plots auto-update via reactivity)
})

Worker Lifecycle

Timeline Visualization

Time →

Main Process (Dashboard):
├─ [Click "Run"] ──────────────────────────────── [Display Results]
│                                                  ↑
│  Crew Controller:                              │
│  ├─ Start ────┬─ Push Tasks ─── Wait ─── Pop ─┴─ Terminate
│               │                   ↑
│               │                   │
│               ↓                   │
│  Worker 1:    Idle ─ Task1 ─ Task3 ─ Task5 ─ Idle
│  Worker 2:    Idle ─ Task2 ─ Task4 ─ Task6 ─ Idle

Detailed Worker State Machine

Worker States:

[STARTED] ──→ [IDLE] ──→ [BUSY] ──→ [IDLE] ──→ [TERMINATED]
               ↑          │         ↑
               └──────────┴─────────┘
                  (task loop)

State transitions:

STARTED: Worker process spawned
IDLE: Waiting for tasks from controller
BUSY: Executing simulate_single_walker()
IDLE: Task complete, ready for more
TERMINATED: Controller shutdown, worker exits

Data Flow

Sync vs Async Data Flow

Sync Mode (workers = 0):

Grid (shared) ──→ Walker 1 ──→ Grid updated
              ──→ Walker 2 ──→ Grid updated
              ──→ Walker 3 ──→ Grid updated
              ...

Sequential: Each walker sees previous walkers' changes

Async Mode (workers > 0):

Grid (original) ──→ Worker 1: [Walker 1] ──→ Paths collected
                │             [Walker 3]
                │             [Walker 5]
                │
                └──→ Worker 2: [Walker 2] ──→ Paths collected
                              [Walker 4]
                              [Walker 6]

All paths ──→ Aggregate ──→ Final Grid

Parallel: Workers use static grid snapshot

Memory Layout

Sync mode:

Main Process Memory:
├─ Grid (1 copy, updated)
└─ Walkers (sequential)

Async mode:

Main Process Memory:
├─ Grid (original)
└─ Controller

Worker 1 Memory:
├─ Grid (snapshot copy)
└─ Walkers 1, 3, 5 (active)

Worker 2 Memory:
├─ Grid (snapshot copy)
└─ Walkers 2, 4, 6 (active)

Memory trade-off: More copies (workers × grid size) for parallelism

Educational Examples

Example 1: Basic Crew Usage

Minimal example showing crew pattern:

library(crew)

# Create controller
controller <- crew_controller_local(workers = 2)
controller$start()

# Push 10 tasks
for (i in 1:10) {
  controller$push(
    command = i^2,  # Square each number
    data = list(i = i)
  )
}

# Wait and collect
controller$wait(mode = "all")
results <- controller$pop()$result

# Results: 1, 4, 9, 16, 25, ...
print(results)

# Clean up
controller$terminate()

Example 2: Crew with Custom Function

Using crew with a custom computation:

# Define expensive function
expensive_computation <- function(x) {
  Sys.sleep(1)  # Simulate work
  return(x * 2)
}

# Create controller
controller <- crew_controller_local(workers = 4)
controller$start()

# Push tasks
inputs <- 1:20
for (x in inputs) {
  controller$push(
    command = expensive_computation(value),
    data = list(value = x)
  )
}

# With 4 workers, 20 tasks take ~5 seconds (not 20)
system.time({
  controller$wait(mode = "all")
  results <- controller$pop()$result
})

controller$terminate()

Example 3: Monitoring Progress

Crew allows checking progress without blocking:

controller <- crew_controller_local(workers = 2)
controller$start()

# Push long-running tasks
for (i in 1:100) {
  controller$push(
    command = slow_function(x),
    data = list(x = i)
  )
}

# Monitor progress
while (!controller$empty()) {
  # Check without blocking
  completed <- controller$pop()
  if (!is.null(completed$result)) {
    message(sprintf("Completed: %d tasks", length(completed$result)))
  }
  Sys.sleep(0.5)  # Check every 500ms
}

controller$terminate()

Example 4: Error Handling

Crew captures errors from workers:

risky_function <- function(x) {
  if (x %% 3 == 0) stop("Divisible by 3!")
  return(x^2)
}

controller <- crew_controller_local(workers = 2)
controller$start()

for (i in 1:10) {
  controller$push(
    command = risky_function(value),
    data = list(value = i)
  )
}

controller$wait(mode = "all")
results <- controller$pop()

# Check for errors
errors <- results$error
successes <- results$result[!is.na(results$result)]

message(sprintf("Successes: %d, Errors: %d",
                length(successes), sum(!is.na(errors))))

controller$terminate()

References

Repository: https://github.com/JohnGavin/randomwalk
Key files:
- R/simulation.R - All simulation code (sync, async, chunked modes)
- R/walker.R - Walker creation and movement
- R/walker_dynamic.R - Dynamic broadcasting walker (deprecated)
- R/broadcasting.R - Pub/sub socket functions (deprecated)
- vignettes/articles/dashboard_comprehensive.qmd - Main Shinylive dashboard

Note: All simulation modes (run_simulation(), async variants, chunked mode) are in the single R/simulation.R file - there are no separate run_async.R or run_sync.R files.

How Async Dashboard Uses Crew and Targets

How the Async Dashboard Uses Crew and Targets

Table of Contents

Overview

Key Concepts

Understanding Workers: Sync vs Async Architecture

The Three Modes

Fractal Similarity: workers=0 vs workers=1

Why This Matters

Comparison Table

When to Use Each Mode

The "Fractal" Insight

Crew Integration

How Crew Works in the Dashboard

Crew Controller Creation

Task Distribution

Actual Implementation

Static Grid Snapshots

Targets Integration

Targets Pipeline Example

Nested Parallelism

Code Walkthrough

Step 1: Dashboard Button Click

Step 2: Package Routing

Step 3: Crew Controller Initialization

Step 4: Grid Initialization

Step 5: Task Distribution to Workers

Step 6: Wait for Completion

Step 7: Collect Results

Step 8: Aggregate and Clean Up

Step 9: Return to Dashboard

Worker Lifecycle

Timeline Visualization

Detailed Worker State Machine

Data Flow

Sync vs Async Data Flow

Memory Layout

Educational Examples

Example 1: Basic Crew Usage

Example 2: Crew with Custom Function

Example 3: Monitoring Progress

Example 4: Error Handling

See Also

References

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally