-
Notifications
You must be signed in to change notification settings - Fork 0
How Async Dashboard Uses Crew and Targets
Educational Deep Dive: Understanding the crew/targets workflow pattern in the randomwalk async dashboard
Related Pages:
- Async Dashboard Approaches - All execution modes including vectorized WebR
Note (April 2026): The browser dashboard at
dashboard_comprehensive.qmdno longer usesrun_simulation()or crew. It uses a vectorized inline engine with JS round-trip batching for non-blocking UI. This page documents the crew patterns used by the native R execution modes (workers > 0). See Approach 5: Vectorized Batched for the WebR architecture.
- Overview
- Understanding Workers: Sync vs Async Architecture ⭐ NEW
- Crew Integration
- Targets Integration
- Code Walkthrough
- Worker Lifecycle
- Data Flow
- Educational Examples
The async dashboard demonstrates production-ready patterns for parallel processing in R using:
- crew: Distributed worker management
- targets: (Future) Pipeline orchestration
While the current dashboard doesn't use targets explicitly (it's a Shiny app, not a pipeline), it uses crew patterns that integrate seamlessly with targets in analysis workflows.
Crew provides:
- Worker pool management
- Task distribution
- Asynchronous execution
- Result collection
Targets provides:
- Dependency tracking
- Caching
- Pipeline orchestration
- Reproducibility
Together: They enable scalable, reproducible, parallel workflows.
Key Insight: The
workersparameter controls more than just parallelism—it determines the entire code architecture.
| Mode | workers |
Architecture | Execution | Code Path |
|---|---|---|---|---|
| Sync | 0 |
Direct function calls | Sequential | run_sync() |
| Async (1 worker) | 1 |
Crew task queue | Sequential* | run_async() |
| Async (parallel) | 2+ |
Crew task queue | Parallel | run_async() |
*With 1 worker, tasks execute sequentially but through the async infrastructure
A fascinating property of the randomwalk simulation is that workers=0 and workers=1 produce structurally similar results through completely different code paths:
# workers=0: True synchronous execution
result_sync <- randomwalk::run_simulation(
grid_size = 100,
n_walkers = 6,
workers = 0 # No crew controller created
)
# workers=1: Async architecture, sequential execution
result_async_1 <- randomwalk::run_simulation(
grid_size = 100,
n_walkers = 6,
workers = 1 # Creates crew controller with 1 worker
)
# Both produce similar fractal patterns!
# But use completely different code paths.Educational Value: workers=1 lets you learn the async architecture without parallelism complexity:
Sync (workers=0):
┌──────────────────────────────────────────────────┐
│ for each walker: │
│ result <- simulate_walker(walker_id, grid) │
│ # Direct function call, no task queue │
└──────────────────────────────────────────────────┘
Async (workers=1):
┌──────────────────────────────────────────────────┐
│ controller <- crew_controller_local(workers = 1) │
│ controller$start() │
│ │
│ for each walker: │
│ controller$push(command = simulate_walker(...))│
│ # Task queued, executed by single worker │
│ │
│ controller$wait(mode = 'all') │
│ results <- controller$pop() │
│ controller$terminate() │
└──────────────────────────────────────────────────┘
| Aspect | workers=0 (Sync) | workers=1 (Async) | workers=2+ (Parallel) |
|---|---|---|---|
| Architecture | Direct function calls | Task queue + worker | Task queue + workers |
| Overhead | Minimal | Crew controller + serialization | Same as workers=1 |
| Execution | Sequential | Sequential (1 worker) | Parallel |
| Pattern | Simple loop | Push/wait/pop | Push/wait/pop |
| Grid Updates | Immediate (shared) | After completion (snapshot) | After completion (snapshot) |
| Use Case | Production baseline | Educational/testing | Performance |
| WebR Compatible | ✅ Yes | ❌ No (needs crew) | ❌ No (needs crew) |
-
workers=0: WebR/Shinylive deployments, simple testing, baseline benchmarks -
workers=1: Learning crew patterns, debugging async logic, testing task serialization -
workers=2+: Production parallel processing, performance-critical workflows
The term "fractal similarity" refers to how both approaches produce self-similar random walk patterns—the statistical properties of the output are comparable even though:
- Sync mode updates the grid after each walker step
- Async mode (even with 1 worker) uses static grid snapshots
This demonstrates that for many applications, the async overhead is acceptable and the patterns remain valid regardless of execution model.
The randomwalk::run_simulation() function uses crew internally when workers > 0:
# From randomwalk package (R/run_simulation.R)
run_simulation <- function(grid_size,
n_walkers,
workers = 0, # Number of crew workers
...) {
if (workers == 0) {
# Sync mode: sequential processing
result <- run_sync(...)
} else {
# Async mode: parallel processing with crew
result <- run_async(workers = workers, ...)
}
return(result)
}When using crew, the first step is creating a controller that manages the worker pool:
# From randomwalk package (internal)
run_async <- function(workers, grid_size, n_walkers, ...) {
# Create crew controller
controller <- crew::crew_controller_local(
name = "randomwalk",
workers = workers,
seconds_idle = 10 # Kill idle workers after 10s
)
# Start the controller
controller$start()
# ... use controller for tasks ...
# Clean up
controller$terminate()
}Key parameters:
-
workers: Number of parallel workers -
seconds_idle: Worker timeout (conserves resources) -
name: Identifier for this controller
Crew distributes tasks across workers using push() and pop():
# Simplified example from randomwalk async implementation
run_async <- function(controller, n_walkers, ...) {
# PUSH: Send tasks to workers
for (walker_id in 1:n_walkers) {
controller$push(
command = simulate_walker(
id = walker_id,
grid_state = grid,
...
),
data = list(walker_id = walker_id) # Task metadata
)
}
# WAIT: Let workers process tasks
controller$wait(mode = "all") # Wait for all tasks
# POP: Collect results
results <- controller$pop()
return(results)
}Push-Pop pattern:
- Push: Queue tasks to worker pool
- Wait: Block until tasks complete
- Pop: Retrieve results
The real implementation in randomwalk is more sophisticated:
# From R/run_async.R (simplified)
run_async <- function(workers, grid_size, n_walkers,
neighborhood, boundary, max_steps) {
# Initialize controller
controller <- crew::crew_controller_local(
name = "randomwalk",
workers = workers,
seconds_idle = 10
)
controller$start()
# Create initial grid
grid <- initialize_grid(grid_size)
# Create walker starting positions
walker_positions <- sample_starting_positions(grid, n_walkers)
# Push walker simulation tasks
for (i in 1:n_walkers) {
controller$push(
command = {
# This code runs IN THE WORKER
randomwalk::simulate_single_walker(
id = walker_id,
start_pos = start_position,
grid_snapshot = grid_state, # Static snapshot
neighborhood = neighborhood,
boundary = boundary,
max_steps = max_steps
)
},
data = list(
walker_id = i,
start_position = walker_positions[[i]],
grid_state = grid # Each worker gets grid copy
)
)
}
# Wait for all walkers to complete
controller$wait(mode = "all")
# Collect results
walker_results <- controller$pop()$result
# Aggregate results
final_grid <- aggregate_walker_paths(walker_results, grid)
# Clean up
controller$terminate()
# Return combined results
list(
grid = final_grid,
walkers = walker_results,
statistics = calculate_stats(final_grid, walker_results),
parameters = list(workers = workers, ...)
)
}Critical design decision: Each worker receives a static copy of the grid state:
# Each worker gets its own grid copy
data = list(
walker_id = i,
grid_state = grid # COPY, not reference
)Why static snapshots?
- Simplicity: No synchronization logic needed
- Safety: No race conditions
- Predictability: Workers don't interfere with each other
- Performance: No locking overhead
Trade-off: Results differ from sync mode (acceptable for this application)
While the Shiny dashboard doesn't use targets directly, the same crew code works seamlessly in targets pipelines.
Here's how you'd use the same run_simulation() function in a targets workflow:
# _targets.R
library(targets)
library(crew)
library(randomwalk)
# Define crew controller for targets
tar_option_set(
controller = crew_controller_local(
name = "simulation_pipeline",
workers = 4
)
)
# Define pipeline
list(
# Grid sizes to test
tar_target(
grid_sizes,
c(20, 50, 100, 200)
),
# Walker counts to test
tar_target(
walker_counts,
c(5, 10, 20, 50)
),
# Run simulations (parallelized across grid x walker combinations)
tar_target(
simulations,
run_simulation(
grid_size = grid_sizes,
n_walkers = walker_counts,
workers = 0, # Targets handles parallelism
neighborhood = "4-hood",
boundary = "terminate",
max_steps = 10000
),
pattern = cross(grid_sizes, walker_counts), # Cartesian product
iteration = "list"
),
# Aggregate results
tar_target(
summary_stats,
summarize_simulations(simulations)
),
# Generate plots
tar_target(
plots,
create_plots(summary_stats)
)
)How targets uses crew:
- Targets creates tasks for each pattern combination
- Crew controller distributes tasks to workers
- Each worker runs
run_simulation()independently - Results cached and aggregated by targets
You can even use nested parallelism - targets parallelizes simulations, each simulation parallelizes walkers:
tar_target(
simulations,
run_simulation(
grid_size = grid_sizes,
n_walkers = 100,
workers = 2, # Each simulation uses 2 crew workers
...
),
pattern = map(grid_sizes), # Targets parallelizes across grids
iteration = "list"
)Result: If targets has 4 workers and each simulation uses 2 workers, you have:
- 4 simulations running simultaneously (targets-level)
- Each using 2 workers internally (crew-level)
- Total: 8 cores utilized
Note: The code walkthrough below describes the native R crew-based path (
workers > 0). The browser dashboard uses a completely different architecture — see Approach 5.
Let's trace what happens when you click "Run Simulation" with workers = 2 in native R:
# inst/shiny/dashboard_async/app.R (line ~340)
observeEvent(input$run_sim, {
# Log event
add_log("=== RUN SIMULATION CLICKED ===")
add_log(sprintf("Parameters: grid=%d, walkers=%d, workers=%d",
input$grid_size, input$n_walkers, input$workers))
# Call randomwalk package
result <- randomwalk::run_simulation(
grid_size = input$grid_size, # 100
n_walkers = input$n_walkers, # 6
workers = input$workers, # 2
neighborhood = input$neighborhood,
boundary = input$boundary,
max_steps = input$max_steps
)
# Store and display results
sim_result(result)
add_log("Simulation completed successfully")
})# R/run_simulation.R
run_simulation <- function(grid_size, n_walkers, workers = 0, ...) {
start_time <- Sys.time()
if (workers == 0) {
# Route to synchronous implementation
logger::log_info("Mode: Synchronous")
result <- run_sync(grid_size, n_walkers, ...)
} else {
# Route to asynchronous implementation
logger::log_info("Mode: Asynchronous ({workers} workers)", workers = workers)
result <- run_async(grid_size, n_walkers, workers, ...)
}
# Add timing information
result$statistics$elapsed_time_secs <- as.numeric(Sys.time() - start_time)
return(result)
}# R/run_async.R (Step 3)
run_async <- function(grid_size, n_walkers, workers, ...) {
logger::log_info("Creating crew controller with {workers} workers")
# Create and start controller
controller <- crew::crew_controller_local(
name = "randomwalk",
workers = workers, # 2 workers
seconds_idle = 10
)
controller$start()
logger::log_info("Controller started")
# ... continue to task distribution ...
}What happens: Crew spawns 2 R processes (workers) waiting for tasks
# R/run_async.R (Step 4)
run_async <- function(...) {
# ... controller setup ...
logger::log_info("Initializing grid of size {grid_size}x{grid_size}")
# Create empty grid
grid <- matrix(FALSE, nrow = grid_size, ncol = grid_size)
# Generate starting positions for walkers
logger::log_info("Created {n_walkers} walkers")
walker_starts <- sample_starting_positions(grid, n_walkers)
# ... continue to task distribution ...
}# R/run_async.R (Step 5)
run_async <- function(...) {
# ... setup complete ...
logger::log_info("Distributing {n_walkers} walkers to {workers} workers")
# Push each walker as a separate task
for (i in 1:n_walkers) { # 6 walkers
controller$push(
command = {
# THIS CODE RUNS IN WORKER PROCESS
randomwalk::simulate_single_walker(
id = walker_id,
start_row = start[[1]],
start_col = start[[2]],
grid = grid_snapshot,
neighborhood = neighborhood,
boundary = boundary,
max_steps = max_steps
)
},
data = list(
walker_id = i,
start = walker_starts[[i]],
grid_snapshot = grid, # Copy of grid
neighborhood = neighborhood,
boundary = boundary,
max_steps = max_steps
)
)
}
logger::log_info("All tasks pushed to workers")
# ... continue to waiting ...
}What happens:
- 6 walker tasks queued
- 2 workers available
- Worker 1 starts walker 1, 2
- Worker 2 starts walker 3, 4
- Workers take more tasks as they finish
# R/run_async.R (Step 6)
run_async <- function(...) {
# ... tasks distributed ...
logger::log_info("Waiting for all walkers to complete")
# Block until all 6 tasks finish
controller$wait(mode = "all")
logger::log_info("All walkers completed")
# ... continue to result collection ...
}# R/run_async.R (Step 7)
run_async <- function(...) {
# ... tasks complete ...
logger::log_info("Collecting results from workers")
# Pop results from completed tasks
task_results <- controller$pop()
# Extract walker data
walker_results <- lapply(task_results$result, function(r) r)
logger::log_info("Results collected")
# ... continue to aggregation ...
}# R/run_async.R (Step 8)
run_async <- function(...) {
# ... results collected ...
logger::log_info("Aggregating walker paths onto grid")
# Combine walker paths onto final grid
final_grid <- grid # Start with empty grid
for (walker in walker_results) {
for (pos in walker$path) {
final_grid[pos[1], pos[2]] <- TRUE # Mark visited
}
}
# Calculate statistics
stats <- list(
total_steps = sum(sapply(walker_results, function(w) w$steps)),
black_pixels = sum(final_grid),
black_percentage = 100 * sum(final_grid) / length(final_grid),
grid_size = grid_size,
total_walkers = n_walkers
)
logger::log_info("=== SIMULATION COMPLETE ===")
logger::log_info("Total steps: {stats$total_steps}")
# Terminate controller
controller$terminate()
# Return results
list(
grid = final_grid,
walkers = walker_results,
statistics = stats,
parameters = list(workers = workers, ...)
)
}# Back in dashboard (inst/shiny/dashboard_async/app.R)
observeEvent(input$run_sim, {
# ... simulation call ...
result <- randomwalk::run_simulation(...) # Returns from Step 8
# Store result
sim_result(result)
# Update status
status_msg(sprintf(
"Simulation complete! %d steps, %.1f%% coverage in %.2fs",
result$statistics$total_steps,
result$statistics$black_percentage,
result$statistics$elapsed_time_secs
))
# Display in tabs (plots auto-update via reactivity)
})Time →
Main Process (Dashboard):
├─ [Click "Run"] ──────────────────────────────── [Display Results]
│ ↑
│ Crew Controller: │
│ ├─ Start ────┬─ Push Tasks ─── Wait ─── Pop ─┴─ Terminate
│ │ ↑
│ │ │
│ ↓ │
│ Worker 1: Idle ─ Task1 ─ Task3 ─ Task5 ─ Idle
│ Worker 2: Idle ─ Task2 ─ Task4 ─ Task6 ─ Idle
Worker States:
[STARTED] ──→ [IDLE] ──→ [BUSY] ──→ [IDLE] ──→ [TERMINATED]
↑ │ ↑
└──────────┴─────────┘
(task loop)
State transitions:
- STARTED: Worker process spawned
- IDLE: Waiting for tasks from controller
-
BUSY: Executing
simulate_single_walker() - IDLE: Task complete, ready for more
- TERMINATED: Controller shutdown, worker exits
Sync Mode (workers = 0):
Grid (shared) ──→ Walker 1 ──→ Grid updated
──→ Walker 2 ──→ Grid updated
──→ Walker 3 ──→ Grid updated
...
Sequential: Each walker sees previous walkers' changes
Async Mode (workers > 0):
Grid (original) ──→ Worker 1: [Walker 1] ──→ Paths collected
│ [Walker 3]
│ [Walker 5]
│
└──→ Worker 2: [Walker 2] ──→ Paths collected
[Walker 4]
[Walker 6]
All paths ──→ Aggregate ──→ Final Grid
Parallel: Workers use static grid snapshot
Sync mode:
Main Process Memory:
├─ Grid (1 copy, updated)
└─ Walkers (sequential)
Async mode:
Main Process Memory:
├─ Grid (original)
└─ Controller
Worker 1 Memory:
├─ Grid (snapshot copy)
└─ Walkers 1, 3, 5 (active)
Worker 2 Memory:
├─ Grid (snapshot copy)
└─ Walkers 2, 4, 6 (active)
Memory trade-off: More copies (workers × grid size) for parallelism
Minimal example showing crew pattern:
library(crew)
# Create controller
controller <- crew_controller_local(workers = 2)
controller$start()
# Push 10 tasks
for (i in 1:10) {
controller$push(
command = i^2, # Square each number
data = list(i = i)
)
}
# Wait and collect
controller$wait(mode = "all")
results <- controller$pop()$result
# Results: 1, 4, 9, 16, 25, ...
print(results)
# Clean up
controller$terminate()Using crew with a custom computation:
# Define expensive function
expensive_computation <- function(x) {
Sys.sleep(1) # Simulate work
return(x * 2)
}
# Create controller
controller <- crew_controller_local(workers = 4)
controller$start()
# Push tasks
inputs <- 1:20
for (x in inputs) {
controller$push(
command = expensive_computation(value),
data = list(value = x)
)
}
# With 4 workers, 20 tasks take ~5 seconds (not 20)
system.time({
controller$wait(mode = "all")
results <- controller$pop()$result
})
controller$terminate()Crew allows checking progress without blocking:
controller <- crew_controller_local(workers = 2)
controller$start()
# Push long-running tasks
for (i in 1:100) {
controller$push(
command = slow_function(x),
data = list(x = i)
)
}
# Monitor progress
while (!controller$empty()) {
# Check without blocking
completed <- controller$pop()
if (!is.null(completed$result)) {
message(sprintf("Completed: %d tasks", length(completed$result)))
}
Sys.sleep(0.5) # Check every 500ms
}
controller$terminate()Crew captures errors from workers:
risky_function <- function(x) {
if (x %% 3 == 0) stop("Divisible by 3!")
return(x^2)
}
controller <- crew_controller_local(workers = 2)
controller$start()
for (i in 1:10) {
controller$push(
command = risky_function(value),
data = list(value = i)
)
}
controller$wait(mode = "all")
results <- controller$pop()
# Check for errors
errors <- results$error
successes <- results$result[!is.na(results$result)]
message(sprintf("Successes: %d, Errors: %d",
length(successes), sum(!is.na(errors))))
controller$terminate()- Async Dashboard Approaches - All execution modes
- Live Dashboard - Browser demo (vectorized engine)
- crew documentation - Official crew package docs
- targets documentation - Official targets package docs
- Repository: https://github.com/JohnGavin/randomwalk
-
Key files:
-
R/simulation.R- All simulation code (sync, async, chunked modes) -
R/walker.R- Walker creation and movement -
R/walker_dynamic.R- Dynamic broadcasting walker (deprecated) -
R/broadcasting.R- Pub/sub socket functions (deprecated) -
vignettes/articles/dashboard_comprehensive.qmd- Main Shinylive dashboard
-
Note: All simulation modes (run_simulation(), async variants, chunked mode) are in the single R/simulation.R file - there are no separate run_async.R or run_sync.R files.