Dynamic Grid Broadcasting Algorithm

Technical Deep Dive: The minimal broadcasting algorithm for grid synchronization

Related Pages:

Async Dashboard Approaches - High-level comparison
How Async Dashboard Uses Crew and Targets - Crew patterns

Problem Statement

The Challenge

In parallel random walk simulation, multiple walkers execute simultaneously across different workers. The challenge is: How do walkers "see" each other's paths?

Static vs Dynamic Approaches

Approach	Grid Updates	Walker Awareness
Static Snapshots	None during simulation	Walkers are blind to each other
Full Synchronization	After every step	Perfect awareness, but O(n²) overhead
Dynamic Broadcasting	On new black pixels	Eventual awareness, O(n) overhead

Algorithm Overview

Core Insight

Not every step needs broadcasting. Only broadcast when a walker creates a new black pixel by terminating adjacent to an existing one.

The Termination Rule (CRITICAL)

A walker on a white pixel checks its neighbors each step:

If any neighbor is black → Walker STOPS, current position becomes black, broadcast occurs
If no black neighbors → Walker moves to a random neighbor

Key point: A walker can NEVER "step onto" an existing black pixel. It terminates when adjacent to one, before it could move onto one.

When Broadcasts Occur

A broadcast happens ONLY when:

Walker is on a white pixel
Walker detects a black neighbor
Walker's current position turns black (termination)

When NO Broadcast Occurs

Walker hits max_steps limit (no black pixel created)
Walker hits boundary (walker leaves grid, no black pixel)
Walker started on black pixel (pixel was already black)

Key Innovation: Minimal Broadcasting

Message Count Analysis

For a simulation with N walkers:

Termination Reason	Creates Black Pixel?	Broadcast?
Black neighbor detected	YES	YES
Hit boundary	NO	NO
Max steps reached	NO	NO
Started on black	NO	NO

Total broadcasts = walkers that create black pixels

This is O(N), where N = total walkers. The number depends on:

Grid density (more black pixels → more terminations near them)
Grid size (larger grid → more boundary exits)
Boundary mode (wrap vs terminate)

Timing Characteristic (IMPORTANT)

Broadcasts are sequential, not simultaneous:

Each walker terminates independently at different times
At most #workers could theoretically broadcast simultaneously
In practice, simultaneous termination is rare
There are rarely 2+ broadcasts happening at exactly the same moment

The Correct Algorithm (Pseudocode)

function walker_step(walker, local_grid):
    current_pos = walker.position

    # STEP 1: Check neighbors for black pixels
    neighbors = get_neighbors(current_pos)

    for neighbor in neighbors:
        if local_grid[neighbor] == BLACK:
            # FOUND BLACK NEIGHBOR!
            # Current position becomes black (NOT next position)
            local_grid[current_pos] = BLACK
            BROADCAST(position=current_pos, walker_id=walker.id)
            return END_STATE.BLACK_NEIGHBOR

    # STEP 2: No black neighbors - move to random neighbor
    next_pos = random_choice(neighbors)

    if is_out_of_bounds(next_pos):
        return END_STATE.BOUNDARY  # No broadcast

    walker.position = next_pos
    return CONTINUE

Why This Works

Black pixels are permanent: Once black, always black
Broadcasts are idempotent: Receiving "pixel X is black" twice is harmless
Order doesn't matter: Any broadcast order produces valid state
Walker cannot step onto black: It terminates when adjacent

Eventual Consistency Model

Definition

Workers don't see updates instantly but eventually reach consistent state.

Timing Diagram

Time →
─────────────────────────────────────────────────────────────
Walker 1: Check neighbors → BLACK found! → Broadcast (50,50) ──→
                                                │
                                                │ Network delay
                                                │ (~1-10ms)
                                                ↓
Walker 2: ──────────────────────── Pop msg → Update grid ────→
                                                │
                                                ↓
                                    Now sees (50,50) as BLACK
─────────────────────────────────────────────────────────────

Consistency Guarantees

Guarantee	Provided?	Notes
Eventual consistency	✅ Yes	All workers converge
Causal ordering	❌ No	Messages may arrive out of order
Total ordering	❌ No	Not needed for this application
Strong consistency	❌ No	Would require locks

Implementation Details

Broadcasting Functions

Located in R/broadcasting.R:

1. Broadcast Black Pixel

broadcast_black_pixel <- function(pub_socket, position, walker_id) {
  message_data <- list(
    type = "black_pixel",
    position = position,  # CURRENT position where walker terminated
    walker_id = walker_id,
    timestamp = Sys.time()
  )
  nanonext::send(pub_socket, serialize(message_data, NULL))
}

2. Update Grid from Broadcasts

update_grid_from_broadcasts <- function(sub_socket, local_grid) {
  repeat {
    msg <- nanonext::recv(sub_socket, mode = "raw", block = FALSE)
    if (is.null(msg)) break  # No more messages

    update <- unserialize(msg)
    if (update$type == "black_pixel") {
      pos <- update$position
      local_grid[pos[1], pos[2]] <- 1  # Mark as black
    }
  }
  return(local_grid)
}

Walker Integration

In R/walker_dynamic.R, the core loop (simulate_walker_dynamic):

# STEP 1: Pop ALL broadcast messages, update local grid
local_grid <- update_grid_from_broadcasts(sub_socket, local_grid)

# STEP 2: Check neighbors for black pixels
has_black_neighbor <- any(sapply(neighbors, function(pos) {
  local_grid[pos[1], pos[2]] == 1
}))

if (has_black_neighbor) {
  # Walker STOPS - CURRENT position becomes BLACK
  local_grid[position[1], position[2]] <- 1
  broadcast_black_pixel(pub_socket, position, walker_id)
  return(list(status = "black_neighbor_detected", ...))
}

# STEP 3: No black neighbors - move
next_position <- choose_next_position(position, local_grid, neighborhood)
position <- next_position

Implementation Status

Phase 1: Broadcasting Infrastructure ✅ COMPLETE

Implemented in PR #83:

init_publisher_socket() - Setup pub socket
init_subscriber_socket() - Setup sub socket
broadcast_black_pixel() - Send update
update_grid_from_broadcasts() - Pop and apply
close_sockets() - Cleanup

Phase 2 & 3: ⚠️ DEPRECATED

The sync_mode parameter exists in run_simulation() but is DEPRECATED:

if (sync_mode %in% c("dynamic", "mirai_dynamic")) {
  logger::log_warn("sync_mode is DEPRECATED: nanonext sockets fail in crew/mirai subprocesses")
  logger::log_warn("Use sync_mode='chunked' for better results (RECOMMENDED)")
}

Why deprecated? nanonext sockets cannot be passed to crew/mirai subprocess contexts. The socket handles become invalid in child processes.

Recommended alternative: Use sync_mode = "chunked" which processes walkers in batches with grid synchronization between batches.

Use Cases and Limitations

Good Use Cases

Educational demonstrations - Understand distributed systems concepts
Research experiments - Study emergent patterns with interaction
Single-process testing - Broadcasting works within same process

Limitations

Not WebR compatible - nanonext requires native sockets
Socket inheritance broken - Sockets fail in crew/mirai subprocesses
~12% overhead - Network serialization cost
Timing-dependent - Results vary based on message delivery timing

Pros and Cons of Full Implementation

Aspect	Pros	Cons
Realism	Walkers "see" each other	Adds complexity
Performance	O(N) messages (acceptable)	~12% overhead
Implementation	Clean pub/sub pattern	Socket inheritance broken
Reproducibility	More realistic patterns	Non-deterministic results
WebR	N/A	Not compatible

Verdict: For most use cases, sync_mode = "chunked" provides better results with simpler implementation. True dynamic broadcasting would require solving the socket inheritance problem in crew/mirai.

References

Pub/Sub Pattern: Wikipedia
Eventual Consistency: Wikipedia
randomwalk source: R/broadcasting.R, R/walker_dynamic.R

Dynamic Grid Broadcasting Algorithm

Dynamic Grid Broadcasting Algorithm

Table of Contents

Problem Statement

The Challenge

Static vs Dynamic Approaches

Algorithm Overview

Core Insight

The Termination Rule (CRITICAL)

When Broadcasts Occur

When NO Broadcast Occurs

Key Innovation: Minimal Broadcasting

Message Count Analysis

Timing Characteristic (IMPORTANT)

The Correct Algorithm (Pseudocode)

Why This Works

Eventual Consistency Model

Definition

Timing Diagram

Consistency Guarantees

Implementation Details

Broadcasting Functions

1. Broadcast Black Pixel

2. Update Grid from Broadcasts

Walker Integration

Implementation Status

Phase 1: Broadcasting Infrastructure ✅ COMPLETE

Phase 2 & 3: ⚠️ DEPRECATED

Use Cases and Limitations

Good Use Cases

Limitations

Pros and Cons of Full Implementation

See Also

References

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally