# Multi-Node Testing for elastic-script

This notebook tests the distributed behavior of elastic-script, specifically:

1. **Execution State Persistence** - Execution state is stored in `.escript_executions` index
2. **Track Name Lookups** - Named executions can be looked up from any node
3. **Leader Election** - Only one node runs scheduled jobs and triggers
4. **Cross-Node Visibility** - All nodes can see execution history

## Architecture Overview

```
┌──────────────────────────────────────────────────────────────┐
│                    Elasticsearch Cluster                      │
├──────────────┬──────────────┬──────────────────────────────────┤
│    Node 1    │    Node 2    │           Node 3                 │
│              │              │                                  │
│ ExecutionReg │ ExecutionReg │ ExecutionRegistry               │
│   (cache)    │   (cache)    │   (cache)                       │
├──────────────┴──────────────┴──────────────────────────────────┤
│                                                                │
│             .escript_executions (shared index)                 │
│             .escript_jobs (shared index)                       │
│             .escript_triggers (shared index)                   │
│             .escript_leader (leader election)                  │
│                                                                │
└────────────────────────────────────────────────────────────────┘
```

## 1. Test Execution State Persistence

When an async procedure runs, its state is persisted to the `.escript_executions` index.
This makes the execution visible from any node in the cluster.

In [None]:
-- Create a test procedure for async execution
CREATE PROCEDURE distributed_test()
BEGIN
    PRINT 'Starting distributed test...';
    
    -- Simulate some work
    DECLARE result VARCHAR;
    SET result = 'Completed from node';
    
    PRINT 'Work completed: ' || result;
END PROCEDURE

In [None]:
-- Start an async execution with tracking
-- The TRACK clause gives the execution a name for easy lookup
ASYNC distributed_test() TRACK 'my_distributed_task'

In [None]:
-- Check the execution status (can be done from any node)
EXECUTION('my_distributed_task') | STATUS

## 2. Verify Execution State in Elasticsearch

The execution state is stored in the `.escript_executions` index.
Let's query it directly to see the structure.

In [None]:
-- Query the executions index directly using ES|QL
FROM .escript_executions
| WHERE name IS NOT NULL
| KEEP execution_id, name, procedure, status, node, started_at
| SORT started_at DESC
| LIMIT 10

## 3. Test Track Name Resolution

Track names provide a human-friendly way to reference async executions.
The name is indexed for fast lookup.

In [None]:
-- Create another tracked execution
CREATE PROCEDURE data_processor(batch_id INT)
BEGIN
    PRINT 'Processing batch ' || batch_id;
    -- Simulate processing
    DECLARE i INT = 0;
    WHILE i < 5 LOOP
        SET i = i + 1;
    END LOOP;
    PRINT 'Batch ' || batch_id || ' complete';
END PROCEDURE

In [None]:
-- Start multiple tracked executions
ASYNC data_processor(1) TRACK 'batch_1';
ASYNC data_processor(2) TRACK 'batch_2';
ASYNC data_processor(3) TRACK 'batch_3'

In [None]:
-- Check status of each batch
EXECUTION('batch_1') | STATUS;
EXECUTION('batch_2') | STATUS;
EXECUTION('batch_3') | STATUS

## 4. Test Leader Election State

The `.escript_leader` index tracks which node is the leader for:
- Running scheduled jobs
- Polling event triggers

In [None]:
-- Query the leader document directly
FROM .escript_leader
| KEEP node_id, last_heartbeat, acquired_at

## 5. Test Job State Persistence

Scheduled jobs are stored in the `.escript_jobs` index.

In [None]:
-- Create a scheduled job
CREATE JOB test_multinode_job
    SCHEDULE '*/5 * * * *'
    AS cleanup_procedure()
    ENABLED FALSE
    DESCRIPTION 'Test job for multi-node verification'

In [None]:
-- Query jobs from the index
FROM .escript_jobs
| KEEP name, schedule, enabled, description
| SORT name

In [None]:
-- Show jobs using the built-in command
SHOW JOBS

## 6. Test Trigger State Persistence

Event triggers are stored in the `.escript_triggers` index.

In [None]:
-- Create a disabled trigger for testing
CREATE TRIGGER test_multinode_trigger
    ON INDEX "logs-*"
    WHEN level == 'ERROR'
    EVERY '30s'
    AS error_handler()
    ENABLED FALSE
    DESCRIPTION 'Test trigger for multi-node verification'

In [None]:
-- Query triggers from the index
FROM .escript_triggers
| KEEP name, index_pattern, condition, enabled, description
| SORT name

In [None]:
-- Show triggers using the built-in command
SHOW TRIGGERS

## 7. Multi-Node Verification Summary

In a multi-node cluster, all of the above state is:
1. **Persisted to Elasticsearch** - Not lost if a node goes down
2. **Accessible from any node** - Queries work regardless of which node handles the request
3. **Consistently updated** - Elasticsearch handles concurrent writes

### Key Indices

| Index | Purpose | Key Fields |
|-------|---------|------------|
| `.escript_executions` | Async execution state | execution_id, name, status, node |
| `.escript_jobs` | Scheduled job definitions | name, schedule, enabled |
| `.escript_job_runs` | Job execution history | job_name, started_at, status |
| `.escript_triggers` | Event trigger definitions | name, index_pattern, condition |
| `.escript_trigger_runs` | Trigger execution history | trigger_name, documents_matched |
| `.escript_leader` | Leader election | node_id, last_heartbeat |

In [None]:
-- List all escript indices
FROM _cat/indices
| WHERE index LIKE '.escript%'

## 8. Cleanup Test Resources

In [None]:
-- Clean up test jobs and triggers
DROP JOB test_multinode_job

In [None]:
DROP TRIGGER test_multinode_trigger

In [None]:
-- Clean up test procedures
DROP PROCEDURE distributed_test;
DROP PROCEDURE data_processor

## Running in a Real Multi-Node Cluster

To test actual multi-node behavior:

1. **Start a multi-node cluster** using Docker Compose or multiple ES instances
2. **Run this notebook** against each node's endpoint
3. **Verify** that state created on one node is visible from another

### Example Docker Compose Setup

```yaml
version: '3.8'
services:
  es01:
    image: elasticsearch-with-escript:latest
    environment:
      - node.name=es01
      - cluster.name=escript-cluster
      - discovery.seed_hosts=es02,es03
      - cluster.initial_master_nodes=es01,es02,es03
    ports:
      - 9200:9200
  es02:
    image: elasticsearch-with-escript:latest
    environment:
      - node.name=es02
      - cluster.name=escript-cluster
      - discovery.seed_hosts=es01,es03
      - cluster.initial_master_nodes=es01,es02,es03
    ports:
      - 9201:9200
  es03:
    image: elasticsearch-with-escript:latest
    environment:
      - node.name=es03
      - cluster.name=escript-cluster
      - discovery.seed_hosts=es01,es02
      - cluster.initial_master_nodes=es01,es02,es03
    ports:
      - 9202:9200
```

### Testing Failover

1. Create a job on node es01 (port 9200)
2. Stop es01
3. Query the job from es02 (port 9201) - should still be visible
4. Check leader election - es02 or es03 should become the new leader