# Remote Execution (PsExec)

This notebook demonstrates how to execute HEC-RAS plans on remote Windows machines using PsExec.

**Features:**
- Distributed execution across multiple remote machines
- Automatic project deployment via network shares
- Parallel execution with configurable workers
- Result collection and consolidation
- **Automatic PsExec.exe download** (no manual setup required)

**Requirements:**
- Remote machine(s) configured per REMOTE_WORKER_SETUP_GUIDE.md (see feature_dev_notes/RasRemote/)
- Network share accessible from control machine
- HEC-RAS installed on remote machine(s)

**Note:** PsExec.exe will be automatically downloaded to `C:\Users\{username}\psexec\` if not found.

**Author:** William (Bill) Katzenmeyer, P.E., C.F.M.

**Date:** 2025-11-24

## 1. Setup and Imports

In [1]:
##### Optional Code Cell For Development/Testing Mode (Local Copy)
##### Uncomment and run this cell instead of the pip cell above

# Optional dependency install for remote workers (only if missing)
import sys
import subprocess

need_paramiko = False
need_docker = False

try:
    import paramiko
except ImportError:
    need_paramiko = True

try:
    import docker
except ImportError:
    need_docker = True

to_install = []
if need_paramiko:
    to_install.append("paramiko")
if need_docker:
    to_install.append("docker")

if to_install:
    print(f"Installing missing packages: {', '.join(to_install)}")
    subprocess.check_call([sys.executable, "-m", "pip", "install", "-q", *to_install])
else:
    print("paramiko and docker already installed; skipping pip install.")

# For Development Mode, add the parent directory to the Python path
import os
from pathlib import Path
import time

current_file = Path(os.getcwd()).resolve()
rascmdr_directory = current_file.parent

# Use insert(0) instead of append() to give highest priority to local version
if str(rascmdr_directory) not in sys.path:
    sys.path.insert(0, str(rascmdr_directory))

print("Loading ras-commander from local dev copy")
from ras_commander import *

paramiko and docker already installed; skipping pip install.
Loading ras-commander from local dev copy


## 2. Configure Remote Workers

Load worker configurations from `RemoteWorkers.json` file using `load_workers_from_json()`.

**First time setup:**
1. Copy `RemoteWorkers.json.template` to `RemoteWorkers.json`
2. Edit `RemoteWorkers.json` with your remote machine details
3. The JSON file is in `.gitignore` for security (credentials won't be committed)

**JSON Format:**
```json
{
  "workers": [
    {
      "name": "Local Compute",
      "worker_type": "local",
      "worker_folder": "C:\\RasRemote",
      "process_priority": "low",
      "queue_priority": 0,
      "cores_total": 4,
      "cores_per_plan": 2,
      "enabled": true
    },
    {
      "name": "Remote Workstation",
      "worker_type": "psexec",
      "hostname": "192.168.1.100",
      "share_path": "\\\\192.168.1.100\\RasRemote",
      "worker_folder": "C:\\RasRemote",
      "username": "your_username",
      "password": "your_password",
      "session_id": 2,
      "process_priority": "low",
      "queue_priority": 1,
      "cores_total": 16,
      "cores_per_plan": 4,
      "enabled": true
    }
  ]
}
```

**Key Changes (v0.85.0):**
- `ras_exe_path` is no longer required - automatically obtained from the initialized RAS project
- Use `load_workers_from_json()` to load all workers from a JSON file
- `worker_type` field is now required in each worker configuration
- `worker_folder` replaces `local_path` - specifies where temp folders are created

**Configuration Fields:**
- `worker_type`: Required - "psexec", "local", "docker", "ssh", etc.
- `worker_folder`: Local path where temporary worker folders are created during execution
- `share_path`: (psexec/docker) UNC path to network share that maps to worker_folder
- `process_priority`: OS process priority for HEC-RAS execution
  - Valid values: `"low"` (default, recommended), `"below normal"`, `"normal"`
- `queue_priority`: Execution queue priority (0-9)
  - Lower values execute first (0 = highest priority)
- `cores_total`: Total CPU cores on the remote machine (enables parallel execution)
- `cores_per_plan`: Cores allocated to each HEC-RAS plan
- **Parallel plans**: cores_total / cores_per_plan (e.g., 16/4 = 4 plans in parallel)

**Session ID:** Use `query user` on remote machine to find (typically 2)

### Docker Worker Configuration

For Docker workers using SSH remote hosts (`docker_host: "ssh://user@host"`):

**Required Setup:**
1. **SSH key-based authentication** (password auth NOT supported by Docker SDK)
   ```bash
   # Generate SSH key (if you don't have one)
   ssh-keygen -t ed25519
   
   # Copy key to remote host
   ssh-copy-id user@192.168.3.8
   
   # Test connection (must work without password prompt)
   ssh user@192.168.3.8 "docker info"
   ```

2. **Docker configuration on remote host:**
   - Docker Desktop or Docker Engine must be running
   - User must be in the `docker` group (Linux) or have Docker Desktop access (Windows)

3. **JSON configuration example:**
   ```json
   {
     "name": "Remote Docker",
     "worker_type": "docker",
     "docker_image": "hecras:6.6",
     "docker_host": "ssh://user@192.168.3.8",
     "share_path": "\\\\192.168.3.8\\RasRemote",
     "remote_staging_path": "/mnt/c/RasRemote",
     "use_ssh_client": true,
     "cores_total": 8,
     "cores_per_plan": 4,
     "preprocess_on_host": true,
     "enabled": true
   }
   ```

**Key Docker Fields:**
- `docker_image`: Docker image with HEC-RAS Linux (e.g., "hecras:6.6")
- `docker_host`: Docker daemon URL - `ssh://user@host` for remote SSH
- `remote_staging_path`: Path on Docker host for volume mounts (use WSL paths like `/mnt/c/...`)
- `use_ssh_client`: Set `true` to use system ssh command (recommended for SSH agent support)
- `preprocess_on_host`: Set `true` to run Windows preprocessing locally before Docker execution

In [2]:
# Load remote worker configurations using the new load_workers_from_json() function
# Note: Workers are loaded AFTER init_ras_project() so ras_exe_path is obtained automatically

config_file = Path("RemoteWorkers.json")

if not config_file.exists():
    print("ERROR: RemoteWorkers.json not found!")
    print()
    print("First time setup:")
    print("1. Copy RemoteWorkers.json.template to RemoteWorkers.json")
    print("2. Edit RemoteWorkers.json with your remote machine details")
    print("3. Run this cell again")
    print()
    print("The RemoteWorkers.json file should be in the same folder as this notebook.")
    raise FileNotFoundError("RemoteWorkers.json not found. See instructions above.")

# Preview the JSON configuration (without loading workers yet)
import json
with open(config_file, 'r') as f:
    worker_configs = json.load(f)

# Get enabled workers for display
enabled_configs = [w for w in worker_configs["workers"] if w.get("enabled", True)]

print(f"Found {len(enabled_configs)} enabled worker(s) in RemoteWorkers.json:")
for w in enabled_configs:
    cores_total = w.get('cores_total', 'Not set')
    cores_per_plan = w.get('cores_per_plan', 4)
    process_priority = w.get('process_priority', 'low')
    queue_priority = w.get('queue_priority', 0)
    
    if w.get('cores_total'):
        max_parallel = w['cores_total'] // cores_per_plan
        parallel_info = f"{max_parallel} plans in parallel"
    else:
        parallel_info = "Sequential execution"

    print(f"  - {w.get('name', 'unnamed')} ({w.get('hostname', 'localhost')})")
    print(f"    Type: {w.get('worker_type', 'unknown')}")
    print(f"    Cores: {cores_total} total, {cores_per_plan} per plan → {parallel_info}")
    print(f"    Process Priority: {process_priority}, Queue Priority: {queue_priority}")

print()
print("NOTE: Workers will be loaded after init_ras_project() to get ras_exe_path automatically")

Found 1 enabled worker(s) in RemoteWorkers.json:
  - CLB-04 Docker 6.6 (localhost)
    Type: docker
    Cores: 4 total, 4 per plan → 1 plans in parallel
    Process Priority: low, Queue Priority: 4

NOTE: Workers will be loaded after init_ras_project() to get ras_exe_path automatically


## 3. Example 1: Execute Single Plan (Muncie)

Simple example executing one plan from the Muncie example project.

In [3]:
# Extract Muncie example project
muncie_path = RasExamples.extract_project("Muncie")
print(f"Project extracted to: {muncie_path}")

# Initialize project (updates global ras object)
init_ras_project(muncie_path, "6.6")
print(f"Project initialized: {ras.project_name}")
print(f"Available plans: {list(ras.plan_df.index)}")

2025-12-04 10:51:01 - ras_commander.RasExamples - INFO - Found zip file: c:\GH\ras-commander\examples\Example_Projects_6_6.zip
2025-12-04 10:51:01 - ras_commander.RasExamples - INFO - Loading project data from CSV...
2025-12-04 10:51:01 - ras_commander.RasExamples - INFO - Loaded 68 projects from CSV.
2025-12-04 10:51:01 - ras_commander.RasExamples - INFO - ----- RasExamples Extracting Project -----
2025-12-04 10:51:01 - ras_commander.RasExamples - INFO - Extracting project 'Muncie'
2025-12-04 10:51:02 - ras_commander.RasExamples - INFO - Successfully extracted project 'Muncie' to c:\GH\ras-commander\examples\example_projects\Muncie
2025-12-04 10:51:02 - ras_commander.RasMap - INFO - Successfully parsed RASMapper file: C:\GH\ras-commander\examples\example_projects\Muncie\Muncie.rasmap


Project extracted to: c:\GH\ras-commander\examples\example_projects\Muncie
Project initialized: Muncie
Available plans: [0, 1, 2]


In [4]:
# Load workers from JSON - ras_exe_path is automatically obtained from the ras object
# This must be called AFTER init_ras_project() so the RAS executable path is known

workers = load_workers_from_json("RemoteWorkers.json")

print(f"Loaded {len(workers)} worker(s):")
for w in workers:
    print(f"  - {w.worker_id} ({w.worker_type})")
    print(f"    Hostname: {w.hostname}")
    print(f"    RAS Exe: {w.ras_exe_path}")
    print(f"    Session ID: {getattr(w, 'session_id', 'N/A')}")
    print(f"    Process Priority: {getattr(w, 'process_priority', 'N/A')}")
    print(f"    Queue Priority: {getattr(w, 'queue_priority', 'N/A')}")
    if hasattr(w, 'max_parallel_plans') and w.max_parallel_plans > 1:
        print(f"    Parallel Capacity: {w.max_parallel_plans} plans simultaneously")
    print()

# Use first worker for single-plan examples
if workers:
    worker = workers[0]
    print(f"Using worker for examples: {worker.worker_id}")
else:
    raise ValueError("No workers loaded from RemoteWorkers.json")

2025-12-04 10:51:02 - ras_commander.remote.RasWorker - INFO - Initializing docker worker
2025-12-04 10:51:02 - ras_commander.remote.DockerWorker - INFO - Using system ssh client for Docker connection
2025-12-04 10:51:02 - ras_commander.remote.DockerWorker - INFO - Docker daemon connected: ssh://bill@192.168.3.8
2025-12-04 10:51:03 - ras_commander.remote.DockerWorker - INFO - Docker image found: hecras:6.6
2025-12-04 10:51:03 - ras_commander.remote.DockerWorker - INFO - DockerWorker initialized:
2025-12-04 10:51:03 - ras_commander.remote.DockerWorker - INFO -   Image: hecras:6.6
2025-12-04 10:51:03 - ras_commander.remote.DockerWorker - INFO -   Host: ssh://bill@192.168.3.8
2025-12-04 10:51:03 - ras_commander.remote.DockerWorker - INFO -   Preprocess on host: True
2025-12-04 10:51:03 - ras_commander.remote.DockerWorker - INFO -   Max parallel plans: 1
2025-12-04 10:51:03 - ras_commander.remote.DockerWorker - INFO -   Timeout: 60 minutes
2025-12-04 10:51:03 - ras_commander.remote.DockerWo

Loaded 1 worker(s):
  - CLB-04 Docker 6.6 (docker)
    Hostname: None
    RAS Exe: C:\Program Files (x86)\HEC\HEC-RAS\6.6\Ras.exe
    Session ID: N/A
    Process Priority: low
    Queue Priority: 4

Using worker for examples: CLB-04 Docker 6.6


In [5]:
# Execute Plan 01 remotely
# autoclean=True (default) deletes worker folders after execution
# Set autoclean=False for debugging to preserve worker folders on the remote machine

print("Executing Plan 01 on remote machine...")
print("This will take ~30-60 seconds")

start_time = time.time()

results = compute_parallel_remote(
    plan_numbers="01",
    workers=[worker],
    num_cores=4,
    autoclean=True  # Default is True - deletes temp folders after execution
)

elapsed = time.time() - start_time

print(f"\nExecution complete in {elapsed:.1f} seconds ({elapsed/60:.1f} minutes)")
print(f"\nResults:")
for plan_num, result in results.items():
    if result.success:
        print(f"  Plan {plan_num}: SUCCESS")
        print(f"    HDF Path: {result.hdf_path}")
        print(f"    Execution Time: {result.execution_time:.1f}s")
    else:
        print(f"  Plan {plan_num}: FAILED - {result.error_message}")

2025-12-04 10:51:03 - ras_commander.remote.Execution - INFO - Starting distributed execution of 1 plans across 1 workers
2025-12-04 10:51:03 - ras_commander.remote.Execution - INFO - Total worker slots available: 1
2025-12-04 10:51:03 - ras_commander.remote.Execution - INFO - Submitting plan 01 to worker CLB-04 Docker 6.6 (sub-worker #1)
2025-12-04 10:51:03 - ras_commander.remote.DockerWorker - INFO - Starting Docker execution: plan 01, sub-worker 1
2025-12-04 10:51:03 - ras_commander.remote.DockerWorker - INFO - Remote Docker host: ssh://bill@192.168.3.8
2025-12-04 10:51:03 - ras_commander.remote.DockerWorker - INFO -   Local preprocessing: C:\Users\BILLK_~1\AppData\Local\Temp
2025-12-04 10:51:03 - ras_commander.remote.DockerWorker - INFO -   Remote share (UNC): \\192.168.3.8\RasRemote
2025-12-04 10:51:03 - ras_commander.remote.DockerWorker - INFO -   Docker mounts: /mnt/c/RasRemote
2025-12-04 10:51:03 - ras_commander.remote.DockerWorker - INFO - Copying project to local staging for p

Executing Plan 01 on remote machine...
This will take ~30-60 seconds


2025-12-04 10:51:07 - ras_commander.remote.DockerWorker - INFO - Detected 'Starting Unsteady Flow Computations' in Muncie.bco01
2025-12-04 10:51:07 - ras_commander.remote.DockerWorker - INFO - Preprocessing complete - terminating HEC-RAS before computation starts
2025-12-04 10:51:07 - ras_commander.remote.DockerWorker - INFO - Terminating HEC-RAS process...
2025-12-04 10:51:07 - ras_commander.remote.DockerWorker - INFO - HEC-RAS terminated successfully
2025-12-04 10:51:07 - ras_commander.remote.DockerWorker - INFO - Preprocessing complete: Muncie.p01.tmp.hdf (1.9 MB)
2025-12-04 10:51:07 - ras_commander.remote.DockerWorker - INFO - Copying preprocessed files to remote share...
2025-12-04 10:51:08 - ras_commander.remote.DockerWorker - INFO - Files copied to: \\192.168.3.8\RasRemote\ras_docker_Muncie_p01_sw1_b5d7969b
2025-12-04 10:51:08 - ras_commander.remote.DockerWorker - INFO - Plan 01 uses geometry 01
2025-12-04 10:51:08 - ras_commander.remote.DockerWorker - INFO - Starting container:


Execution complete in 48.3 seconds (0.8 minutes)

Results:
  Plan 01: SUCCESS
    HDF Path: C:\GH\ras-commander\examples\example_projects\Muncie\Muncie.p01.hdf
    Execution Time: 48.3s


In [6]:
# Verify Muncie results using HDF analysis
from ras_commander import HdfResultsPlan

hdf_path = Path(muncie_path) / "Muncie.p01.hdf"

if hdf_path.exists():
    print("=" * 70)
    print("MUNCIE PLAN 01 - RESULT VERIFICATION")
    print("=" * 70)
    print()
    
    # Get basic info
    size_mb = hdf_path.stat().st_size / (1024 * 1024)
    print(f"HDF File: {hdf_path.name}")
    print(f"Size: {size_mb:.2f} MB")
    print()
    
    # Get compute messages (static method)
    msgs = HdfResultsPlan.get_compute_messages(hdf_path)
    
    if "completed successfully" in msgs.lower() or "complete process" in msgs.lower():
        print("Compute Status: ✅ Successful")
    else:
        print("Compute Status: ⚠️ Check messages")
    
    # Show last part of compute messages
    print("\nCompute Messages (last 250 chars):")
    print(msgs[-250:])
    print()
    
    # Get steady flow results
    is_steady = HdfResultsPlan.is_steady_plan(hdf_path)
    if is_steady:
        profiles = HdfResultsPlan.get_steady_profile_names(hdf_path)
        print(f"Steady Flow Profiles: {profiles}")
        
        # Get WSE for first profile
        if profiles:
            wse_df = HdfResultsPlan.get_steady_wse(hdf_path, profiles[0])
            if wse_df is not None and len(wse_df) > 0:
                print(f"Cross Sections: {len(wse_df)}")
                print(f"WSE Range: {wse_df['W.S. Elev'].min():.2f} to {wse_df['W.S. Elev'].max():.2f} ft")
    
    # Get volume accounting
    try:
        vol = HdfResultsPlan.get_volume_accounting(hdf_path)
        if vol is not None:
            print(f"\nVolume Accounting: Available ({len(vol)} entries)")
            print(vol)
    except:
        print("\nVolume Accounting: Not available")
    
    print()
    print("✅ Remote execution verified - HDF results successfully collected!")
    print()
else:
    print("❌ HDF file not found - execution may have failed")

2025-12-04 10:51:51 - ras_commander.hdf.HdfResultsPlan - INFO - Using existing Path object HDF file: c:\GH\ras-commander\examples\example_projects\Muncie\Muncie.p01.hdf
2025-12-04 10:51:51 - ras_commander.hdf.HdfResultsPlan - INFO - Final validated file path: c:\GH\ras-commander\examples\example_projects\Muncie\Muncie.p01.hdf
2025-12-04 10:51:51 - ras_commander.hdf.HdfResultsPlan - INFO - Using existing Path object HDF file: c:\GH\ras-commander\examples\example_projects\Muncie\Muncie.p01.hdf
2025-12-04 10:51:51 - ras_commander.hdf.HdfResultsPlan - INFO - Final validated file path: c:\GH\ras-commander\examples\example_projects\Muncie\Muncie.p01.hdf
2025-12-04 10:51:51 - ras_commander.hdf.HdfResultsPlan - INFO - Using existing Path object HDF file: c:\GH\ras-commander\examples\example_projects\Muncie\Muncie.p01.hdf
2025-12-04 10:51:51 - ras_commander.hdf.HdfResultsPlan - INFO - Final validated file path: c:\GH\ras-commander\examples\example_projects\Muncie\Muncie.p01.hdf


MUNCIE PLAN 01 - RESULT VERIFICATION

HDF File: Muncie.p01.hdf
Size: 3.25 MB

Compute Status: ⚠️ Check messages

Compute Messages (last 250 chars):



Volume Accounting: Available (1 entries)
      Error  Error Percent  Total Boundary Flux of Water In  \
0 -0.277472        0.00075                     36674.503906   

   Total Boundary Flux of Water Out Vol Accounting in  Volume Ending  \
0                      33463.007812         Acre Feet    3533.462646   

   Volume Starting  
0       322.244659  

✅ Remote execution verified - HDF results successfully collected!



In [7]:
# Extract BaldEagleCrkMulti2D project
baldeagle_path = RasExamples.extract_project("BaldEagleCrkMulti2D")
print(f"Project extracted to: {baldeagle_path}")

# Initialize project (updates global ras object)
init_ras_project(baldeagle_path, "6.6")
print(f"Project initialized: {ras.project_name}")
print(f"Available plans: {list(ras.plan_df.index)}")

2025-12-04 10:51:51 - ras_commander.RasExamples - INFO - ----- RasExamples Extracting Project -----
2025-12-04 10:51:51 - ras_commander.RasExamples - INFO - Extracting project 'BaldEagleCrkMulti2D'
2025-12-04 10:51:52 - ras_commander.RasExamples - INFO - Successfully extracted project 'BaldEagleCrkMulti2D' to c:\GH\ras-commander\examples\example_projects\BaldEagleCrkMulti2D
2025-12-04 10:51:53 - ras_commander.RasMap - INFO - Successfully parsed RASMapper file: C:\GH\ras-commander\examples\example_projects\BaldEagleCrkMulti2D\BaldEagleDamBrk.rasmap


Project extracted to: c:\GH\ras-commander\examples\example_projects\BaldEagleCrkMulti2D
Project initialized: BaldEagleDamBrk
Available plans: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]


In [8]:
# Execute a few plans to test parallel execution with thread-safe implementation
# Testing with 3 plans to verify thread-safety fix works
# (Reduced from 8 plans to avoid timeout during testing)

test_plans = ["03", "04", "06"]
print(f"Executing {len(test_plans)} plans on remote machine: {test_plans}")
print("These are 2D unsteady models - may take 5-10 minutes total")
print("Watch the logs to observe queue priority and wave scheduling")

start_time = time.time()

results = compute_parallel_remote(
    plan_numbers=test_plans,
    workers=[worker],
    num_cores=4,
    autoclean=True  # Default is True - deletes temp folders after execution
)

elapsed = time.time() - start_time

print(f"\nExecution complete in {elapsed:.1f} seconds ({elapsed/60:.1f} minutes)")
print(f"\nResults:")
success_count = 0
for plan_num, result in results.items():
    if result.success:
        print(f"  Plan {plan_num}: SUCCESS ({result.execution_time:.1f}s)")
        success_count += 1
    else:
        print(f"  Plan {plan_num}: FAILED - {result.error_message}")

print(f"\nSummary: {success_count}/{len(results)} plans succeeded")

2025-12-04 10:51:53 - ras_commander.remote.Execution - INFO - Starting distributed execution of 3 plans across 1 workers
2025-12-04 10:51:53 - ras_commander.remote.Execution - INFO - Total worker slots available: 1
2025-12-04 10:51:53 - ras_commander.remote.Execution - INFO - Submitting plan 03 to worker CLB-04 Docker 6.6 (sub-worker #1)
2025-12-04 10:51:53 - ras_commander.remote.Execution - INFO - Submitting plan 04 to worker CLB-04 Docker 6.6 (sub-worker #1)
2025-12-04 10:51:53 - ras_commander.remote.DockerWorker - INFO - Starting Docker execution: plan 03, sub-worker 1
2025-12-04 10:51:53 - ras_commander.remote.Execution - INFO - Submitting plan 06 to worker CLB-04 Docker 6.6 (sub-worker #1)
2025-12-04 10:51:53 - ras_commander.remote.DockerWorker - INFO - Remote Docker host: ssh://bill@192.168.3.8
2025-12-04 10:51:53 - ras_commander.remote.DockerWorker - INFO -   Local preprocessing: C:\Users\BILLK_~1\AppData\Local\Temp
2025-12-04 10:51:53 - ras_commander.remote.DockerWorker - INFO 

Executing 3 plans on remote machine: ['03', '04', '06']
These are 2D unsteady models - may take 5-10 minutes total
Watch the logs to observe queue priority and wave scheduling


2025-12-04 10:51:53 - ras_commander.remote.DockerWorker - INFO - Running preprocessing locally (not on network share)...
2025-12-04 10:51:53 - ras_commander.RasMap - INFO - Successfully parsed RASMapper file: C:\Users\billk_clb\AppData\Local\Temp\ras_docker_BaldEagleDamBrk_p03_sw1_87b65e71\input\BaldEagleDamBrk.rasmap
2025-12-04 10:51:53 - ras_commander.remote.DockerWorker - INFO - Preprocessing plan 03 for Linux execution...
2025-12-04 10:51:53 - ras_commander.geom.GeomPreprocessor - INFO - Clearing geometry preprocessor file for single plan: 03
2025-12-04 10:51:53 - ras_commander.geom.GeomPreprocessor - INFO - Geometry dataframe updated successfully.
2025-12-04 10:51:53 - ras_commander.remote.DockerWorker - INFO - Plan 03 uses geometry 09
2025-12-04 10:51:53 - ras_commander.RasPlan - INFO - Successfully updated run flags in plan file: C:\Users\BILLK_~1\AppData\Local\Temp\ras_docker_BaldEagleDamBrk_p03_sw1_87b65e71\input\BaldEagleDamBrk.p03 (flags modified: 3)
2025-12-04 10:51:53 - ra


Execution complete in 719.5 seconds (12.0 minutes)

Results:
  Plan 03: SUCCESS (165.6s)
  Plan 04: SUCCESS (144.8s)
  Plan 06: SUCCESS (409.1s)

Summary: 3/3 plans succeeded


In [9]:
# Verify BaldEagle results using HDF analysis
from ras_commander import HdfResultsPlan, HdfResultsMesh

print("=" * 70)
print("BALDEAGLE PLANS - RESULT VERIFICATION")
print("=" * 70)
print()

for plan_num in ["03", "04", "06"]:
    hdf_path = Path(baldeagle_path) / f"BaldEagleDamBrk.p{plan_num}.hdf"
    
    if hdf_path.exists():
        print(f"Plan {plan_num}:")
        size_mb = hdf_path.stat().st_size / (1024 * 1024)
        print(f"  HDF Size: {size_mb:.2f} MB")
        
        # Get compute messages (static method)
        msgs = HdfResultsPlan.get_compute_messages(hdf_path)
        if "completed successfully" in msgs.lower() or "complete process" in msgs.lower():
            print(f"  Status: ✅ Computation successful")
        else:
            print(f"  Status: ⚠️ Check compute messages")
        
        # Get unsteady summary
        try:
            summary = HdfResultsPlan.get_unsteady_summary(hdf_path)
            if summary is not None:
                print(f"  Unsteady Summary: Available")
        except:
            print(f"  Unsteady Summary: Not available")
        
        # Get volume accounting
        try:
            vol = HdfResultsPlan.get_volume_accounting(hdf_path)
            if vol is not None and len(vol) > 0:
                print(f"  Volume Accounting: {len(vol)} entries")
        except:
            pass
        
        # Get mesh timesteps for 2D
        try:
            mesh_times = HdfResultsMesh.get_output_times(hdf_path)
            if mesh_times is not None:
                print(f"  Output Timesteps: {len(mesh_times)}")
        except:
            pass
        
        print()
    else:
        print(f"Plan {plan_num}: ❌ HDF file not found")
        print()

print("✅ Remote execution verified - 2D model results successfully collected!")
print()

2025-12-04 11:03:52 - ras_commander.hdf.HdfResultsPlan - INFO - Using existing Path object HDF file: c:\GH\ras-commander\examples\example_projects\BaldEagleCrkMulti2D\BaldEagleDamBrk.p03.hdf
2025-12-04 11:03:52 - ras_commander.hdf.HdfResultsPlan - INFO - Final validated file path: c:\GH\ras-commander\examples\example_projects\BaldEagleCrkMulti2D\BaldEagleDamBrk.p03.hdf
2025-12-04 11:03:52 - ras_commander.hdf.HdfResultsPlan - INFO - Reading computation messages from HDF: BaldEagleDamBrk.p03.hdf
2025-12-04 11:03:52 - ras_commander.hdf.HdfResultsPlan - INFO - Successfully extracted 2014 characters from HDF
2025-12-04 11:03:52 - ras_commander.hdf.HdfResultsPlan - INFO - Using existing Path object HDF file: c:\GH\ras-commander\examples\example_projects\BaldEagleCrkMulti2D\BaldEagleDamBrk.p03.hdf
2025-12-04 11:03:52 - ras_commander.hdf.HdfResultsPlan - INFO - Final validated file path: c:\GH\ras-commander\examples\example_projects\BaldEagleCrkMulti2D\BaldEagleDamBrk.p03.hdf
2025-12-04 11:03:

BALDEAGLE PLANS - RESULT VERIFICATION

Plan 03:
  HDF Size: 59.07 MB
  Status: ✅ Computation successful
  Unsteady Summary: Available
  Volume Accounting: 1 entries

Plan 04:
  HDF Size: 81.72 MB
  Status: ⚠️ Check compute messages
  Unsteady Summary: Available
  Volume Accounting: 1 entries

Plan 06:
  HDF Size: 564.36 MB
  Status: ⚠️ Check compute messages
  Unsteady Summary: Available
  Volume Accounting: 1 entries

✅ Remote execution verified - 2D model results successfully collected!



## 5. Example 3: Multiple Remote Workers (Parallel)

Execute plans across multiple remote machines simultaneously.

**Note:** This example uses ALL enabled workers from `RemoteWorkers.json`.
To use multiple machines, add additional workers to the JSON file and set `enabled: true`.

In [10]:
# Execute multiple plans across all loaded workers
# Plans will be distributed based on queue_priority (0 first, then 1, etc.)

# Workers were already loaded in cell-7 using load_workers_from_json()
if len(workers) > 1:
    print(f"Executing plans across {len(workers)} worker(s)...")
    for w in workers:
        print(f"  - {w.worker_id} ({w.hostname}) - Queue {getattr(w, 'queue_priority', 0)}")
    
    start_time = time.time()
    
    results = compute_parallel_remote(
        plan_numbers=["06", "19"],
        workers=workers,
        num_cores=4,
        clear_geompre=False,
        autoclean=True  # Default is True - deletes temp folders after execution
    )
    
    elapsed = time.time() - start_time
    
    print(f"\nTotal execution time: {elapsed:.1f} seconds ({elapsed/60:.1f} minutes)")
    print(f"\nResults:")
    for plan_num, result in results.items():
        status = "SUCCESS" if result.success else f"FAILED: {result.error_message}"
        print(f"  Plan {plan_num}: {status}")
    
    # Calculate speedup
    successful = sum(1 for r in results.values() if r.success)
    print(f"\nSummary: {successful}/{len(results)} plans succeeded")
else:
    print(f"Only 1 worker loaded - skipping multi-worker example")
    print(f"To test parallel execution:")
    print(f"  1. Add more workers to RemoteWorkers.json")
    print(f"  2. Set enabled=true for each")
    print(f"  3. Re-run the notebook from the beginning")

Only 1 worker loaded - skipping multi-worker example
To test parallel execution:
  1. Add more workers to RemoteWorkers.json
  2. Set enabled=true for each
  3. Re-run the notebook from the beginning


In [11]:
# Alternative: Manually initialize a worker without JSON file
# This demonstrates the init_ras_worker() function directly
# Note: ras_exe_path is automatically obtained from the ras object

manual_worker = init_ras_worker(
    "psexec",
    hostname="192.168.3.8",  # Replace with your hostname
    share_path=r"\\192.168.3.8\RasRemote",  # Replace with your share path
    worker_folder=r"C:\RasRemote",  # Local path on remote machine corresponding to share_path
    credentials={
        "username": ".\\bill",  # Replace with your username
        "password": "YourPassword"  # Replace with your password
    },
    # ras_exe_path is NOT required - obtained from ras object automatically
    session_id=2,
    process_priority="low",
    queue_priority=0,
    cores_total=8,
    cores_per_plan=2
)

print(f"Manual worker initialized:")
print(f"  Worker ID: {manual_worker.worker_id}")
print(f"  Hostname: {manual_worker.hostname}")
print(f"  Worker Folder: {manual_worker.worker_folder}")
print(f"  RAS Exe: {manual_worker.ras_exe_path}")  # Automatically set from ras object
print(f"  Parallel Capacity: {manual_worker.max_parallel_plans} plans")

2025-12-04 11:03:52 - ras_commander.remote.RasWorker - INFO - Initializing psexec worker
2025-12-04 11:03:52 - ras_commander.remote.PsexecWorker - INFO - Initializing PsExec worker for 192.168.3.8
2025-12-04 11:03:52 - ras_commander.remote.PsexecWorker - INFO - PsExec worker configured:
2025-12-04 11:03:52 - ras_commander.remote.PsexecWorker - INFO -   Hostname: 192.168.3.8
2025-12-04 11:03:52 - ras_commander.remote.PsexecWorker - INFO -   Share path: \\192.168.3.8\RasRemote
2025-12-04 11:03:52 - ras_commander.remote.PsexecWorker - INFO -   Worker folder: C:\RasRemote
2025-12-04 11:03:52 - ras_commander.remote.PsexecWorker - INFO -   User: .\bill
2025-12-04 11:03:52 - ras_commander.remote.PsexecWorker - INFO -   System account: False
2025-12-04 11:03:52 - ras_commander.remote.PsexecWorker - INFO -   Session ID: 2
2025-12-04 11:03:52 - ras_commander.remote.PsexecWorker - INFO -   Process Priority: low
2025-12-04 11:03:52 - ras_commander.remote.PsexecWorker - INFO -   Queue Priority: 0


Manual worker initialized:
  Worker ID: psexec_62a78450
  Hostname: 192.168.3.8
  Worker Folder: C:\RasRemote
  RAS Exe: C:\Program Files (x86)\HEC\HEC-RAS\6.6\Ras.exe
  Parallel Capacity: 4 plans


## 6. Verify Results

Check that HDF files were created and results collected properly.

In [12]:
# List only .pXX.hdf files in results folder (plan result HDFs)
import re

results_path = Path(baldeagle_path).parent / "multi_worker_results" / "BaldEagleDamBrk"

pattern = re.compile(r"\.p\d{2}\.hdf$", re.IGNORECASE)

if results_path.exists():
    hdf_files = [hdf for hdf in results_path.glob("*.hdf") if pattern.search(hdf.name)]
    print(f"Plan HDF files (.pXX.hdf) in results folder: {len(hdf_files)}")
    for hdf in hdf_files:
        size_mb = hdf.stat().st_size / (1024 * 1024)
        print(f"  {hdf.name}: {size_mb:.2f} MB")
else:
    print(f"Results folder not found: {results_path}")

Results folder not found: c:\GH\ras-commander\examples\example_projects\multi_worker_results\BaldEagleDamBrk


## 7. Advanced Configuration

### Session ID Determination

Find the active session ID on a remote machine:

In [13]:
# Query active sessions on remote machine
# Uses the first loaded worker to get psexec_path and credentials
import subprocess

if workers:
    w = workers[0]
    psexec = getattr(w, 'psexec_path', None)
    
    if psexec and hasattr(w, 'credentials') and w.credentials:
        cmd = [
            psexec,
            f"\\\\{w.hostname}",
            "-u", w.credentials.get("username", ""),
            "-p", w.credentials.get("password", ""),
            "-accepteula",
            "cmd", "/c", "query", "user"
        ]

        try:
            result = subprocess.run(cmd, capture_output=True, text=True, timeout=30)
            print("Active sessions on remote machine:")
            print(result.stdout)
            print("\nLook for the ID column - typically 2 for workstations")
        except subprocess.TimeoutExpired:
            print("Timeout querying sessions")
        except Exception as e:
            print(f"Could not query sessions: {e}")
    else:
        print("Worker doesn't have psexec_path or credentials set")
        print("Try session_id=2 (most common for single-user workstations)")
else:
    print("No workers loaded - run previous cells first")

Worker doesn't have psexec_path or credentials set
Try session_id=2 (most common for single-user workstations)


### Process Priority Levels

Control OS process priority for remote HEC-RAS execution:

- `"low"` - Low priority (recommended for background work, minimal impact on remote user)
- `"below normal"` - Below normal priority
- `"normal"` - Normal priority (default Windows priority)

**Note:** Higher priorities (above normal, high, realtime) are NOT supported to avoid impacting remote user operations.

### Queue Priority

Control execution order across workers:

- `queue_priority` is an integer from 0-9 (lower = higher priority)
- Workers at queue level 0 are filled before queue level 1, etc.
- Within each queue level, wave scheduling applies (one plan per machine first, then additional)
- Use for tiered bursting: local workers (queue 0) execute first, then remote (queue 1), then cloud (queue 2)

In [14]:
# Example: Viewing worker configuration with low process priority
# Workers loaded from JSON already have these settings applied

if workers:
    w = workers[0]
    print(f"Worker: {w.worker_id}")
    print(f"  Process Priority: {getattr(w, 'process_priority', 'N/A')}")
    print(f"  Queue Priority: {getattr(w, 'queue_priority', 'N/A')}")
    print(f"  RAS Exe Path: {w.ras_exe_path}")
    print()
    print("To change settings, edit RemoteWorkers.json and reload workers:")
    print("  workers = load_workers_from_json('RemoteWorkers.json')")
else:
    print("No workers loaded - run previous cells first")

Worker: CLB-04 Docker 6.6
  Process Priority: low
  Queue Priority: 4
  RAS Exe Path: C:\Program Files (x86)\HEC\HEC-RAS\6.6\Ras.exe

To change settings, edit RemoteWorkers.json and reload workers:
  workers = load_workers_from_json('RemoteWorkers.json')


## 8. Troubleshooting (Optional)

### Test Remote Connections using psexec

Change the cell below to a code cell, enter your username and password for use in testing. 

Don't leave your passwords here, it can get synced back to git.  Use RemoteWorkers.json, it is already in the .gitignore for this repo.  
Use the code cell below for testing only, not as a design pattern for production usage: 

In [15]:
# Build REMOTE_CONFIG from the first psexec worker in workers list
# This uses the credentials already loaded from RemoteWorkers.json

REMOTE_CONFIG = None

if workers:
    # Find first psexec worker
    for w in workers:
        if w.worker_type == "psexec":
            REMOTE_CONFIG = {
                "hostname": w.hostname,
                "share_path": w.share_path,
                "username": w.credentials.get("username", "") if hasattr(w, 'credentials') and w.credentials else "",
                "password": w.credentials.get("password", "") if hasattr(w, 'credentials') and w.credentials else "",
                "ras_exe_path": w.ras_exe_path,
                "session_id": getattr(w, 'session_id', 2)
            }
            print(f"REMOTE_CONFIG built from worker: {w.worker_id}")
            print(f"  Hostname: {REMOTE_CONFIG['hostname']}")
            print(f"  Share Path: {REMOTE_CONFIG['share_path']}")
            print(f"  Session ID: {REMOTE_CONFIG['session_id']}")
            break

if REMOTE_CONFIG is None:
    print("WARNING: No psexec workers found in workers list.")
    print("Define REMOTE_CONFIG manually or add psexec workers to RemoteWorkers.json")

Define REMOTE_CONFIG manually or add psexec workers to RemoteWorkers.json


In [16]:
# Test basic PsExec connectivity
import subprocess

if REMOTE_CONFIG is None:
    print("REMOTE_CONFIG not set - run the cell above first")
else:
    # Get psexec path from the initialized worker
    try:
        temp_worker = init_ras_worker(
            "psexec",
            hostname=REMOTE_CONFIG["hostname"],
            share_path=REMOTE_CONFIG["share_path"],
            credentials={
                "username": REMOTE_CONFIG["username"],
                "password": REMOTE_CONFIG["password"]
            },
            session_id=REMOTE_CONFIG["session_id"]
        )
        psexec_path = temp_worker.psexec_path

        test_cmd = [
            psexec_path,
            f"\\\\{REMOTE_CONFIG['hostname']}",
            "-u", REMOTE_CONFIG["username"],
            "-p", REMOTE_CONFIG["password"],
            "-i", str(REMOTE_CONFIG["session_id"]),
            "-accepteula",
            "cmd", "/c", "echo", "SUCCESS"
        ]

        result = subprocess.run(test_cmd, capture_output=True, text=True, timeout=30)
        if "SUCCESS" in result.stdout:
            print("[OK] PsExec connection successful!")
        else:
            print("[WARNING] Unexpected output:")
            print(result.stdout)
            print(result.stderr)
    except subprocess.TimeoutExpired:
        print("[FAIL] Connection timeout - check firewall and services")
    except Exception as e:
        print(f"[FAIL] Connection error: {e}")

REMOTE_CONFIG not set - run the cell above first


### Test Share Access

In [20]:
# Test if share is accessible
from pathlib import WindowsPath

if REMOTE_CONFIG is None:
    print("REMOTE_CONFIG not set - run the 'Build REMOTE_CONFIG' cell first")
else:
    share_path = Path(REMOTE_CONFIG["share_path"])

    try:
        # This may fail without authenticated session - that's OK
        if share_path.exists():
            print(f"[OK] Share accessible: {share_path}")
            files = list(share_path.iterdir())[:5]
            print(f"     Contents: {len(list(share_path.iterdir()))} items")
        else:
            print(f"[INFO] Share not accessible via Path.exists() (authentication may be required)")
            print(f"      This is normal - share will be accessed during execution with credentials")
    except Exception as e:
        print(f"[INFO] Cannot test share access: {e}")
        print(f"      This is normal - share will be accessed during execution with credentials")

REMOTE_CONFIG not set - run the 'Build REMOTE_CONFIG' cell first


## 9. Notes and Best Practices

### Remote Worker Configuration:
- Credentials stored in `RemoteWorkers.json` (not committed to git)
- See **REMOTE_WORKERS_README.md** for JSON format and setup
- Template provided: `RemoteWorkers.json.template`

### Remote Worker Requirements:
1. ✅ Network share created and accessible
2. ✅ User in local Administrators group
3. ✅ Group Policy: User added to network access, local logon, batch job policies
4. ✅ Registry: LocalAccountTokenFilterPolicy = 1
5. ✅ Remote Registry service running
6. ✅ Windows Firewall configured
7. ✅ Machine rebooted after changes

### Session ID:
- Session ID 2 is typical for single-user workstations
- Use `query user` on remote machine to verify
- User must be logged in for session to be active
- Session ID can change if user logs off/on

### HEC-RAS Considerations:
- HEC-RAS is a GUI application
- MUST use session-based execution (`system_account=False`)
- NEVER use SYSTEM account (`system_account=True`) for HEC-RAS
- HEC-RAS window will start on the desktop of the remote desktop
- Ensure HEC-RAS version matches on all workers, and TOS has been accepted.

### Performance:
- Network share speed affects file transfer
- Use Gigabit Ethernet for best performance
- 2-4 workers per machine optimal (depends on cores/RAM)
- Plans execute sequentially on each worker
- Multiple workers enable true parallel execution

### Security:
- Credentials in `RemoteWorkers.json` (in .gitignore)
- Never commit credentials to git
- See setup instructions for required group policy and registry changes

### Debugging:
- Check logs in ras_commander.log
- Inspect compute messages: `project.p##.computeMsgs.txt`
- Verify temp folders on remote share
- Test PsExec manually with provided batch files

---

**For complete setup instructions, see:**
- `feature_dev_notes/RasRemote/REMOTE_WORKER_SETUP_GUIDE.md` - Remote machine setup
- `REMOTE_WORKERS_README.md` - JSON credential file format

## 10. Cleanup Remote Worker Folders

The `autoclean=True` parameter (default) automatically deletes worker folders after execution.
However, if you used `autoclean=False` for debugging or if executions were interrupted,
you may have leftover folders on the remote shares.

**All files in the RasRemote share are considered temporary** and can be safely deleted
to preserve disk space on the remote machines.

Run the cells below to manually clean up any remaining worker folders.

In [18]:
# List and optionally clean up worker folders on remote machines
# This cleans ALL files in the RasRemote share - all contents are temporary

def cleanup_remote_shares(workers, dry_run=True):
    """
    Clean up worker folders from remote shares.
    
    Args:
        workers: List of worker objects with share_path attribute
        dry_run: If True, only list folders without deleting (default True for safety)
    
    Returns:
        dict: {hostname: {"folders": count, "size_mb": total_size}}
    """
    import shutil
    
    results = {}
    seen_shares = set()
    
    for w in workers:
        if not hasattr(w, 'share_path') or not w.share_path:
            continue
            
        share_path = Path(w.share_path)
        share_key = str(share_path)
        
        # Skip if we've already processed this share
        if share_key in seen_shares:
            continue
        seen_shares.add(share_key)
        
        hostname = getattr(w, 'hostname', 'unknown')
        
        try:
            if not share_path.exists():
                print(f"Share not accessible: {share_path}")
                continue
                
            folders = [f for f in share_path.iterdir() if f.is_dir()]
            total_size = 0
            
            print(f"\n{'='*60}")
            print(f"Share: {share_path} ({hostname})")
            print(f"{'='*60}")
            
            if not folders:
                print("  No folders found - share is clean")
                results[hostname] = {"folders": 0, "size_mb": 0}
                continue
                
            for folder in folders:
                # Calculate folder size
                folder_size = sum(f.stat().st_size for f in folder.rglob('*') if f.is_file())
                folder_size_mb = folder_size / (1024 * 1024)
                total_size += folder_size_mb
                
                if dry_run:
                    print(f"  [WOULD DELETE] {folder.name} ({folder_size_mb:.1f} MB)")
                else:
                    print(f"  [DELETING] {folder.name} ({folder_size_mb:.1f} MB)")
                    shutil.rmtree(folder, ignore_errors=True)
            
            results[hostname] = {"folders": len(folders), "size_mb": total_size}
            
            if dry_run:
                print(f"\n  Summary: {len(folders)} folders, {total_size:.1f} MB total")
                print(f"  Set dry_run=False to delete these folders")
            else:
                print(f"\n  Deleted: {len(folders)} folders, {total_size:.1f} MB freed")
                
        except Exception as e:
            print(f"Error accessing {share_path}: {e}")
            
    return results

# DRY RUN - List folders without deleting
print("=" * 70)
print("CLEANUP PREVIEW (dry_run=True)")
print("=" * 70)
cleanup_results = cleanup_remote_shares(workers, dry_run=True)

CLEANUP PREVIEW (dry_run=True)

Share: \\192.168.3.8\RasRemote\ (None)
  [WOULD DELETE] BaldEagleDamBrk_04_SW2_1eb53001 (0.0 MB)
  [WOULD DELETE] docker (0.0 MB)
  [WOULD DELETE] ras_docker_BaldEagleDamBrk_p03_sw1_5a90bd88 (574.4 MB)
  [WOULD DELETE] ras_docker_BaldEagleDamBrk_p03_sw1_5b00f282 (574.4 MB)
  [WOULD DELETE] ras_docker_BaldEagleDamBrk_p03_sw1_87b65e71 (574.3 MB)
  [WOULD DELETE] ras_docker_BaldEagleDamBrk_p03_sw1_a81a8524 (574.4 MB)
  [WOULD DELETE] ras_docker_BaldEagleDamBrk_p03_sw1_b8ad058f (574.4 MB)
  [WOULD DELETE] ras_docker_BaldEagleDamBrk_p03_sw1_e9545117 (574.4 MB)
  [WOULD DELETE] ras_docker_BaldEagleDamBrk_p03_sw1_f6147736 (574.4 MB)
  [WOULD DELETE] ras_docker_BaldEagleDamBrk_p04_sw1_838f712b (791.0 MB)
  [WOULD DELETE] ras_docker_BaldEagleDamBrk_p04_sw1_96236c8b (794.3 MB)
  [WOULD DELETE] ras_docker_BaldEagleDamBrk_p04_sw1_aaf7134c (790.9 MB)
  [WOULD DELETE] ras_docker_BaldEagleDamBrk_p04_sw1_f9a505f9 (790.9 MB)
  [WOULD DELETE] ras_docker_BaldEagleDamBrk_p0

In [19]:
# ACTUALLY DELETE - Uncomment and run to delete all worker folders
# WARNING: This permanently deletes all folders in the RasRemote shares!

# print("=" * 70)
# print("CLEANUP EXECUTION (dry_run=False)")
# print("=" * 70)
# cleanup_results = cleanup_remote_shares(workers, dry_run=False)
# print("\nCleanup complete!")