Skip to content

Feature: Interacting with a running simulation #4

@jaywonchung

Description

@jaywonchung

Depends on #2 and #3. #2 provides the WebSocket channel user actions ride back on; without it every click is a POST round-trip, which doesn't compose. #3 provides the things to interact with: PV systems, time-varying loads, training overlays, multi-DC topologies. Without #3 the only meaningful actions are "bump a load somewhere" and "nudge a tap", which doesn't justify the UI work.

Once #2 is in, users see the simulation playing out tick by tick. This issue lets them act on it mid-run: click a bus and add load, scroll a regulator to nudge a tap, drag a slider to override a batch size, click a button to open a line. Each action becomes an openg2g DatacenterCommand or GridCommand dispatched at the next simulated tick. The controller in the simulation reacts to the user-injected disturbance the same way it reacts to PV/load shifts.

Interactive sessions need wall-clock pacing, so pass live=True to the Coordinator constructor. The producer then sleeps between ticks so one simulated second matches one real second, which gives the user enough time to see state change, decide, and click before the next tick. (#2 keeps live=False for the non-interactive playback case where you want the whole scenario to stream as fast as possible.)

Backend is the simple part; openg2g.Coordinator.dispatch_commands() takes whatever the user sends. The hard part is frontend UI: making the network graph discoverably interactive, choosing the right input affordances per action type, and giving useful feedback on what landed.

Backend

Extend the WebSocket handler from #2 to receive command messages from the frontend in addition to streaming ticks to it. The frontend sends:

{ "kind": "load_bump", "bus": "671", "kw": 500, "duration_s": 60 }

Backend decodes JSON into an openg2g Command object, puts it on an in-process queue. Build the coordinator with live=True and replace #2's run_iter() loop with a manual coord.step() loop that drains the queue before each step:

coord = build_coordinator(params, live=True)
coord.reset(); coord.start()
try:
    while coord.clock.now_s < coord.total_duration_s:
        while not cmd_queue.empty():
            coord.dispatch_commands([cmd_queue.get_nowait()])
        tick = coord.step()
        await ws.send_json(serialize_tick(tick))
finally:
    coord.stop()

run_iter() doesn't fit here because it owns the inner loop; for an interactive session we need to interleave queue drain and step ourselves.

Command types live in openg2g.datacenter.command and openg2g.grid.command. Start with a small vocabulary and grow.

Action vocabulary

A reasonable spread of starter actions, mixing grid-side and datacenter-side and ranging from "small nudge" to "drastic disruption":

  • Bump load. Click a bus, enter kW and duration. An external load appears at that bus for N seconds. The simplest "disturb the system" action.
  • Override batch size. Drag a slider on a model card. Issues a SetBatchSize for that model.
  • Nudge tap. Click +/- on a regulator. Issues a SetTapPosition.
  • Trigger a PV cloud. Click a PV system, enter a duration and depth (e.g. drop to 30% for 90 seconds). Forces solar generation down sharply; tests how the controller handles sudden generation loss.
  • Open a line (fault). Click a line, confirm. The line goes open-circuit for the rest of the simulation (or for a user-set window). Voltage profile shifts dramatically downstream.
  • Bump replica count. Drag a slider on a model. Simulates a model becoming popular (or unpopular) mid-run, so power draw shifts.
  • Start a training overlay mid-sim. Click "Add training run", enter n_gpus and end time. A multi-thousand-GPU training spike appears at the configured datacenters, testing big sudden power swings.
  • Datacenter outage. Click a datacenter, "Take offline". Sets all batch sizes to zero for that DC for a window. Demonstrates load loss and the controllers redistributing.

Each one is a thin translation layer from a JSON message to a single openg2g command or a small composite. Most are one or two lines on the backend; the work is the UI affordances.

Grow further later: GPU thermal throttling (cap a DC's max replicas), inverter trip on a PV system, swap controller mid-run (disable OFO, enable rule-based), pause/resume the simulation.

Frontend

Three decisions to make:

  • Where users click. Network graph (buses, lines, regulators, PV systems) and datacenter panels (models, replica counts, training overlay). Targets need a visible hover state so users see at a glance that things are interactive.
  • Modal vs direct manipulation. Modal (click, form, confirm) is safer for irreversible or parameterized actions: opening a line, starting a training overlay, triggering a PV cloud. Direct manipulation (drag, scroll) is faster for nudges: batch size, tap position, replica count. Mix by action type.
  • Bounds checking and feedback. Validate before send (no negative replica count, no batch size outside the feasible set, no tap position outside +/-16 steps). On every dispatch, append a line to the event log so the user sees what landed. Color-code user actions distinct from controller actions.

Order of operations

  1. Backend: accept one command type (load_bump), wire to dispatch_commands. Test with a hardcoded JSON message from the frontend.
  2. Frontend: make one bus clickable, open a modal, send the message. Confirm the voltage chart reacts.
  3. Add the rest of the action vocabulary one at a time. Modal-driven first, drag-and-scroll later.
  4. Hover affordances, tooltips, event log coloring.
  5. A scripted demo scenario (preset config plus scripted action list) that exercises the whole loop for showcases.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions