An MCP server that lets AI agents safely manage Node-RED flows.
Node-RED stores its entire flow graph in a single flows.json document. Letting an agent edit that file directly is fragile in practice: node IDs drift across runs, wires end up malformed, dependent config nodes go missing, and a "deploy" can land on top of changes someone else made minutes earlier. FlowOtter inserts a typed authoring layer between the agent and flows.json — it compiles, validates, diffs, and (when explicitly enabled) deploys with a snapshot the agent can roll back to. The safety surface — byte-identical idempotent compilation, snapshot-before-deploy with hash-drift refusal, tiered env-gates with read-only as the default, substring-level secret redaction, and ID preservation across runs — is what separates this from a thin Admin-API wrapper.
If you cloned this repo via an agent (Claude Code, Cursor, etc.), the project-scoped .mcp.json registers FlowOtter automatically — npm install && npm run build, restart the client, and the agent has the read tier available. Write/deploy tiers stay gated behind env flags documented in Safety Model.
v1.3.0 — architectural redesign: methodology, catalog, layout, dashboards, ISA-101. Introduces a methodology playbook surfaced in the MCP instructions field, a structured capability catalog (get_authoring_guide), Node-RED version detection with feature gating, a plan_flow methodology spine, a response-side soft-nudge guidance system, named toolsets for progressive disclosure, MCP elicitation gating destructive operations, a dual dagre/elkjs layout engine with auto-selection, authoring schemas for the full Dashboard 2.0 widget catalog (24 widgets), 4 ISA-101 operator-screen validators, and 5 user-facing slash-command MCP prompts. See CHANGELOG.md for the per-item summary and docs/REDESIGN_PLAN.md for the design rationale. Prior lines: v1.2 closed path-traversal exposure on set_target, fixed audit/subflow correctness bugs, hardened HTTP retry semantics. v1.1 fixed lossy roundtrip and wired the previously dead config knobs.
Default visible tools are filtered by toolsets — named groups that progressive-disclose by intent. The specialist add_<type>_node tools are hidden by default behind the author_specialists toolset; the agent calls enable_toolset('author_specialists') to load them. Dangerous tools require both an env flag AND the dangerous toolset (auto-enabled when the env flag is set).
-
Discovery + analysis —
health_check,get_server_config_summary,set_target,clear_target,list_flows,get_flows_summary,get_flow,get_node,search_nodes,get_subflow,list_installed_node_types(annotates each type withis_core: boolto surface contrib packages),get_runtime_state,list_templates,get_authoring_guide(capability catalog),validate_flow,validate_all_flows,analyze_flow,analyze_all_flows,explain_flow,render_flow_svg,preview_flow_diff,export_snapshot,list_snapshots,get_snapshot,get_staged_change,get_audit_log_recent,get_recent_debug_messages(/commsdebug observer),list_available_toolsets/enable_toolset. -
Author —
plan_flow(methodology spine),add_node(generic — handles core types AND node-red-contrib-* long tail),add_dashboard_widget(24 Dashboard 2.0 widgets including ui-chart, ui-gauge, ui-table, ui-button, ui-button-group, ui-form, ui-template, ui-notification, ui-control, ui-text),add_subflow_instance,add_group,add_comment,wire_nodes,set_wires,set_links,remove_node,update_node,move_node,create_subflow_definition,instantiate_template. -
Author specialists (opt-in via
enable_toolset('author_specialists')) — typed conveniences for high-value patterns:add_inject_node,add_debug_node,add_function_node,add_catch_node,add_status_node,add_complete_node,add_mqtt_in_node,add_mqtt_out_node,add_link_in_node,add_link_out_node,add_link_call_node. The genericadd_nodehandles every case these do; specialists exist for per-node-type schema validation when that matters. -
Deploy —
deploy_staged_change(with MCP elicitation confirming deploy before push to live runtime),rollback_last_change,set_flows_state. -
Dangerous (env-gated,
ENABLE_DANGEROUS_TOOLS=true) —prepare_dangerous_operation,replace_flows,delete_tab,reset_runtime,create_flow,update_flow,delete_flow.
FlowOtter ships 5 MCP prompts that surface as /mcp__flow-otter__<name> in Claude Code (and equivalent menus in other MCP clients) — they're how the user discovers FlowOtter workflows:
/mcp__flow-otter__new_flow— full plan → wire → deploy walkthrough./mcp__flow-otter__build_operator_dashboard— composes an ISA-101 operator dashboard from the built-in operator-grade templates./mcp__flow-otter__refactor_to_subflow— fold selected nodes into a reusable subflow./mcp__flow-otter__explain_my_flow— structured walkthrough./mcp__flow-otter__review_my_flow— full review with ISA-101 explanations.
738 unit + 17 property + integration tests. Property tests run fast-check at numRuns:1000 and exercise junctions, tab locked/env, group geometry, comment size, and layout determinism in the round-trip arbitraries.
The three tabs below were authored end-to-end through MCP calls in a single agent session against a sterile Node-RED 3.1 stack. Each section gives the kind of prompt a user might write, the resulting flow, and an honest read on what FlowOtter handled versus what the agent still had to know.
"Stand up a tab that exercises every common author tool. Add inject, function, debug, plus catch, status, complete, mqtt in, mqtt out, link in, link out, link call. Label each node with the MCP tool that produced it. Wire inject → function → debug, and fan function's output to mqtt out via
set_wires. Pair link_call to link_in withset_links. Group the canonical author loop as 'Idempotent compile' and drop comments explaining the staging contract."
What FlowOtter took care of. Each add_*_node call returned a staged_hash and a diff_summary so the agent could reason about the staged change before deploying. Authoring keys (_authoringKey) are preserved across re-runs — running the same prompt twice produces byte-identical flows.json with stable IDs. set_wires rejects cross-tab wiring (point at link nodes instead), and set_links rejects pairing anything that isn't a link in. The validator caught off-grid positions and would have caught duplicate link in names within the tab.
What the agent still had to bring. Node-type-specific config the toolkit doesn't model: link call rejected links: [] until linkType: "dynamic" was set; complete failed validation without a scope array pointing at a real node id; inject required repeat (even an empty string). Layout coordinates are agent-supplied — auto-placement stacks new nodes adjacent and the bbox-overlap lint will warn, but the agent has to actually move them.
"Lay down an operator console using the six bundled dashboard-2 templates:
dashboard_2_alarm_panel,dashboard_2_mode_banner,dashboard_2_confirmed_button(hold-to-confirm e-stop),dashboard_2_live_value(with stale-data badge),dashboard_2_gauge_grid(four process metrics),dashboard_2_audit_log_tail. All widgets should share one ui-base / ui-page / ui-group skeleton — no duplicate scaffolding."
A handful of the bundled templates use opinionated defaults loosely inspired by industrial-HMI standards — dashboard_2_alarm_panel borrows the ISA-18.2 state-machine vocabulary (UNACK/ACK/RTN/SHELVED), dashboard_2_mode_banner and dashboard_2_confirmed_button lean on ISA-101 ideas (grayscale base, hold-to-confirm destructive actions). v1.3.0 adds four ISA-101 enforcement validators (unbounded-chart-append, screen-clutter, saturated-color-outside-alarm, button-group-color-decoration) that flag the most common operator-screen anti-patterns. Standards conformance was not a goal of this project; these were one author's starting points and would need to be developed considerably further to be appropriate for actual industrial deployment.
What FlowOtter took care of. Six successive instantiate_template calls reused one ui-base + ui-page + ui-group via the ensureSkeleton helper — instead of stamping six separate dashboards, every widget points at the same Operator Console group. The dashboard-2-hierarchy validator confirms every widget reaches an existing ui-group, dashboard-2-required-fields flags widgets missing per-type required fields, and dashboard-2-destructive-needs-confirm would catch a ui-button with destructive labels not paired with a confirmation widget.
What the agent still had to bring. Node-RED runtime semantics that FlowOtter's validators don't replicate: ui-gauge segments must fall within the gauge's own [min, max] (a Pressure gauge with max=10 threw at runtime because the default from: 70 segment was out of range); ui-table v2 requires a numeric maxrows field (FlowOtter let null through, Node-RED rejected it); the mqtt in/mqtt out typed author tools don't auto-create a mqtt-broker config node — the agent has to add one and wire it via the broker field. These are real-runtime constraints owned by each plugin, not flow-topology rules the toolkit can sense.
"Create a reusable subflow with its instance on a dedicated tab. Add a
link inhere that's the cross-tab counterpart of Tab 1'slink out— pair them viaset_linksso the Tab 1 outbound resolves to this tab's entry. Wire the link in to the subflow's debug so we can see what arrives."
What FlowOtter took care of. set_links is the only tool here that crosses tab boundaries — it validates that the source is link out or link call, every target is link in, and every target exists. The reusable_subflow template emits the subflow definition + a workspace instance in one call; the compiler sizes the instance's wire array from the definition's output count, so re-deploying a subflow change keeps the instances consistent.
What the agent still had to bring. Subflow definitions need optional fields the editor iterates over — missing in: [], env: [], or meta: {} produced a generic Cannot read properties of undefined (reading 'forEach') in the editor (not in the runtime). The reusable_subflow template covers this for the common case, but bespoke subflow definitions authored via create_subflow_definition need these fields supplied.
Everything visible here — four gauges with proportional color bands, the alarm table, mode banner, hold-to-confirm Emergency Stop button, stale-data live value, audit-log tail — was produced by the six template calls in Tab 2. The MQTT-driven widgets show their idle states because no producers are publishing in the sterile stack.
The showcase deliberately ran into FlowOtter's edges so they could be named clearly:
- The toolkit-level validators model flow topology (link resolution, dashboard hierarchy, group consistency, on-grid, label-cap, off-canvas, naming contract). They do not replicate per-node
validate()functions owned by individual Node-RED node modules — those constraints surface only in the editor or at flow start. - The author tools accept a
passthroughfield for type-specific config. FlowOtter validates passthrough against per-type Zod schemas for ~30 common node types (add_noderegistry + every typedadd_*_node). Outside that list,add_nodeaccepts arbitrary passthrough and returnstype_had_schema: false. - Templates auto-create dependent config nodes (
ui-base,ui-page,ui-group,mqtt-broker); bare author tools don't. If the agent callsadd_mqtt_in_nodedirectly, the broker config is the agent's responsibility. render_flow_svgoutput is checked in alongside the editor screenshots — vector-deterministic for the same input, useful for diff-based regression checks: tab-1, tab-2, tab-3.
The division of labor: FlowOtter guarantees the shape of the flow (idempotent compile, ID-stable IDs, topologically valid, snapshot-backed). The agent owns the content (per-plugin field values, runtime config like broker hostnames, layout coordinates).
npm install
npm run build
node dist/bin/flow-otter.js --versionStart the local Node-RED test stack:
docker compose -f deploy/docker-compose.yml up -dRun the MCP server against that stack:
NODE_RED_BASE_URL=http://localhost:1880 \
ENABLE_WRITE_TOOLS=true \
ENABLE_DEPLOY_TOOLS=true \
READ_ONLY_MODE=false \
npm run devDangerous tools require one more flag:
ENABLE_DANGEROUS_TOOLS=trueUse the built stdio binary from any MCP client:
{
"mcpServers": {
"FlowOtter": {
"command": "node",
"args": ["/absolute/path/to/FlowOtter/dist/bin/flow-otter.js"],
"env": {
"FLOW_SOURCE": "file",
"READ_ONLY_MODE": "true",
"ENVIRONMENT_NAME": "default"
}
}
}
}The server boots without a Node-RED target. To point it at a runtime, the agent calls the set_target tool:
set_target also re-scopes snapshot/staging/audit storage under ~/.flow-otter/<env_name>/ so state from different targets doesn't cross-contaminate. Pass snapshot_dir, staging_dir, or audit_log_path to override.
You can still pre-bind a target at startup with NODE_RED_BASE_URL + FLOW_SOURCE=admin-api if you don't want the agent picking the URL.
npm run typecheck
npm run lint
npm run test:unit
npm run test:property
npm run test:integration
npm run build
npm pack --dry-runThe property suite runs fast-check at numRuns:1000. The integration suite starts Node-RED plus Mosquitto with deploy/docker-compose.yml.
The default config is read-only. Author/deploy/dangerous tools appear in tools/list only when their env gates are enabled:
| Env var | Default | Effect |
|---|---|---|
READ_ONLY_MODE |
true |
Blocks all write/deploy/dangerous tiers. |
ENABLE_WRITE_TOOLS |
false |
Enables author/stage tools when read-only is off. |
ENABLE_DEPLOY_TOOLS |
false |
Enables deploy/rollback when write tools are enabled. |
ENABLE_DANGEROUS_TOOLS |
false |
Enables destructive full replace/delete/reset tools. |
REQUIRE_DRIFT_CHECK_BEFORE_DEPLOY |
true |
Refuses staged deploy if runtime hash changed. |
set_target is read-tier (always available); it changes which Node-RED the read/write/deploy tools talk to but cannot itself bypass tier gates. Every deploy and dangerous operation snapshots the prior runtime first. rollback_last_change restores the latest pre-deploy/pre-dangerous snapshot.
- Agent Quickstart — how an AI agent drives FlowOtter
- Tool Reference — every tool, signature, example
- Architecture — layer boundaries + pipelines
- Client Configuration — MCP client setup
- Security — threat model + redaction + tier gates
- Non-Goals — what v1 explicitly does NOT do
- Changelog — per-version change record
Mozilla Public License 2.0 — see LICENSE.
See CONTRIBUTING.md. For security issues, see SECURITY.md.



