Skip to content

Add graph mode backend#664

Merged
leszko merged 3 commits intomainfrom
rafal/graph-mode-backend
Mar 13, 2026
Merged

Add graph mode backend#664
leszko merged 3 commits intomainfrom
rafal/graph-mode-backend

Conversation

@leszko
Copy link
Collaborator

@leszko leszko commented Mar 11, 2026

Summary

  • Refactor all pipeline execution to be graph-based internally — pipelines are modeled as nodes in a directed graph connected by edges between named input/output ports, replacing the previous linear pipeline chain
  • Remove pipeline throttling — previously, when one pipeline ran faster than another, the faster one was slowed down to match. With the introduction of multiple inputs/outputs per pipeline, reasoning about throttling becomes significantly more complex, so it has been removed for now and will need to be redesigned
  • Add support for multiple inputs and outputs per pipeline processor — each pipeline (processor) can now declare and use multiple named input and output ports (e.g. video, conditioning_video), rather than being limited to a single input and single output
  • Add a new graph API — the frontend can now send a full graph definition (nodes and edges) to describe how pipelines, sources, and sinks are connected, instead of only being able to specify a linear chain of pipelines
  • Add the ability to define inputs and outputs in the pipeline schema — each pipeline can now declare its expected input/output ports in its schema, making the graph structure explicit and validated
  • Clean up VACE enabling — VACE input video routing is now expressed as a different input edge (conditioning_video) on the graph node, replacing the previous special-case runtime logic that handled VACE mode separately

API changes

1. Pipeline load

Each pipeline can now have a node_id and its own load_params. Top-level pipeline_ids and load_params can be null when using the new format.

Example request body:

{
  "pipelines": [
    {
      "node_id": "pipeline",
      "pipeline_id": "passthrough",
      "load_params": {"height": 320, "width": 576}
    },
    {
      "node_id": "pipeline_1",
      "pipeline_id": "longlive",
      "load_params": {"height": 320, "width": 576}
    }
  ],
  "pipeline_ids": null,
  "load_params": null,
  "connection_id": null,
  "connection_info": null,
  "user_id": null
}

2. Initial parameters (WebRTC connection start)

Initial parameters sent when the WebRTC connection is started can include a graph field that describes the pipeline graph (nodes and edges). Other fields (e.g. input_mode, prompts, …) are unchanged.

Example graph field:

"graph": {
  "nodes": [
    {"id": "input", "type": "source", "source_mode": "camera", "source_name": null},
    {"id": "output", "type": "sink"},
    {"id": "pipeline", "type": "pipeline", "pipeline_id": "passthrough"},
    {"id": "pipeline_1", "type": "pipeline", "pipeline_id": "longlive"}
  ],
  "edges": [
    {"from": "input", "from_port": "video", "to_node": "pipeline", "to_port": "video", "kind": "stream"},
    {"from": "pipeline", "from_port": "video", "to_node": "pipeline_1", "to_port": "video", "kind": "stream"},
    {"from": "pipeline_1", "from_port": "video", "to_node": "output", "to_port": "video", "kind": "stream"}
  ]
}

(Full initial payload still includes input_mode, prompts, prompt_interpolation_method, noise_scale, etc.)

3. Update params

Update requests can include node_id so parameters are applied to a specific graph node (e.g. a pipeline instance).

Example:

{"node_id": "pipeline_1", "prompts": [{"text": "batman", "weight": 100}]}

@leszko leszko mentioned this pull request Mar 11, 2026
4 tasks
@coderabbitai
Copy link

coderabbitai bot commented Mar 11, 2026

Important

Review skipped

Auto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 03f001b5-36ec-48d4-8fbb-18baae0ead8a

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

This PR introduces graph-based pipeline execution with new schema and executor modules, refactors pipeline management to support multi-instance loading via instance keys, redesigns pipeline processing with multi-port queuing, removes WebSocket fal-connection-id tracking and WebRTC endpoints, consolidates Docker workflows, simplifies logging, and eliminates internal caching mechanisms.

Changes

Cohort / File(s) Summary
CI/CD Docker Workflows
.github/workflows/docker-build-image.yml, .github/workflows/docker-build.yml
Removes reusable docker-build-image.yml workflow; consolidates docker-build.yml to use a single build-and-push job with matrix strategy for standard and cloud variants, updating downstream job dependencies and permissions.
Version Downgrade
app/package.json, pyproject.toml
Downgrades project version from 0.1.7 to 0.1.6 in both files.
Graph Architecture Foundation
src/scope/server/graph_schema.py, src/scope/server/graph_state.py
Introduces GraphNode, GraphEdge, GraphConfig Pydantic models with validation and helper methods for DAG structure; adds thread-safe file-backed store for API-configured graph state with get/set/clear operations.
Graph Execution Engine
src/scope/server/graph_executor.py
Implements build_graph() function and GraphRun dataclass to construct and wire graph-based pipelines, including queue creation, processor instantiation, source/sink resolution, and dynamic queue resizing based on pipeline requirements.
Frame & Pipeline Processing
src/scope/server/frame_processor.py, src/scope/server/pipeline_processor.py
Refactors frame processing to support graph mode with per-node routing; redesigns pipeline processor with port-keyed input/output queues, output_consumers tracking, multi-port frame preparation, and per-node parameter routing.
Pipeline Management & Orchestration
src/scope/server/pipeline_manager.py, src/scope/server/schema.py
Introduces instance_key-based multi-pipeline support; refactors load_pipelines() to accept per-node load configurations; adds PipelineLoadItem model and graph/node_id parameters to pipeline configuration schema.
WebSocket & Logging Cleanup
src/scope/cloud/fal_app.py, src/scope/server/cloud_connection.py, src/scope/server/logs_config.py
Removes fal-connection-id tracking, log correlation setup, and WebRTC session lifecycle management; eliminates FalConnectionFilter class and fal-specific logging format.
API Endpoint & Cache Removal
src/scope/server/app.py
Removes PUT/DELETE /api/v1/internal/fal-connection-id and DELETE /api/v1/webrtc/offer/{session_id} endpoints; refactors logging initialization; removes _pipeline_schemas_cache and _plugins_list_cache; normalizes pipeline prewarming with tuple structure.
Pipeline Configuration
src/scope/core/pipelines/base_schema.py, src/scope/core/pipelines/*/schema.py
Adds public inputs/outputs ClassVar declarations to BasePipelineConfig and extends metadata generation; defines port specifications for KreaRealtimeVideoConfig, LongLiveConfig, MemFlowConfig, RewardForcingConfig, and StreamDiffusionV2Config.
Test Cleanup
tests/test_plugin_api.py
Removes explicit cache reset imports and initialization for _pipeline_schemas_cache and _plugins_list_cache in test setup.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant API as API Endpoint
    participant GraphState as Graph State
    participant GraphExec as Graph Executor
    participant PipelineMgr as Pipeline Manager
    participant FrameProc as Frame Processor

    Client->>API: POST /pipelines/load (with graph)
    API->>GraphState: set_api_graph(GraphConfig)
    GraphState->>GraphState: Persist to disk
    
    Client->>API: POST /process (with frames)
    API->>FrameProc: put(frame)
    API->>GraphExec: build_graph(GraphConfig, PipelineManager)
    
    GraphExec->>PipelineMgr: load_pipelines([(node_id, pipeline_id, params)])
    PipelineMgr->>PipelineMgr: Create instance keys, load per graph
    
    GraphExec->>GraphExec: Wire queues between pipeline nodes
    GraphExec->>GraphExec: Set sink processor and source queues
    GraphExec->>FrameProc: _setup_graph(GraphRun)
    
    FrameProc->>FrameProc: Initialize per-node processors with port queues
    FrameProc->>FrameProc: Connect output_consumers for downstream routing
    
    Client->>API: PUT /parameters (node_id specified)
    API->>FrameProc: update_parameters(node_id, params)
    FrameProc->>FrameProc: Route to specific node processor
    
    Note over FrameProc: Graph execution flow<br/>(frames fan-out through queues)
    FrameProc->>FrameProc: put() → _graph_source_queues
    loop For each pipeline node
        FrameProc->>FrameProc: process_chunk(per-port frames)
        FrameProc->>FrameProc: output to downstream output_queues
    end
    
    Client->>API: GET /output
    API->>FrameProc: get()
    FrameProc->>FrameProc: last_processor() → sink output
    API->>Client: Return frame from sink
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Suggested reviewers

  • mjh1
  • emranemran

🐰 Hops with glee through queues so deep,
Graph nodes arranged in order to keep,
Multi-port frames dance without care,
While connection IDs fade in the air,
A pipeline refactored, both sturdy and fair!

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title 'Add graph mode backend' is specific and directly reflects the main changes: introduction of graph execution infrastructure including graph_executor.py, graph_schema.py, graph_state.py, and updates to core components to support graph-based pipeline execution.
Docstring Coverage ✅ Passed Docstring coverage is 90.20% which is sufficient. The required threshold is 80.00%.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch rafal/graph-mode-backend
📝 Coding Plan
  • Generate coding plan for human review comments

Comment @coderabbitai help to get the list of available commands and usage tips.

…hanges

Introduce server-side graph execution infrastructure including graph schema
definitions, graph executor for processing node graphs, and graph state
management. Update pipeline manager, frame processor, and pipeline processor
to support graph-based execution. Add graph inputs/outputs to pipeline schemas.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Rafał Leszko <rafal@livepeer.org>
@leszko leszko force-pushed the rafal/graph-mode-backend branch from 3a90176 to f466988 Compare March 11, 2026 15:57
@leszko leszko marked this pull request as draft March 11, 2026 15:58
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 13

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
src/scope/server/frame_processor.py (2)

171-191: ⚠️ Potential issue | 🔴 Critical

Allow graph-only sessions to reach graph setup.

start() still hard-fails when self.pipeline_ids is empty, but in graph mode pipeline_ids is only populated later by _setup_pipeline_chain_sync(). A request that sends just the new graph field never reaches graph construction.

Suggested fix
-        if not self.pipeline_ids:
+        if not self.pipeline_ids and not self.parameters.get("graph"):
             error_msg = "No pipeline IDs provided, cannot start"
             logger.error(error_msg)
             self.running = False
             # Publish error for startup failure
             publish_event(
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/scope/server/frame_processor.py` around lines 171 - 191, The startup
currently aborts when self.pipeline_ids is empty inside start(), which blocks
graph-only sessions because pipeline_ids are populated later by
_setup_pipeline_chain_sync(); change the check in start() to only treat missing
pipeline_ids as a fatal error when not running in graph mode (e.g., check
self.mode != "graph" or presence of self.graph), so graph-mode sessions proceed
to call _setup_pipeline_chain_sync() to build pipeline_ids; keep the existing
error logging/publish_event branch for true local-mode failures (referencing
start(), self.pipeline_ids, and _setup_pipeline_chain_sync()).

893-918: ⚠️ Potential issue | 🟠 Major

Route input-source frames through the graph source fan-out too.

put() already uses _graph_source_queues, but the input-source receiver still writes only to first_processor.input_queue. Spout/NDI/camera input will therefore bypass explicit source wiring and miss multi-source graphs.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/scope/server/frame_processor.py` around lines 893 - 918, The input-source
branch is bypassing the graph fan-out by writing directly to
first_processor.input_queue; replace that direct write with the existing put()
path that uses _graph_source_queues so spout/NDI/camera frames reach all wired
sources. Specifically, after preparing frame_tensor (the same tensor you
currently unsqueeze and would send), call self.put(frame_tensor) instead of
first_processor.input_queue.put_nowait(...); remove or adapt the queue.Full
handling around first_processor.input_queue since put() handles fan-out/queue
semantics via _graph_source_queues.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/docker-build.yml:
- Around line 117-123: The workflow currently injects the user-controlled
github.head_ref directly into an inline shell expression (used to build
BRANCH_TAG) which enables command injection; fix it by first assigning
github.head_ref to an environment variable (e.g., export BRANCH_REF="${{
github.head_ref }}") and then use that safe variable in the shell script (e.g.,
BRANCH_TAG=$(printf '%s' "$BRANCH_REF" | sed 's/\//-/g')), ensuring all
expansions are quoted; update the places referencing BRANCH_TAG and avoid using
github.head_ref directly in the inline script or unquoted expansions.

In `@app/package.json`:
- Line 4: The package.json "version" was incorrectly downgraded from 0.1.7 to
0.1.6; update the "version" field in app/package.json to a higher patch than the
already released 0.1.7 (e.g., 0.1.8) so it is not a downgrade and ensure it is
kept synchronized with the Python package's version in pyproject.toml; change
the "version" string value accordingly.

In `@pyproject.toml`:
- Line 3: The package version was downgraded to version = "0.1.6" which breaks
semantic versioning; verify whether 0.1.7 was already published and if so bump
the version in pyproject.toml to the next appropriate release (e.g., "0.1.8" or
higher) and update any release notes/changelog accordingly so the published
package version monotonically increases.

In `@src/scope/core/pipelines/base_schema.py`:
- Around line 248-250: BasePipelineConfig currently exposes implicit graph ports
by default: change the ClassVar definitions inputs and outputs on
BasePipelineConfig from ["video"] to empty lists (e.g., ClassVar[list[str]] =
[]) so ports are opt-in; then update each pipeline config that should expose a
video port (ControllerVisualizerConfig, GrayConfig, OpticalFlowConfig,
PassthroughConfig, RIFEConfig, ScribbleConfig, VideoDepthAnythingConfig) to
explicitly declare their own inputs and outputs = ["video"]; ensure
get_schema_with_metadata() continues to read these per-class attributes so only
configs that opt in publish graph ports.

In `@src/scope/server/frame_processor.py`:
- Around line 946-955: The current code only runs
GraphConfig.model_validate(...) which checks shape but not graph structure;
after obtaining api_graph (either via GraphConfig.model_validate(...) or via
_build_linear_graph(...)) call api_graph.validate_structure() and raise or
propagate any validation errors so structurally invalid graphs (duplicate node
IDs, dangling edges, etc.) are rejected before _setup_graph(...) or build_graph
runs; update the block that assigns api_graph to ensure validate_structure() is
invoked and its exceptions are forwarded to fail fast.
- Around line 627-633: The current branch treats missing or misspelled node_id
as a broadcast; change the logic so broadcast only occurs when node_id is
explicitly None: if node_id is None iterate self.pipeline_processors and call
update_parameters(parameters); else if node_id exists in
self._processors_by_node_id call that processor's update_parameters(parameters);
otherwise do not broadcast—log or raise a clear error/warning (e.g., mentioning
node_id) so stale/unknown IDs do not turn into global updates; reference the
variables node_id, self._processors_by_node_id, self.pipeline_processors and the
method update_parameters to locate where to change the conditional flow.

In `@src/scope/server/graph_executor.py`:
- Around line 68-73: The code populates stream_queues keyed by (to_node,
to_port) which silently overwrites a previous stream edge when multiple edges
fan into the same input; update the logic in the block that iterates graph.edges
(and the analogous block at lines ~104-109) to detect duplicate stream targets
and either (A) raise a clear validation error (e.g., ValueError) when more than
one edge has the same (e.to_node, e.to_port) to make fan-in explicit and
rejected, or (B) change the keying to be per-edge (use a unique edge id or the
edge object as the key) and implement an explicit merge/dispatcher for multiple
queues into the single destination port; reference the symbols stream_queues,
graph.edges, e.kind, e.to_node, e.to_port and DEFAULT_INPUT_QUEUE_MAXSIZE when
making the change so the fix is applied in both places.
- Around line 154-163: The loop that wires throttlers stops after the first
match because of the trailing break; remove the break so every edge with e.kind
== "stream" and e.from_port == "video" sets
producer.throttler.set_next_processor(consumer) for all matching
producer/consumer pairs (loop over graph.edges) and optionally guard against
overwriting an existing downstream by checking the producer.throttler state
(e.g., only call set_next_processor if not already set) before assigning.

In `@src/scope/server/graph_schema.py`:
- Around line 116-158: validate_structure currently omits cycle detection so
cyclic graphs pass validation; add a DAG check (e.g., Kahn's algorithm or DFS
back-edge detection) inside validate_structure using the existing node id set
and edges (use self.nodes, self.edges, edge.from_node, edge.to_node, node_ids)
and if a cycle is found append an error like "Graph contains cycle: <describe
nodes involved>" to errors and return it along with the other checks; ensure the
algorithm uses node_ids/node_id_set for lookup and is efficient for large
graphs.

In `@src/scope/server/graph_state.py`:
- Around line 80-89: set_api_graph() and clear_api_graph() currently release
_lock before performing file I/O which allows get_api_graph() to race and
observe or recreate stale files; fix this by moving the disk mutations
(_write_to_file(graph) in set_api_graph and the file deletion call used in
clear_api_graph) inside the same critical section guarded by _lock so the
in-memory update and corresponding file mutation occur atomically; ensure any
call to _write_to_file, _delete_file (or similar file helpers) occurs while
holding _lock and keep logging after releasing the lock if desired.

In `@src/scope/server/pipeline_manager.py`:
- Around line 487-491: The logger.info call uses zip(node_ids, pipeline_ids)
without an explicit strict parameter; update that call to pass strict=True
(i.e., zip(node_ids, pipeline_ids, strict=True)) to satisfy the B905 linter rule
and ensure mismatched lengths raise immediately—modify the expression inside the
f-string in the function that builds node_ids/pipeline_ids from pipelines (the
logger.info line referencing node_ids, pipeline_ids, pipelines).

In `@src/scope/server/pipeline_processor.py`:
- Around line 513-550: The code is currently duplicating non-video tensor
outputs into parameters_queue even after they've been enqueued to downstream
output_queues; update the logic that builds extra_params so it skips adding any
port that was already delivered via output_queues (i.e., where
self.output_queues.get(port) exists and self.output_consumers.get(port) is
true), and only add truly persistent parameter outputs to parameters_queue (also
ensure tensors added to parameters_queue are detached/moved to CPU to avoid
keeping GPU memory). Apply the same guard to the second occurrence mentioned
(the block around the 555-567 range) and reference output_dict, output_queues,
output_consumers, parameters_queue, extra_params, and pipeline_id when making
the change.
- Around line 540-549: The fan-out loop in pipeline_processor.py currently
breaks out of the per-port iteration on a queue.Full which prevents later queues
from getting the frame; change the exception handling in the inner loop that
iterates over queues (where q.put_nowait(frame if q is queues[0] else
frame.clone())) so that on queue.Full you log (as you do for port == "video")
but do NOT break — simply continue to the next queue so each downstream consumer
is attempted independently; keep using frame.clone() for non-first queues and
preserve the existing logging via logger.info(f"Output queue full for
{self.pipeline_id}, dropping frame").

---

Outside diff comments:
In `@src/scope/server/frame_processor.py`:
- Around line 171-191: The startup currently aborts when self.pipeline_ids is
empty inside start(), which blocks graph-only sessions because pipeline_ids are
populated later by _setup_pipeline_chain_sync(); change the check in start() to
only treat missing pipeline_ids as a fatal error when not running in graph mode
(e.g., check self.mode != "graph" or presence of self.graph), so graph-mode
sessions proceed to call _setup_pipeline_chain_sync() to build pipeline_ids;
keep the existing error logging/publish_event branch for true local-mode
failures (referencing start(), self.pipeline_ids, and
_setup_pipeline_chain_sync()).
- Around line 893-918: The input-source branch is bypassing the graph fan-out by
writing directly to first_processor.input_queue; replace that direct write with
the existing put() path that uses _graph_source_queues so spout/NDI/camera
frames reach all wired sources. Specifically, after preparing frame_tensor (the
same tensor you currently unsqueeze and would send), call self.put(frame_tensor)
instead of first_processor.input_queue.put_nowait(...); remove or adapt the
queue.Full handling around first_processor.input_queue since put() handles
fan-out/queue semantics via _graph_source_queues.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: c9173233-66dc-428e-b7f0-e78eabaea0b6

📥 Commits

Reviewing files that changed from the base of the PR and between 6fbf39e and 3a90176.

⛔ Files ignored due to path filters (2)
  • app/package-lock.json is excluded by !**/package-lock.json
  • uv.lock is excluded by !**/*.lock
📒 Files selected for processing (22)
  • .github/workflows/docker-build-image.yml
  • .github/workflows/docker-build.yml
  • app/package.json
  • pyproject.toml
  • src/scope/cloud/fal_app.py
  • src/scope/core/pipelines/base_schema.py
  • src/scope/core/pipelines/krea_realtime_video/schema.py
  • src/scope/core/pipelines/longlive/schema.py
  • src/scope/core/pipelines/memflow/schema.py
  • src/scope/core/pipelines/reward_forcing/schema.py
  • src/scope/core/pipelines/streamdiffusionv2/schema.py
  • src/scope/server/app.py
  • src/scope/server/cloud_connection.py
  • src/scope/server/frame_processor.py
  • src/scope/server/graph_executor.py
  • src/scope/server/graph_schema.py
  • src/scope/server/graph_state.py
  • src/scope/server/logs_config.py
  • src/scope/server/pipeline_manager.py
  • src/scope/server/pipeline_processor.py
  • src/scope/server/schema.py
  • tests/test_plugin_api.py
💤 Files with no reviewable changes (5)
  • src/scope/cloud/fal_app.py
  • tests/test_plugin_api.py
  • .github/workflows/docker-build-image.yml
  • src/scope/server/logs_config.py
  • src/scope/server/cloud_connection.py

Comment on lines +117 to +123
elif [ "${{ github.event_name }}" == "pull_request" ]; then
# Pull request: use sanitized branch name + short SHA
BRANCH_TAG=$(echo "${{ github.head_ref }}" | sed 's/\//-/g')
SHORT_SHA=$(echo "${{ github.event.pull_request.head.sha }}" | cut -c1-7)
TAGS="$DOCKERHUB:$BRANCH_TAG$SUFFIX,$DOCKERHUB:$SHORT_SHA$SUFFIX"
TAGS="$TAGS,$GHCR:$BRANCH_TAG$SUFFIX,$GHCR:$SHORT_SHA$SUFFIX"
fi
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Script injection vulnerability via github.head_ref.

github.head_ref is user-controlled and can contain shell metacharacters. Using it directly in an inline script allows command injection attacks. For example, a malicious branch name like `whoami` or $(id) could execute arbitrary commands in the workflow.

🔒 Proposed fix: Pass through an environment variable
       - name: Set image tag
         id: tag
+        env:
+          HEAD_REF: ${{ github.head_ref }}
         run: |
           DOCKERHUB="daydreamlive/scope"
           GHCR="ghcr.io/daydreamlive/scope"
           SUFFIX="${{ matrix.suffix }}"

           if [ "${{ github.event_name }}" == "workflow_dispatch" ]; then
             # Manual dispatch: use provided version + latest
             VERSION="${{ github.event.inputs.version }}"
             TAGS="$DOCKERHUB:$VERSION$SUFFIX,$DOCKERHUB:latest$SUFFIX"
             TAGS="$TAGS,$GHCR:$VERSION$SUFFIX,$GHCR:latest$SUFFIX"
           elif [[ "${{ github.ref }}" == refs/tags/* ]]; then
             # Git tag: extract version (e.g., v0.1.0 -> 0.1.0) + latest
             VERSION_TAG="${{ github.ref_name }}"
             VERSION="${VERSION_TAG#v}"  # Remove 'v' prefix if present
             TAGS="$DOCKERHUB:$VERSION$SUFFIX,$DOCKERHUB:latest$SUFFIX"
             TAGS="$TAGS,$GHCR:$VERSION$SUFFIX,$GHCR:latest$SUFFIX"
           elif [ "${{ github.ref }}" == "refs/heads/main" ]; then
             # Main branch
             SHORT_SHA=$(echo "${{ github.sha }}" | cut -c1-7)
             TAGS="$DOCKERHUB:main$SUFFIX,$DOCKERHUB:$SHORT_SHA$SUFFIX"
             TAGS="$TAGS,$GHCR:main$SUFFIX,$GHCR:$SHORT_SHA$SUFFIX"
           elif [ "${{ github.ref }}" == "refs/heads/runpod-serverless" ]; then
             # Runpod serverless branch
             TAGS="$DOCKERHUB:runpod-serverless$SUFFIX"
             TAGS="$TAGS,$GHCR:runpod-serverless$SUFFIX"
           elif [ "${{ github.event_name }}" == "pull_request" ]; then
             # Pull request: use sanitized branch name + short SHA
-            BRANCH_TAG=$(echo "${{ github.head_ref }}" | sed 's/\//-/g')
+            BRANCH_TAG=$(echo "$HEAD_REF" | sed 's/\//-/g')
             SHORT_SHA=$(echo "${{ github.event.pull_request.head.sha }}" | cut -c1-7)
             TAGS="$DOCKERHUB:$BRANCH_TAG$SUFFIX,$DOCKERHUB:$SHORT_SHA$SUFFIX"
             TAGS="$TAGS,$GHCR:$BRANCH_TAG$SUFFIX,$GHCR:$SHORT_SHA$SUFFIX"
           fi
           echo "tags=$TAGS" >> $GITHUB_OUTPUT

Environment variables are expanded by the runner before shell execution, preventing injection. See GitHub's security hardening guide for details.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/docker-build.yml around lines 117 - 123, The workflow
currently injects the user-controlled github.head_ref directly into an inline
shell expression (used to build BRANCH_TAG) which enables command injection; fix
it by first assigning github.head_ref to an environment variable (e.g., export
BRANCH_REF="${{ github.head_ref }}") and then use that safe variable in the
shell script (e.g., BRANCH_TAG=$(printf '%s' "$BRANCH_REF" | sed 's/\//-/g')),
ensuring all expansions are quoted; update the places referencing BRANCH_TAG and
avoid using github.head_ref directly in the inline script or unquoted
expansions.

app/package.json Outdated
"name": "daydream-scope-desktop",
"productName": "Daydream Scope",
"version": "0.1.7",
"version": "0.1.6",
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Version downgrade detected: 0.1.7 → 0.1.6.

Same issue as in pyproject.toml. The desktop app version should remain synchronized with the Python package, but both should increase rather than decrease. If 0.1.7 was already released/distributed, this should be bumped to 0.1.8 or higher.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/package.json` at line 4, The package.json "version" was incorrectly
downgraded from 0.1.7 to 0.1.6; update the "version" field in app/package.json
to a higher patch than the already released 0.1.7 (e.g., 0.1.8) so it is not a
downgrade and ensure it is kept synchronized with the Python package's version
in pyproject.toml; change the "version" string value accordingly.

pyproject.toml Outdated
[project]
name = "daydream-scope"
version = "0.1.7"
version = "0.1.6"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Version downgrade detected: 0.1.7 → 0.1.6.

Decreasing the version number violates semantic versioning conventions and can cause practical issues:

  • Package managers expect monotonically increasing versions; users on 0.1.7 won't receive this as an upgrade.
  • PyPI rejects re-uploads of existing versions but also won't treat 0.1.6 as newer than 0.1.7.

If 0.1.7 was already released, this should be 0.1.8 (or higher). If 0.1.7 was never published, please verify this is intentional.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pyproject.toml` at line 3, The package version was downgraded to version =
"0.1.6" which breaks semantic versioning; verify whether 0.1.7 was already
published and if so bump the version in pyproject.toml to the next appropriate
release (e.g., "0.1.8" or higher) and update any release notes/changelog
accordingly so the published package version monotonically increases.

Comment on lines +248 to +250
# Graph port declaration: which stream ports this pipeline exposes
inputs: ClassVar[list[str]] = ["video"]
outputs: ClassVar[list[str]] = ["video"]
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

python - <<'PY'
import ast
from pathlib import Path

root = Path("src/scope/core/pipelines")

for path in root.rglob("*.py"):
    try:
        tree = ast.parse(path.read_text())
    except SyntaxError as exc:
        print(f"Syntax error in {path}: {exc}")
        continue

    for node in tree.body:
        if not isinstance(node, ast.ClassDef):
            continue

        base_names = set()
        for base in node.bases:
            if isinstance(base, ast.Name):
                base_names.add(base.id)
            elif isinstance(base, ast.Attribute):
                base_names.add(base.attr)

        if "BasePipelineConfig" not in base_names:
            continue

        assigned = set()
        for stmt in node.body:
            if isinstance(stmt, ast.Assign):
                for target in stmt.targets:
                    if isinstance(target, ast.Name):
                        assigned.add(target.id)
            elif isinstance(stmt, ast.AnnAssign) and isinstance(stmt.target, ast.Name):
                assigned.add(stmt.target.id)

        if "inputs" not in assigned or "outputs" not in assigned:
            print(f"{path}:{node.lineno}: {node.name} missing explicit graph ports")
PY

Repository: daydreamlive/scope

Length of output: 735


Make graph ports opt-in instead of implicit base defaults.

Seven pipeline configs inherit the new video input/output defaults from BasePipelineConfig without explicitly declaring them:

  • ControllerVisualizerConfig (controller_viz/schema.py:6)
  • GrayConfig (gray/schema.py:4)
  • OpticalFlowConfig (optical_flow/schema.py:8)
  • PassthroughConfig (passthrough/schema.py:4)
  • RIFEConfig (rife/schema.py:5)
  • ScribbleConfig (scribble/schema.py:5)
  • VideoDepthAnythingConfig (video_depth_anything/schema.py:5)

This means get_schema_with_metadata() now publishes video ports for all of them, even though they never opted in. Defaulting to empty lists and requiring explicit port declaration would keep the metadata honest and prevent accidental exposure of graph interfaces for configs that haven't been audited for graph mode.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/scope/core/pipelines/base_schema.py` around lines 248 - 250,
BasePipelineConfig currently exposes implicit graph ports by default: change the
ClassVar definitions inputs and outputs on BasePipelineConfig from ["video"] to
empty lists (e.g., ClassVar[list[str]] = []) so ports are opt-in; then update
each pipeline config that should expose a video port
(ControllerVisualizerConfig, GrayConfig, OpticalFlowConfig, PassthroughConfig,
RIFEConfig, ScribbleConfig, VideoDepthAnythingConfig) to explicitly declare
their own inputs and outputs = ["video"]; ensure get_schema_with_metadata()
continues to read these per-class attributes so only configs that opt in publish
graph ports.

Comment on lines +627 to +633
# Route to specific node or broadcast to all pipeline processors
node_id = parameters.pop("node_id", None)
if node_id and node_id in self._processors_by_node_id:
self._processors_by_node_id[node_id].update_parameters(parameters)
else:
for processor in self.pipeline_processors:
processor.update_parameters(parameters)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Unknown node_id should not fall back to broadcast.

The schema says broadcast happens when node_id is None; this branch also broadcasts on typos or stale node IDs. That turns a targeted update into a global one.

Suggested fix
-        if node_id and node_id in self._processors_by_node_id:
-            self._processors_by_node_id[node_id].update_parameters(parameters)
-        else:
+        if node_id is None:
             for processor in self.pipeline_processors:
                 processor.update_parameters(parameters)
+        elif node_id in self._processors_by_node_id:
+            self._processors_by_node_id[node_id].update_parameters(parameters)
+        else:
+            logger.warning("Ignoring update for unknown node_id '%s'", node_id)
+            return False
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# Route to specific node or broadcast to all pipeline processors
node_id = parameters.pop("node_id", None)
if node_id and node_id in self._processors_by_node_id:
self._processors_by_node_id[node_id].update_parameters(parameters)
else:
for processor in self.pipeline_processors:
processor.update_parameters(parameters)
# Route to specific node or broadcast to all pipeline processors
node_id = parameters.pop("node_id", None)
if node_id is None:
for processor in self.pipeline_processors:
processor.update_parameters(parameters)
elif node_id in self._processors_by_node_id:
self._processors_by_node_id[node_id].update_parameters(parameters)
else:
logger.warning("Ignoring update for unknown node_id '%s'", node_id)
return False
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/scope/server/frame_processor.py` around lines 627 - 633, The current
branch treats missing or misspelled node_id as a broadcast; change the logic so
broadcast only occurs when node_id is explicitly None: if node_id is None
iterate self.pipeline_processors and call update_parameters(parameters); else if
node_id exists in self._processors_by_node_id call that processor's
update_parameters(parameters); otherwise do not broadcast—log or raise a clear
error/warning (e.g., mentioning node_id) so stale/unknown IDs do not turn into
global updates; reference the variables node_id, self._processors_by_node_id,
self.pipeline_processors and the method update_parameters to locate where to
change the conditional flow.

Comment on lines +116 to +158
def validate_structure(self) -> list[str]:
"""Validate the graph structure and return a list of error messages.

Checks:
- No duplicate node IDs
- At least one source and one sink node
- Pipeline nodes have a pipeline_id
- All edge references point to existing nodes
"""
errors: list[str] = []
node_ids = [n.id for n in self.nodes]

# Check for duplicate node IDs
seen: set[str] = set()
for nid in node_ids:
if nid in seen:
errors.append(f"Duplicate node ID: '{nid}'")
seen.add(nid)

# At least one source and one sink
if not self.get_source_node_ids():
errors.append("Graph must have at least one source node")
if not self.get_sink_node_ids():
errors.append("Graph must have at least one sink node")

# Pipeline nodes must have pipeline_id
for node in self.nodes:
if node.type == "pipeline" and not node.pipeline_id:
errors.append(f"Pipeline node '{node.id}' is missing pipeline_id")

# Edge references must point to existing nodes
node_id_set = set(node_ids)
for edge in self.edges:
if edge.from_node not in node_id_set:
errors.append(
f"Edge references non-existent source node: '{edge.from_node}'"
)
if edge.to_node not in node_id_set:
errors.append(
f"Edge references non-existent target node: '{edge.to_node}'"
)

return errors
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

validate_structure() still accepts cyclic graphs.

The module and docs describe a DAG, but this validator never checks for cycles. Because processors wait for all upstream queues before running, a cyclic graph can deadlock the stream at startup instead of being rejected up front.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/scope/server/graph_schema.py` around lines 116 - 158, validate_structure
currently omits cycle detection so cyclic graphs pass validation; add a DAG
check (e.g., Kahn's algorithm or DFS back-edge detection) inside
validate_structure using the existing node id set and edges (use self.nodes,
self.edges, edge.from_node, edge.to_node, node_ids) and if a cycle is found
append an error like "Graph contains cycle: <describe nodes involved>" to errors
and return it along with the other checks; ensure the algorithm uses
node_ids/node_id_set for lookup and is efficient for large graphs.

Comment on lines +80 to +89
def set_api_graph(graph: GraphConfig) -> None:
"""Store a graph config set via the API (memory + file)."""
with _lock:
global _graph_config
_graph_config = graph
# Persist outside the lock to avoid holding it during I/O
_write_to_file(graph)
logger.info(
f"API graph set with {len(graph.nodes)} nodes and {len(graph.edges)} edges"
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Keep the file mutation in the same critical section as the in-memory update.

set_api_graph() and clear_api_graph() release _lock before touching disk. get_api_graph() can interleave in that window and either repopulate memory from a file that clear_api_graph() is about to delete, or observe a stale file recreated after a later clear.

Also applies to: 92-98

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/scope/server/graph_state.py` around lines 80 - 89, set_api_graph() and
clear_api_graph() currently release _lock before performing file I/O which
allows get_api_graph() to race and observe or recreate stale files; fix this by
moving the disk mutations (_write_to_file(graph) in set_api_graph and the file
deletion call used in clear_api_graph) inside the same critical section guarded
by _lock so the in-memory update and corresponding file mutation occur
atomically; ensure any call to _write_to_file, _delete_file (or similar file
helpers) occurs while holding _lock and keep logging after releasing the lock if
desired.

Comment on lines +487 to +491
node_ids = [node_id for node_id, _, _ in pipelines]
pipeline_ids = [pid for _, pid, _ in pipelines]
logger.info(
f"Loading {len(pipelines)} pipeline(s): {list(zip(node_ids, pipeline_ids))}"
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Fix linter error: zip() without explicit strict= parameter.

The pipeline failure indicates zip() should have an explicit strict= parameter per B905.

Proposed fix
         logger.info(
-            f"Loading {len(pipelines)} pipeline(s): {list(zip(node_ids, pipeline_ids))}"
+            f"Loading {len(pipelines)} pipeline(s): {list(zip(node_ids, pipeline_ids, strict=True))}"
         )
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
node_ids = [node_id for node_id, _, _ in pipelines]
pipeline_ids = [pid for _, pid, _ in pipelines]
logger.info(
f"Loading {len(pipelines)} pipeline(s): {list(zip(node_ids, pipeline_ids))}"
)
node_ids = [node_id for node_id, _, _ in pipelines]
pipeline_ids = [pid for _, pid, _ in pipelines]
logger.info(
f"Loading {len(pipelines)} pipeline(s): {list(zip(node_ids, pipeline_ids, strict=True))}"
)
🧰 Tools
🪛 GitHub Actions: Lint

[error] 490-490: Ruff check failed: B905 'zip()' without an explicit 'strict=' parameter. Add explicit value for parameter 'strict='.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/scope/server/pipeline_manager.py` around lines 487 - 491, The logger.info
call uses zip(node_ids, pipeline_ids) without an explicit strict parameter;
update that call to pass strict=True (i.e., zip(node_ids, pipeline_ids,
strict=True)) to satisfy the B905 linter rule and ensure mismatched lengths
raise immediately—modify the expression inside the f-string in the function that
builds node_ids/pipeline_ids from pipelines (the logger.info line referencing
node_ids, pipeline_ids, pipelines).

Comment on lines +513 to 550
for port, value in output_dict.items():
if value is None or not isinstance(value, torch.Tensor):
continue
queues = self.output_queues.get(port)
if not queues:
continue
# Resize output queues to meet target max size.
# Only resize when there are downstream pipeline consumers;
# the sink has no consumers so its queues stay fixed for
# frame_processor.get().
if self.output_consumers.get(port):
target_size = value.shape[0] * OUTPUT_QUEUE_MAX_SIZE_FACTOR
self._resize_output_queue(port, target_size)
# Re-read queues after potential resize – _resize_output_queue
# may replace self.output_queues[port] with a new list.
queues = self.output_queues.get(port)
if not queues:
continue
if value.dtype != torch.uint8:
value = (
(value * 255.0)
.clamp(0, 255)
.to(dtype=torch.uint8)
.contiguous()
.detach()
)
frames = [value[i].unsqueeze(0) for i in range(value.shape[0])]
for frame in frames:
for q in queues:
try:
q.put_nowait(frame if q is queues[0] else frame.clone())
except queue.Full:
if port == "video":
logger.info(
f"Output queue full for {self.pipeline_id}, dropping frame"
)
break

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Don't mirror streamed tensor outputs into parameters_queue.

extra_params forwards every non-video output to every downstream processor, even when that same port was already delivered via output_queues above. That duplicates batch tensors into persistent parameter state, routes ports to unrelated consumers, and keeps stale GPU tensors alive after the chunk has been consumed.

Also applies to: 555-567

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/scope/server/pipeline_processor.py` around lines 513 - 550, The code is
currently duplicating non-video tensor outputs into parameters_queue even after
they've been enqueued to downstream output_queues; update the logic that builds
extra_params so it skips adding any port that was already delivered via
output_queues (i.e., where self.output_queues.get(port) exists and
self.output_consumers.get(port) is true), and only add truly persistent
parameter outputs to parameters_queue (also ensure tensors added to
parameters_queue are detached/moved to CPU to avoid keeping GPU memory). Apply
the same guard to the second occurrence mentioned (the block around the 555-567
range) and reference output_dict, output_queues, output_consumers,
parameters_queue, extra_params, and pipeline_id when making the change.

Comment on lines +540 to +549
for frame in frames:
for q in queues:
try:
q.put_nowait(frame if q is queues[0] else frame.clone())
except queue.Full:
if port == "video":
logger.info(
f"Output queue full for {self.pipeline_id}, dropping frame"
)
break
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Keep fanning out even when one downstream queue is full.

On Line 549, break exits the per-port fan-out loop, so one slow consumer prevents every later queue on that port from receiving the frame. In a branched graph, that turns localized backpressure into dropped frames for unrelated consumers.

Suggested fix
                 for frame in frames:
                     for q in queues:
                         try:
                             q.put_nowait(frame if q is queues[0] else frame.clone())
                         except queue.Full:
                             if port == "video":
                                 logger.info(
                                     f"Output queue full for {self.pipeline_id}, dropping frame"
                                 )
-                            break
+                            continue
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
for frame in frames:
for q in queues:
try:
q.put_nowait(frame if q is queues[0] else frame.clone())
except queue.Full:
if port == "video":
logger.info(
f"Output queue full for {self.pipeline_id}, dropping frame"
)
break
for frame in frames:
for q in queues:
try:
q.put_nowait(frame if q is queues[0] else frame.clone())
except queue.Full:
if port == "video":
logger.info(
f"Output queue full for {self.pipeline_id}, dropping frame"
)
continue
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/scope/server/pipeline_processor.py` around lines 540 - 549, The fan-out
loop in pipeline_processor.py currently breaks out of the per-port iteration on
a queue.Full which prevents later queues from getting the frame; change the
exception handling in the inner loop that iterates over queues (where
q.put_nowait(frame if q is queues[0] else frame.clone())) so that on queue.Full
you log (as you do for port == "video") but do NOT break — simply continue to
the next queue so each downstream consumer is attempted independently; keep
using frame.clone() for non-first queues and preserve the existing logging via
logger.info(f"Output queue full for {self.pipeline_id}, dropping frame").

@github-actions
Copy link
Contributor

github-actions bot commented Mar 11, 2026

🚀 fal.ai Preview Deployment

App ID daydream/scope-pr-664--preview
WebSocket wss://fal.run/daydream/scope-pr-664--preview/ws
Commit 1d50040

Testing

Connect to this preview deployment by running this on your branch:

uv run build && SCOPE_CLOUD_APP_ID="daydream/scope-pr-664--preview/ws" uv run daydream-scope

🧪 E2E tests will run automatically against this deployment.

@github-actions
Copy link
Contributor

github-actions bot commented Mar 11, 2026

✅ E2E Tests passed

Status passed
fal App daydream/scope-pr-664--preview
Run View logs

Test Artifacts

Check the workflow run for screenshots.

@leszko leszko force-pushed the rafal/graph-mode-backend branch 2 times, most recently from 4e46b52 to 39df79a Compare March 12, 2026 12:58
Squashed follow-up fixes and improvements:
- Fix graph executor pipeline lookup to use node ID instead of pipeline ID
- Fix review issues in pipeline processor and frame processor
- Validate graph edge ports and clean up _load_events on unload
- Acquire input_queue_lock when resizing graph queues
- Remove pipeline throttler and add graph validation
- Move VACE input video routing from runtime to graph construction
- Remove redundant video-only normalization from pipeline_processor
- Reject duplicate stream edges targeting the same input port
- Fix unknown node_id falling back to broadcast in update_parameters
- Simplify graph_executor.py build_graph wiring
- Remove unused graph_state.py module
- Remove old pipeline chaining remnants
- Move build_linear_graph from frame_processor.py to graph_schema.py
- Fix queue resize
- Refactor pipeline reconciliation logic for readability and add unit tests
- Fix ruff B905 lint error: add strict=True to zip() call

Signed-off-by: Rafał Leszko <rafal@livepeer.org>
@leszko leszko force-pushed the rafal/graph-mode-backend branch from 39df79a to 8a9c6ec Compare March 12, 2026 13:05
@leszko leszko marked this pull request as ready for review March 12, 2026 13:15
Copy link
Collaborator

@ryanontheinside ryanontheinside left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. i couldnt break it.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Rafał Leszko <rafal@livepeer.org>
@leszko leszko merged commit a1c65e3 into main Mar 13, 2026
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants