Skip to content

fix(PLT-1158): uncached operator with UNAVAILABLE parents resolves to UNAVAILABLE#105

Merged
eywalker merged 2 commits intodevfrom
eywalker/plt-1158-operator-node-without-database-caching-incorrectly-resolves
Mar 25, 2026
Merged

fix(PLT-1158): uncached operator with UNAVAILABLE parents resolves to UNAVAILABLE#105
eywalker merged 2 commits intodevfrom
eywalker/plt-1158-operator-node-without-database-caching-incorrectly-resolves

Conversation

@kurodo3
Copy link
Copy Markdown
Contributor

@kurodo3 kurodo3 Bot commented Mar 25, 2026

Summary

Fixes PLT-1158: an implicit operator node (e.g. a Join between two source nodes) with no explicit database caching incorrectly resolved to READ_ONLY on pipeline load, causing downstream function nodes to fail when attempting computation.

  • Root cause: OperatorNode.from_descriptor() promoted any node to READ_ONLY when a pipeline_db was present — regardless of cache_mode. An uncached operator (cache_mode=OFF) never writes records to the database, so there is nothing to read back; UNAVAILABLE is the correct status.
  • Fix: Gate the READ_ONLY promotion on cache_mode != CacheMode.OFF. Only operators that actively persist results (LOG or REPLAY mode) are promoted to READ_ONLY.
  • Cascade: With the operator correctly UNAVAILABLE, _load_function_node enters the existing UNAVAILABLE branch and wires the downstream function pod in CACHE_ONLY mode, allowing it to serve all previously cached results from the DB without touching the unavailable operator or sources.

Changes

File Change
src/orcapod/core/nodes/operator_node.py Guard READ_ONLY promotion with cache_mode != CacheMode.OFF in from_descriptor()
tests/test_pipeline/test_serialization.py Fix 3 tests with wrong expectations (uncached JoinUNAVAILABLE); add TestPLT1158UncachedOperatorStatus with 5 regression tests

Test plan

  • All 2580 existing tests pass
  • 3 tests corrected: test_full_mode_operator_degrades_when_sources_unavailable, test_read_only_operator_with_join, test_load_multi_source_operator_pipeline_read_only
  • 5 new regression tests in TestPLT1158UncachedOperatorStatus:
    • test_uncached_operator_is_unavailable_not_read_only — operator status is UNAVAILABLE
    • test_source_nodes_are_unavailable — pre-condition check
    • test_function_node_gets_cache_only_when_operator_is_unavailable — downstream cascade
    • test_cache_only_function_node_serves_cached_results — end-to-end: cached results served correctly
    • test_read_only_mode_uncached_operator_is_also_unavailable — consistent across load modes
  • Regression test pipeline mirrors the exact reproduction case from PLT-1158 (two DictSources → uncached Joinadder function pod)

Closes PLT-1158

🤖 Generated with Claude Code

… UNAVAILABLE

An OperatorNode whose cache_mode=OFF never writes any records to the
pipeline database.  Previously, from_descriptor() would set
LoadStatus.READ_ONLY whenever pipeline_db was not None — regardless of
whether any data was ever persisted.  This caused the downstream
FunctionNode to treat the operator as a usable stream (FULL mode) and
attempt computation, which failed because the operator had no live
data stream.

The fix: only promote to READ_ONLY when the operator has actually
persisted data (cache_mode=LOG or REPLAY).  A cache_mode=OFF operator
remains UNAVAILABLE even when a pipeline_db is present.

With the operator correctly UNAVAILABLE, _load_function_node enters the
UNAVAILABLE branch and wires the function pod in CACHE_ONLY mode, allowing
it to serve all previously cached results from the DB without touching the
unavailable operator or sources.

Changes:
- OperatorNode.from_descriptor: guard READ_ONLY promotion with
  `cache_mode != CacheMode.OFF`
- Update three existing tests whose expectations were wrong
  (uncached Join in read_only and full mode → UNAVAILABLE, not READ_ONLY)
- Add TestPLT1158UncachedOperatorStatus with five regression tests,
  including the pipeline from the issue (two DictSources → join → adder)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@eywalker eywalker requested a review from Copilot March 25, 2026 01:58
@codecov
Copy link
Copy Markdown

codecov Bot commented Mar 25, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes PLT-1158 by correcting how deserialized uncached operator nodes (e.g., implicit Joins) resolve their LoadStatus when upstream sources are UNAVAILABLE, ensuring downstream function nodes correctly fall back to CACHE_ONLY instead of attempting computation.

Changes:

  • Update OperatorNode.from_descriptor() to promote to READ_ONLY only when pipeline_db exists and cache_mode != CacheMode.OFF.
  • Update 3 existing serialization tests to expect uncached operators to load as UNAVAILABLE.
  • Add a new regression test suite covering the PLT-1158 reproduction pipeline and the downstream CACHE_ONLY cascade.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
src/orcapod/core/nodes/operator_node.py Fixes operator LoadStatus derivation on load by gating READ_ONLY on non-OFF cache modes.
tests/test_pipeline/test_serialization.py Aligns existing expectations with corrected semantics and adds PLT-1158 regression coverage.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

loaded = Pipeline.load(json_path, mode="full")

# The implicit join operator has no DB caching → UNAVAILABLE
join_node = loaded.compiled_nodes["adder"]
Copy link

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In test_uncached_operator_is_unavailable_not_read_only, join_node = loaded.compiled_nodes["adder"] is both unused and misleading (it points to the function node, not the join/operator). Consider removing it or renaming/using it to assert on the function node explicitly, and keep operator assertions scoped to the actual join node if possible to reduce brittleness.

Suggested change
join_node = loaded.compiled_nodes["adder"]

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — the assignment is removed. The operator nodes are now found exclusively via from , with the comment updated to make that clear.

Remove the spurious `join_node = loaded.compiled_nodes["adder"]`
assignment that was both unused and misleading (it referenced the
function node, not the operator/join).  The comment above it is
also tightened to clarify that operator nodes are found via
`node_type`, not by label.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@kurodo3
Copy link
Copy Markdown
Contributor Author

kurodo3 Bot commented Mar 25, 2026

Review round addressed

Unused variable removed in test_uncached_operator_is_unavailable_not_read_only

The stale join_node = loaded.compiled_nodes["adder"] assignment (which pointed to the function node, not the operator) is removed. Operator nodes are now found exclusively via node_type == "operator" from _persistent_node_map, and the inline comment is updated to make that clear. All 5 regression tests still pass.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@kurodo3
Copy link
Copy Markdown
Contributor Author

kurodo3 Bot commented Mar 25, 2026

Review round 2 — no new comments

Copilot's second pass produced no additional comments. The PR is ready for human review.

@eywalker eywalker merged commit 0ce6e0e into dev Mar 25, 2026
9 checks passed
@eywalker eywalker deleted the eywalker/plt-1158-operator-node-without-database-caching-incorrectly-resolves branch March 25, 2026 02:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants