Skip to content

fix(pipeline): InputtersFlow predicated-link deadlocks on filtered items — add LinkLeftToNull() discard #344

@Vasar007

Description

@Vasar007

Summary: InputtersFlow.InitFlow constructs each inputFlow with inputFlow.LinkTo(_resultTransformer, FilterInputData) — Gridsum's predicated-link overload. With the default Gridsum.DataflowEx 2.0.0 behaviour and no LinkLeftToNull() escape hatch, items the predicate rejects (deduped repeats AND too-short items) are NOT discarded — they accumulate in inputFlow's source-block buffer and prevent inputFlow from ever completing. The downstream _resultTransformer (which has RegisterDependency(inputFlow)) therefore never completes, OutputBlock.Completion never fires, and any caller awaiting end-to-end completion deadlocks.

Production impact (latent bug)

Any production input data with duplicates OR too-short entries hangs Shell.Run indefinitely. Verified in the integration suite with a 30-second xUnit timeout that consistently hits the cap with no items received in the rejection case; the no-rejection happy path (1 unique non-empty item) completes in ~90 ms.

Where

  • Sources/Libraries/ProjectV.DataPipeline/InputtersFlow.csInitFlow method, predicated LinkTo call against FilterInputData.

Suggested fix

Add .LinkLeftToNull() (or an equivalent discard sink) to the predicated LinkTo. After the fix, items the predicate rejects flow to the null sink and inputFlow.Completion fires normally.

Verification today (test-side workaround)

Sources/Tests/ProjectV.DataPipeline.Tests/InputtersFlowTests.cs exercises the dedup + length-filter branches by reflecting on the private FilterInputData(string) predicate directly — a minimal-invariant probe that confirms the production behaviour without depending on Gridsum's deadlocking completion semantics. A separate happy-path smoke test (ProcessAsync_WithSingleUniqueItem_EmitsItDownstream) exercises the no-rejection case end-to-end. The class-level <remarks> block documents the deadlock root cause.

Acceptance

  • InputtersFlow.InitFlow adds .LinkLeftToNull() (or equivalent discard) to the predicated link.
  • The dedup + length-filter tests in ProjectV.DataPipeline.Tests flip from reflection-probe to end-to-end driving (ProcessAsync(..., completeFlowOnFinish: true) + sink.Completion) and complete within the 30-second xUnit timeout.
  • Shell.Run end-to-end test (currently "tested around" in ProjectV.Core.Tests) becomes drivable when fed input with duplicates or filtered-by-length entries.

Surfaced by

Phase 2 Test Coverage (milestone v0.9.8) — Test (Integration) stage on ProjectV.DataPipeline.Tests. Companion to the DataflowPipeline.Execute(string) terminal-pipeline deadlock (separate issue) — both are Gridsum-completion-semantics latent bugs surfaced together by the Phase 2 test build-out.

Metadata

Metadata

Assignees

Labels

Projects

Status

To do

Relationships

None yet

Development

No branches or pull requests

Issue actions