Reject cyclic MFD fraction grids instead of returning wrong results or hanging#2884
Merged
Conversation
brendancol
commented
Jun 2, 2026
Contributor
Author
brendancol
left a comment
There was a problem hiding this comment.
PR Review: Reject cyclic MFD fraction grids instead of returning wrong results or hanging
Blockers
None.
Suggestions
None.
Nits
- The same ValueError message is repeated across six sites (the CPU and dask-tile kernels in flow_accumulation_mfd.py and flow_length_mfd.py, plus two in flow_path_mfd.py). Numba njit functions can't pull a raise message from a module-level constant, so the duplication is hard to avoid for the kernels. Not worth refactoring.
What looks good
- The cycle check is the standard Kahn-completion test: after the topological traversal, compare the processed-cell count against the valid-cell count. Correct, with O(1) extra work.
head(processed) andtail(ordered) are equivalent here because the BFS drains its queue. Both stay below n_valid only when a cycle remains.- flow_path_mfd caps the trace at HW steps. A dominant-neighbor path through a DAG visits each cell at most once, so the bound is tight and won't false-positive. A full-length 2-cell valid path (HW=2) does not trip the cap.
- No false positives on valid grids: pits and flats are in_degree-0 sources, and off-grid flow creates no in-grid edges. Confirmed on a random elevation-derived MFD grid across numpy and dask.
- The guard sits in both the eager and dask-tile kernels, so numpy, dask+numpy, and dask+cupy all raise. cupy routes through the CPU kernel.
- Tests cover numpy and dask for all three functions and both flow_length directions. 8 new tests; the existing 82 still pass.
Checklist
- Algorithm matches reference: yes (Kahn completion check)
- All implemented backends consistent: yes
- NaN handling correct: unchanged; n_valid counts non-NaN cells
- Edge cases covered: cycle on numpy and dask; 1x1 and full-length paths checked manually
- Dask chunk boundaries: guard fires per tile; cross-tile cycles are out of scope for this PR
- No premature materialization: just an integer counter
- Benchmark: existing flow benchmarks apply; no new function added
- README matrix: not applicable, no new function or backend change
- Docstrings: unchanged, still accurate
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #2866
The MFD flow functions assume their fraction grid is a DAG but never check it. A hand-built or user-supplied grid with a cycle slipped through:
flow_accumulation_mfdandflow_length_mfdreturned wrong numbers (cells in the cycle kept their initial value), andflow_path_mfdlooped forever.Changes
flow_accumulation_mfdandflow_length_mfd: after the Kahn traversal, compare the processed-cell count against the valid-cell count. If a cycle left some cells with nonzero in-degree, raise aValueErrornaming the problem. Applied to the eager CPU kernels and the dask tile kernels (downstream and upstream).flow_path_mfd: the dominant-neighbor trace now caps atH*Wsteps. A path through a DAG visits each cell at most once, so exceeding that bound means a cycle. Applied to the CPUwhile Trueloop and the dask tracing loop. Both raise instead of spinning.Backend coverage
The cycle guard runs on numpy, dask+numpy, and dask+cupy (the cupy and dask+cupy paths route through the same CPU/tile kernels). cupy follows the CPU kernel as well.
Test plan
ValueErrortests for a 2-cell cycle on numpy and dask, for all three functions and both flow_length directionsflow_direction_mfdgrid still computes on every backend