refactor: centralize input validations#13
Conversation
|
Warning Rate limit exceeded
Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 53 minutes and 3 seconds. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
📝 WalkthroughWalkthroughThe PR restructures input validation by moving checks from utility functions to dataclass Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request refactors validation logic by centralizing boundary checks in dataclass __post_init__ methods and the main explain entry point, while removing redundant checks from internal functions. Review feedback identifies a remaining redundant validation in calculate_hyperpixel_deltas and recommends restoring a safety check for class name lookups to prevent potential IndexError when handling auto-detected class indices.
There was a problem hiding this comment.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
ciao/scoring/hyperpixel.py (1)
133-141:⚠️ Potential issue | 🟠 MajorAdd validation to reject negative
max_hyperpixels.Python slicing with negative integers has special semantics (
[:−1]returns "all but last element"). Without explicit validation, passing-1silently returns an incorrect subset instead of failing. Add a guard before slicing:if max_hyperpixels < 0: raise ValueError( f"max_hyperpixels must be non-negative, got {max_hyperpixels}" )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ciao/scoring/hyperpixel.py` around lines 133 - 141, The function select_top_hyperpixels currently slices sorted results and will behave incorrectly for negative max_hyperpixels; add an explicit guard at the start of select_top_hyperpixels that checks if max_hyperpixels < 0 and raises ValueError(f"max_hyperpixels must be non-negative, got {max_hyperpixels}"), ensuring invalid negative inputs fail fast before the sorted(...)[ :max_hyperpixels] slice is applied.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@ciao/algorithm/context.py`:
- Around line 24-38: SearchContext.__post_init__ currently only checks value
ranges but allows wrong types (e.g., floats or bools); add explicit type checks
using isinstance for desired_length, batch_size, seed_idx and optimization_sign
(and ensure optimization_sign is an int but not a bool) and raise TypeError with
a clear message when types are wrong, then keep the existing range checks
(desired_length >=1, batch_size >0, optimization_sign in (1, -1), and 0 <=
seed_idx < self.image_graph.num_segments) and the existing duplicate check
against self.used_segments to prevent runtime TypeErrors later.
In `@ciao/explainer/ciao_explainer.py`:
- Around line 83-95: Add explicit type validation at the public API boundary:
check that max_hyperpixels, desired_length, and batch_size are instances of int
(use isinstance(..., int)) before the existing positive-value checks and raise a
TypeError with a clear message if not; likewise, if target_class_idx is not None
validate it is an int before checking bounds against predictor.class_names and
raise TypeError for non-int inputs. Ensure messages include the parameter name
and the received type/value so downstream code (e.g., range(max_hyperpixels) in
algorithm/builder.py) cannot receive floats or other invalid types.
- Around line 90-96: The code currently only validates caller-supplied
target_class_idx against predictor.class_names but misses validating the
resolved index when target_class_idx is None; after the resolution call to
predictor.get_predicted_class(input_batch) (the place that computes
argmax(outputs, dim=1)), add a bounds check that ensures the resulting
target_class_idx is within 0..len(predictor.class_names)-1 and raise a
ValueError if not, and also ensure predictor.class_names is non-empty (raise a
clear error if len(class_names)==0) before any indexing (used later when
accessing class_names[target_class_idx]).
In `@ciao/scoring/hyperpixel.py`:
- Around line 97-102: Validate the batch_size parameter before it's used in the
range(0, num_masks, batch_size) loop: ensure batch_size is an int and > 0,
otherwise raise a ValueError with a clear message (e.g., "batch_size must be a
positive integer"). Add this check near the start of the function (before the
loop that iterates using range with batch_size) alongside the existing
segment_sets validation so floats or non-positive values fail fast.
---
Outside diff comments:
In `@ciao/scoring/hyperpixel.py`:
- Around line 133-141: The function select_top_hyperpixels currently slices
sorted results and will behave incorrectly for negative max_hyperpixels; add an
explicit guard at the start of select_top_hyperpixels that checks if
max_hyperpixels < 0 and raises ValueError(f"max_hyperpixels must be
non-negative, got {max_hyperpixels}"), ensuring invalid negative inputs fail
fast before the sorted(...)[ :max_hyperpixels] slice is applied.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 41b01c08-2351-412a-8f2c-dfe992ba0a3e
📒 Files selected for processing (6)
ciao/algorithm/context.pyciao/algorithm/graph.pyciao/algorithm/lookahead.pyciao/explainer/ciao_explainer.pyciao/scoring/hyperpixel.pyciao/scoring/segments.py
💤 Files with no reviewable changes (2)
- ciao/algorithm/lookahead.py
- ciao/scoring/segments.py
d0f666e to
d0592d6
Compare
d168c1a to
0451a3f
Compare
d0592d6 to
b5ca1e5
Compare
0451a3f to
7869eb4
Compare
b5ca1e5 to
b1c0118
Compare
7869eb4 to
fdc538b
Compare
b1c0118 to
5faee77
Compare
There was a problem hiding this comment.
Pull request overview
Refactors validation responsibilities to align with a Boundary vs. Core architecture: fail fast at the explainer entrypoint and enforce invariants on immutable context/state objects, while removing redundant checks from deeper algorithm/scoring helpers.
Changes:
- Added strict boundary validations in
CIAOExplainer.explain(path existence, parameter bounds, class index bounds, tensor/device/shape checks). - Enforced invariant validation via
__post_init__inSearchContextandImageGraph. - Removed redundant/internal validations from scoring/algorithm helpers (e.g., hyperpixel and lookahead).
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
ciao/explainer/ciao_explainer.py |
Adds centralized boundary validations and simplifies class name handling. |
ciao/algorithm/context.py |
Adds SearchContext.__post_init__ to guarantee context validity. |
ciao/algorithm/graph.py |
Adds ImageGraph.__post_init__ to prevent empty graphs. |
ciao/algorithm/lookahead.py |
Removes local parameter validations from the lookahead builder. |
ciao/scoring/hyperpixel.py |
Removes helper-level input validation and relies on upstream checks. |
ciao/scoring/segments.py |
Removes redundant empty-graph validation (now enforced by ImageGraph). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
fdc538b to
7332b88
Compare
5faee77 to
7c8e95e
Compare
7c8e95e to
bf9df94
Compare
There was a problem hiding this comment.
🧹 Nitpick comments (1)
ciao/algorithm/graph.py (1)
17-20: StrengthenImageGraphinvariants beyond non-emptyadj_list.This validates emptiness, but traversal assumes stronger consistency (segment IDs and neighbor IDs must be within adjacency bounds). Consider enforcing those invariants here to fail fast on malformed graphs instead of failing later during indexing.
Suggested hardening in
__post_init__def __post_init__(self) -> None: if not self.adj_list: raise ValueError("ImageGraph cannot be empty (adj_list is empty).") + n = len(self.adj_list) + + if self.segments.numel() == 0: + raise ValueError("ImageGraph cannot be empty (segments is empty).") + + max_segment_id = int(self.segments.max().item()) + if max_segment_id >= n: + raise ValueError( + f"segments contain id {max_segment_id}, but adj_list has only {n} nodes." + ) + + for node_id, neighbors in enumerate(self.adj_list): + if any(neighbor_id < 0 or neighbor_id >= n for neighbor_id in neighbors): + raise ValueError( + f"adj_list[{node_id}] contains out-of-range neighbor id(s)." + )Based on learnings: in this codebase, keep
__post_init__runtime checks focused on domain-logic validation (valid ranges/values), not explicitisinstance()type enforcement.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ciao/algorithm/graph.py` around lines 17 - 20, The __post_init__ for ImageGraph currently only checks adj_list non-emptiness but should also validate adjacency consistency: in ImageGraph.__post_init__, iterate each segment index i and ensure its neighbor IDs in adj_list[i] are integers within 0..len(self.adj_list)-1 (no negatives, no out-of-range IDs) and optionally ensure neighbors refer to existing segments (i.e., membership in bounds); raise ValueError with a clear message identifying the offending segment/index when any invariant fails so malformed graphs fail fast during construction instead of later in traversal or indexing.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@ciao/algorithm/graph.py`:
- Around line 17-20: The __post_init__ for ImageGraph currently only checks
adj_list non-emptiness but should also validate adjacency consistency: in
ImageGraph.__post_init__, iterate each segment index i and ensure its neighbor
IDs in adj_list[i] are integers within 0..len(self.adj_list)-1 (no negatives, no
out-of-range IDs) and optionally ensure neighbors refer to existing segments
(i.e., membership in bounds); raise ValueError with a clear message identifying
the offending segment/index when any invariant fails so malformed graphs fail
fast during construction instead of later in traversal or indexing.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 8f2e7a46-4eac-465d-82f1-37f6ca36bada
📒 Files selected for processing (6)
ciao/algorithm/context.pyciao/algorithm/graph.pyciao/algorithm/lookahead.pyciao/explainer/ciao_explainer.pyciao/scoring/hyperpixel.pyciao/scoring/segments.py
💤 Files with no reviewable changes (3)
- ciao/scoring/segments.py
- ciao/algorithm/lookahead.py
- ciao/scoring/hyperpixel.py
🚧 Files skipped from review as they are similar to previous changes (2)
- ciao/algorithm/context.py
- ciao/explainer/ciao_explainer.py
Context:
This PR refactors input validations across the codebase to follow the "Boundary vs. Core" architectural principle. It ensures we fail-fast on invalid user inputs at the entrypoint, while removing redundant parameter checks deep inside internal algorithms.
What's Changed:
CIAOExplainer.explain(Boundary): Added strict input validations at the very beginning of the method (checking file existence, positive integers, device matches, and valid tensor shapes).SearchContext(State): Implemented__post_init__to guarantee the immutable context state is always mathematically valid before any search begins.builder.py,region.py, andlookahead.pyto keep the core logic clean and DRY.Related Task:
XAI-29
Summary by CodeRabbit