-
Notifications
You must be signed in to change notification settings - Fork 123
Add dimension-aware long-running test notifications #1067
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add dimension-aware long-running test notifications #1067
Conversation
- Interactive mode uses dimension-aware thresholds: * 1D tests: warned at 30 seconds * 2D tests: warned at 1 minute * 3D tests: warned at 2 minutes - Headless mode uses fixed milestone warnings at 2min, 10min, 30min - Completion messages printed for long-running tests in interactive mode - Live progress row shows currently long-running tests
|
CodeAnt AI is reviewing your PR. |
|
Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. WalkthroughAdded long-running task monitoring to the scheduler with dimension-aware thresholds. Introduced Task dataclass, expanded WorkerThreadHolder with milestone tracking, and enhanced WorkerThread with exception handling. Implemented notifications at configurable intervals for interactive and headless modes. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Comment |
PR Reviewer Guide 🔍Here are some key observations to aid the review process:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
High-level Suggestion
Refactor the notification state management by replacing the multiple boolean flags in WorkerThreadHolder with a single state field. This field would track the last notification milestone, simplifying both the data structure and the notification logic. [High-level, importance: 7]
Solution Walkthrough:
Before:
@dataclasses.dataclass
class WorkerThreadHolder:
# ...
notified_30s: bool = False
notified_2m: bool = False
notified_10m: bool = False
notified_30m: bool = False
def notify_long_running_threads(...):
# ...
if interactive:
if elapsed >= threshold and not holder.notified_30s:
# print notification
holder.notified_30s = True
else: # headless
if not holder.notified_2m and elapsed >= 120:
holder.notified_2m = True
if not holder.notified_10m and elapsed >= 600:
holder.notified_10m = True
# ... and so onAfter:
@dataclasses.dataclass
class WorkerThreadHolder:
# ...
# Tracks the index of the last notification sent. -1 means none.
last_notification_idx: int = -1
def notify_long_running_threads(...):
# ...
if interactive:
if elapsed >= threshold and holder.last_notification_idx == -1:
# print notification
holder.last_notification_idx = 0 # Mark as notified
else: # headless
for i, (threshold, msg) in enumerate(HEADLESS_THRESHOLDS):
if elapsed >= threshold and i > holder.last_notification_idx:
# print notification for this milestone
holder.last_notification_idx = i
Nitpicks 🔍
|
|
CodeAnt AI finished reviewing your PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds dimension-aware long-running test notifications to the MFC test scheduler, improving visibility into test execution time across different dimensional complexity levels.
- Interactive mode now shows dimension-specific warnings (1D at 30s, 2D at 1min, 3D at 2min) and displays currently long-running tests in a live progress row.
- Headless mode provides milestone-based warnings at fixed intervals (2min, 10min, 30min) regardless of test dimensionality.
- Completion messages are printed for previously-notified long-running tests in interactive mode to provide closure on test execution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No issues found across 1 file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (3)
toolchain/mfc/sched.py (3)
41-53: Consider consolidating notification state tracking.The pipeline flags this class with "Too many instance attributes (10/7)". As noted in past reviews, consider:
- Renaming
notified_30stonotified_interactivesince the threshold varies by dimensionality (30s, 60s, or 120s).- Replacing the four boolean flags with a single
last_notification_idx: int = -1field to track milestone progression.@dataclasses.dataclass class WorkerThreadHolder: thread: threading.Thread ppn: int load: float devices: typing.Set[int] task: typing.Optional['Task'] = None start: float = 0.0 - # Track which milestones we've already logged - notified_30s: bool = False # for interactive mode - notified_2m: bool = False - notified_10m: bool = False - notified_30m: bool = False + # Track notification state: -1 = none, 0 = interactive/first, 1/2/3 = headless milestones + last_notification_idx: int = -1
199-208: Good UX: completion messages for long-running tests.The completion message provides useful feedback when long-running tests finish. The case extraction pattern (lines 202-204) duplicates lines 103-105 - consider extracting to a small helper if this pattern grows, but acceptable as-is.
94-149: Fix trailing whitespace and useelifchain for headless notifications.Pipeline flags trailing whitespace on lines 110 and 113, and "too-many-branches (13/12)".
Using
eliffor headless mode prevents redundant condition checks when a notification has already been printed for a lower milestone in the same iteration, and reduces branch count.def notify_long_running_threads(progress, running_tracker, interactive: bool) -> None: now = time.time() long_running_for_progress = [] for holder in threads: if not holder.thread.is_alive(): continue elapsed = now - holder.start case = holder.task.args[0] if holder.task and holder.task.args else None case_uuid = case.get_uuid() if hasattr(case, "get_uuid") else "unknown" case_trace = getattr(case, "trace", "") # --- interactive: dimension-aware thresholds --- if interactive: threshold = get_threshold_for_case(case, interactive=True) - + if elapsed >= threshold: long_running_for_progress.append((case_uuid, case_trace)) - + # Print explicit line once when crossing threshold if not holder.notified_30s: dim = get_case_dimensionality(case) dim_label = f"{dim}D" time_label = f"{int(threshold)}s" if threshold < 60 else f"{int(threshold/60)}min" cons.print( f" [italic yellow]Still running[/italic yellow] ({dim_label}, >{time_label}) " f"[bold magenta]{case_uuid}[/bold magenta] {case_trace}" ) holder.notified_30s = True # --- headless: milestone notifications at 2, 10, 30 minutes --- else: - # 2 minutes - if (not holder.notified_2m) and elapsed >= 2 * 60: + # 30 minutes (check highest first) + if (not holder.notified_30m) and elapsed >= 30 * 60: cons.print( - f" {HEADLESS_THRESHOLDS[0][1]} " + f" {HEADLESS_THRESHOLDS[2][1]} " f"[bold magenta]{case_uuid}[/bold magenta] {case_trace}" ) - holder.notified_2m = True - - # 10 minutes - if (not holder.notified_10m) and elapsed >= 10 * 60: + holder.notified_30m = True + # 10 minutes + elif (not holder.notified_10m) and elapsed >= 10 * 60: cons.print( f" {HEADLESS_THRESHOLDS[1][1]} " f"[bold magenta]{case_uuid}[/bold magenta] {case_trace}" ) holder.notified_10m = True - - # 30 minutes - if (not holder.notified_30m) and elapsed >= 30 * 60: + # 2 minutes + elif (not holder.notified_2m) and elapsed >= 2 * 60: cons.print( - f" {HEADLESS_THRESHOLDS[2][1]} " + f" {HEADLESS_THRESHOLDS[0][1]} " f"[bold magenta]{case_uuid}[/bold magenta] {case_trace}" ) - holder.notified_30m = True + holder.notified_2m = True
🧹 Nitpick comments (1)
toolchain/mfc/sched.py (1)
220-281: Implementation is correct; consider future refactoring for complexity.The scheduler loop correctly integrates the new notification system with proper handling of interactive vs headless modes. The pipeline flags "too-many-locals (18/15)" and "too-many-statements (102/50)" - the
schedfunction has grown significantly. For maintainability, consider extracting the thread management and notification logic into separate functions in a future PR.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
toolchain/mfc/sched.py(7 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
toolchain/mfc/sched.py (2)
toolchain/mfc/test/case.py (2)
get_uuid(154-155)get_uuid(288-289)toolchain/mfc/test/test.py (1)
test(93-180)
🪛 GitHub Actions: Lint Toolchain
toolchain/mfc/sched.py
[warning] 73-73: Trailing whitespace (trailing-whitespace)
[warning] 77-77: Trailing whitespace (trailing-whitespace)
[warning] 110-110: Trailing whitespace (trailing-whitespace)
[warning] 113-113: Trailing whitespace (trailing-whitespace)
[warning] 42-42: Too many instance attributes (10/7) (too-many-instance-attributes)
[warning] 63-63: Too many local variables (18/15) (too-many-locals)
[warning] 78-78: Unnecessary "elif" after "return", remove the leading "el" from "elif" (no-else-return)
[warning] 87-87: Unnecessary "else" after "return", remove the "else" and de-indent the code inside it (no-else-return)
[warning] 94-94: Too many branches (13/12) (too-many-branches)
[warning] 63-63: Too many statements (102/50) (too-many-statements)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
- GitHub Check: Code Cleanliness Check
- GitHub Check: Github (ubuntu, mpi, no-debug, false)
- GitHub Check: Github (ubuntu, no-mpi, single, no-debug, false)
- GitHub Check: Github (ubuntu, mpi, debug, false)
- GitHub Check: Github (ubuntu, mpi, no-debug, true)
- GitHub Check: Github (ubuntu, mpi, debug, true)
- GitHub Check: Coverage Test on CodeCov
- GitHub Check: cubic · AI code reviewer
- GitHub Check: Build & Publish
🔇 Additional comments (3)
toolchain/mfc/sched.py (3)
7-20: LGTM! Well-structured threshold constants.The dimension-aware thresholds for interactive mode and milestone-based thresholds for headless mode are clearly defined and appropriately documented.
22-38: LGTM! Improved exception tracking.The addition of
exc_infofor full traceback storage andcompleted_successfullyflag provides better debugging context and allows distinguishing between gracefully handled test failures and unexpected exceptions.
56-61: LGTM!Clean dataclass for task encapsulation with appropriate type annotations.
- Simplify get_threshold_for_case() to only handle interactive mode - Rename notified_30s to notified_interactive for clarity - Add type hints to function parameters for consistency - Improve docstrings for better documentation - Change 'Running (long): -' to 'Running (long): none' - Fix progress tracker spacing for consistent alignment - Improve time_label formatting to handle non-integer minutes
- Remove trailing whitespace - Change elif to if after return statement - Add pylint disable comments for acceptable complexity warnings
The pylint disable comment for too-many-branches needs to be on the function definition line (where 'def' is) rather than the closing paren.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #1067 +/- ##
=======================================
Coverage 44.36% 44.36%
=======================================
Files 71 71
Lines 20590 20590
Branches 1994 1994
=======================================
Hits 9134 9134
Misses 10310 10310
Partials 1146 1146 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
- Keep typing.Optional[typing.Set[int]] for devices parameter (from our branch) - Integrate dimension-aware long-running test notifications (from upstream PR MFlowCode#1067) - Combine both WorkerThreadHolder fields and sched function signature improvements
User description
User description
Closes #1065
PR Type
Enhancement
Description
Add dimension-aware long-running test notifications for interactive mode
Implement fixed milestone warnings in headless mode at 2, 10, 30 minutes
Display currently long-running tests in live progress row
Print completion messages for long-running tests in interactive mode
Diagram Walkthrough
flowchart LR A["Test Scheduler"] --> B["Determine Test Dimensionality"] B --> C{Interactive Mode?} C -->|Yes| D["Dimension-aware Thresholds<br/>1D:30s, 2D:1m, 3D:2m"] C -->|No| E["Fixed Milestones<br/>2m, 10m, 30m"] D --> F["Notify & Update Progress Row"] E --> F F --> G["Print Completion Message"]File Walkthrough
sched.py
Implement dimension-aware long-running test notificationstoolchain/mfc/sched.py
INTERACTIVE_THRESHOLDSandHEADLESS_THRESHOLDSconstants fordimension-aware and milestone-based notifications
WorkerThreadHolderdataclass withtask,starttimestamp, andnotification flags for tracking long-running tests
get_case_dimensionality()helper to determine testdimensionality from case parameters
get_threshold_for_case()helper to retrieve appropriatethreshold based on mode and dimensionality
notify_long_running_threads()function to monitor runningthreads and emit notifications at appropriate thresholds
join_first_dead_thread()to acceptinteractiveparameter andprint completion messages for long-running tests
interactivemode, createrunning_trackerprogress task, and call notification functionperiodically
WorkerThreadHolderinstantiation to include task reference andstart timestamp
CodeAnt-AI Description
Dimension-aware long-running test alerts now update progress and completion notices
What Changed
Impact
✅ Clearer long-running test alerts per interactive dimensionality✅ Better visibility into currently long-running tests via the running row✅ Faster detection of hanging jobs in headless batches💡 Usage Guide
Checking Your Pull Request
Every time you make a pull request, our system automatically looks through it. We check for security issues, mistakes in how you're setting up your infrastructure, and common code problems. We do this to make sure your changes are solid and won't cause any trouble later.
Talking to CodeAnt AI
Got a question or need a hand with something in your pull request? You can easily get in touch with CodeAnt AI right here. Just type the following in a comment on your pull request, and replace "Your question here" with whatever you want to ask:
This lets you have a chat with CodeAnt AI about your pull request, making it easier to understand and improve your code.
Example
Preserve Org Learnings with CodeAnt
You can record team preferences so CodeAnt AI applies them in future reviews. Reply directly to the specific CodeAnt AI suggestion (in the same thread) and replace "Your feedback here" with your input:
This helps CodeAnt AI learn and adapt to your team's coding style and standards.
Example
Retrigger review
Ask CodeAnt AI to review the PR again, by typing:
Check Your Repository Health
To analyze the health of your code repository, visit our dashboard at https://app.codeant.ai. This tool helps you identify potential issues and areas for improvement in your codebase, ensuring your repository maintains high standards of code health.
Summary by CodeRabbit
Release Notes
✏️ Tip: You can customize this high-level summary in your review settings.