-
Notifications
You must be signed in to change notification settings - Fork 22
[Enhancement] Stop looping when runtime is stable (CF-934) #967
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…sistent-loop-break
…sistent-loop-break
PR Reviewer Guide 🔍Here are some key observations to aid the review process:
|
PR Code Suggestions ✨Explore these optional code suggestions:
|
| return 0.2 | ||
| if avg < 0.1: # < 100 ms | ||
| return 0.1 | ||
| return 0.03 # > 0.1 s |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mohammedahmed18 how did you set these numbers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aseembits93
there are just experiment numbers, the whole idea is very small runtimes should have high tolerance value because the noise is relatively large
…sistent-loop-break
…sistent-loop-break
The optimization achieves a **437% speedup** by eliminating expensive function calls and using more efficient algorithms for median calculation and min/max operations. **Key optimizations applied:** 1. **Custom median calculation**: Replaced `statistics.median(recent)` with a custom implementation using `sorted(recent)` and direct indexing. This eliminates the overhead of the statistics module's generic median function. 2. **Reused sorted array**: The sorted array from median calculation is reused for min/max operations (`recent_sorted[0]`, `recent_sorted[-1]`) instead of calling `min(recent)` and `max(recent)` separately, eliminating redundant iterations. 3. **Replaced `statistics.mean()` calls**: Substituted `mean(recent[:half])` and `mean(recent[half:])` with direct `sum()/length` calculations, removing function call overhead. 4. **Early termination optimization**: Changed the `all()` generator expression to an explicit loop with early `break`, avoiding unnecessary iterations when the first non-conforming value is found. **Performance impact analysis:** The line profiler shows dramatic improvements in the most expensive operations: - `first = mean(recent[:half])`: 2.07ms → 47.7μs (98% reduction) - `second = mean(recent[half:])`: 1.54ms → 35.4μs (98% reduction) - `m = median(recent)`: 220μs → eliminated, replaced with ~55μs custom implementation **Hot path significance:** Based on the function reference, `should_stop()` is called in the main test loop (`pytest_runtestloop`) after every iteration to determine if benchmarking should continue. This makes it a critical hot path function where the 437% speedup directly translates to faster benchmark completion times. **Test case performance:** The optimization performs exceptionally well across all test scenarios, with speedups ranging from 400-500% for typical cases involving stable data, fluctuations, and large datasets. Even edge cases with insufficient data show 10-25% improvements.
⚡️ Codeflash found optimizations for this PR📄 438% (4.38x) speedup for
|
Code Review for PR #967: Stop looping when runtime is stableThanks for this enhancement! Here is my detailed review: ✅ Strengths
🔴 Critical Issues - MUST FIX1. Type Hint Error (best_summed_runtime.py:1)
2-4. Division by Zero Vulnerabilities (pytest_plugin.py)
|
…sh/optimize-pr967-2025-12-22T14.11.46
…25-12-22T14.11.46 ⚡️ Speed up function `should_stop` by 438% in PR #967 (`exp/consistent-loop-break`)
|
This PR is now faster! 🚀 @mohammedahmed18 accepted my optimizations from: |
Code Review: Stop looping when runtime is stable (CF-934)Thank you for this enhancement! The dynamic loop termination based on runtime stability is a smart optimization. Below is my detailed review: ✅ Strengths
🐛 Critical Issues1. Type annotation bug in
|
…odeflash into exp/consistent-loop-break
PR Review: Stop Looping When Runtime is Stable (CF-934)SummaryThis PR introduces intelligent loop termination based on runtime stability, replacing fixed iteration counts with dynamic detection. The implementation tracks test durations and stops when measurements stabilize, improving efficiency while maintaining accuracy. 🟢 Strengths
🟡 Issues & RecommendationsCritical: Type Annotation BugLocation: def calculate_best_summed_runtime(grouped_runtime_info: dict[any, list[int]]) -> int:Issue: Fix: from typing import Any
def calculate_best_summed_runtime(grouped_runtime_info: dict[Any, list[int]]) -> int:
return sum([min(usable_runtime_data) for _, usable_runtime_data in grouped_runtime_info.items()])High Priority: Edge Case Handling1. Division by Zero in
|
|
@misrasaurabh1 @mohammedahmed18 claude reviews are way too frequent, let's do something about it |
| N_CANDIDATES_LP = 6 | ||
|
|
||
| # pytest loop stability | ||
| STABILITY_WARMUP_LOOPS = 4 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how did you determine these magic numbers @mohammedahmed18 ?
PR Type
Enhancement, Tests
Description
Add consistent-loop break based on durations
Track per-loop test durations via hook
Introduce dynamic tolerance by runtime scale
New config: CONSISTENT_LOOP_COUNT
Diagram Walkthrough
File Walkthrough
config_consts.py
Introduce loop consistency count constantcodeflash/code_utils/config_consts.py
CONSISTENT_LOOP_COUNTdefault to 3.pytest_plugin.py
Duration-based consistent loop termination logiccodeflash/verification/pytest_plugin.py
dynamic_tolerancebased on avg runtime.CONSISTENT_LOOP_COUNT.env_utils.py
Minor formatting cleanupcodeflash/code_utils/env_utils.py