⚡️ Speed up method InjectPerfOnly.visit_ClassDef by 2,017% in PR #617 (alpha-async)
#703
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
⚡️ This pull request contains optimizations for PR #617
If you approve this dependent PR, these changes will be merged into the original PR branch
alpha-async.📄 2,017% (20.17x) speedup for
InjectPerfOnly.visit_ClassDefincodeflash/code_utils/instrument_existing_tests.py⏱️ Runtime :
4.08 milliseconds→193 microseconds(best of20runs)📝 Explanation and details
The optimization significantly improves performance by eliminating redundant AST traversals in the
visit_ClassDefmethod.Key optimization: Replace
ast.walk(node)with direct iteration overnode.body. The original code usesast.walk()which performs a deep recursive traversal of the entire AST subtree, visiting every nested node including those inside method bodies, nested classes, and compound statements. This creates O(n²) complexity when combined with the subsequentvisit_FunctionDefcalls.Why this works: The method only needs to find direct child nodes that are
FunctionDeforAsyncFunctionDefto process them. Direct iteration overnode.bodyachieves the same result in O(n) time since it only examines immediate children of the class.Performance impact: The line profiler shows the critical bottleneck - the
ast.walk()call took 88.2% of total execution time (27ms out of 30.6ms) in the original version. The optimized version reduces this to just 10.3% (207μs out of 2ms), achieving a 2017% speedup.Optimization effectiveness: This change is particularly beneficial for large test classes with many methods (as shown in the annotated tests achieving 800-2500% speedups), where the unnecessary deep traversal of method bodies becomes increasingly expensive. The optimization maintains identical behavior while dramatically reducing computational overhead for AST processing workflows.
✅ Correctness verification report:
🌀 Generated Regression Tests and Runtime
To edit these changes
git checkout codeflash/optimize-pr617-2025-09-02T17.56.48and push.