blockifier: add with-libfunc-profiling feature flag#13755
blockifier: add with-libfunc-profiling feature flag#13755avi-starkware wants to merge 1 commit intoavi/blockifier/benchmarking-instrumentationfrom
Conversation
|
Warning This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
This stack of pull requests is managed by Graphite. Learn more about stacking. |
|
Artifacts upload workflows: |
PR SummaryMedium Risk Overview When enabled, native entrypoint execution conditionally routes through a new Reviewed by Cursor Bugbot for commit 545e953. Bugbot is set up for automated code reviews on this repo. Configure here. |
6a272ae to
ceacfac
Compare
| }; | ||
|
|
||
| libfunc_profiling_old_trace_id = *libfunc_profiling_trace_id; | ||
| *libfunc_profiling_trace_id = counter; |
There was a problem hiding this comment.
Unsafe mutable aliasing enables data race on trace ID
High Severity
The run method takes &self, and the ContractExecutor is shared across threads via Arc in NativeCompiledClassV1. The unsafe block creates a &mut u64 reference to the executor's trace ID symbol pointer. Since nothing prevents concurrent calls to run on the same executor, two threads can simultaneously obtain &mut u64 references to the same memory — a data race and undefined behavior. Additionally, the save/restore pattern for the old trace ID is not atomic, so interleaved concurrent calls corrupt each other's trace IDs.
Additional Locations (1)
Reviewed by Cursor Bugbot for commit ceacfac. Configure here.
| } | ||
| }; | ||
|
|
||
| *libfunc_profiling_trace_id = libfunc_profiling_old_trace_id; |
There was a problem hiding this comment.
Trace ID not restored if unwrap panics
Medium Severity
After setting the trace ID at the unsafe pointer on line 100, the restoration on line 131 is not panic-safe. Several unwrap() calls between these lines (on LIBFUNC_PROFILE.lock(), .remove(), and LIBFUNC_PROFILES_MAP.lock()) can panic — particularly if a mutex becomes poisoned from a prior panic. If any panics, the trace ID is never restored, leaving the executor's symbol in a stale state. An RAII guard or scopeguard pattern would ensure the restore happens on unwind.
Reviewed by Cursor Bugbot for commit ceacfac. Configure here.
ceacfac to
eab2995
Compare
| selector, | ||
| profile: raw_profile, | ||
| program: program.clone(), | ||
| }; |
There was a problem hiding this comment.
Entire Sierra program cloned per entrypoint execution
Medium Severity
Each EntrypointProfile stores a full program.clone() of the Sierra program. Since program comes from the AotWithProgram variant and is the same for every call to the same contract, repeatedly cloning a potentially large Sierra program for every single entrypoint execution causes significant unnecessary memory consumption. For a contract called N times, N full copies of the program are stored in LIBFUNC_PROFILES_MAP.
Reviewed by Cursor Bugbot for commit eab2995. Configure here.
TomerStarkware
left a comment
There was a problem hiding this comment.
@TomerStarkware partially reviewed 3 files and all commit messages, and made 1 comment.
Reviewable status: 2 of 3 files reviewed, 3 unresolved discussions (waiting on avi-starkware, noaov1, and Yoni-Starkware).
dadc4c0 to
7293ba2
Compare
eab2995 to
71c0167
Compare
| }; | ||
|
|
||
| let old_trace_id = *trace_id; | ||
| *trace_id = counter; |
There was a problem hiding this comment.
Unsafe data race on shared executor's trace_id symbol
High Severity
The trace_id raw pointer targets a global symbol in the executor's shared library. Since AotContractExecutor is shared across threads via Arc<NativeCompiledClassV1Inner>, concurrent calls to run_profiled for the same contract class race on reading/writing trace_id without synchronization. The save-modify-run-restore pattern on lines 66–67 and 93 is inherently non-atomic, causing undefined behavior and incorrect profile attribution when two threads interleave.
Additional Locations (1)
Reviewed by Cursor Bugbot for commit 71c0167. Configure here.
| } | ||
| }; | ||
|
|
||
| *trace_id = old_trace_id; |
There was a problem hiding this comment.
Missing panic safety for trace_id and profiler cleanup
Medium Severity
If executor.run() on line 69 panics (or unwinds), the cleanup code on lines 71–93 never executes. The LIBFUNC_PROFILE entry for counter leaks permanently, trace_id remains set to counter instead of being restored to old_trace_id, and the LIBFUNC_PROFILES_MAP lock is never updated. A RAII guard pattern would ensure cleanup runs even during unwinding.
Reviewed by Cursor Bugbot for commit 71c0167. Configure here.
| type ProfilesByBlockTx = HashMap<String, TransactionProfile>; | ||
|
|
||
| pub static LIBFUNC_PROFILES_MAP: LazyLock<Mutex<ProfilesByBlockTx>> = | ||
| LazyLock::new(|| Mutex::new(HashMap::new())); |
There was a problem hiding this comment.
Unbounded growth of global LIBFUNC_PROFILES_MAP without cleanup
Medium Severity
LIBFUNC_PROFILES_MAP is a global Mutex<HashMap> that accumulates an EntrypointProfile (which includes a cloned Program) for every profiled entrypoint call, keyed by transaction hash. No code in the codebase ever reads from or drains this map. In a long-running node with profiling enabled, this grows without bound, leaking memory proportional to the number of executed transactions.
Reviewed by Cursor Bugbot for commit 71c0167. Configure here.
| #[cfg(feature = "with-libfunc-profiling")] | ||
| program: None, | ||
| casm, | ||
| } |
There was a problem hiding this comment.
Program field always None, profiling path never executes
High Severity
The program field on NativeCompiledClassV1Inner is always initialized to None in the constructor, and no code anywhere in the codebase ever sets it to Some(...). In entry_point_execution.rs, the profiling branch matches on &compiled_class.program — since it's always None, the run_profiled path is dead code and profiling never activates, even with the with-libfunc-profiling feature flag enabled. The entire feature is non-functional.
Additional Locations (1)
Reviewed by Cursor Bugbot for commit 71c0167. Configure here.
7293ba2 to
c9f42dd
Compare
71c0167 to
0381287
Compare
Yoni-Starkware
left a comment
There was a problem hiding this comment.
@Yoni-Starkware reviewed 7 files and all commit messages, and made 1 comment.
Reviewable status: all files reviewed, 7 unresolved discussions (waiting on avi-starkware and noaov1).
Adds the `with-libfunc-profiling` feature flag for execution profiling support. When enabled, wraps AotContractExecutor::run() to collect per-entrypoint libfunc profiling data into a global map keyed by transaction hash. Stores an optional Sierra program alongside the executor in NativeCompiledClassV1Inner to enable profiling instrumentation. Zero behavior change without the flag enabled.
0381287 to
545e953
Compare
c9f42dd to
bfe0a84
Compare
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
There are 8 total unresolved issues (including 7 from previous reviews).
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, have a team admin enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 545e953. Configure here.
| let mut profiles_map = LIBFUNC_PROFILES_MAP.lock().unwrap(); | ||
|
|
||
| let profile = | ||
| EntrypointProfile { class_hash, selector, profile: raw_profile, program: program.clone() }; |
There was a problem hiding this comment.
Full Sierra Program cloned per entrypoint invocation
Medium Severity
EntrypointProfile stores a full owned program: cairo_lang_sierra::program::Program and each profiled entry point call does program.clone(). A Sierra program can be very large (thousands of type/libfunc declarations and statements). When the same contract is called many times, this creates many redundant deep copies. Storing an Arc<Program> instead would avoid the clones entirely.
Additional Locations (1)
Reviewed by Cursor Bugbot for commit 545e953. Configure here.



Adds the
with-libfunc-profilingfeature flag for execution profilingsupport. When enabled, introduces an
AotWithProgramvariant onContractExecutorthat collects per-entrypoint libfunc profiling datainto a global map keyed by transaction hash.
Zero behavior change without the flag enabled.