Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds mutation-version tracking to Dict and exposes it on PyDict; introduces crate-visible fast-path Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 inconclusive)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
8fc678d to
57cd6b2
Compare
|
Code has been automatically formatted The code in this PR has been formatted using:
git pull origin specialization |
843794e to
5ac7f95
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
crates/vm/src/builtins/range.rs (1)
663-674: Avoid duplicating step logic betweennext_fastandfast_next.Both methods currently implement the same increment/check path. Please keep one source of truth to prevent divergence.
♻️ Suggested simplification
impl PyRangeIterator { /// Fast path for FOR_ITER specialization. Returns the next isize value /// without allocating PyInt or PyIterReturn. pub(crate) fn fast_next(&self) -> Option<isize> { - let index = self.index.fetch_add(1); - if index < self.length { - Some(self.start + (index as isize) * self.step) - } else { - None - } + self.next_fast() } }As per coding guidelines: "When branches differ only in a value but share common logic, extract the differing value first, then call the common logic once to avoid duplicate code."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/vm/src/builtins/range.rs` around lines 663 - 674, The iteration logic in PyRangeIterator is duplicated between fast_next and fast_next (the fast path) — extract the shared increment/check logic into a single helper (e.g., a private method on PyRangeIterator like next_index_or_none or advance_and_get_index) and have both next_fast and fast_next call that helper to compute the next index/value; ensure the helper uses self.index.fetch_add(1), compares to self.length, and returns either the computed isize value (start + index * step) or None so both next_fast and fast_next reuse the same implementation and avoid divergence.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@crates/vm/src/builtins/function.rs`:
- Around line 611-621: The get_version_for_current_state implementation allows
FUNC_VERSION_COUNTER to wrap and reuse version tags; update it to atomically
advance FUNC_VERSION_COUNTER without ever returning a recycled nonzero version
by using an atomic compare-and-swap/loop (or Atomics::fetch_update) that: loads
the current counter (FUNC_VERSION_COUNTER), if it's 0 or u32::MAX treat as
exhausted and return 0, otherwise compute next = current.wrapping_add(1) and
attempt compare_exchange to set it to next; once you successfully set the global
counter, store that nonzero new version into self.func_version and return it;
reference get_version_for_current_state, FUNC_VERSION_COUNTER, and
self.func_version for locating the change.
In `@crates/vm/src/dict_inner.rs`:
- Around line 263-270: The version() accessor and bump_version() updater must
use proper Acquire/Release ordering and bump_version must be called while
holding the dict write lock: change version() to load with Acquire and
bump_version() to fetch_add with Release (function names: version and
bump_version in dict_inner.rs), and move every call to bump_version so it
executes inside the same write-lock critical section where the dictionary
mutation occurs (the callers in this repo include the mutation sites referenced
in frame.rs that read the version lock-free for LOAD_GLOBAL caching); ensure the
write lock remains held across the mutation and the bump_version call so the
Release store synchronizes with readers that use Acquire.
---
Nitpick comments:
In `@crates/vm/src/builtins/range.rs`:
- Around line 663-674: The iteration logic in PyRangeIterator is duplicated
between fast_next and fast_next (the fast path) — extract the shared
increment/check logic into a single helper (e.g., a private method on
PyRangeIterator like next_index_or_none or advance_and_get_index) and have both
next_fast and fast_next call that helper to compute the next index/value; ensure
the helper uses self.index.fetch_add(1), compares to self.length, and returns
either the computed isize value (start + index * step) or None so both next_fast
and fast_next reuse the same implementation and avoid divergence.
ℹ️ Review info
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (10)
crates/vm/src/builtins/dict.rscrates/vm/src/builtins/function.rscrates/vm/src/builtins/list.rscrates/vm/src/builtins/object.rscrates/vm/src/builtins/property.rscrates/vm/src/builtins/range.rscrates/vm/src/builtins/tuple.rscrates/vm/src/builtins/type.rscrates/vm/src/dict_inner.rscrates/vm/src/frame.rs
| pub fn version(&self) -> u64 { | ||
| self.version.load(Relaxed) | ||
| } | ||
|
|
||
| /// Bump the version counter after any mutation. | ||
| fn bump_version(&self) { | ||
| self.version.fetch_add(1, Relaxed); | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# 1) Show where version bumps happen relative to lock scopes.
rg -n -C4 'bump_version\(' crates/vm/src/dict_inner.rsRepository: RustPython/RustPython
Length of output: 2663
🏁 Script executed:
# 2) Show consumers of dict version values (to validate whether reads are lock-free).
rg -n -C4 '\.version\(\)' crates/vm/srcRepository: RustPython/RustPython
Length of output: 3008
🏁 Script executed:
# 3) Get the full context around lines 263-270 and surrounding code
head -n 300 crates/vm/src/dict_inner.rs | tail -n +250Repository: RustPython/RustPython
Length of output: 1941
Use Acquire/Release ordering for version field and perform version bumps under the write lock.
The version() and bump_version() methods use Relaxed ordering. Currently, bump_version() is called after the write lock is released at lines 319, 383, 441, 465, 499, 536, 738, and 756. Meanwhile, version() is read lock-free in frame.rs (lines 4644, 4680, 7076) to validate LOAD_GLOBAL specialization caches. With Relaxed ordering, a reader may see stale version values after mutations complete, causing cache validation to incorrectly accept invalidated caches.
🔧 Suggested direction
-use core::sync::atomic::{AtomicU64, Ordering::Relaxed};
+use core::sync::atomic::{AtomicU64, Ordering::{Acquire, Release}};
...
pub fn version(&self) -> u64 {
- self.version.load(Relaxed)
+ self.version.load(Acquire)
}
...
fn bump_version(&self) {
- self.version.fetch_add(1, Relaxed);
+ self.version.fetch_add(1, Release);
}Also move each bump_version() call into the same write-lock critical section as the corresponding mutation to ensure version updates are ordered with respect to dict mutations.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@crates/vm/src/dict_inner.rs` around lines 263 - 270, The version() accessor
and bump_version() updater must use proper Acquire/Release ordering and
bump_version must be called while holding the dict write lock: change version()
to load with Acquire and bump_version() to fetch_add with Release (function
names: version and bump_version in dict_inner.rs), and move every call to
bump_version so it executes inside the same write-lock critical section where
the dictionary mutation occurs (the callers in this repo include the mutation
sites referenced in frame.rs that read the version lock-free for LOAD_GLOBAL
caching); ensure the write lock remains held across the mutation and the
bump_version call so the Release store synchronizes with readers that use
Acquire.
9df2a06 to
0f8da0c
Compare
📦 Library DependenciesThe following Lib/ modules were modified. Here are their dependencies: [x] lib: cpython/Lib/doctest.py dependencies:
dependent tests: (33 tests)
Legend:
|
There was a problem hiding this comment.
♻️ Duplicate comments (1)
crates/vm/src/dict_inner.rs (1)
262-270:⚠️ Potential issue | 🟠 MajorUse Acquire/Release ordering for dict version invalidation.
Line 264 and Line 269 currently use
Relaxed. For a version counter read lock-free by specialization paths, this can admit stale version observations on weak memory models.🔧 Suggested fix
-use core::sync::atomic::{AtomicU64, Ordering::Relaxed}; +use core::sync::atomic::{ + AtomicU64, + Ordering::{Acquire, Release}, +}; ... pub fn version(&self) -> u64 { - self.version.load(Relaxed) + self.version.load(Acquire) } ... fn bump_version(&self) { - self.version.fetch_add(1, Relaxed); + self.version.fetch_add(1, Release); }Run this read-only verification to confirm ordering and usage sites:
#!/bin/bash set -euo pipefail # Inspect ordering in dict version API. rg -n -C3 'AtomicU64|Ordering::|fn version\(|fn bump_version\(|fetch_add|load\(' crates/vm/src/dict_inner.rs # Find version readers across VM (including specialization paths). rg -n -C3 '\.version\(\)' crates/vm/src # Review all mutation sites that bump the version. rg -n -C2 'bump_version\(' crates/vm/src/dict_inner.rsAs per coding guidelines "Follow Rust best practices for error handling and memory management".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/vm/src/dict_inner.rs` around lines 262 - 270, The version counter currently uses Relaxed ordering which can allow stale reads; change the atomic load in fn version(&self) to use Acquire ordering and change the fetch_add in fn bump_version(&self) to use Release ordering on self.version so writers publish updates and readers observe them (i.e., replace Relaxed with Acquire in version() and with Release in bump_version()); after changing, run the suggested searches for .version() and bump_version() to ensure all readers/writers rely on the Acquire/Release semantics and update any comments indicating the stronger memory ordering.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@crates/vm/src/dict_inner.rs`:
- Around line 262-270: The version counter currently uses Relaxed ordering which
can allow stale reads; change the atomic load in fn version(&self) to use
Acquire ordering and change the fetch_add in fn bump_version(&self) to use
Release ordering on self.version so writers publish updates and readers observe
them (i.e., replace Relaxed with Acquire in version() and with Release in
bump_version()); after changing, run the suggested searches for .version() and
bump_version() to ensure all readers/writers rely on the Acquire/Release
semantics and update any comments indicating the stronger memory ordering.
ℹ️ Review info
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
Lib/test/test_doctest/test_doctest.pyis excluded by!Lib/**
📒 Files selected for processing (10)
crates/vm/src/builtins/dict.rscrates/vm/src/builtins/function.rscrates/vm/src/builtins/list.rscrates/vm/src/builtins/object.rscrates/vm/src/builtins/property.rscrates/vm/src/builtins/range.rscrates/vm/src/builtins/tuple.rscrates/vm/src/builtins/type.rscrates/vm/src/dict_inner.rscrates/vm/src/frame.rs
🚧 Files skipped from review as they are similar to previous changes (5)
- crates/vm/src/builtins/range.rs
- crates/vm/src/builtins/dict.rs
- crates/vm/src/builtins/list.rs
- crates/vm/src/builtins/type.rs
- crates/vm/src/builtins/tuple.rs
87bd158 to
5b6f414
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
crates/vm/src/dict_inner.rs (1)
22-22:⚠️ Potential issue | 🟠 MajorUse Acquire/Release for version atomics (previously reported, still unresolved).
Line 264 and Line 269 still use
Relaxed. For a mutation-version invalidation signal, this can leave dict writes unsynchronized with lock-free version readers and allow stale cache validation.Proposed fix
-use core::sync::atomic::{AtomicU64, Ordering::Relaxed}; +use core::sync::atomic::{ + AtomicU64, + Ordering::{Acquire, Release}, +}; ... pub fn version(&self) -> u64 { - self.version.load(Relaxed) + self.version.load(Acquire) } ... fn bump_version(&self) { - self.version.fetch_add(1, Relaxed); + self.version.fetch_add(1, Release); }#!/bin/bash # Verify atomic ordering at the definition site and inspect version() consumers. rg -n -C3 'AtomicU64|Ordering::|version\.load|fetch_add\(' crates/vm/src/dict_inner.rs rg -n -C3 '\.version\(\)' crates/vm/srcAlso applies to: 262-270
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/vm/src/dict_inner.rs` at line 22, The atomic version uses Ordering::Relaxed which can break synchronization between writers and lock-free readers; change the ordering so readers use Acquire on version loads (e.g. any call sites of version().load(...) or version.read paths) and writers use Release on updates (e.g. the AtomicU64.fetch_add(...) or any version.store(...) in DictInner), or use Acquire on loads and Release on the mutation operation in the methods named around version()/fetch_add in dict inner; update the AtomicU64 operations referenced (version(), fetch_add, version.load) to use Ordering::Acquire for reads and Ordering::Release for writes to ensure proper acquire–release synchronization.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@crates/vm/src/builtins/range.rs`:
- Around line 653-657: fast_next currently uses self.index.fetch_add(1) which
can wrap at usize::MAX and later casts that usize to isize (index as isize),
causing wrong values for huge ranges; fix by replacing the fetch_add with an
atomic fetch_update (or a compare-and-swap loop) that first reads current =
self.index.load(...), returns None if current >= self.length, otherwise attempts
to increment to current + 1 and, on success, uses that non-wrapping current as
the element index; then convert the chosen current index to isize using
isize::try_from(current).ok_or_else(|| ...) and compute the element with checked
arithmetic (use current_isize.checked_mul(self.step) and then
checked_add(self.start)) and return None on overflow — reference the fast_next
function and the fields self.index, self.length, self.start, and self.step when
making these changes.
---
Duplicate comments:
In `@crates/vm/src/dict_inner.rs`:
- Line 22: The atomic version uses Ordering::Relaxed which can break
synchronization between writers and lock-free readers; change the ordering so
readers use Acquire on version loads (e.g. any call sites of version().load(...)
or version.read paths) and writers use Release on updates (e.g. the
AtomicU64.fetch_add(...) or any version.store(...) in DictInner), or use Acquire
on loads and Release on the mutation operation in the methods named around
version()/fetch_add in dict inner; update the AtomicU64 operations referenced
(version(), fetch_add, version.load) to use Ordering::Acquire for reads and
Ordering::Release for writes to ensure proper acquire–release synchronization.
ℹ️ Review info
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
Lib/test/test_doctest/test_doctest.pyis excluded by!Lib/**
📒 Files selected for processing (10)
crates/vm/src/builtins/dict.rscrates/vm/src/builtins/function.rscrates/vm/src/builtins/list.rscrates/vm/src/builtins/object.rscrates/vm/src/builtins/property.rscrates/vm/src/builtins/range.rscrates/vm/src/builtins/tuple.rscrates/vm/src/builtins/type.rscrates/vm/src/dict_inner.rscrates/vm/src/frame.rs
🚧 Files skipped from review as they are similar to previous changes (6)
- crates/vm/src/builtins/property.rs
- crates/vm/src/builtins/tuple.rs
- crates/vm/src/builtins/dict.rs
- crates/vm/src/builtins/object.rs
- crates/vm/src/builtins/function.rs
- crates/vm/src/builtins/list.rs
- dict version: Relaxed → Acquire/Release ordering - range iterator: deduplicate fast_next/next_fast
- Add debug_assert preconditions in invoke_exact_args - Add get_version_for_current_state() for lazy version reassignment after func_version invalidation - Document NEXT_TYPE_VERSION overflow policy
- COMPARE_OP: CompareOpInt, CompareOpFloat, CompareOpStr - TO_BOOL: ToBoolBool, ToBoolInt, ToBoolNone, ToBoolList, ToBoolStr - FOR_ITER: ForIterRange, ForIterList, ForIterTuple with fast_next() - LOAD_GLOBAL: LoadGlobalModule, LoadGlobalBuiltin with dict version guard - Add version counter to Dict for mutation tracking
…ation - BinaryOpSubscrListInt, BinaryOpSubscrTupleInt, BinaryOpSubscrDict - ContainsOpDict, ContainsOpSet - UnpackSequenceTwoTuple, UnpackSequenceTuple, UnpackSequenceList - StoreAttrInstanceValue with type_version guard - Deoptimize bytecode for marshal serialization (original_bytes) - Separate co_code (deoptimized) from _co_code_adaptive (quickened)
…Isinstance, CallType1 specialization
…al, ForIterGen, CallListAppend specialization
- LoadAttrNondescriptorNoDict: plain class attr on objects without dict - LoadAttrNondescriptorWithValues: plain class attr with dict fallback - LoadAttrClass: handler for type attribute access (not yet routed) - CallMethodDescriptorNoargs: method descriptor with 0 args - CallMethodDescriptorO: method descriptor with 1 arg - CallMethodDescriptorFast: method descriptor with multiple args - Use HAS_DICT flag instead of obj.dict().is_some() for method/nondescriptor routing
- CallBuiltinFast: native function calls with arbitrary positional args - CallNonPyGeneral: fallback for unmatched callables (custom __call__, etc.) - All builtin function calls now specialize (CallBuiltinFast as default) - specialize_call now always produces a specialized instruction
- SendGen: direct coro.send() for generator/coroutine receivers - Add adaptive counter to Send instruction - specialize_send checks builtin_coro for PyGenerator/PyCoroutine
- LoadAttrSlot: direct obj.get_slot(offset) bypassing descriptor protocol - StoreAttrSlot: direct obj.set_slot(offset, value) bypassing descriptor protocol - Detect PyMemberDescriptor with MemberGetter::Offset in specialize_load_attr/store_attr - Cache slot offset in cache_base+3
…ltinFastWithKeywords, CallMethodDescriptorFastWithKeywords specialization
Fix LoadSuperAttrMethod to push unbound descriptor + self instead of bound method + self which caused double self binding. Fix LoadSuperAttrAttr obj_arg condition for classmethod detection.
5b6f414 to
c9c95c1
Compare
Remove unnecessary CPython references, FIXME→TODO, redundant Note: prefix, and "Same as" cross-references.
c9c95c1 to
51accdb
Compare
There was a problem hiding this comment.
🧹 Nitpick comments (1)
crates/vm/src/builtins/function.rs (1)
614-628: Minor race condition may waste version numbers.Two threads calling
get_version_for_current_state()simultaneously whenfunc_version == 0could each allocate a fresh version, with one being discarded. This is benign for correctness (the function still gets a valid version) but wastes version space.If version exhaustion is a practical concern, consider using
compare_exchangeto atomically set only if still 0:♻️ Optional improvement using CAS
pub fn get_version_for_current_state(&self) -> u32 { let v = self.func_version.load(Relaxed); if v != 0 { return v; } let new_v = next_func_version(); if new_v == 0 { return 0; } - self.func_version.store(new_v, Relaxed); - new_v + match self.func_version.compare_exchange(0, new_v, Relaxed, Relaxed) { + Ok(_) => new_v, + Err(existing) => existing, // Another thread won; return their version + } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/vm/src/builtins/function.rs` around lines 614 - 628, get_version_for_current_state currently races when func_version == 0: two threads can both call next_func_version and one result is discarded, wasting version numbers; change the allocation path to perform an atomic compare-and-swap (compare_exchange) on self.func_version so you only install new_v if the current value is still 0 (if compare_exchange fails, discard new_v and read the installed value instead); keep the same zero handling (return 0 if next_func_version() returns 0) and use the same memory ordering (Relaxed) for the CAS/store to preserve the original semantics while preventing wasted allocations.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@crates/vm/src/builtins/function.rs`:
- Around line 614-628: get_version_for_current_state currently races when
func_version == 0: two threads can both call next_func_version and one result is
discarded, wasting version numbers; change the allocation path to perform an
atomic compare-and-swap (compare_exchange) on self.func_version so you only
install new_v if the current value is still 0 (if compare_exchange fails,
discard new_v and read the installed value instead); keep the same zero handling
(return 0 if next_func_version() returns 0) and use the same memory ordering
(Relaxed) for the CAS/store to preserve the original semantics while preventing
wasted allocations.
ℹ️ Review info
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
crates/vm/src/builtins/function.rscrates/vm/src/builtins/range.rscrates/vm/src/dict_inner.rscrates/vm/src/frame.rs
🚧 Files skipped from review as they are similar to previous changes (1)
- crates/vm/src/builtins/range.rs
youknowone
left a comment
There was a problem hiding this comment.
This PR still has many missing features and caveats. They will be addressed in follow-up PR
- dict version: Relaxed → Acquire/Release ordering - range iterator: deduplicate fast_next/next_fast
- dict version: Relaxed → Acquire/Release ordering - range iterator: deduplicate fast_next/next_fast
- dict version: Relaxed → Acquire/Release ordering - range iterator: deduplicate fast_next/next_fast
Summary by CodeRabbit