Skip to content

Refactor ci#51

Merged
CppCXY merged 3 commits into
mainfrom
refactor-ci
Jun 3, 2026
Merged

Refactor ci#51
CppCXY merged 3 commits into
mainfrom
refactor-ci

Conversation

@CppCXY
Copy link
Copy Markdown
Owner

@CppCXY CppCXY commented Jun 3, 2026

No description provided.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jun 3, 2026

Reasoning Details We are given a large diff across many files. The task is to review the code changes for potential issues: code style violations, logical errors, security vulnerabilities, and provide improvement suggestions.

I'll go through each file change systematically.

First, note the overall changes: version bumps, addition of [profile.release] with opt-level = 3 and debug = true in workspace Cargo.toml, changes in GC module to add prepare_object_for_release for threads, changes in object_allocator to add call_info_pool, changes in call_info to introduce CallInfoPtr and use it in lua_state, changes in execute_loop to replace ActiveFrame with direct CallInfo pointer, changes in helper.rs and metamethod.rs to pass &mut CallInfo instead of frame_idx, changes in lua_state to store CallInfoPtr and use Pooled<CallInfo> for stable storage, changes in stdlib to use current_frame() instead of direct call_stack access, and test changes.

Potential issues:

  1. Cargo.toml: Adding debug = true to release profile may be unintended for production. It can increase binary size and decrease performance. Usually release builds have debug symbols only if needed. Possibly a debug optimization for performance profiling? But not typical. Should verify intent.

  2. GC module: prepare_object_for_release handles thread release. However, it is called in release_or_detach_object and release_object. But note that release_or_detach_object takes ownership of GcObjectOwner and passes as mutable ref to prepare_object_for_release. After that, it does std::mem::forget if shared or drops otherwise. That seems okay.

    However, in release_object, it takes mut obj: GcObjectOwner (changed from &mut self to taking ownership). The method signature changed: previously fn release_object(&mut self, obj: GcObjectOwner) now fn release_object(mut obj: GcObjectOwner) — note it's no longer a method of GC, it's a static method? Actually, in the diff, it shows fn release_object(mut obj: GcObjectOwner) { Self::prepare_object_for_release(&mut obj); drop(obj); } — this is inside impl GC, so it's a method that takes self? No, the diff shows removal of &mut self parameter: from fn release_object(&mut self, obj: GcObjectOwner) to fn release_object(mut obj: GcObjectOwner). That means it becomes an associated function (static method) because it no longer takes &self as first argument. But the call sites use Self::release_object(obj) which is fine for static method. However, note that drop(obj) does not require self. But previously release_object was used with self.release_object(obj), now it's Self::release_object(obj). That's consistent. Potential issue: If release_object was meant to modify GC state (like accounting), but now it doesn't have access to &mut self. However, looking at the old code, release_object didn't do anything other than drop(obj)? Actually old code: fn release_object(&mut self, obj: GcObjectOwner) { drop(obj); } — it didn't use self. So making it static is fine. But is it possible that other call sites still use self.release_object? The diff shows all call sites changed to Self::release_object. Good.

  3. Object Allocator: Added call_info_pool and alloc_call_info. This seems fine. But note that CallInfo is being allocated via pool, not directly on heap. However, CallInfo contains a raw pointer chunk_ptr and upvalue_ptrs, and other fields. The pool allocation is stable as long as the pool's memory is not deallocated until the end. The LuaState holds Vec<Pooled<CallInfo>> and Vec<CallInfoPtr>. When a CallInfo is allocated, it is pushed to call_stack_storage and its pointer is stored in call_stack. That's fine.

    Potential issue: The Pooled type likely stores a pointer to the allocated slot. When release_empty_pages is called (e.g., during GC), it may release pages that still have active CallInfo? But call_info_pool.release_empty_pages() is called in release_empty_pages() of ObjectAllocator. However, call_info_pool contains CallInfo objects that are referenced by LuaState's call_stack_storage. If the pool releases empty pages, it might deallocate memory that is still in use. But release_empty_pages typically only frees pages that have no allocations. It should be safe if the pool tracks active allocations. But need to ensure that the CallInfo objects are never freed until the Pooled handle is dropped. The LuaState stores Pooled<CallInfo> in call_stack_storage, so as long as that vector keeps those handles alive, the allocations remain. So release_empty_pages should be fine.

    However, note that the call to release_empty_pages is inside ObjectAllocator::release_empty_pages, which is called from GC? Not shown in diff. But if the GC triggers while a thread's execution is ongoing, the CallInfo objects in use are still referenced, so not released. Should be okay.

  4. CallInfo and CallInfoPtr: New type CallInfoPtr wraps NonNull<CallInfo> with Deref and DerefMut implementations that do unsafe dereference. This is used to avoid borrow checker issues? Previously, LuaState had Vec<CallInfo> and methods returning &mut CallInfo. Now it stores Vec<CallInfoPtr> and Vec<Pooled<CallInfo>>. The get_call_info_mut now returns &mut CallInfo via unsafe deref. The get_call_info_ptr returns raw pointer. This introduces more unsafe code. Need to ensure safety invariants: the CallInfoPtr must always point to valid memory. Since the Pooled<CallInfo> in call_stack_storage owns the allocation, and the CallInfoPtr in call_stack points to it, as long as the corresponding Pooled is not dropped while the pointer is used, it's safe. But call_stack_storage may be cleared in release(), which also clears call_stack. That's fine. However, there could be aliasing issues: multiple &mut aliases via CallInfoPtr DerefMut? The call_stack stores many CallInfoPtr, each pointing to a different CallInfo. But there can be multiple mutable references to the same CallInfo if the same pointer is used from two places? Unlikely, but need to ensure that the code doesn't store two CallInfoPtr to the same CallInfo. It appears each CallInfo is allocated once and stored in both call_stack and the Pooled handle. Only one CallInfoPtr per index. So safe.

    The execute_loop now gets ci_ptr from get_call_info_ptr and dereferences it as &mut CallInfo and holds that reference across loops. It replaces the old ActiveFrame struct which held a clone of frame data. This change reduces data duplication. However, now the code directly mutates the CallInfo in the pool. But there is a risk: if the GC runs and tries to access the same CallInfo via a different path, could cause aliasing? Possibly not, as GC doesn't directly touch CallInfo. However, note that CallInfo contains chunk_ptr and upvalue_ptrs which are raw pointers into GC-managed objects. Those should be stable as long as references to those objects are held. The GC may move or collect objects if not properly rooted. Need to ensure that GC does not collect the chunk or upvalues while a frame is active. The code likely has proper rooting via the LuaState's stack and call stack. But with this change, the CallInfo is now stored in a pooled allocation, but the GC could potentially move the chunk? No, chunk is a GcProto, which is a GC object, but it's not moved; only collected when unreachable. The chunk_ptr is a raw pointer to GcProto. As long as the thread holds a reference to the function value (which points to the proto), the proto is kept alive. The upvalue pointers are UpvaluePtr which are also GC roots. So it should be fine.

  5. Execute loop changes: The main change is removing ActiveFrame and using a &mut CallInfo directly. The reload_after_call and reload_after_return macros now reassign ci from a new frame pointer. This is correct. However, note that after a call to precall that returns true (indicating Lua call), they call reload_after_call! which gets the new frame. But before that, they call flush_ci!(pc) which saves the current pc. That's fine.

    One potential issue: The ci mutable reference is obtained from call_stack via raw pointer. In the main loop, ci is declared as let mut ci: &mut CallInfo = unsafe { &mut *ci_ptr };. This is a mutable reference that can be reborrowed. But then inside macros like savestate!, they call flush_ci!(pc); which does ci.save_pc($saved_pc);. That's fine. However, note that after savestate!, they may call trap = hook_check_instruction(...) which may call reload_ci!() which is currently empty (no-op). That's okay.

    The reload_ci! macro is empty because the ci reference is already pointing to the correct CallInfo; no need to reload. But there is a subtlety: when a call returns, the reload_after_return! macro sets ci to a new CallInfo from the caller's frame. So the mutable reference ci is reassigned. That is allowed because ci is let mut and it's a mutable reference to a different object. However, Rust's borrow checker does not allow reassigning a mutable reference to point to a different object? Actually, you can reassign a mutable reference variable to point to a different object if the variable is mutable. For example: let mut x: &mut i32 = &mut a; x = &mut b; is allowed. So that's fine.

    However, there is a potential aliasing issue: the old mutable reference to the previous frame's CallInfo may still be an active mutable reference. But after the function returns, the previous frame's CallInfo might be popped (by pop_call_frame). But the old mutable reference is not used again. The compiler ensures no use after move. So it's safe.

    One may worry about the stability of the raw pointer when the call stack is modified (push/pop). The CallInfoPtr stored in call_stack remains valid because the underlying Pooled<CallInfo> is owned by call_stack_storage and not removed until release or the state is dropped. When a frame is popped, the call_depth decreases, but the CallInfoPtr remains in the vector (the slot is not cleared). That's fine because future pushes will overwrite the slot. However, the CallInfo inside the pool is still alive. But after popping, the caller's frame is still there. So the pointer remains valid. No issue.

    However, there is a change in release_object from method to static. The calls Self::release_object(obj) are used. In the GC's sweep functions, they previously called drop(obj) directly, now they call Self::release_object(obj). That's consistent.

  6. Helper functions: Changed from passing frame_idx to passing &mut CallInfo directly. This reduces repeated lookups and avoids borrow issues. Good.

  7. Metamethod: call_newindex_tm_fast now takes &mut CallInfo instead of frame_idx. It directly modifies ci in case of yield. Good.

  8. LuaState changes:

    • Added call_stack_storage: Vec<Pooled<CallInfo>>.
    • get_call_info and get_call_info_mut now use CallInfoPtr to deref.
    • get_call_info_ptr returns raw pointer.
    • current_frame() and current_frame_mut() adjusted.
    • alloc_call_info_slot method that allocates from pool and stores both handle and pointer.
    • release method that clears call_depth and clears vectors. Note: release only clears call_stack and call_stack_storage, but does not call pool.release_empty_pages(). That's fine because the pool is global and will be cleaned up later.

    Potential issue: The call_stack_storage vector owns Pooled<CallInfo> handles. When release is called, these handles are dropped, which will deallocate the CallInfo objects in the pool. But the pool may still hold references to those pages? Actually, when Pooled is dropped, it presumably returns the slot to the pool for reuse. That's fine. However, there might be an issue if release is called while the thread is being garbage collected (as a GcThread). The release_object in GC now calls thread.release() which may call release on the LuaState. That's exactly what is intended: when a thread object is released, its call stack is cleaned up.

    But note: release is called from prepare_object_for_release when the object is a thread. That is called before the thread's GC object is dropped. So the thread's call stack is cleared, and then the thread is dropped. This seems correct.

  9. Stdlib changes: Changed direct l.call_stack[l.call_depth()-1] to l.current_frame().expect(...). That is cleaner and safer. The expect will panic if there is no frame, but that should not happen in normal operation. It's okay.

  10. Test: Changed similarly.

  11. Edge cases: In execute_loop.rs, the load_active_frame_header function returns a FrameCtx without frame_idx now. It also no longer caches top and call_status. That means the code relies on the ci mutable reference to read those. That's fine.

    One thing: In the main loop, after a reload_after_call!, the ci is reassigned to the new frame. But the chunk, code, constants, base, pc are also updated. That's correct. However, note that flush_ci!() is called before some operations, which saves the pc to the ci. That's okay.

    Potential bug: In the OpCode::Return0 and Return1 fast paths, they now use ci directly (the let ci = ... lines removed). They previously had let ci = active_frame.current_ci(lua_state); but now they use the outer ci directly. That's fine because ci is the current frame's CallInfo. However, note that in Return0 fast path, they do lua_state.pop_call_frame(); after using ci fields. After pop_call_frame, the ci pointer may become invalid? Actually, pop_call_frame only decrements call_depth; it does not deallocate the CallInfo. The CallInfo slot is still in the pool and the pointer remains valid. But the data inside the CallInfo is no longer the current frame. However, after pop_call_frame, the code still reads nresults and res from the old ci? Let's check:

    let ci = active_frame.current_ci(lua_state);
    let nresults = ci.nresults();
    let res = ci.base - ci.func_offset as usize;
    lua_state.pop_call_frame();
    

    Now, after the change, they use the outer ci (which is the current frame) before popping. Then they pop. After pop, they continue to use nresults and res which were copied. That's fine. The ci reference is not used after pop.

    However, there is a subtlety: In the new code, they have ci as a mutable reference to the CallInfo. After pop, that mutable reference is no longer valid? Actually, it's still valid because the memory is still there. But the semantic meaning of that CallInfo changed: it is no longer the current frame. But if they don't use it after pop, it's fine. The code does not use ci after pop. Good.

  12. Macro definitions: The flush_ci! macro calls ci.save_pc($saved_pc);. That's fine. But note that ci is a mutable reference, and save_pc takes &mut self. The macro is defined inside the function, so it captures the local ci. That's acceptable.

  13. Performance impact: The new approach avoids copying CallInfo fields into an ActiveFrame struct, which may reduce overhead. The use of pool allocation for CallInfo may improve cache locality. The addition of debug = true in release could severely impact performance. Suggest removal unless intentional.

  14. Rust edition 2024: The crate uses edition 2024. That's fine, but ensure that the Rust toolchain supports it (stable? maybe nightly). Could cause CI issues.

  15. Unsafe code: There is a significant increase in unsafe code, particularly in CallInfoPtr dereferences and in LuaState methods. Need to verify safety invariants are documented. The code relies on the invariant that the CallInfoPtr always points to a valid, initialized CallInfo that is not aliased with any other mutable reference. The get_call_info_mut method now returns &mut CallInfo by dereferencing a raw pointer. This is safe as long as the pointer is valid and unique. However, note that call_stack is a Vec<CallInfoPtr>, and get_call_info_mut is used in handle_pending_ops (helper.rs line 978) where it takes &mut CallInfo from the function argument, not from the state. But there are also calls to lua_state.get_call_info_mut(...) and lua_state.get_call_info(...) in other places? The diff shows some calls changed to use ci directly, but some remain? For example, in helper.rs, handle_pending_ops now takes ci: &mut CallInfo and uses it, no longer calls lua_state.get_call_info. That's good.

    But we need to check if any remaining usages of get_call_info_mut might alias with the ci mutable reference in the execute loop. In the execute loop, ci is a mutable reference to the current frame's CallInfo. If anywhere in the execute loop, a call to lua_state.get_call_info_mut(frame_idx) is made for the same frame, it would create two mutable references. I didn't see any such calls in the diff; all previous uses of lua_state.get_call_info in execute_loop have been replaced with direct use of ci. So that seems safe.

    However, there is a risk: in call.rs or other modules not shown in diff, there might still be calls to get_call_info_mut. But the diff only shows changes to the files listed. The review is limited to these changes. So we assume that all necessary changes are made.

  16. Missing bounds check: In execute_loop.rs, the macro reload_after_call! does let frame_idx = lua_state.call_depth() - 1; without checking if call_depth > 0. But it is called only after a successful call, so depth should be at least 1. Could panic if call_depth is 0. However, the logic ensures depth > 0.

  17. Thread release: In GC, prepare_object_for_release calls thread.release() on the thread's LuaState. That clears its call stack. But if the thread is currently executing, this could be dangerous. However, the GC only releases objects that are unreachable, so the thread should not be active. The release method is safe because the thread is being collected.

  18. Potential memory leak: The call_stack_storage vector holds Pooled<CallInfo> handles. When the LuaState is dropped, these handles are dropped, which deallocates the CallInfo in the pool. But the pool itself may still hold the pages. That's fine.

  19. Code style: The code uses macros like flush_ci! and reload_ci! which are defined inside the function. That's unusual but acceptable. There is also a lot of unsafe blocks. The unsafe code should have safety comments. I don't see comments in the diff. It would be good to add safety comments for the new unsafe operations.

  20. Security: No obvious security issues. The changes are internal to VM execution.

Improvement suggestions:

  • Remove debug = true from release profile unless specifically needed.
  • Add safety comments for unsafe blocks.
  • Consider using Option<&mut CallInfo> instead of raw pointers to avoid unsafety, but that would require changing the whole architecture. The current approach is performance-driven.
  • Ensure that the CallInfo pool's release_empty_pages is not called while any CallInfo is in use (i.e., only at safe points). The existing code probably handles that.

Overall, the changes seem well-thought-out and improve performance by reducing duplication and allocation. The main concern is the added unsafe code and the debug symbols in release.

Code Review Summary

🚨 Potential Issues

  1. Release profile has debug symbols enabled
    [profile.release] adds debug = true, which will include debug info in release builds. This increases binary size and reduces performance. Unless this is intentional for profiling/release debugging, it should be removed or set to false.

  2. Unsafe code without safety comments
    Multiple new unsafe blocks (e.g., CallInfoPtr::deref, get_call_info_ptr, get_call_info_mut, execute loop pointer dereferences) lack safety documentation. While invariants appear correct, explicit safety comments are essential for maintainability and to prevent future misuse.

  3. release_object became a static method – no access to GC state
    The method no longer takes &mut self. Previously, it only called drop(obj) and didn't use self, so this is safe. However, if future modifications need GC state (e.g., accounting), this change will break. Consider keeping as a method or documenting the decision.

  4. Possible panic in reload_after_call! if call_depth is zero
    The macro uses lua_state.call_depth() - 1 without checking for non-zero. In normal flow, depth should be ≥1 after a call, but an unexpected state could cause a panic. Add a debug assertion or safe guard.

  5. CallInfo pool allocations may not be freed until LuaState is dropped
    call_stack_storage holds Pooled<CallInfo> handles, keeping allocation slots alive even after frames are popped. This is intentional for reuse, but could lead to memory bloat if many threads are created and released. Not a bug, but worth monitoring.

✅ Positive Changes

  • Reduced data duplication – Replacing ActiveFrame with a direct &mut CallInfo pointer avoids copying and re-reading frame state.
  • Stable CallInfo pointers – Using a pool prevents re-allocation and ensures memory stability, improving safety and performance.
  • Improved code clarity – Passing &mut CallInfo instead of frame_idx reduces index lookups and makes ownership explicit.
  • Cleaner current_frame() usage – Stdlib and test code now use current_frame().expect(...) instead of direct call_stack access, which is safer and more idiomatic.

🛠 Suggestions

  1. Remove or comment out debug = true in [profile.release] unless specifically needed.
  2. Add // Safety: comments to all unsafe blocks (especially in call_info.rs, lua_state.rs, and the execute loop).
  3. Consider adding debug_assert!(lua_state.call_depth() > 0) before subtracting in reload_after_call! and load_active_frame_header.
  4. Evaluate if release_object should remain a method for future use (e.g., GC metrics).
  5. For long-running applications, consider periodically calling release_empty_pages on the call info pool when the thread is idle to free unused memory.

Overall, the changes are clean and logically sound. The main risks are the added unsafe code and the debug profile setting. With minor adjustments, this PR can be merged.

@CppCXY CppCXY merged commit e385113 into main Jun 3, 2026
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant