Skip to content

feat(buffer_object): Add resize method to BufferIO#57

Merged
Leahlijuan merged 5 commits intomainfrom
feat/bufferresize
Feb 25, 2026
Merged

feat(buffer_object): Add resize method to BufferIO#57
Leahlijuan merged 5 commits intomainfrom
feat/bufferresize

Conversation

@Leahlijuan
Copy link
Collaborator

Fixes #40

@Leahlijuan Leahlijuan requested review from g-husam and kkkapu February 23, 2026 17:38
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @Leahlijuan, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request enhances the BufferIO system by introducing the capability to dynamically resize buffer objects. This feature allows for greater flexibility in managing data stored in memory-mapped files, enabling the buffer's capacity to be adjusted after its initial creation. The implementation involves changes in both the Python BufferIO class and its underlying C++ BufferObject and memory-mapping utilities, ensuring robust and efficient resizing operations.

Highlights

  • Dynamic Buffer Resizing: Introduced a resize method to the BufferIO class in Python, allowing dynamic adjustment of buffer capacity for memory-mapped files.
  • C++ Backend Implementation: Implemented the core resize logic in the underlying C++ BufferObject, which handles the resizing of the memory-mapped file.
  • Memory-Mapped File Resizing Utility: Developed a new C++ helper function, resize_mmap, to manage the complex process of unmapping, truncating, and remapping files for resizing.
  • Comprehensive Testing: Added extensive unit tests across both Python and C++ components to ensure the resize functionality works correctly, preserves data, and handles various edge cases and error conditions.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • src/ml_flashpoint/checkpoint_object_manager/buffer_io.py
    • Added resize method to BufferIO class.
    • Implemented logic to release and recreate memoryview and remap metadata after resizing.
    • Included validity checks for write mode and minimum buffer size.
  • src/ml_flashpoint/checkpoint_object_manager/buffer_object/bindings.cpp
    • Exposed BufferObject::resize method to Python via pybind11.
  • src/ml_flashpoint/checkpoint_object_manager/buffer_object/buffer_helper.cpp
    • Implemented resize_mmap function to unmap, ftruncate, and remap memory-mapped files.
    • Added error handling and logging for resize_mmap.
  • src/ml_flashpoint/checkpoint_object_manager/buffer_object/buffer_helper.h
    • Declared resize_mmap function for memory-mapped file resizing.
  • src/ml_flashpoint/checkpoint_object_manager/buffer_object/buffer_object.cpp
    • Implemented BufferObject::resize method with checks for closed, read-only, or zero-capacity buffers.
    • Integrated resize_mmap to perform the actual file resizing.
  • src/ml_flashpoint/checkpoint_object_manager/buffer_object/buffer_object.h
    • Declared resize method within the BufferObject class.
  • tests/checkpoint_object_manager/buffer_object/buffer_helper_test.cpp
    • Added ResizeMmapSucceedsToLargerSize test.
    • Added ResizeMmapSucceedsToSmallerSize test.
    • Added ResizeMmapFailsOnInvalidFd test.
    • Added ResizeMmapFailsOnFtruncateFailure test.
  • tests/checkpoint_object_manager/buffer_object/buffer_object_test.cpp
    • Added ResizeSucceeds test.
    • Added ResizePreservesData test.
    • Added ResizeFailsOnClosedBuffer test.
    • Added ResizeFailsOnReadOnlyBuffer test.
    • Added ResizeFailsOnZeroCapacity test.
  • tests/checkpoint_object_manager/test_buffer_io.py
    • Added TestResizeOperations class with tests for resize_increases_capacity, resize_updates_memoryview, resize_fails_on_closed_buffer, resize_fails_on_readonly_buffer, and resize_fails_on_zero_capacity.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

The pull request introduces a resize method to the BufferIO class and the underlying C++ BufferObject. The implementation is generally sound, correctly managing memory-mapped regions and metadata. I have identified a few areas for improvement: specifically, validating the new size upfront in BufferIO.resize to ensure atomicity and adhering to the repository's testing style guide by using structured comments (Given/When/Then) in the new test cases.

out_data_ptr = ptr;
out_data_size = new_size;

LOG(INFO) << "Successfully resized mmap to " << new_size;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make this debug

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is no LOG(DEBUG) for this library

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah true, can we use VLOG(DEBUG_LEVEL) then, that's the recommendation. can set constexpr DEBUG_LEVEL = 4 .

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but for other methods, we use LOG(INFO) for similar logs, maybe we can keep it for now and modify the log in the future for all of them

out_data_ptr = ptr;
out_data_size = new_size;

LOG(INFO) << "Successfully resized mmap to " << new_size;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah true, can we use VLOG(DEBUG_LEVEL) then, that's the recommendation. can set constexpr DEBUG_LEVEL = 4 .

@g-husam g-husam changed the title feat: Add resize method to BufferIO feat(buffer_object): Add resize method to BufferIO Feb 25, 2026
@github-actions
Copy link

Python Code Coverage Summary

Code Coverage

Package Line Rate Branch Rate Health
src.ml_flashpoint 100% 100%
src.ml_flashpoint.adapter 100% 100%
src.ml_flashpoint.adapter.megatron 97% 94%
src.ml_flashpoint.adapter.nemo 98% 94%
src.ml_flashpoint.adapter.pytorch 99% 88%
src.ml_flashpoint.checkpoint_object_manager 92% 91%
src.ml_flashpoint.core 96% 92%
src.ml_flashpoint.replication 81% 81%
Summary 95% (2058 / 2170) 91% (473 / 520)

Minimum allowed line rate is 90%

@github-actions
Copy link

C++ Code Coverage Summary

Code Coverage

Package Line Rate Branch Rate Health
src.ml_flashpoint.checkpoint_object_manager.buffer_object 93% 54%
src.ml_flashpoint.checkpoint_object_manager.object_manager 70% 37%
src.ml_flashpoint.replication.transfer_service 79% 40%
Summary 81% (916 / 1126) 43% (687 / 1604)

Minimum allowed line rate is 80%

@Leahlijuan Leahlijuan merged commit c45592d into main Feb 25, 2026
5 checks passed
@Leahlijuan Leahlijuan deleted the feat/bufferresize branch February 25, 2026 16:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Reuse mmap buffers when saving checkpoint objects

2 participants