fix(BA-3308): Support multi-GPU fractional allocation in anti-fragmentation guard#10477
fix(BA-3308): Support multi-GPU fractional allocation in anti-fragmentation guard#10477seedspirit wants to merge 7 commits intomainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR updates the agent’s fractional allocation anti-fragmentation guard to allow multi-device fractional GPU allocations (BA-3308 / issue #275), and adds unit tests intended to validate the revised behavior.
Changes:
- Revise
FractionAllocMap.ensure_slot_not_fragmented()to evaluate multi-device feasibility using per-device “density” and quantum rounding. - Add an extensive unit-test matrix for density quantization and expected device usage under FILL/EVENLY strategies (plus occupied-device scenarios and edge cases).
- Add a changelog entry describing the fix.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
src/ai/backend/agent/alloc_map.py |
Reworks the anti-fragmentation guard to support multi-device fractional allocations. |
tests/unit/agent/test_alloc_map.py |
Adds new test suites/cases covering defrag density math, strategies, occupied devices, and edge cases. |
changes/10477.fix.md |
Documents the bugfix in the changelog. |
Review notes (blockers):
- The new guard explicitly assumes homogeneous per-device capacity, but the codebase can produce heterogeneous
DeviceSlotInfo.amountvalues (e.g., mock accelerator’s_get_share_raw()varies by device). This can cause false rejections of otherwise feasible allocations. - The guard’s “remainder/quantum” reasoning is described as matching
distribute_evenly, butdistribute_evenlyoperates inself.digits(0.01) while final rounding usesquantum_size(often 0.1 for CUDA shares). This mismatch can allow allocations that later get truncated byround_down(..., quantum_size)such that the returned allocation sum no longer equals the requested amount; the newly added tests currently don’t assert “sum allocated == requested” for the parametrized FILL/EVENLY cases.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Replace indirect fixture pattern with explicit device_remaining list - Split tests into separate FILL and EVENLY strategy methods - Add strategy parameter to _make_map_with_remaining helper Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
This task does not address the original issue. (Please also update the PR description, it is not related to #275)
It only changes a ensure_slot_not_fragmented() function that checks whether a given agent has enough resources with no fragment.
We have to update the _allocate_by_filling / _allocate_evenly allocator functions.
resolves #275 (BA-3308)
Checklist: (if applicable)
ai.backend.testdocsdirectory