fix: flush MPS cache in alora test GPU cleanup (#790) by planetf1 · Pull Request #800 · generative-computing/mellea

planetf1 · 2026-04-09T13:57:31Z

Flush MPS cache in alora test GPU cleanup

Type of PR

Bug Fix
New Feature
Documentation
Other

Description

Link to Issue: Fixes fix: MPS cache not flushed in alora test GPU cleanup #790

PR #765 added CUDA GPU cleanup to the alora integration tests, but the
MPS equivalent was missing. On Apple Silicon with MPS-capable PyTorch,
GPU memory isn't reclaimed between tests.

Changes in this PR:

Added torch.mps.empty_cache() after the CUDA cleanup blocks in both
test_alora_training_integration and test_lora_training_integration
Updated cleanup comment to say "device caches" instead of "CUDA cache"

Mirrors the existing pattern in test/conftest.py:373-374.

Testing

Tests added to the respective file if code was changed
New code has 100% coverage if code was added
Ensure existing tests and github automation passes (a maintainer will kick off the github automation when the rest of the PR is populated)

github-actions · 2026-04-09T13:57:49Z

The PR description has been updated. Please fill out the template for your PR to be reviewed.

jakelorocco

Should we extract this into a function and then utilize it everywhere we currently call this pattern? For instance, we do a very similar garbage collection dance pytest_runtest_setup in test/conftest.py without the mps cleanup. But in other cleanup functions in that same file, we do the mps style cleanup as well.

Or, are there reasons we might not want to do mps cleanup every time it's available?

) Add torch.mps.empty_cache() after CUDA cleanup blocks in both alora integration tests, matching the existing conftest pattern. Prevents MPS memory from accumulating between tests on Apple Silicon.

Consolidate the duplicated gc.collect + CUDA/MPS cache flush pattern into a single flush_device_caches() function in test/conftest.py. - Replaces 4 inline flush sites with a single call - Adds MPS support to sites that previously only handled CUDA (pytest_runtest_setup backend transitions, memory_cleaner fixture) - Fixes a bug where gc.collect() was conditional on CUDA availability in pytest_runtest_setup (now runs unconditionally) - Adds torch.mps.synchronize() for parity with CUDA synchronize() - Enriches cleanup_gpu_backend() VRAM logging: device-aware reporting for both CUDA (free/total/allocated/reserved/fragmentation) and MPS (allocated/max), with reclaimed bytes on both paths - Removes unused shutil/sys imports from test_alora_train_integration

planetf1 · 2026-04-10T06:37:40Z

@jakelorocco Done - I think the cleanup is safe everywhere— extracted a flush_device_caches() helper and replaced all 5 calls including MPS support to the two places that were CUDA-only. cleanup_gpu_backend VRAM logging extended to cuda, though the data available differs (and is the inverse). Keep it simple

Site	Location	Before	After
1	`conftest.py` `cleanup_gpu_backend()`	Inline `gc.collect()` x2 + CUDA flush + MPS `empty_cache()` only; VRAM logging CUDA-only (free/total)	Calls `flush_device_caches()`; VRAM logging device-aware — CUDA reports free/total/allocated/reserved/fragmentation, MPS reports allocated/max; both report reclaimed bytes
2	`conftest.py` `pytest_runtest_setup()`	`gc.collect()` inside `if torch.cuda.is_available()` — no GC or flush on MPS-only machines	Calls `flush_device_caches()` — GC runs unconditionally, flushes both CUDA and MPS
3	`conftest.py` `memory_cleaner()`	`gc.collect()` x2 + CUDA flush only; no MPS support	Calls `flush_device_caches()` — adds MPS flush
4	`test_alora_train_integration.py` `test_alora_training_integration()`	Inline `gc.collect()` x2 + CUDA flush + MPS `empty_cache()`; model-specific cleanup (accelerate hooks, del adapter, move to CPU)	Model-specific cleanup unchanged; inline flush replaced by `flush_device_caches()` (adds `mps.synchronize()`)
5	`test_alora_train_integration.py` `test_lora_training_integration()`	Inline `gc.collect()` x2 + CUDA flush + MPS `empty_cache()`	Calls `flush_device_caches()` (adds `mps.synchronize()`)

planetf1 · 2026-04-10T08:08:38Z

Have also run on LSF as well as locally (macOS). Just one qualitative failure - test/backends/test_ollama.py — one of the 4 raw prompts ("What is 2+2?") returned an empty string. The other 3 prompts got valid responses. So an LLM qual failure -- ie happy with test run

github-actions Bot added the bug Something isn't working label Apr 9, 2026

planetf1 marked this pull request as ready for review April 9, 2026 13:58

planetf1 requested a review from a team as a code owner April 9, 2026 13:58

planetf1 requested review from akihikokuroda, avinash2692, jakelorocco and markstur April 9, 2026 13:58

jakelorocco reviewed Apr 9, 2026

View reviewed changes

planetf1 added 2 commits April 10, 2026 07:11

fix: flush MPS cache in alora test GPU cleanup (generative-computing#790

80b925b

) Add torch.mps.empty_cache() after CUDA cleanup blocks in both alora integration tests, matching the existing conftest pattern. Prevents MPS memory from accumulating between tests on Apple Silicon.

planetf1 force-pushed the fix/mps-cache-cleanup-790 branch from a3e95ae to 29922a3 Compare April 10, 2026 06:37

planetf1 enabled auto-merge April 10, 2026 06:41

planetf1 requested a review from jakelorocco April 10, 2026 06:41

jakelorocco approved these changes Apr 10, 2026

View reviewed changes

planetf1 added this pull request to the merge queue Apr 10, 2026

Merged via the queue into generative-computing:main with commit 839eead Apr 10, 2026
4 checks passed

planetf1 deleted the fix/mps-cache-cleanup-790 branch April 10, 2026 13:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: flush MPS cache in alora test GPU cleanup (#790)#800

fix: flush MPS cache in alora test GPU cleanup (#790)#800
planetf1 merged 2 commits intogenerative-computing:mainfrom
planetf1:fix/mps-cache-cleanup-790

planetf1 commented Apr 9, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Apr 9, 2026

Uh oh!

jakelorocco left a comment

Uh oh!

planetf1 commented Apr 10, 2026 •

edited

Loading

Uh oh!

planetf1 commented Apr 10, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

planetf1 commented Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Flush MPS cache in alora test GPU cleanup

Type of PR

Description

Testing

Uh oh!

github-actions Bot commented Apr 9, 2026

Uh oh!

jakelorocco left a comment

Choose a reason for hiding this comment

Uh oh!

planetf1 commented Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

planetf1 commented Apr 10, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

planetf1 commented Apr 9, 2026 •

edited

Loading

planetf1 commented Apr 10, 2026 •

edited

Loading