feat: further vram optimizations by avinash2692 · Pull Request #765 · generative-computing/mellea

avinash2692 · 2026-03-30T00:39:54Z

Further optimizations to the GPU VRAM when running tests.

Type of PR

Bug Fix
New Feature
Documentation
Other

Description

###F ix GPU OOM in test runs: ollama lifecycle ownership + VRAM timing**

run_tests_with_ollama.sh starts and warms up all three ollama models unconditionally, even when an external process has already started the server — fragmenting VRAM before HF tests run and triggering OOM
Add OLLAMA_EXTERNAL and OLLAMA_SKIP_WARMUP env guards to the shell script so an external orchestrator can own the lifecycle (both default to 0 so standalone use is unaffected)
conftest.py handles VRAM timing — loading models right before the ollama test group and evicting after via keep_alive
Fix _check_ollama_available() hardcoded port to respect OLLAMA_HOST/OLLAMA_PORT env vars
Add extra GPU cleanup in test_alora_train_integration.py for all tests.

Testing

Tests added to the respective file if code was changed
New code has 100% coverage if code as added
Ensure existing tests and github automation passes (a maintainer will kick off the github automation when the rest of the PR is populated)

…imizations

github-actions · 2026-03-30T00:40:05Z

The PR description has been updated. Please fill out the template for your PR to be reviewed.

avinash2692 added 2 commits March 27, 2026 14:24

adding some extra vram cleanup to make end to end tests smoother

d393603

Merge remote-tracking branch 'origin/main' into feat/further-vram-opt…

9c87d95

…imizations

avinash2692 requested a review from a team as a code owner March 30, 2026 00:39

github-actions bot added the enhancement New feature or request label Mar 30, 2026

jakelorocco approved these changes Mar 30, 2026

View reviewed changes

avinash2692 added this pull request to the merge queue Mar 30, 2026

Merged via the queue into main with commit 243a161 Mar 30, 2026
8 checks passed

This was referenced Mar 31, 2026

fix: run_tests_with_ollama.sh proceeds silently when Ollama warmup times out #759

Open

test: agent skills infrastructure and marker taxonomy audit (#727, #728) #742

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: further vram optimizations#765

feat: further vram optimizations#765
avinash2692 merged 2 commits intomainfrom
feat/further-vram-optimizations

avinash2692 commented Mar 30, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Mar 30, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

avinash2692 commented Mar 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Further optimizations to the GPU VRAM when running tests.

Type of PR

Description

Testing

Uh oh!

github-actions bot commented Mar 30, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

avinash2692 commented Mar 30, 2026 •

edited

Loading