fix: issues with tests (alora example, rag intrinsics, mistral tool use, vllm auto-skip) by jakelorocco · Pull Request #570 · generative-computing/mellea

jakelorocco · 2026-03-03T17:30:25Z

Misc PR

Type of PR

Bug Fix
New Feature
Documentation
Other

Description

Link to Issue: N/A for most of them; bug: broken aLora example after intrinsics refactor #385 for alora 101 example

A few issues with tests that I stumbled on. These were errors in our test code not the mellea code the tests were testing.

test/backends/test_huggingface_tools.py- we are using a mistral model that requires sentencepiece package installed -> fixed in pyproject.toml
test/stdlib/components/intrinsic/test_rag.py- changes to the adapters for citations / hallucination detection resulted in slightly different values -> fixed the expected data
docs/examples/aLora/102_example.py- expected input-> fixed by skipping this example and unskipping the 101_example.py that tests the same functionality.
test/backends/test_openai_vllm.py -> exceptions raised during vllm setup were causing the test to error out instead of be skipped

Test passes

Testing

Tests added to the respective file if code was changed
New code has 100% coverage if code as added
Ensure existing tests and github automation passes (a maintainer will kick off the github automation when the rest of the PR is populated)

github-actions · 2026-03-03T17:30:39Z

The PR description has been updated. Please fill out the template for your PR to be reviewed.

mergify · 2026-03-03T17:31:06Z

Merge Protections

Your pull request matches the following merge protections and will not be merged until they are valid.

🟢 Enforce conventional commit

Wonderful, this rule succeeded.

Make sure that we follow https://www.conventionalcommits.org/en/v1.0.0/

title ~= ^(fix|feat|docs|style|refactor|perf|test|build|ci|chore|revert|release)(?:\(.+\))?:

psschwei · 2026-03-04T19:50:04Z

Two of the test_rag tests failed for me:

=================================================================== FAILURES ====================================================================
________________________________________________________________ test_citations _________________________________________________________________

backend = <mellea.backends.huggingface.LocalHFBackend object at 0x351d82480>

    @pytest.mark.qualitative
    def test_citations(backend):
        """Verify that the citations intrinsic functions properly."""
        context, assistant_response, docs = _read_input_json("citations.json")
        expected = _read_output_json("citations.json")

        # First call triggers adapter loading
        result = rag.find_citations(assistant_response, docs, context, backend)
>       assert result == expected
E       assert [{'citation_b...nion. ", ...}] == [{'citation_b...nion. ", ...}]
E
E         At index 0 diff: {'response_begin': 0, 'response_end': 96, 'response_text': 'Murdoch expanded in Australia and New Zealand by acquiring and expanding local newspapers. ', 'citation_doc_id': '0', 'citation_begin'
E
E         ...Full output truncated (2 lines hidden), use '-vv' to show

test/stdlib/components/intrinsic/test_rag.py:130: AssertionError
------------------------------------------------------------- Captured stdout call --------------------------------------------------------------
=== 14:46:06-INFO ======
passing in model options when generating with an adapter; some model options may be overwritten / ignored
------------------------------------------------------------- Captured stderr call --------------------------------------------------------------
Fetching 1 files: 100%|██████████| 1/1 [00:00<00:00, 32263.88it/s]
Fetching 9 files: 100%|██████████| 9/9 [00:00<00:00, 190650.18it/s]
--------------------------------------------------------------- Captured log call ---------------------------------------------------------------
INFO     fancy_logger:huggingface.py:428 passing in model options when generating with an adapter; some model options may be overwritten / ignored
_________________________________________________________ test_hallucination_detection __________________________________________________________

backend = <mellea.backends.huggingface.LocalHFBackend object at 0x351d82480>

    @pytest.mark.qualitative
    def test_hallucination_detection(backend):
        """Verify that the hallucination detection intrinsic functions properly."""
        context, assistant_response, docs = _read_input_json("hallucination_detection.json")
        expected = _read_output_json("hallucination_detection.json")

        # First call triggers adapter loading
        result = rag.flag_hallucinated_content(assistant_response, docs, context, backend)
        # pytest.approx() chokes on lists of records, so we do this complicated dance.
        for r, e in zip(result, expected, strict=True):  # type: ignore
>           assert pytest.approx(r, abs=2e-2) == e
E           AssertionError: assert approx({'resp...he sentence.}) == {'explanation...end': 31, ...}
E
E             comparison failed. Mismatched elements: 1 / 5:
E             Max absolute difference: 5
E             Max relative difference: 0.1388888888888889
E             Index        | Obtained | Expected
E             response_end | 31       | 36 ± 0.02

test/stdlib/components/intrinsic/test_rag.py:164: AssertionError
------------------------------------------------------------- Captured stdout call --------------------------------------------------------------
=== 14:46:15-INFO ======
passing in model options when generating with an adapter; some model options may be overwritten / ignored
------------------------------------------------------------- Captured stderr call --------------------------------------------------------------
Fetching 1 files: 100%|██████████| 1/1 [00:00<00:00, 26886.56it/s]
Fetching 9 files: 100%|██████████| 9/9 [00:00<00:00, 147456.00it/s]
--------------------------------------------------------------- Captured log call ---------------------------------------------------------------
INFO     fancy_logger:huggingface.py:428 passing in model options when generating with an adapter; some model options may be overwritten / ignored

psschwei · 2026-03-04T19:51:28Z

The other three all passed (or were skipped) successfully

jakelorocco · 2026-03-04T19:52:04Z

Two of the test_rag tests failed for me:
...

@psschwei, can you please clarify. Did these tests fail when running against this branch and with packages updated?

psschwei · 2026-03-04T19:54:06Z

@psschwei, can you please clarify. Did these tests fail when running against this branch and with packages updated?

Yes, against this branch with a fresh venv (checkout branch as new worktree and uv sync --all-groups --all-extras in worktree dir)

psschwei · 2026-03-04T19:55:17Z

though I think your force push came after I checked out, let me retry

psschwei · 2026-03-04T20:00:37Z

more failures now (though all seem to be related to modules not found after the granite-common merge)

jakelorocco · 2026-03-04T20:24:05Z

Updated the commit to fix the pyproject packages and tests pass for me locally on mac and on linux with a clean environment.

psschwei

tests all pass for me now too

…se, vllm auto-skip) (generative-computing#570) * fix: issues with tests (alora example, rag intrinsics, mistral tool use) * fix: uv lock update after pyproject changes

jakelorocco requested a review from a team as a code owner March 3, 2026 17:30

jakelorocco force-pushed the jal/test-fixes branch 2 times, most recently from 8795893 to 8ed36c8 Compare March 3, 2026 17:41

jakelorocco requested a review from planetf1 March 3, 2026 17:41

jakelorocco force-pushed the jal/test-fixes branch from 8ed36c8 to f9b18ce Compare March 3, 2026 19:21

jakelorocco changed the title ~~fix: issues with tests (alora example, rag intrinsics, mistral tool use)~~ fix: issues with tests (alora example, rag intrinsics, mistral tool use, vllm auto-skip) Mar 3, 2026

jakelorocco force-pushed the jal/test-fixes branch from f9b18ce to 42e7e80 Compare March 4, 2026 19:48

fix: issues with tests (alora example, rag intrinsics, mistral tool use)

f919af4

jakelorocco force-pushed the jal/test-fixes branch from 42e7e80 to f919af4 Compare March 4, 2026 20:20

Merge branch 'main' into jal/test-fixes

359d003

fix: uv lock update after pyproject changes

f48f413

psschwei approved these changes Mar 4, 2026

View reviewed changes

jakelorocco merged commit 4cc75c8 into main Mar 4, 2026
5 checks passed

jakelorocco deleted the jal/test-fixes branch March 4, 2026 21:25

ajbozarth mentioned this pull request Mar 5, 2026

fix: correct type annotations and improve CI cache invalidation #579

Merged

8 tasks

psschwei mentioned this pull request Mar 20, 2026

chore: use github tooling to build release notes #710

Open

8 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: issues with tests (alora example, rag intrinsics, mistral tool use, vllm auto-skip)#570

fix: issues with tests (alora example, rag intrinsics, mistral tool use, vllm auto-skip)#570
jakelorocco merged 3 commits intomainfrom
jal/test-fixes

jakelorocco commented Mar 3, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Mar 3, 2026

Uh oh!

mergify bot commented Mar 3, 2026

Uh oh!

psschwei commented Mar 4, 2026

Uh oh!

psschwei commented Mar 4, 2026

Uh oh!

jakelorocco commented Mar 4, 2026

Uh oh!

psschwei commented Mar 4, 2026

Uh oh!

psschwei commented Mar 4, 2026

Uh oh!

psschwei commented Mar 4, 2026

Uh oh!

jakelorocco commented Mar 4, 2026

Uh oh!

psschwei left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

jakelorocco commented Mar 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Misc PR

Type of PR

Description

Testing

Uh oh!

github-actions bot commented Mar 3, 2026

Uh oh!

mergify bot commented Mar 3, 2026

Merge Protections

🟢 Enforce conventional commit

Uh oh!

psschwei commented Mar 4, 2026

Uh oh!

psschwei commented Mar 4, 2026

Uh oh!

jakelorocco commented Mar 4, 2026

Uh oh!

psschwei commented Mar 4, 2026

Uh oh!

psschwei commented Mar 4, 2026

Uh oh!

psschwei commented Mar 4, 2026

Uh oh!

jakelorocco commented Mar 4, 2026

Uh oh!

psschwei left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jakelorocco commented Mar 3, 2026 •

edited

Loading