You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
test_hallucination_detection in test/stdlib/components/intrinsic/test_rag.py:152 has two issues:
Missing @pytest.mark.qualitative — every other LLM-output-quality test in the file is marked qualitative, but this one isn't. It runs in fast test loops where it shouldn't.
Problem
test_hallucination_detectionintest/stdlib/components/intrinsic/test_rag.py:152has two issues:Missing
@pytest.mark.qualitative— every other LLM-output-quality test in the file is marked qualitative, but this one isn't. It runs in fast test loops where it shouldn't.Tolerance too tight — asserts
pytest.approx(r, abs=3e-2)on a generative model score. Observed drift of 0.036 causes spurious failures (reported in fix: evict Ollama models between test modules to prevent memory starvation #804).Fix
@pytest.mark.qualitativedecoratorabs=3e-2toabs=5e-2Related