Skip to content

[hotfix] Make Ollama embedding test actually run in CI#686

Merged
wenjin272 merged 4 commits into
apache:mainfrom
weiqingy:hotfix-embedding-test-open
May 19, 2026
Merged

[hotfix] Make Ollama embedding test actually run in CI#686
wenjin272 merged 4 commits into
apache:mainfrom
weiqingy:hotfix-embedding-test-open

Conversation

@weiqingy
Copy link
Copy Markdown
Collaborator

@weiqingy weiqingy commented May 17, 2026

Purpose of change

test_ollama_embedding_setup was effectively dead in CI — silently skipped by a broken preload script, and would have failed on _get_connection() if it ever ran. This PR makes it actually run and pass on the Python 3.10 / Ubuntu UT job.

Fix 1 — ab62944 — Add setup.open() calls in the embedding setup tests.

test_ollama_embedding_setup and test_openai_embedding_model both construct a BaseEmbeddingModelSetup subclass with connection set to a resource-name string, then call setup.embed(...) directly. BaseEmbeddingModelSetup.open() is what resolves self.connection from the descriptor-supplied resource name into the actual BaseEmbeddingModelConnection instance; without it, _get_connection() raises TypeError("Expect BaseEmbeddingModelConnection, but is str"). test_tongyi_embedding_model already follows the correct pattern. (The OpenAI test is gated by TEST_API_KEY so the latent bug was masked, but the fix applies identically.)

Fix 2 — 3ade7fa — Drop the failing ollama run line from start_ollama_server.sh.

The CI preload script ran ollama run all-minilm:22m, which exits non-zero against an embedding model with Error: embedding models require input text. test_ollama_embedding_model.py's module-level setup uses subprocess.run(..., check=True), so the failure fell into the except block and set client = None, which made @pytest.mark.skipif(client is None, ...) silently skip the test in every CI run. The preceding ollama pull all-minilm:22m is sufficient to make the model discoverable via client.list(), which is what the gate actually checks; the parallel chat-model start_ollama_server.sh already follows that minimal pattern.

Fix 3 — acfd146 — Move OLLAMA_EMBEDDING_MODEL out of module scope in two e2e tests.
After Fix 2 the test was still being skipped. Root cause was module-level os.environ["OLLAMA_EMBEDDING_MODEL"] = OLLAMA_MODEL in e2e_tests_resource_cross_language/embedding_model_cross_language_test.py:43-44 and vector_store_cross_language_test.py:43-44. Because tools/ut.sh runs pytest flink_agents -k "not e2e_tests" and -k only filters which tests execute (not which files are collected), pytest imported the e2e modules during collection and executed the assignment as a side effect — polluting OLLAMA_EMBEDDING_MODEL to "nomic-embed-text:latest" before test_ollama_embedding_model.py:34 read it with its "all-minilm:22m" default. Fix: removed the module-scope assignment; moved the assignment into the test function body via monkeypatch.setenv, which preserves runtime propagation to the Flink subprocess while auto-restoring after each test.

Tests

  • cd python && uv run pytest flink_agents/integrations/embedding_models/local/tests/test_ollama_embedding_model.py -v — passes locally when an Ollama daemon is reachable (previously raised TypeError).
  • cd python && uv run pytest flink_agents/integrations/embedding_models/tests/test_openai_embedding_model.py -v — gated by TEST_API_KEY; latent bug fixed.
  • ./tools/lint.sh -c clean.
  • Next CI run on this PR will exercise the embedding test for real on Python 3.10 / Ubuntu.

API

No API changes — test/CI fixes only.

Documentation

  • doc-needed
  • doc-not-needed
  • doc-included

`test_ollama_embedding_setup` and `test_openai_embedding_model` both
construct a `BaseEmbeddingModelSetup` subclass with `connection` set to
a resource-name string, then call `setup.embed(...)` directly without
first invoking `setup.open()`.

`BaseEmbeddingModelSetup.open()` is what resolves `self.connection` from
the descriptor-supplied resource name into the actual
`BaseEmbeddingModelConnection` instance. Without it,
`_get_connection()` raises `TypeError("Expect BaseEmbeddingModelConnection,
but is str")` and the test fails. The Ollama test exhibits this failure
in CI on Python 3.10 once the ollama daemon is brought up by
`start_ollama_server.sh`; the OpenAI test is masked by the
`TEST_API_KEY` skipif but would fail identically if the gate were ever
satisfied. `test_tongyi_embedding_model` (the third embedding test in
this repo) already follows the correct pattern.

Add the missing `setup.open()` call to both tests so the connection is
resolved before `embed()`.
@github-actions github-actions Bot added doc-not-needed Your PR changes do not impact docs fixVersion/0.3.0 The feature or bug should be implemented/fixed in the 0.3.0 version. priority/major Default priority of the PR or issue. labels May 17, 2026
@wenjin272
Copy link
Copy Markdown
Collaborator

Hi, @weiqingy. Good Catch, the fix looks good to me.

In theory, test_ollama_embedding_model.py should run in the UT CI environment with Python 3.10 and Ubuntu, but it appears to have been skipped in the CI. I investigated this issue and found that it seems to be caused by the command ollama run all-minilm:22m in start_ollama_server.sh, which triggers the following error:
Error: embedding models require input text. Usage: ollama run all-minilm:22m "your text here"

I think we may need to fix this issue together.

Per @wenjin272's review on apache#686:

`ollama run all-minilm:22m` cannot run non-interactively against an
embedding model — `ollama` reports

  Error: embedding models require input text.
  Usage: ollama run all-minilm:22m "your text here"

and exits non-zero. That bubbles up through `subprocess.run(check=True)`
in `test_ollama_embedding_model.py`'s module-level setup, falls into
the `except` block, and silently sets `client = None`, which makes the
embedding test skip in CI on Python 3.10 — defeating the purpose of
ever calling the script.

`ollama pull` alone is sufficient to make the model discoverable via
the daemon's `client.list()`; the chat-model sibling script
(`integrations/chat_models/tests/start_ollama_server.sh`) already
follows that minimal pattern. Drop the redundant `ollama run` line so
the gate succeeds and the test actually executes in CI.
@weiqingy
Copy link
Copy Markdown
Collaborator Author

weiqingy commented May 17, 2026

@wenjin272 Thanks for catching this! Confirmed by inspecting both scripts side-by-side: the chat-model sibling at python/flink_agents/integrations/chat_models/tests/start_ollama_server.sh only does ollama pull and nothing else, while the embedding version has the redundant ollama run all-minilm:22m line. As you diagnosed, ollama run against an embedding model exits non-zero with Error: embedding models require input text, and subprocess.run(..., check=True) in test_ollama_embedding_model.py:37–55 swallows that into the except block, leaving client = None and silently skipping the test — defeating the gate's purpose.

Dropped the line in 3ade7fa. The preceding ollama pull all-minilm:22m is sufficient to make the model discoverable via client.list(), which is what the gate actually checks. The next CI run should exercise test_ollama_embedding_setup on the Python 3.10 / Ubuntu UT job for real.

@weiqingy weiqingy changed the title [hotfix] Call open() in embedding setup tests before embed() [hotfix] Make Ollama embedding test actually run in CI May 17, 2026
@github-actions github-actions Bot added doc-not-needed Your PR changes do not impact docs and removed doc-not-needed Your PR changes do not impact docs labels May 17, 2026
@wenjin272
Copy link
Copy Markdown
Collaborator

Hi, @weiqingy. I noticed that the latest CI run still skipped test_ollama_embedding_setup. I tested this in my own repository's CI and found that the value of test_model was not the expected all-minilm:22m.
image

The root cause of this issue is environment variable leakage, and I have provided a detailed explanation below. I noticed that the title of this PR has been updated to “Make Ollama embedding test actually run in CI,” so I believe it is appropriate to address and fix this issue within the same PR.

Alternatively, we can fix the issues with the Open and Ollama scripts in this PR alone, and submit a separate PR to address all environment variable leakage issues, including those related to the embedding model, chat model, and other environment variables.

WDYT?

Root Cause: OLLAMA_EMBEDDING_MODEL env var leaked from e2e test modules

In CI (ut-python), test_model in test_ollama_embedding_model.py resolves to nomic-embed-text:latest even though its hard-coded default is all-minilm:22m:

# python/flink_agents/integrations/embedding_models/local/tests/test_ollama_embedding_model.py:34
test_model = os.environ.get("OLLAMA_EMBEDDING_MODEL", "all-minilm:22m")

The value comes from two e2e test modules that mutate os.environ at module import time (not inside a fixture or test function):

  • python/flink_agents/e2e_tests/e2e_tests_resource_cross_language/embedding_model_cross_language_test.py:43-44
  • python/flink_agents/e2e_tests/e2e_tests_resource_cross_language/vector_store_cross_language_test.py:43-44
OLLAMA_MODEL = os.environ.get("OLLAMA_EMBEDDING_MODEL", "nomic-embed-text:latest")
os.environ["OLLAMA_EMBEDDING_MODEL"] = OLLAMA_MODEL   # ← runs at import time

Why this leaks into ut-python

tools/ut.sh -p runs:

pytest flink_agents -k "not e2e_tests" ...

-k only filters which tests are executed/selected — pytest still imports every matching test file during the collection phase, including the ones under
e2e_tests/. Collection happens in alphabetical order, so e2e_tests/ is imported before integrations/:

  1. pytest collects e2e_tests_resource_cross_language/*_test.py → importing them executes os.environ["OLLAMA_EMBEDDING_MODEL"] = "nomic-embed-text:latest" as a side
    effect.
  2. pytest then collects integrations/embedding_models/local/tests/test_ollama_embedding_model.py → its module-level os.environ.get(...) now reads the polluted value
    instead of falling back to all-minilm:22m.

Net effect: the -k "not e2e_tests" filter prevents the e2e tests from running, but does not prevent their import-time side effects from leaking into the rest of the
test process.

Fix direction

Move the os.environ[...] = ... assignments out of module scope in both e2e files — into a fixture, the test function body, or use monkeypatch.setenv — so importing the
modules has no global side effects.

weiqingy added 2 commits May 18, 2026 22:06
The e2e modules in e2e_tests_resource_cross_language assigned
os.environ["OLLAMA_EMBEDDING_MODEL"] = OLLAMA_MODEL at module scope. Because
pytest -k "not e2e_tests" only filters which tests run (not which files are
collected/imported), pytest collection imported these modules and executed
the assignment as a side effect, polluting OLLAMA_EMBEDDING_MODEL for the
later-collected
integrations/embedding_models/local/tests/test_ollama_embedding_model.py.
Net effect: test_ollama_embedding_setup read the polluted
"nomic-embed-text:latest" value instead of falling back to the hard-coded
"all-minilm:22m" default, so the model_found gate failed and the test was
silently skipped in the ut-python CI job.

Removed the module-scope assignment and moved it into the test functions
via monkeypatch.setenv, which preserves the runtime propagation behavior
(Flink subprocess still sees the env var) and auto-restores after the
test.

Reported and root-caused by @wenjin272 in
apache#686 (comment).
Resolve conflict in vector_store_cross_language_test.py: main (apache#663,
Milvus integration) refactored the test into _run_vector_store_integration
helper with two backend-specific test functions (ELASTICSEARCH, MILVUS).
Reapply our Round 2 env-var fix on top of that structure: keep the
module-scope os.environ["OLLAMA_EMBEDDING_MODEL"] = OLLAMA_MODEL removal,
and add monkeypatch.setenv("OLLAMA_EMBEDDING_MODEL", OLLAMA_MODEL) inside
the helper so both backends get the env var hygiene.
@weiqingy
Copy link
Copy Markdown
Collaborator Author

weiqingy commented May 19, 2026

Thanks for the root-cause analysis, @wenjin272! Went with the in-this-PR narrow fix: moved OLLAMA_EMBEDDING_MODEL out of module scope in both cited files via monkeypatch.setenv inside the test functions (commit acfd146), then merged main to resolve the conflict with #663's Milvus refactor. The same module-level pattern exists in three more e2e files (chat-model side); left out of this PR to keep scope tight — create a separate cleanup PR for those: #689

@github-actions github-actions Bot added doc-not-needed Your PR changes do not impact docs and removed doc-not-needed Your PR changes do not impact docs labels May 19, 2026
Copy link
Copy Markdown
Collaborator

@wenjin272 wenjin272 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@wenjin272 wenjin272 merged commit 989a92e into apache:main May 19, 2026
23 of 24 checks passed
@weiqingy weiqingy deleted the hotfix-embedding-test-open branch May 19, 2026 23:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

doc-not-needed Your PR changes do not impact docs fixVersion/0.3.0 The feature or bug should be implemented/fixed in the 0.3.0 version. priority/major Default priority of the PR or issue.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants