Fix missing packages, broken entry points, and add packaging CI tests #309

kovtcharov · 2026-02-05T17:29:03Z

Summary

Fixes package registration issues that caused ModuleNotFoundError on non-editable installs, plus CI improvements and bug fixes.

Version bump: v0.15.3.1 → v0.15.3.2

Package Registration Fixes

Add 5 missing packages to setup.py (gaia.vlm, gaia.apps.docker, gaia.apps.jira, gaia.apps.summarize.templates, gaia.eval.fix_code_testbench)
Create missing __init__.py for gaia.mcp, gaia.eval, gaia.talk, gaia.apps.summarize

Entry Point Fixes

Add main() to mcp_bridge.py so gaia-mcp entry point works
Remove broken gaia-mcp-atlassian entry point (file doesn't exist)

Bug Fixes

Fix gaia init blocking on Lemonade install: Previously stopped at step 1/4 if user declined installation. Now continues to step 2 (health check) which properly verifies connectivity for remote servers, manual installs, or LEMONADE_BASE_URL setups.

Code Cleanup

Remove dead ChatApp import from agents/__init__.py
Fix unused variable warnings in batch_experiment.py
Fix trailing whitespace and encoding in eval.py
Re-enable 13 pylint checks after fixing all violations

CI Improvements

Fix release notes bug: pypi.yml was overwriting release notes with empty string due to race condition with publish_installer.yml. Now checks if release exists before creating.
Add packaging integrity tests: New test_packaging.py with 6 tests to catch missing packages, __init__.py files, and broken entry points
Update lint config: Add pre-existing warning codes to DISABLED_CHECKS (exposed when new __init__.py files made pylint scan gaia.eval and gaia.mcp)
Fix flaky chat SDK memory test: The 'say I don't know' system prompt instruction was confusing the small 0.6B model - it would say 'I don't know' even when the answer was in conversation history. Changed to 'answer based on conversation history' which works reliably.

Release Notes

Added docs/releases/v0.15.3.2.mdx with full release notes
Updated docs/docs.json navbar to show v0.15.3.2 · Lemonade 9.3.0

Test plan

All 316 unit tests pass locally
New test_packaging.py (6 tests) validates package/entry point integrity
util/lint.py --all passes (5 passed, 2 non-blocking warnings)
CI runs packaging tests before other unit tests
Chat SDK integration tests pass

- Add 5 missing packages to setup.py (gaia.vlm, gaia.apps.docker, gaia.apps.jira, gaia.apps.summarize.templates, gaia.eval.fix_code_testbench) - Create missing __init__.py for gaia.mcp, gaia.eval, gaia.talk, gaia.apps.summarize - Add main() to mcp_bridge.py for gaia-mcp entry point - Remove broken gaia-mcp-atlassian entry point (file doesn't exist) - Remove dead ChatApp import from agents/__init__.py - Add test_packaging.py with 6 CI tests to catch packaging issues - Remove unused sd_agent_example.py

The github-release job in pypi.yml used gh release create with --notes "", which blanked out release notes if publish_installer.yml had already created the release. Now it checks if the release exists first and only creates as a fallback with --generate-notes.

- Add pre-existing warning codes to DISABLED_CHECKS in util/lint.py - Fix unused variable warnings in batch_experiment.py (prefix with _) - Fix f-strings without interpolation in batch_experiment.py - Fix trailing whitespace and encoding in eval.py

Add system prompt to make LLM recall facts more reliably. Also made prompts more explicit about expected answers to reduce non-determinism.

LLM responses are non-deterministic, so the memory recall test may occasionally fail. Added retry logic (up to 3 attempts) to make the test more robust while still validating conversation memory works.

Tell LLM to answer directly and never ask questions back. If it doesn't know, it should say 'I don't know' instead of asking clarifying questions.

Previously, if user answered 'n' to installing Lemonade, init would stop at step 1/4. Now it continues to step 2 (server health check) which will verify connectivity regardless of how Lemonade is set up: - Remote server via LEMONADE_BASE_URL - Manual local installation - Pre-existing installation The health check is the proper gate, not the installation prompt.

The 'If you don't know the answer, say I don't know' instruction was causing the small 0.6B model to respond with 'I don't know' even when the answer was in conversation history. Small models can't distinguish between inherent knowledge gaps vs context-provided information. Simplified to 'answer based on conversation history' which works with both small and large models.

- Fix pylint issues in eval.py, groundtruth.py, and all MCP files (unused vars, f-strings, missing encoding, duplicate exceptions, etc.) - Add inline pylint disable for exec-used in blender_mcp_server.py - Remove unused import from test_chat_sdk.py (flake8 F401) - Re-enable 13 pylint checks in lint.py now that violations are fixed Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

… models

…#309) ## Summary Fixes package registration issues that caused `ModuleNotFoundError` on non-editable installs, plus CI improvements and bug fixes. **Version bump**: v0.15.3.1 → v0.15.3.2 ### Package Registration Fixes - Add 5 missing packages to setup.py (`gaia.vlm`, `gaia.apps.docker`, `gaia.apps.jira`, `gaia.apps.summarize.templates`, `gaia.eval.fix_code_testbench`) - Create missing `__init__.py` for `gaia.mcp`, `gaia.eval`, `gaia.talk`, `gaia.apps.summarize` ### Entry Point Fixes - Add `main()` to `mcp_bridge.py` so `gaia-mcp` entry point works - Remove broken `gaia-mcp-atlassian` entry point (file doesn't exist) ### Bug Fixes - **Fix `gaia init` blocking on Lemonade install**: Previously stopped at step 1/4 if user declined installation. Now continues to step 2 (health check) which properly verifies connectivity for remote servers, manual installs, or `LEMONADE_BASE_URL` setups. ### Code Cleanup - Remove dead `ChatApp` import from `agents/__init__.py` - Fix unused variable warnings in `batch_experiment.py` - Fix trailing whitespace and encoding in `eval.py` - Re-enable 13 pylint checks after fixing all violations ### CI Improvements - **Fix release notes bug**: `pypi.yml` was overwriting release notes with empty string due to race condition with `publish_installer.yml`. Now checks if release exists before creating. - **Add packaging integrity tests**: New `test_packaging.py` with 6 tests to catch missing packages, `__init__.py` files, and broken entry points - **Update lint config**: Add pre-existing warning codes to `DISABLED_CHECKS` (exposed when new `__init__.py` files made pylint scan `gaia.eval` and `gaia.mcp`) - **Fix flaky chat SDK memory test**: The 'say I don't know' system prompt instruction was confusing the small 0.6B model - it would say 'I don't know' even when the answer was in conversation history. Changed to 'answer based on conversation history' which works reliably. ### Release Notes - Added `docs/releases/v0.15.3.2.mdx` with full release notes - Updated `docs/docs.json` navbar to show v0.15.3.2 · Lemonade 9.3.0 ## Test plan - [x] All 316 unit tests pass locally - [x] New `test_packaging.py` (6 tests) validates package/entry point integrity - [x] `util/lint.py --all` passes (5 passed, 2 non-blocking warnings) - [ ] CI runs packaging tests before other unit tests - [ ] Chat SDK integration tests pass --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

kovtcharov requested a review from kovtcharov-amd as a code owner February 5, 2026 17:29

github-actions bot added dependencies Dependency updates devops DevOps/infrastructure changes agents Agent system changes talk Talk agent changes mcp MCP integration changes eval Evaluation framework changes tests Test changes performance Performance-critical changes labels Feb 5, 2026

Restore examples/sd_agent_example.py

04d8c31

kovtcharov enabled auto-merge February 5, 2026 17:36

kovtcharov added 2 commits February 5, 2026 09:37

Apply Black formatting to __init__.py and test_packaging.py

0181feb

kovtcharov-amd approved these changes Feb 5, 2026

View reviewed changes

kovtcharov added 5 commits February 5, 2026 14:20

Remove unused pytest import from test_packaging.py

6a03191

Fix flaky test_convenience_functions_integration test

f15930e

Add system prompt to make LLM recall facts more reliably. Also made prompts more explicit about expected answers to reduce non-determinism.

Add retry logic for flaky memory recall test

dd10661

LLM responses are non-deterministic, so the memory recall test may occasionally fail. Added retry logic (up to 3 attempts) to make the test more robust while still validating conversation memory works.

Simplify flaky test fix: use direct system prompt instead of retry

81f8506

Tell LLM to answer directly and never ask questions back. If it doesn't know, it should say 'I don't know' instead of asking clarifying questions.

kovtcharov mentioned this pull request Feb 5, 2026

GAIA Init installation fails on Ubuntu 24.04.3 LTS #311

Open

This was referenced Feb 5, 2026

GAIA init forces lemonade install #310

Closed

Missing module for SD #313

Closed

version bump and lint

e0c75a0

kovtcharov added this pull request to the merge queue Feb 5, 2026

kovtcharov removed this pull request from the merge queue due to a manual request Feb 5, 2026

kovtcharov and others added 2 commits February 5, 2026 15:30

lint

564ca68

kovtcharov enabled auto-merge February 5, 2026 23:47

Add v0.15.3.2 release notes and bump version

7213a58

github-actions bot added the documentation Documentation changes label Feb 5, 2026

kovtcharov added this pull request to the merge queue Feb 6, 2026

Fix test_conversation_memory_integration: add system prompt for small…

2af9a8c

… models

kovtcharov removed this pull request from the merge queue due to a manual request Feb 6, 2026

kovtcharov-amd merged commit 1d75142 into main Feb 6, 2026
44 checks passed

kovtcharov-amd deleted the kalin/vlm-fix branch February 6, 2026 00:12

itomek linked an issue Feb 6, 2026 that may be closed by this pull request

GAIA init forces lemonade install #310

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix missing packages, broken entry points, and add packaging CI tests #309

Fix missing packages, broken entry points, and add packaging CI tests #309

Uh oh!

kovtcharov commented Feb 5, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Fix missing packages, broken entry points, and add packaging CI tests #309

Fix missing packages, broken entry points, and add packaging CI tests #309

Uh oh!

Conversation

kovtcharov commented Feb 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Package Registration Fixes

Entry Point Fixes

Bug Fixes

Code Cleanup

CI Improvements

Release Notes

Test plan

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

kovtcharov commented Feb 5, 2026 •

edited

Loading