Skip to content

Conversation

@kovtcharov
Copy link
Collaborator

@kovtcharov kovtcharov commented Feb 5, 2026

Summary

Fixes package registration issues that caused ModuleNotFoundError on non-editable installs, plus CI improvements and bug fixes.

Version bump: v0.15.3.1 → v0.15.3.2

Package Registration Fixes

  • Add 5 missing packages to setup.py (gaia.vlm, gaia.apps.docker, gaia.apps.jira, gaia.apps.summarize.templates, gaia.eval.fix_code_testbench)
  • Create missing __init__.py for gaia.mcp, gaia.eval, gaia.talk, gaia.apps.summarize

Entry Point Fixes

  • Add main() to mcp_bridge.py so gaia-mcp entry point works
  • Remove broken gaia-mcp-atlassian entry point (file doesn't exist)

Bug Fixes

  • Fix gaia init blocking on Lemonade install: Previously stopped at step 1/4 if user declined installation. Now continues to step 2 (health check) which properly verifies connectivity for remote servers, manual installs, or LEMONADE_BASE_URL setups.

Code Cleanup

  • Remove dead ChatApp import from agents/__init__.py
  • Fix unused variable warnings in batch_experiment.py
  • Fix trailing whitespace and encoding in eval.py
  • Re-enable 13 pylint checks after fixing all violations

CI Improvements

  • Fix release notes bug: pypi.yml was overwriting release notes with empty string due to race condition with publish_installer.yml. Now checks if release exists before creating.
  • Add packaging integrity tests: New test_packaging.py with 6 tests to catch missing packages, __init__.py files, and broken entry points
  • Update lint config: Add pre-existing warning codes to DISABLED_CHECKS (exposed when new __init__.py files made pylint scan gaia.eval and gaia.mcp)
  • Fix flaky chat SDK memory test: The 'say I don't know' system prompt instruction was confusing the small 0.6B model - it would say 'I don't know' even when the answer was in conversation history. Changed to 'answer based on conversation history' which works reliably.

Release Notes

  • Added docs/releases/v0.15.3.2.mdx with full release notes
  • Updated docs/docs.json navbar to show v0.15.3.2 · Lemonade 9.3.0

Test plan

  • All 316 unit tests pass locally
  • New test_packaging.py (6 tests) validates package/entry point integrity
  • util/lint.py --all passes (5 passed, 2 non-blocking warnings)
  • CI runs packaging tests before other unit tests
  • Chat SDK integration tests pass

- Add 5 missing packages to setup.py (gaia.vlm, gaia.apps.docker,
  gaia.apps.jira, gaia.apps.summarize.templates, gaia.eval.fix_code_testbench)
- Create missing __init__.py for gaia.mcp, gaia.eval, gaia.talk,
  gaia.apps.summarize
- Add main() to mcp_bridge.py for gaia-mcp entry point
- Remove broken gaia-mcp-atlassian entry point (file doesn't exist)
- Remove dead ChatApp import from agents/__init__.py
- Add test_packaging.py with 6 CI tests to catch packaging issues
- Remove unused sd_agent_example.py
@github-actions github-actions bot added dependencies Dependency updates devops DevOps/infrastructure changes agents Agent system changes talk Talk agent changes mcp MCP integration changes eval Evaluation framework changes tests Test changes performance Performance-critical changes labels Feb 5, 2026
@kovtcharov kovtcharov enabled auto-merge February 5, 2026 17:36
The github-release job in pypi.yml used gh release create with --notes "",
which blanked out release notes if publish_installer.yml had already created
the release. Now it checks if the release exists first and only creates as
a fallback with --generate-notes.
- Add pre-existing warning codes to DISABLED_CHECKS in util/lint.py
- Fix unused variable warnings in batch_experiment.py (prefix with _)
- Fix f-strings without interpolation in batch_experiment.py
- Fix trailing whitespace and encoding in eval.py
Add system prompt to make LLM recall facts more reliably. Also made
prompts more explicit about expected answers to reduce non-determinism.
LLM responses are non-deterministic, so the memory recall test may
occasionally fail. Added retry logic (up to 3 attempts) to make the
test more robust while still validating conversation memory works.
Tell LLM to answer directly and never ask questions back. If it doesn't
know, it should say 'I don't know' instead of asking clarifying questions.
Previously, if user answered 'n' to installing Lemonade, init would
stop at step 1/4. Now it continues to step 2 (server health check)
which will verify connectivity regardless of how Lemonade is set up:
- Remote server via LEMONADE_BASE_URL
- Manual local installation
- Pre-existing installation

The health check is the proper gate, not the installation prompt.
@kovtcharov kovtcharov added this pull request to the merge queue Feb 5, 2026
The 'If you don't know the answer, say I don't know' instruction was
causing the small 0.6B model to respond with 'I don't know' even when
the answer was in conversation history. Small models can't distinguish
between inherent knowledge gaps vs context-provided information.

Simplified to 'answer based on conversation history' which works with
both small and large models.
@kovtcharov kovtcharov removed this pull request from the merge queue due to a manual request Feb 5, 2026
kovtcharov and others added 2 commits February 5, 2026 15:30
- Fix pylint issues in eval.py, groundtruth.py, and all MCP files
  (unused vars, f-strings, missing encoding, duplicate exceptions, etc.)
- Add inline pylint disable for exec-used in blender_mcp_server.py
- Remove unused import from test_chat_sdk.py (flake8 F401)
- Re-enable 13 pylint checks in lint.py now that violations are fixed

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@kovtcharov kovtcharov enabled auto-merge February 5, 2026 23:47
@github-actions github-actions bot added the documentation Documentation changes label Feb 5, 2026
@kovtcharov kovtcharov added this pull request to the merge queue Feb 6, 2026
github-merge-queue bot pushed a commit that referenced this pull request Feb 6, 2026
…#309)

## Summary

Fixes package registration issues that caused `ModuleNotFoundError` on
non-editable installs, plus CI improvements and bug fixes.

**Version bump**: v0.15.3.1 → v0.15.3.2

### Package Registration Fixes
- Add 5 missing packages to setup.py (`gaia.vlm`, `gaia.apps.docker`,
`gaia.apps.jira`, `gaia.apps.summarize.templates`,
`gaia.eval.fix_code_testbench`)
- Create missing `__init__.py` for `gaia.mcp`, `gaia.eval`, `gaia.talk`,
`gaia.apps.summarize`

### Entry Point Fixes
- Add `main()` to `mcp_bridge.py` so `gaia-mcp` entry point works
- Remove broken `gaia-mcp-atlassian` entry point (file doesn't exist)

### Bug Fixes
- **Fix `gaia init` blocking on Lemonade install**: Previously stopped
at step 1/4 if user declined installation. Now continues to step 2
(health check) which properly verifies connectivity for remote servers,
manual installs, or `LEMONADE_BASE_URL` setups.

### Code Cleanup
- Remove dead `ChatApp` import from `agents/__init__.py`
- Fix unused variable warnings in `batch_experiment.py`
- Fix trailing whitespace and encoding in `eval.py`
- Re-enable 13 pylint checks after fixing all violations

### CI Improvements
- **Fix release notes bug**: `pypi.yml` was overwriting release notes
with empty string due to race condition with `publish_installer.yml`.
Now checks if release exists before creating.
- **Add packaging integrity tests**: New `test_packaging.py` with 6
tests to catch missing packages, `__init__.py` files, and broken entry
points
- **Update lint config**: Add pre-existing warning codes to
`DISABLED_CHECKS` (exposed when new `__init__.py` files made pylint scan
`gaia.eval` and `gaia.mcp`)
- **Fix flaky chat SDK memory test**: The 'say I don't know' system
prompt instruction was confusing the small 0.6B model - it would say 'I
don't know' even when the answer was in conversation history. Changed to
'answer based on conversation history' which works reliably.

### Release Notes
- Added `docs/releases/v0.15.3.2.mdx` with full release notes
- Updated `docs/docs.json` navbar to show v0.15.3.2 · Lemonade 9.3.0

## Test plan
- [x] All 316 unit tests pass locally
- [x] New `test_packaging.py` (6 tests) validates package/entry point
integrity
- [x] `util/lint.py --all` passes (5 passed, 2 non-blocking warnings)
- [ ] CI runs packaging tests before other unit tests
- [ ] Chat SDK integration tests pass

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
@kovtcharov kovtcharov removed this pull request from the merge queue due to a manual request Feb 6, 2026
@kovtcharov-amd kovtcharov-amd merged commit 1d75142 into main Feb 6, 2026
44 checks passed
@kovtcharov-amd kovtcharov-amd deleted the kalin/vlm-fix branch February 6, 2026 00:12
@itomek itomek linked an issue Feb 6, 2026 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent system changes dependencies Dependency updates devops DevOps/infrastructure changes documentation Documentation changes eval Evaluation framework changes mcp MCP integration changes performance Performance-critical changes talk Talk agent changes tests Test changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

GAIA init forces lemonade install

2 participants