Feature v10.1 by Manthya · Pull Request #5 · Manthya/ProdBot

Manthya · 2026-03-10T16:03:33Z

No description provided.

Copilot

Pull request overview

This PR introduces the Phase 10.1 feature set: dynamic plugin/model configuration, personal assistant (HITL) integrations, multi-agent orchestration upgrades, and expanded evaluation/red-team test tooling.

Changes:

Added plugin + personal assistant API routes and frontend UI (Plugins tab, Draft Cards + HITL send).
Improved orchestration reliability/perf (agentic engine guardrails, Redis checkpointing/tests, pooled Ollama streaming, media pipeline offloading).
Added extensive eval, red-team, and integration test assets (csv case matrices, runners, Playwright specs).

Reviewed changes

Copilot reviewed 70 out of 75 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
tests/verify_plugins_api.py	Adds a script-style check for plugin endpoints (currently pytest-collectable).
tests/test_web_search.py	Adds a standalone DuckDuckGo tool test script (ignored by pytest config).
tests/test_tools_integration.py	Updates integration test to mock new orchestration/repo dependencies and validate JSON response.
tests/test_reflection.py	Adds unit tests for new `ReflectionHandler` behavior.
tests/test_mcp_server_management.py	Adds a multi-phase script for MCP server CRUD via plugin API (currently pytest-collectable).
tests/test_mcp_integration.py	Updates MCP integration script to refresh and query remote tools.
tests/test_mcp_config.py	Updates MCP config tests for async `get_mcp_servers()` and settings-manager mocking.
tests/test_checkpointing.py	Adds tests for Redis-backed checkpoint save/load/clear.
tests/test_agent_handoff.py	Adds tests for agent config selection and phase-based prompt substitution.
tests/redteam/tools/test_mcp_payloads.py	Adds red-team payload tests for MCP tools.
tests/redteam/frontend/test_frontend_concurrency.spec.ts	Adds Playwright red-team tests for frontend concurrency resilience.
tests/redteam/conftest.py	Adds shared red-team fixtures (mock provider, repos, app overrides).
tests/redteam/backend/test_thread_isolation.py	Adds red-team tests for thread isolation, concurrency interleaving, and overflow payloads.
tests/redteam/agentic/test_agentic_cycles.py	Adds red-team tests for cycle detection, circuit breaker, and tool hallucination handling.
tests/evals/test.csv	Adds an eval case matrix for benchmark runner.
tests/evals/run_phase10_orchestrator_live.py	Adds a live websocket eval runner with reporting.
tests/evals/run_benchmarks.py	Adds a mock/live benchmark runner that records tool trajectories.
tests/evals/phase10_report.md	Adds a generated mock eval report artifact.
tests/evals/phase10_live_report.md	Adds a generated live eval report artifact.
tests/evals/phase10_cases.csv	Adds generated Phase 10 case matrix for eval runner.
tests/evals/golden_db.csv	Adds fixtures backing the eval suite.
tests/evals/generate_phase10_cases.py	Adds generator script for Phase 10 eval cases.
tests/evals/check_phase10_report.py	Adds a CI gate for eval pass rate threshold.
tests/conftest.py	Configures pytest to ignore certain script-style tests.
src/chatbot_ai_system/tools/registry.py	Registers `GetCurrentTimeTool` in the default tool registry.
src/chatbot_ai_system/tools/init.py	Makes default tool registration tolerant to duplicates (via direct `_tools` access).
src/chatbot_ai_system/services/tool_reliability.py	Adds Redis-backed tool reliability tracking and ranking.
src/chatbot_ai_system/services/reflection.py	Adds reflection-based retry parser/LLM prompt flow.
src/chatbot_ai_system/services/media_pipeline.py	Offloads CPU-bound media work + adds singleton Whisper model caching.
src/chatbot_ai_system/services/agents.py	Adds AgentConfig registry and phase-aware tool-executor agent selection.
src/chatbot_ai_system/services/agentic_engine.py	Adds retries/timeouts, cycle detection, tool whitelist, HITL gating, and fail-closed behavior.
src/chatbot_ai_system/server/routes.py	Uses dynamic model/provider settings, atomic sequence numbers, request_id correlation, and rollback on errors.
src/chatbot_ai_system/server/plugin_routes.py	Adds plugin routes for model activation and MCP server management.
src/chatbot_ai_system/server/personal_routes.py	Adds personal integration config/connect endpoints and HITL `/send`.
src/chatbot_ai_system/server/multimodal_routes.py	Uses settings_manager to source the active model for voice flow.
src/chatbot_ai_system/server/main.py	Migrates startup/shutdown to lifespan and mounts new routers.
src/chatbot_ai_system/repositories/conversation.py	Adds DB-derived next sequence number + pgvector feature detection guards.
src/chatbot_ai_system/providers/ollama.py	Adds pooled httpx client reuse for streaming and dynamic model fallback.
src/chatbot_ai_system/prompts.py	Centralizes system/router/synthesis/verification/reflection prompts.
src/chatbot_ai_system/personal/constants.py	Adds HITL tool name list and platform config schemas.
src/chatbot_ai_system/models/schemas.py	Switches timestamps to timezone-aware UTC.
src/chatbot_ai_system/database/models.py	Replaces deprecated utcnow usage and adds `SystemSetting` model.
src/chatbot_ai_system/config/settings_manager.py	Adds DB-backed dynamic settings with validation hooks.
src/chatbot_ai_system/config/settings.py	Updates Settings config to pydantic v2 `SettingsConfigDict` and aliases.
src/chatbot_ai_system/config/mcp_server_config.py	Makes MCP server loading async; adds personal + dynamic server configs.
scripts/live_audit_matrix.py	Adds a live audit runner script to validate end-to-end behavior.
pyproject.toml	Adds media/tooling dependencies (Pillow, pydub, faster-whisper, opencv-headless).
frontend/test-results/.last-run.json	Adds a Playwright results artifact (likely unintended commit).
frontend/package.json	Adds Playwright as a dev dependency.
frontend/components/Sidebar.tsx	Adds tab switching UI (Chats vs Plugins).
frontend/components/ChatArea.tsx	Adds DraftCard UI for HITL tools and send callback wiring.
frontend/app/page.tsx	Adds PluginsDashboard routing and HITL send flow, plus request_id correlation.
docs/phase_7.1.md	Adds documentation for hardening and concurrency fixes.
docs/phase_7.0.md	Adds documentation for model integration and plugin dashboard.
docs/phase_10.1.md	Adds documentation for orchestrator graph + multi-agent + checkpointing.
docs/phase9.0.md	Adds documentation for personal assistant integrations and HITL flow.
docs/phase10.0_testing.md	Adds documentation for Phase 10 test plan and eval approach.
docs/phase10.0.md	Adds documentation for routing/tool reliability upgrades.
docs/personal_platform_integration.md	Adds specification doc for personal platform integrations.
alembic/versions/cdc18a2dc7b3_add_system_settings_table.py	Adds migration for system_settings table.
.pgvector_build	Adds a subproject pointer for pgvector build.
.github/workflows/phase10_eval.yml	Adds a GitHub Actions job to run Phase 10 evals and enforce pass rate.

Files not reviewed (1)

frontend/package-lock.json: Language not supported

Comments suppressed due to low confidence (15)

tests/verify_plugins_api.py:1

This is pytest-collectable (function name starts with test_) but it performs a real HTTP call to localhost and has no assertions; it will be flaky/fail in CI. Move it to scripts/ (or rename to avoid test_ prefix) and/or add it to tests/conftest.py::collect_ignore, or convert it into a proper pytest test using FastAPI TestClient with deterministic assertions.
tests/test_mcp_server_management.py:1
This is a script-style integration runner that hits a live server, but it lives under tests/ and matches pytest discovery (test_*.py). It should either be moved to scripts/ or added to tests/conftest.py::collect_ignore to prevent unintended execution during unit test runs.
tests/test_mcp_server_management.py:1
This is a script-style integration runner that hits a live server, but it lives under tests/ and matches pytest discovery (test_*.py). It should either be moved to scripts/ or added to tests/conftest.py::collect_ignore to prevent unintended execution during unit test runs.
tests/test_mcp_server_management.py:1
This is a script-style integration runner that hits a live server, but it lives under tests/ and matches pytest discovery (test_*.py). It should either be moved to scripts/ or added to tests/conftest.py::collect_ignore to prevent unintended execution during unit test runs.
tests/test_mcp_server_management.py:1
This is a script-style integration runner that hits a live server, but it lives under tests/ and matches pytest discovery (test_*.py). It should either be moved to scripts/ or added to tests/conftest.py::collect_ignore to prevent unintended execution during unit test runs.
tests/test_mcp_server_management.py:1
This is a script-style integration runner that hits a live server, but it lives under tests/ and matches pytest discovery (test_*.py). It should either be moved to scripts/ or added to tests/conftest.py::collect_ignore to prevent unintended execution during unit test runs.
tests/conftest.py:1
Given the addition of other script-style files under tests/ (e.g., verify_plugins_api.py, test_mcp_server_management.py, possibly others), this ignore list likely needs to be expanded to keep pytest runs hermetic. Consider adding the new script-style runners here or relocating them under scripts/.
src/chatbot_ai_system/repositories/conversation.py:1
Binding conversation_id as str(conversation_id) can cause Postgres type mismatch (uuid = text) with a UUID column, especially in raw text() queries where the bind param isn't typed. Pass the UUID object directly (or cast :cid::uuid / use SQLAlchemy select(func.max(...)) against the model column) to ensure correct typing.
src/chatbot_ai_system/services/media_pipeline.py:1
Inside async def, prefer asyncio.get_running_loop() over get_event_loop() (which is deprecated/behavior-changed in newer Python versions). Update these call sites to avoid runtime warnings/errors under Python 3.12+.
src/chatbot_ai_system/services/media_pipeline.py:1
Inside async def, prefer asyncio.get_running_loop() over get_event_loop() (which is deprecated/behavior-changed in newer Python versions). Update these call sites to avoid runtime warnings/errors under Python 3.12+.
src/chatbot_ai_system/services/media_pipeline.py:1
Inside async def, prefer asyncio.get_running_loop() over get_event_loop() (which is deprecated/behavior-changed in newer Python versions). Update these call sites to avoid runtime warnings/errors under Python 3.12+.
src/chatbot_ai_system/services/media_pipeline.py:1
The class-level ThreadPoolExecutor is never shut down. In long-running processes (and especially during local dev reloads/tests), this can leak threads/resources. Consider adding a shutdown hook (e.g., a close()/shutdown() classmethod) and calling it from FastAPI lifespan shutdown.
src/chatbot_ai_system/services/reflection.py:1
This fallback regex only matches JSON objects without nested braces, so it will never capture a typical tool call like {\"name\": \"x\", \"arguments\": { ... }} when it's not in a code block. Consider replacing this with a brace-balancing extraction or a more robust JSON-snippet finder so nested arguments objects can be parsed.
src/chatbot_ai_system/tools/init.py:1
This relies on the private attribute registry._tools. Prefer a public API (e.g., registry.has_tool(name)), or make ToolRegistry.register() idempotent (no-op on duplicates) to avoid reaching into internal state.
src/chatbot_ai_system/server/main.py:1
This commented-out shutdown code is now misleading since Redis shutdown is handled in the new lifespan() context manager. Consider removing the commented line to avoid confusion.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Manthya added 12 commits February 22, 2026 17:56

added version7.0

4cfadae

removed multiple error version7.0

c3f2cd3

tetsing and eval added

d4d34f9

added personalisation

d6e5098

further testing added

5d88996

testing added

ba31f12

added orchestrator updates

cd5edf6

added orchestrator updates

a46d21f

phase10.1

e9279a0

readme updated

e83e442

readme updated

5e1d640

readme updated

c199c40

Manthya requested a review from Copilot March 10, 2026 16:03

Copilot AI reviewed Mar 10, 2026

View reviewed changes

Comment thread src/chatbot_ai_system/config/settings_manager.py

Comment thread frontend/components/Sidebar.tsx

Comment thread frontend/components/Sidebar.tsx

Comment thread frontend/test-results/.last-run.json

fix: resolve DATABASE_URL MissingError in phase10_eval workflow

5a37008

Manthya merged commit a3c5c26 into main Mar 10, 2026
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature v10.1#5

Feature v10.1#5
Manthya merged 13 commits into
mainfrom
feature_v10.1

Manthya commented Mar 10, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Manthya commented Mar 10, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants