abossard · abossard · Mar 10, 2026 · Mar 3, 2026 · Mar 3, 2026 · Mar 4, 2026
diff --git a/.env.example b/.env.example
@@ -1,10 +1,36 @@
-# OpenAI Configuration
+# LiteLLM Configuration (Default LLM backend)
+# Works out of the box with GitHub Copilot — no API keys needed.
+# Supports 100+ providers: GitHub Copilot, Ollama, Anthropic, etc.
+# Docs: https://docs.litellm.ai/docs/providers
+
+# Primary model (default: github_copilot/gpt-4o)
+# LITELLM_MODEL=github_copilot/gpt-4o
+
+# Fallback chain (comma-separated, tried in order if primary fails)
+# LITELLM_FALLBACK_MODELS=github_copilot/claude-sonnet-4,github_copilot/gpt-4o,github_copilot/gpt-4o-mini
+
+# OpenAI Configuration (Optional — overrides LiteLLM when set)
 # Get your API key from: https://platform.openai.com/api-keys
+# If set, uses OpenAI SDK directly with beta.chat.completions.parse()
 
-OPENAI_API_KEY=your-openai-api-key-here
-OPENAI_MODEL=gpt-4o-mini
-# Optional override
+# OPENAI_API_KEY=your-openai-api-key-here
+# OPENAI_MODEL=gpt-4o-mini
 # OPENAI_BASE_URL=https://api.openai.com/v1
 
+# Knowledge Base Publishing Configuration (for KBA Drafter)
+# FileSystem Adapter (MVP - writes markdown files)
+KB_FILE_BASE_PATH=./kb_published
+KB_FILE_CREATE_CATEGORIES=true
+
+# SharePoint Adapter (future - not yet implemented)
+# KB_SHAREPOINT_SITE_URL=https://company.sharepoint.com/sites/KB
+# KB_SHAREPOINT_CLIENT_ID=your-client-id
+# KB_SHAREPOINT_CLIENT_SECRET=your-client-secret
+
+# ITSM/ServiceNow Adapter (future - not yet implemented)
+# KB_ITSM_INSTANCE_URL=https://company.service-now.com
+# KB_ITSM_USERNAME=your-username
+# KB_ITSM_PASSWORD=your-password
+
 # Optional: Frontend build path override
 # FRONTEND_DIST=/path/to/custom/frontend/dist
diff --git a/CSV/data.csv b/CSV/data.csv
diff --git a/README.md b/README.md
@@ -16,9 +16,9 @@
 ## Tech stack at a glance
 - Backend: Quart, Pydantic 2, MCP JSON-RPC, Async SSE (`backend/app.py`)
 - Business logic: `TaskService` + models in `backend/tasks.py`
-- LLM Integration: Ollama with local models (`backend/ollama_service.py`)
+- LLM Integration: OpenAI (`backend/llm_service.py`)
 - Frontend: React 18, Vite, FluentUI components, feature-first structure under `frontend/src/features`
-- Tests: Playwright E2E (`tests/e2e/app.spec.js`, `tests/e2e/ollama.spec.js`)
+- Tests: Playwright E2E (`tests/e2e/app.spec.js`)
 
 ## Documentation
 
@@ -33,18 +33,27 @@ All deep-dive guides now live under `docs/` for easier discovery:
 - [Troubleshooting](docs/TROUBLESHOOTING.md) – common issues and fixes for setup, dev, and tests
 - [CSV AI Guidance](docs/CSV_AI_GUIDANCE.md) – how AI agents should query and reason over CSV ticket data
 
+### KBA Drafter Documentation
+
+> **NEW:** LLM-powered Knowledge Base Article generator with OpenAI integration
+
+- **[Feature Overview](docs/KBA_DRAFTER_OVERVIEW.md)** – Architecture, components, API endpoints, testing
+- **[Quick Start](docs/KBA_DRAFTER_QUICKSTART.md)** – Fastest path to generating your first KBA
+- **[Technical Guide](docs/KBA_DRAFTER.md)** – Complete implementation details
+- **[Publishing Guide](docs/KBA_PUBLISHING.md)** – How to publish KBAs to different KB systems
+
 
 
 
 
 
 ## 5-minute quick start (TL;DR)
 1. Clone the repo: `git clone <your-fork-url> && cd python-quart-vite-react`
-2. Run the automated bootstrap: `./setup.sh` (creates the repo-level `.venv`, installs frontend deps, installs Playwright, checks for Ollama)
-3. (Optional) Install Ollama for LLM features: `curl -fsSL https://ollama.com/install.sh | sh && ollama pull llama3.2:1b`
+2. Run the automated bootstrap: `./setup.sh` (creates the repo-level `.venv`, installs frontend deps, installs Playwright)
+3. Configure OpenAI API key in `.env` for LLM features (see KBA Drafter documentation)
 4. Start all servers: `./start-dev.sh` *(or)* use the VS Code "Full Stack: Backend + Frontend" launch config
 5. Open `http://localhost:3001/usecase_demo_1` and start documenting your usecase demo idea on that page
-6. (Optional) Test Ollama integration: `curl -X POST http://localhost:5001/api/ollama/chat -H "Content-Type: application/json" -d '{"messages":[{"role":"user","content":"Say hello"}]}'`
+6. Test KBA health endpoint: `curl http://localhost:5001/api/kba/health`
 7. (Optional) Run the Playwright suite from the repo root: `npm run test:e2e`
 
 ## Detailed setup (first-time users)
@@ -66,34 +75,28 @@ npx playwright install chromium
 ```
 > Debian/Ubuntu users may also need `npx playwright install-deps` for browser libs.
 
-### 4. Ollama (optional - for LLM features)
-```bash
-# Install Ollama
-curl -fsSL https://ollama.com/install.sh | sh
+### 4. OpenAI API Key (for KBA Drafter)
 
-# Pull the lightweight model
-ollama pull llama3.2:1b
+Add your OpenAI API key to `.env`:
 
-# Verify installation
-ollama list
+```bash
+OPENAI_API_KEY=sk-proj-your-key-here
+OPENAI_MODEL=gpt-4o-mini
 ```
 
-The app works without Ollama, but LLM endpoints (`/api/ollama/*`) will return 503 errors. For production use, consider:
-- **llama3.2:1b** (~1.3GB) — Fast, good for testing and simple tasks
-- **llama3.2:3b** (~2GB) — Better quality, still fast
-- **qwen2.5:3b** (~2GB) — Alternative with strong performance
+Get your API key from [platform.openai.com/api-keys](https://platform.openai.com/api-keys).
 
-> The `setup.sh` script checks for Ollama and provides installation instructions if not found.
+> The KBA Drafter requires OpenAI configured in `.env` to function.
 
 ## Run & verify
 
 ### Option A — Manual terminals
 1. **Backend:** `source .venv/bin/activate && cd backend && python app.py` → serves REST + MCP on `http://localhost:5001`
 2. **Frontend:** `cd frontend && npm run dev` → launches Vite dev server on `http://localhost:3001`
-3. **Ollama (optional):** `ollama serve` → runs LLM server on `http://localhost:11434`
+3. **OpenAI (for KBA Drafter):** Configure `.env` with `OPENAI_API_KEY` → enables LLM-powered KBA generation
 
 ### Option B — Helper script
-`./start-dev.sh` (verifies dependencies, starts backend + frontend + Ollama if available, stops all on Ctrl+C)
+`./start-dev.sh` (verifies dependencies, starts backend + frontend, stops all on Ctrl+C)
 
 ### Option C — VS Code
 Use the “Full Stack: Backend + Frontend” launch config to start backend + frontend with attached debuggers.
@@ -122,10 +125,7 @@ docker run --rm -p 5001:5001 quart-react-demo
 - **Usecase Demo tab (`/usecase_demo_1`):** Main demo page for documenting usecase demo ideas with editable prompts and background agent runs.
 - **Fields tab (`/fields`):** Lists mapped CSV schema fields available to UI/MCP/agent flows.
 - **Agent tab (`/agent`):** Chat-style agent interface for CSV ticket analysis.
-- **Ollama API (backend only):**
-  - `POST /api/ollama/chat` — Chat with local LLM (supports conversation history)
-  - `GET /api/ollama/models` — List available models
-  - Also exposed via MCP tools: `ollama_chat`, `list_ollama_models`
+- **KBA Drafter tab (`/kba-drafter`):** Generate Knowledge Base Articles from tickets using OpenAI
 
 ## Architecture cheat sheet
 - Shows how to keep REST and MCP JSON-RPC in a single Quart process
@@ -162,57 +162,6 @@ TaskService + Pydantic models (backend/tasks.py)
 | `npm run test:e2e` | Run all Playwright E2E tests |
 | `npm run test:e2e:ui` | Run tests in interactive UI mode |
 | `npm run test:e2e:report` | View test results report |
-| `npm run ollama:pull` | Download llama3.2:1b model |
-| `npm run ollama:start` | Start Ollama server manually |
-| `npm run ollama:status` | Check if Ollama is running |
-
-## Example Ollama API calls
-
-```bash
-# List available models
-curl http://localhost:5001/api/ollama/models
-
-# Simple chat
-curl -X POST http://localhost:5001/api/ollama/chat \
-  -H "Content-Type: application/json" \
-  -d '{
-    "messages": [
-      {"role": "user", "content": "What is Python?"}
-    ],
-    "model": "llama3.2:1b",
-    "temperature": 0.7
-  }'
-
-# Conversation with history
-curl -X POST http://localhost:5001/api/ollama/chat \
-  -H "Content-Type: application/json" \
-  -d '{
-    "messages": [
-      {"role": "user", "content": "My name is Alice"},
-      {"role": "assistant", "content": "Nice to meet you, Alice!"},
-      {"role": "user", "content": "What is my name?"}
-    ],
-    "model": "llama3.2:1b"
-  }'
-
-# Via MCP JSON-RPC
-curl -X POST http://localhost:5001/mcp \
-  -H "Content-Type: application/json" \
-  -d '{
-    "jsonrpc": "2.0",
-    "method": "tools/call",
-    "params": {
-      "name": "ollama_chat",
-      "arguments": {
-        "messages": [{"role": "user", "content": "Hello!"}]
-      }
-    },
-    "id": 1
-  }'
-```
-
-- Node.js 18+
-- `cd frontend && npm install`
 
 ## Testing
 
@@ -234,13 +183,11 @@ npm run test:e2e:report
 
 **Test suites:**
 - `tests/e2e/app.spec.js` — Dashboard, tasks, SSE streaming
-- `tests/e2e/ollama.spec.js` — LLM chat, model listing, validation (requires Ollama)
 
 Tests rely on:
 - Sample tasks being present
 - Stable `data-testid` attributes in the React components
 - SSE payload shape `{ time, date, timestamp }`
-- Ollama running on `localhost:11434` with `llama3.2:1b` model (for Ollama tests)
 
 1. **Backend:** `source .venv/bin/activate && cd backend && python app.py` → serves REST + MCP on `http://localhost:5001`
 2. **Frontend:** `cd frontend && npm run dev` → launches Vite dev server on `http://localhost:3001`
@@ -253,9 +200,7 @@ Tests rely on:
 | `source .venv/bin/activate` fails | Recreate the env: `rm -rf .venv && python3 -m venv .venv && pip install -r backend/requirements.txt` |
 | `npm install` errors | `npm cache clean --force && rm -rf node_modules package-lock.json && npm install` |
 | Playwright browser install fails | `sudo npx playwright install-deps && npx playwright install` |
-| Ollama not found | Install: `curl -fsSL https://ollama.com/install.sh \| sh` then `ollama pull llama3.2:1b` |
-| Ollama connection error | Start server: `ollama serve` or check if running: `curl http://localhost:11434/api/tags` |
-| LLM responses are slow | Try a smaller model (`llama3.2:1b` is fastest) or ensure GPU acceleration is enabled |
+| OpenAI API errors | Check `.env` has valid `OPENAI_API_KEY`, verify at `curl http://localhost:5001/api/kba/health` |
 
 See [docs/TROUBLESHOOTING.md](docs/TROUBLESHOOTING.md) for more detailed solutions.
 
@@ -264,9 +209,8 @@ See [docs/TROUBLESHOOTING.md](docs/TROUBLESHOOTING.md) for more detailed solutio
 2. Extend the SSE stream to broadcast task stats (remember to update `connectToTimeStream` consumers)
 3. Persist data with SQLite or Postgres instead of `_tasks_db`
 4. Add more Playwright specs (filters, SSE error handling, MCP flows)
-5. **Build a chat UI:** Create `frontend/src/features/ollama/OllamaChat.jsx` with FluentUI components and connect to `/api/ollama/chat`
-6. **Smart task descriptions:** Use Ollama to auto-generate task descriptions from titles
-7. **Task summarization:** Summarize completed tasks using LLM
-8. **Multi-model comparison:** Let users select different Ollama models and compare responses
+5. **Smart task descriptions:** Use OpenAI to auto-generate task descriptions from titles
+6. **Task summarization:** Summarize completed tasks using LLM
+7. **KBA enhancements:** Add multi-language support, SharePoint integration
 
 Happy coding! 🎉
diff --git a/backend/=3.10.4 b/backend/=3.10.4
@@ -0,0 +1,9 @@
+Collecting APScheduler
+  Downloading apscheduler-3.11.2-py3-none-any.whl.metadata (6.4 kB)
+Collecting tzlocal>=3.0 (from APScheduler)
+  Downloading tzlocal-5.3.1-py3-none-any.whl.metadata (7.6 kB)
+Downloading apscheduler-3.11.2-py3-none-any.whl (64 kB)
+Downloading tzlocal-5.3.1-py3-none-any.whl (18 kB)
+Installing collected packages: tzlocal, APScheduler
+
+Successfully installed APScheduler-3.11.2 tzlocal-5.3.1
diff --git a/backend/agent_workbench/service.py b/backend/agent_workbench/service.py
@@ -75,13 +75,21 @@ def on_tool_error(self, error: BaseException, *, run_id: Any, **kwargs: Any) ->
 # ============================================================================
 
 def _build_llm(model: str, api_key: str, base_url: str = "") -> Any:
-    from langchain_openai import ChatOpenAI
-    return ChatOpenAI(
-        model=model,
-        api_key=api_key,
-        base_url=base_url or None,
-        temperature=0.0,
-    )
+    if api_key:
+        from langchain_openai import ChatOpenAI
+        return ChatOpenAI(
+            model=model,
+            api_key=api_key,
+            base_url=base_url or None,
+            temperature=0.0,
+        )
+    else:
+        from langchain_litellm import ChatLiteLLM
+        litellm_model = os.getenv("LITELLM_MODEL", "github_copilot/gpt-4o")
+        return ChatLiteLLM(
+            model=litellm_model,
+            temperature=0.0,
+        )
 
 
 def _build_react_agent(llm: Any, tools: list[Any], system_prompt: str) -> Any:
@@ -147,11 +155,6 @@ def __init__(
     @property
     def llm(self) -> Any:
         if self._llm is None:
-            if not self._api_key:
-                raise ValueError(
-                    "OPENAI_API_KEY is required to run agents. "
-                    "Set it via environment variable or pass openai_api_key."
-                )
             self._llm = _build_llm(self._model, self._api_key, self._base_url)
         return self._llm