Implement chatbot core modules by danrixd · Pull Request #6 · danrixd/smartbaseai

danrixd · 2025-08-01T06:07:56Z

Summary

add conversation manager for maintaining chat sessions
add intent recognizer with simple keyword matching
add response generator leveraging ModelManager

Testing

python -m pytest -q
python -m py_compile $(git ls-files '*.py')

https://chatgpt.com/codex/tasks/task_e_688c58f66f94832a9cbe23fd2d534d7d

Big coordinated batch covering Tier 2 (functional gaps), Tier 3 (polish), and Tier 4 (production readiness) from the earlier what's-missing audit. Only #20 (KMS/Vault secrets manager) is intentionally skipped as overkill for a single-operator dev box. # Backend T2 #5 — generalize exact_lookup (db/query_engine.py) Now dispatches by tenant and table. financebench queries with a ticker hint (sniffed from the user message) and a YYYY-MM-DD date land on daily_bars(ticker, date, open, high, low, close, volume) in financebench.db and return the exact row. Verified live: "AAPL closing price on 2024-06-14" -> {close: 212.49, volume: 70.1M}. Legacy market_data path still works for company/organization vaults. ResponseGenerator._lookup_db now tries the intraday pattern first, then falls back to date-only; passes message= through so the ticker detector can see it. Shared _format_row helper handles both row shapes. T2 #6 — FinanceBench eval harness (scripts/eval_financebench.py) Runs each of the 150 open-source ground-truth questions through /chat/message and scores the reply with three strategies: - numeric match (2% relative tolerance, handles $/bn/M/%) - substring match on the canonical answer - optional Claude-as-judge via --judge flag Writes a markdown report at docs/financebench_eval.md with headline accuracy, per-question-type breakdown, latency stats, sample failures. Usage: python scripts/eval_financebench.py --model-provider anthropic --model-name claude-opus-4-6 --limit 50 --judge T2 #7 — chunked re-ingest on vault edit (api/routes_files.py) _ingest_file_into_tenant_store now uses the shared ai/chunking module, deletes every existing {tenant}:{filename}#* vector under the file's prefix before inserting fresh chunks, and honors section-aware chunking. The vault PUT endpoint delegates to the same helper so edits match the loader's layout exactly. T2 #8 — streaming responses (api/routes_chat.py) New POST /chat/message/stream endpoint returning SSE events: event: delta data: {"text": "..."} (60-char chunks) event: done data: {"latency_ms": N} event: error data: {"detail": "..."} Records usage + writes audit log + updates conversation history just like the non-streaming variant. T2 #9 — cleared all datetime.utcnow() deprecation warnings Updated 8 callsites across api/routes_auth, db/{audit_log, conversation, file, rag_trace, settings, user}_repository, and ingestion/metadata_generator to use datetime.now(timezone.utc). Warning count dropped 83 -> 12 on pytest runs. T3 #14 — cross-tenant search (api/routes_admin.py + frontend page) GET /admin/search?q=...&tenants=... runs the hybrid retriever across every (or a subset of) tenant's Chroma store and returns a ranked flat list with tenant tags. Super-admin only. Verified live: q=quokka across all 7 tenants returns 29 hits primarily from organization + company vaults. T3 #15 — audit log viewer (db/audit_log_repository + /admin/audit-log + frontend) audit_log_repository gains list_logs(limit, offset, username, action) and count_logs(). GET /admin/audit-log returns paginated events. New pages/AuditLog.jsx renders a filterable table. T3 #16 — prompt-cache verification (ai/models/anthropic_model.py) AnthropicModel.generate now logs anthropic usage: input=N cache_create=N cache_read=N output=N on every response via a dedicated smartbaseai.anthropic logger, tagged with the request id from the middleware. Lets operators verify cache_control is actually being hit across turns. T3 #21 — metadata-aware chunking (new ai/chunking.py) Two-tier chunker: 1. Split markdown on ## / ### headings (preserves 10-K sections like Risk Factors, MD&A, Balance Sheet in their own chunks) 2. Within each section, pack paragraphs with a MAX_CHUNK_CHARS=2000 hard cap; oversize paragraphs get whitespace-aligned hard-split. Each chunk is prefixed with **Section Title** so semantic similarity can match on the section name. chunk_with_sections() returns rich dicts with section metadata for callers that want it. T4 #17 — structured logging + request IDs (new api/logging_config.py) New RequestIdMiddleware assigns uuid4 per request (or honors an incoming X-Request-ID header), propagates via contextvars, and returns the id in the response header. Log format: HH:MM:SS INFO rid=abc123456789 smartbaseai.http: POST /chat/trace -> 200 (142.1ms) Installed via configure_logging() in api/app.py. T4 #18 — rate limiting + cost tracking (new db/usage_repository.py) llm_usage table records (tenant_id, username, provider, model_name, input_tokens, output_tokens, cache_read_tokens, cache_create_tokens, latency_ms, created_at) per request. chat_message + chat_message_stream both record on every call with a 4-chars-per-token heuristic for providers that don't expose usage. Per-tenant daily cap: set "daily_token_cap" on the tenant config to enforce. Chat endpoint rejects with 429 when hit: "Daily token cap reached for tenant 'X' (N/CAP)" GET /admin/usage returns a per-day x per-tenant x per-provider rollup with estimated cost using blended prices (Anthropic/OpenAI public rates). Ollama is free/local. New pages/UsageDashboard.jsx renders the rollup with totals across requests / tokens / estimated cost. T4 #19 — session timeout interceptor (frontend/src/api/api.js) axios response interceptor catches 401 / "Token expired" / "Invalid token" and: - clears localStorage (access_token, role, tenant_id, active_tenant, username) - stashes the current path in sessionStorage.post_login_redirect - window.location.assign('/login') Login page reads post_login_redirect on success and bounces back. # Frontend T3 #10 — markdown preview in Vault editor (pages/Vault.jsx) New edit / split / preview toggle above the textarea. split mode shows raw markdown on the left and rendered output on the right. Uses react-markdown (new dep). T3 #11 — export trace (pages/RagVisualizer.jsx) Three new buttons above the pipeline diagram: - 📋 Copy JSON — writes full trace to clipboard - ⬇ .md — downloads a formatted markdown report (query, DB lookup, hybrid retrieval ranking, fusion block, LLM reply) - ⬇ .json — downloads the raw trace JSON Live alongside the existing "💾 Save trace" button. T3 #12 — bulk upload in Vault (pages/Vault.jsx) File input becomes <input multiple>. upload() iterates every selected file, catches per-file errors, reports "Uploaded N/M files · K ingested." when multi. T3 #13 — error boundary (components/ErrorBoundary.jsx + App.jsx) React class component wraps <AppRouter/>. Catches render errors with a recoverable fallback card (try again / reload app) instead of blanking the page. # Login / app plumbing Login.jsx now also stashes username in localStorage and honors the post_login_redirect bounce. components/Layout.jsx gets new sidebar links for super_admin (Cross-tenant Search / Usage / Audit Log / Settings). # Tests 24/24 pytest green. Warning count dropped from 83 to 12 due to the datetime.utcnow() cleanup. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Add basic chatbot components

e0a8596

danrixd added the codex label Aug 1, 2025 — with ChatGPT Codex Connector

danrixd merged commit 843fe92 into main Aug 1, 2025

danrixd deleted the codex/implement-conversation-manager,-intent-recognition,-and-resp branch August 1, 2025 06:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement chatbot core modules#6

Implement chatbot core modules#6
danrixd merged 1 commit into
mainfrom
codex/implement-conversation-manager,-intent-recognition,-and-resp

danrixd commented Aug 1, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

danrixd commented Aug 1, 2025

Summary

Testing

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant