VirtuDirector IA is a local-first FastAPI application for SME AI consulting and prototyping. It combines:
- a guided operator workspace for SME diagnosis,
- tenant-scoped document ingestion and retrieval,
- deterministic opportunity diagnosis,
- executive proposal generation,
- implementation bundle generation,
- curated intelligence ingestion and ranked retrieval,
- local/LAN model routing through LM Studio,
- optional premium escalation through customer-owned APIs or CLIs,
- and an FDE Labs subsystem that evaluates changes before they are promoted.
This repository is not a generic chatbot template. It is an operational product and engineering lab for deciding where AI should be introduced, packaging pilots, and governing runtime choices inside small and medium-sized businesses.
License: MIT. See LICENSE.
- Describe an SME, department, or process.
- Run AI opportunity diagnosis.
- Generate an executive proposal for management.
- Generate an implementation bundle for the selected pilot.
- Create a tracked pilot with tasks, status, risks, and success metrics.
- Route work through local or optional premium models according to tenant policy.
- Validate system improvements in FDE Labs before promotion.
For a complete text-only demo script, see docs/DEMO_WALKTHROUGH.md.
- AI consultants delivering practical SME implementations.
- Forward-deployed engineers building local-first AI systems.
- SME digital transformation teams that need a structured AI adoption workflow.
- Local AI implementers working with sensitive documents.
- Legal, healthcare, real estate, accounting, public-sector, and office automation prototypes.
The repository solves six practical problems:
- Ingest client documents into a tenant-scoped knowledge layer.
- Answer AI implementation questions using deterministic scoring plus curated knowledge.
- Turn ranked opportunities into executive proposals and implementation bundles.
- Route work across local and optional premium providers under tenant runtime policy.
- Run measured lab experiments on RAG, routing, workflows, GRC, ROI, and market intelligence.
- Require human approval before promoting a measured improvement.
backend/app/main.py: FastAPI entrypoint.backend/app/api/: HTTP routes.backend/app/core/: orchestration, opportunity scoring, executive proposals, process scanner, routing, runtime policy.backend/app/rag/: embeddings, ingestion, retriever, local store.backend/app/knowledge/: curated knowledge ingestion, compaction, retrieval, ranking.backend/app/security/: PII redaction, sensitivity classification, audit helpers.backend/app/tools/: web search, LM Studio, and premium CLI integration.backend/app/static/: operator UI and Labs admin UI.
backend/app/labs/: lab definitions, schemas, service, registry, change promotion.backend/app/api/labs.py: HTTP surface for labs.scripts/smoke_labs.py: local smoke test.scripts/labs_quality_gate.py: deterministic validation gate for labs.
backend/app/data/solutions_catalog.json: base catalog used by the solutions engine.backend/app/data/curated_intel/: versioned curated knowledge loaded into the local knowledge store.
tests/: pytest suite.scripts/import_curated_intel.py: imports curated markdown into the local knowledge DB.scripts/recompact_knowledge_briefs.py: recomputes compact knowledge briefs with current logic.scripts/smoke_tests.py: HTTP smoke tests against a running backend.
The product can run in three modes:
demo_mode=true: deterministic/demo behavior where external services are optional.- Cloud-assisted mode: external web search and hosted model APIs enabled.
- Local/LAN mode: LM Studio provides OpenAI-compatible inference from the local machine or other machines on the LAN.
At the UI level, /app supports three product modes:
SME: guided diagnosis and executive proposal generation.Consultant: diagnosis, implementation bundle generation, and intelligence exploration.Technical: runtime controls, scanner surfaces, and operator diagnostics.
There are two authentication paths:
For most application routes, requests are resolved to a tenant principal.
- In production: use
Authorization: Bearer <jwt>. - In development:
X-Tenant-IdandX-User-Idheaders are accepted.
Labs routes and the Labs admin UI use HTTP Basic auth.
Development defaults:
- username:
admin - password:
change-me-admin
Change these values in .env before exposing the service outside local development.
Minimum requirements for local development:
- Python 3.11 or newer
venvpip
Optional but recommended:
- PostgreSQL, if you want to match the configured
DATABASE_URL - Redis, if you plan to extend the runtime to use it
- Tesseract OCR, for scanned PDFs
- LM Studio, for local/LAN inference
The current labs persistence uses SQLite through LABS_SQLITE_PATH when set, or data/virtudirector_labs.sqlite3 by default.
From the repository root:
make venv
make install
cp backend/.env.example backend/.env
make runOpen:
python3 -m venv backend/.venvor:
make venvbackend/.venv/bin/pip install -r backend/requirements.txtor:
make installcp backend/.env.example backend/.envmake runFor LAN access:
make run-lanor:
./scripts/run_backend_lan.shThe repository exposes these make targets:
make help: print available targets.make venv: createbackend/.venv.make install: install backend dependencies intobackend/.venv.make run: start FastAPI on127.0.0.1:8000.make run-lan: start FastAPI on0.0.0.0:8000.make smoke: runscripts/smoke_labs.py.make smoke-http: runscripts/smoke_tests.pyagainst a running backend.make compile: run Python bytecode compilation overbackend/appandscripts.make test: run pytest.make labs-quality: run deterministic lab validation.make recompact-intel: recompute all knowledge briefs with current compaction logic.
Environment variables are loaded from backend/.env.
Important variables:
DEMO_MODEENVIRONMENTDATA_REGION
SEARCH_PROVIDERBRAVE_SEARCH_API_KEYTAVILY_API_KEYPERPLEXITY_API_KEYWEB_SEARCH_TIMEOUT_SECONDSWEB_SEARCH_CACHE_TTL_SECONDSWEB_SEARCH_DEFAULT_COUNTRYWEB_SEARCH_DEFAULT_LANGUAGE
DEEPINFRA_API_KEYTOGETHER_API_KEYFIREWORKS_API_KEYGROQ_API_KEYANTHROPIC_API_KEYOPENAI_API_KEYMODEL_ROUTER_CHEAPMODEL_ROUTER_MEDIUMMODEL_ROUTER_PREMIUMMODEL_EMBEDDINGS
LOCAL_LLM_ENABLEDLOCAL_LLM_PROVIDERPREMIUM_PROVIDERESCALATION_ENABLEDESCALATION_ALLOWED_INTENTSESCALATION_ALLOW_SENSITIVELOCAL_CONTEXT_LIMITPREMIUM_SANDBOX_DIRCLAUDE_CLI_COMMANDCODEX_CLI_COMMANDPREMIUM_CLI_TIMEOUT_SECONDSLM_STUDIO_BASE_URLLM_STUDIO_API_KEYLM_STUDIO_TIMEOUT_SECONDSLM_STUDIO_CHAT_MODELLM_STUDIO_MODEL_CHEAPLM_STUDIO_MODEL_MEDIUMLM_STUDIO_MODEL_PREMIUMLM_STUDIO_EMBEDDING_MODELLM_STUDIO_REMOTE_BASE_URLSLOCAL_EMBEDDING_FALLBACK
DATABASE_URLREDIS_URLLABS_SQLITE_PATHJWT_SECRETADMIN_BASIC_USERNAMEADMIN_BASIC_PASSWORD
See backend/.env.example for defaults and comments.
GET /GET /healthzGET /appGET /docs
POST /chatPOST /opportunities/diagnosePOST /opportunities/executive-proposalPOST /opportunities/implementation-bundlePOST /process-scanner/analyze
POST /documentsGET /documents/status
GET /knowledge/updates/statusPOST /knowledge/updatesGET /knowledge/updatesGET /knowledge/briefsGET /knowledge/blocksGET /knowledge/use-casesPOST /knowledge/solutions
GET /tools/web-search/statusGET /tools/web-search/testGET /tools/lm-studio/statusGET /tools/lm-studio/testGET /tools/premium/statusGET /tools/runtime-policyPOST /tools/runtime-policyPOST /tools/sensitivity/analyze
GET /labs/catalogGET /labs/schedule/previewPOST /labs/experiments/runGET /labs/runsGET /labs/reportsGET /labs/reports/{report_id}POST /labs/reports/{report_id}/decisionGET /labs/changesGET /labs/changes/{change_id}POST /labs/changes/{change_id}/applyGET /labs/feature-flags
The /app interface is the main workspace for day-to-day usage.
Typical usage flow:
- Set
TenantandCompany. - Upload client documents under
Documentos cliente. - Upload curated intelligence under
Intel IA diaria. - Select the product mode (
SME,Consultant, orTechnical). - Use the guided intake or write a custom question.
- Use
Opportunity workbenchto run a structured diagnosis. - Generate:
- an executive proposal,
- an implementation bundle,
- or both.
- Use
Explorador intelto inspect:- curated blocks,
- direct searches,
- detected query intent,
- ranking reasons.
- Use
Process scannerto submit structured process descriptions and review candidate automations.
The Labs admin panel is for measured evaluation and change promotion.
Typical usage flow:
- Log in with admin Basic auth.
- Review available labs.
- Run a full experiment batch or a single lab.
- Inspect reports with baseline vs candidate evidence.
- Approve or reject reports.
- Apply approved staged changes.
- Inspect resulting feature flags.
make smokeThis command:
- initializes the local labs database,
- runs every registered lab,
- stores run records,
- stores proposed reports,
- approves one report as part of the smoke flow.
Expected output shape:
Catalog: 6 labs
Runs: 6
Reports proposed: N
...
OK
make labs-qualityThis command validates that:
- every lab defined in the catalog is registered,
- every lab is deterministic across two consecutive runs,
- scores and metrics are well-formed,
- report-producing labs generate valid report drafts.
make testThe test suite covers:
- knowledge ranking behavior,
- explainable search metadata,
- executive proposal generation,
- premium escalation behavior,
- runtime policy behavior,
- sensitivity classification,
- lab quality gate behavior,
- lab persistence,
- report approval and change application flow.
make recompact-intelUse this after changing knowledge compaction logic in backend/app/knowledge/updates.py.
backend/.venv/bin/python scripts/import_curated_intel.py --date 2026-05-21Common options:
--date <folder>: import only one dated folder underbackend/app/data/curated_intel/--uploaded-by <name>--scope global|internal--source-type <value>
The knowledge subsystem stores two related records:
knowledge_updates: raw ingested documents and metadata.knowledge_briefs: compacted summaries used for retrieval.
The workflow is:
- ingest markdown, text, PDF, or DOCX,
- parse and normalize,
- compact into summary, tags, key points, and relevance fields,
- store a retrieval-ready brief,
- rank briefs for future operator questions.
The ranking layer currently supports:
- accent-insensitive matching,
- token expansion,
- field weighting,
- phrase boosts,
- sector boosts,
- query-intent detection,
- explainable search output.
The labs subsystem uses a fixed lifecycle:
- a lab definition exists in
backend/app/labs/catalog.py, - a concrete evaluator class is registered in
backend/app/labs/registry.py, - the lab runs a deterministic experiment,
- the service stores a
lab_run, - if the threshold is exceeded, the lab generates a
CoreReportDraft, - a human approves or rejects the report,
- approved reports create staged changes,
- staged changes can be applied,
- applied changes create or update feature flags.
Current labs:
rag_groundingmodel_routing_costagent_workflowroi_solutionsgrc_eu_ai_actmarket_intelligence
Two product outputs sit on top of the diagnosis engine:
Use POST /opportunities/executive-proposal when you need a shareable decision artifact for management.
The proposal payload includes:
- selected opportunity,
- problem statement,
- recommended solution,
- annual benefit estimate,
- setup and monthly cost ranges,
- deployment mode,
- pilot window,
- quick wins,
- 90-day roadmap,
- primary risk,
- and a concise sales message.
When persistence is enabled, the backend writes:
proposal.jsonproposal.html
under data/executive_proposals/<timestamp-tenant-opportunity>/.
Use POST /opportunities/implementation-bundle when you want a delivery-ready package for a pilot or prototype.
The bundle generator writes:
swarm_input.mdexecution_request.jsonreview_checklist.mdcommand.txt
under data/implementation_bundles/<timestamp-tenant-opportunity>/.
- Add a catalog entry in backend/app/labs/catalog.py.
- Create an evaluator in
backend/app/labs/evaluators/. - Inherit from the base lab contract.
- Register the evaluator with
@register_lab("your_lab_id"). - Ensure
run()is deterministic. - Ensure
build_report()returns a valid reviewable draft. - Run:
make labs-quality
make test
make smoke- Start LM Studio.
- Enable its OpenAI-compatible server.
- Set:
LOCAL_LLM_ENABLED=true
LM_STUDIO_BASE_URL=http://127.0.0.1:1234/v1- Configure chat and embedding model names in
.env.
- Enable network serving in LM Studio on the remote machine.
- Add its URL to:
LM_STUDIO_REMOTE_BASE_URLS=http://192.168.x.y:1234/v1,http://192.168.x.z:1234/v1- Restart the backend.
- Inspect:
GET /tools/lm-studio/status- the
Runtimepanel in/app
Runtime behavior is local-first.
Each tenant can override premium routing and escalation policy without changing global settings.
Relevant routes:
GET /tools/runtime-policyPOST /tools/runtime-policyGET /tools/premium/statusPOST /tools/sensitivity/analyze
The application is local-first by default.
If a self-hosting customer wants frontier escalation for difficult tasks, they can configure one of these modes in backend/.env:
# A) local only
LOCAL_LLM_ENABLED=true
PREMIUM_PROVIDER=lmstudio
# B) local + Claude CLI
LOCAL_LLM_ENABLED=true
PREMIUM_PROVIDER=claude_cli
ESCALATION_ENABLED=true
# C) local + Codex CLI
LOCAL_LLM_ENABLED=true
PREMIUM_PROVIDER=codex_cli
ESCALATION_ENABLED=true
# D) local + hosted API
LOCAL_LLM_ENABLED=true
PREMIUM_PROVIDER=anthropic_api
ESCALATION_ENABLED=trueBehavior:
- cheap and medium tiers can remain local,
- premium can be routed separately,
- escalation is disabled by default,
- confidential and regulated content is blocked from escalation by default,
- and
/tools/premium/statusreports whether the selected premium backend is available.
If the client does not want to upload files manually, use:
backend/.venv/bin/python scripts/ingest_agent.py \
--base-url http://127.0.0.1:8000 \
--tenant-id demo-tenant \
--user-id ingest-agent \
--client-name "Demo SL" \
--source-dir /absolute/path/to/source-folderThe script:
- scans allowlisted local folders,
- sends document-like files to
/documents, - sends update-like files to
/knowledge/updates, - and is designed to be run from cron or another scheduler.
make runcurl -X POST http://127.0.0.1:8000/documents \
-H 'X-Tenant-Id: demo-tenant' \
-H 'X-User-Id: tester' \
-H 'X-Client-Name: Demo SL' \
-F 'file=@/absolute/path/to/file.txt'curl -X POST http://127.0.0.1:8000/opportunities/diagnose \
-H 'Content-Type: application/json' \
-H 'X-Tenant-Id: demo-tenant' \
-H 'X-User-Id: tester' \
-H 'X-Client-Name: Demo SL' \
-d '{"question":"Where should we implement AI first in a 500-employee SME?","employee_count":500}'Use the diagnosis response from the previous step as the diagnosis field in the request body.
curl -X POST http://127.0.0.1:8000/opportunities/executive-proposal \
-H 'Content-Type: application/json' \
-H 'X-Tenant-Id: demo-tenant' \
-H 'X-User-Id: tester' \
-H 'X-Client-Name: Demo SL' \
-d @/absolute/path/to/executive-proposal-request.jsoncurl -X POST http://127.0.0.1:8000/labs/experiments/run \
-u admin:change-me-admin \
-H 'Content-Type: application/json' \
-d '{"triggered_by":"manual"}'GitHub Actions currently runs:
make compilemake labs-quality VENV_PYTHON=pythonmake test VENV_PYTHON=pythonmake smoke VENV_PYTHON=python
CI definition: .github/workflows/ci.yml
Additional repository documentation:
- docs/ARCHITECTURE.md
- docs/OPERATIONS.md
- docs/DEVELOPMENT.md
- docs/ROADMAP.md
- docs/DEMO_WALKTHROUGH.md
- docs/EXAMPLE_SME_DIAGNOSIS.md
- docs/EXAMPLE_EXECUTIVE_PROPOSAL.md
- docs/SECURITY_HARDENING.md
- docs/IMPLEMENTATION_ENGINE_EXTENSION.md
Repository-facing documentation is in English.
Curated intelligence datasets under backend/app/data/curated_intel/ intentionally remain in Spanish because they are runtime product content for Spanish-language retrieval and operator workflows.
An optional execution scaffold is also available under: