Skip to content

fix(rag): skip showcase knowledge phase gracefully on invalid embedding credentials #329

@w7-mgfcode

Description

@w7-mgfcode

Problem

During #324 showcase_rich validation, the knowledge phase fails with a 401/502 cascade when the embedding provider has a placeholder or invalid OpenAI key, instead of skipping gracefully (as it already does when the provider is unreachable).

Root cause

  • The demo probe only checks key presence (bool(settings.openai_api_key), app/features/demo/pipeline.py), not validity → a placeholder key reports reachable=True.
  • The indexing call then hits OpenAI, gets a 401, which OpenAIEmbeddingProvider._embed_batch swallows in a generic except Exception and re-raises as a generic EmbeddingError (app/features/rag/embeddings.py).
  • /rag/index/project-docs and /rag/retrieve map EmbeddingError to 502 (app/features/rag/routes.py) with no way for the demo pipeline to tell auth from connection failure.
  • The demo knowledge steps (rag_index_subset, rag_retrieve_probe) only skip on ctx.embedding_unreachable, so an auth failure surfaces as a hard fail.

Fix (PR1, PRP-42)

  • Classify embedding auth failures distinctly (new EmbeddingAuthError), keep public /rag 502 status stable but add a machine-readable problem marker.
  • Make the showcase_rich knowledge steps skip gracefully on an auth-classified failure.
  • Tighten tests/test_e2e_demo.py::test_run_demo_showcase_rich_full_epic so RAG knowledge steps may skip but must not fail.
  • Update RUNBOOKS for the new behavior.

Out of scope

Manual dogfood + screenshots (PR2), hygiene/docs-drift (PR3), Playwright e2e.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions