Chat services return success with empty response when LLM output is unparseable or empty

## What's broken

Both `workflow_chat` and `global_chat` can return HTTP 200 with `response: ""` (empty string) when the LLM produces output that the service can't use. There is no legitimate "empty response" path, so this silently degrades the contract: callers see a success and have no way to distinguish "the model said nothing useful" from "the model's normal answer".

In `services/workflow_chat/workflow_chat.py`, `split_format_yaml` initialises `output_text = ""` and only fills it from `response_data.get("text", "")` after JSON parsing succeeds. If the LLM returns text that isn't valid JSON (or JSON without a `text` field), the `except` branch logs an error and `output_text` stays empty. The wrapper still returns `{"response": "", ...}` with status 200.

In `services/global_chat/planner.py`, `final_text = ""` initialises and is filled by `_extract_text(response)` only on `end_turn`, on max_tool_calls exit, or on an unexpected `stop_reason`. If the model produces a response with no text blocks (only `tool_use` or `thinking`) and the loop exits without `end_turn`, `_extract_text` returns empty. The service logs `Loop exited without end_turn` but still returns 200 with `response: ""`.

## How it surfaced

In Lightning, the AI Assistant background worker treats Apollo's 200 as success and tries to persist the assistant turn. The `ChatMessage` changeset requires `content` (1 to 10,000 characters), so the insert fails with `content: "can't be blank"`. The user-side message stays stuck in `:processing` and the user sees no response and no error indication.

Surfaced as Sentry alert LIGHTNING-1MP and tracked downstream at OpenFn/lightning#4710.

## What to fix

When the LLM output can't be parsed (workflow_chat) or produces no text content (global_chat), the service should raise `ApolloError` (or include an explicit error signal in the response body), not return 200 with empty text.

Both services already use `ApolloError` for other failure modes (auth, rate limit, connection), and Lightning's `handle_error_response` already routes those into the error tuple cleanly, so the fix should fit the existing shape without new contracts.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Chat services return success with empty response when LLM output is unparseable or empty #484

What's broken

How it surfaced

What to fix

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Chat services return success with empty response when LLM output is unparseable or empty #484

Description

What's broken

How it surfaced

What to fix

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions