Task 158 follow-up: enrich LLM diagnostic suggestions from provider_message text

## Context

Task 158 §40 added `provider_message: str | None` to every `LLMCallError`, exposing the raw upstream LiteLLM/provider exception text in `_diagnostic_context["provider_message"]`. Agents reading the diagnostic JSON now see what the provider said (the WHY) alongside pflow's wrapped framing (the WHAT).

But pflow's own remediation suggestions (`Diagnostic.suggestions`) are static per-`(error_class, kind/reason)` — they don't yet leverage `provider_message` to surface provider-specific context that's already in hand.

## Problem

Generic suggestions miss provider-specific actionable detail that's already captured in `provider_message`. Concrete examples:

**`UnknownModelError(reason="unknown_name")`** for a deprecated model:
- `provider_message`: `"Model 'claude-2.1' was retired on 2025-07-21 — use claude-3-5-sonnet instead"`
- Current suggestions: "Check the model name against the provider's current model catalogue."
- Could surface: "Provider says: model retired. Try the replacement they suggested."

**`MissingSdkError`** when `provider_message` already contains the install command:
- `provider_message`: `"Google Cloud SDK not found. Install it with: pip install 'litellm[google]'"`
- Current suggestions (already good): the package name is parsed and shown in the install hint.
- Could be more honest: surface the upstream "install with" text verbatim when present, not only the parsed package name.

**`InvalidRequestError`** for context-window overflow:
- `provider_message`: `"Request exceeds maximum context length of 200000 tokens (got 215431)"`
- Current suggestions: "Check the request shape against the provider's documentation."
- Could surface: "Provider says request was 215431 tokens; the model's max is 200000. Reduce prompt size or use a model with a larger context."

**`InvalidRequestError`** for content policy violations:
- `provider_message`: `"Content policy violation: prompt may be unsafe"`
- Current suggestions: generic.
- Could surface: "Provider blocked the prompt as policy-violating. Adjust the prompt to avoid the trigger."

## Scope

Add light-touch enrichment in each subclass's `to_diagnostics()` override that recognizes a small, stable set of patterns in `self.provider_message` and appends targeted suggestions when matched. Important constraints:

- **Pattern detection must stay narrow**: substring-match only on stable provider phrases (e.g. `"retired"`, `"context length"`, `"content policy"`, `"quota"`). Don't try to parse free-form provider text.
- **Always falls back to generic suggestions**: enrichment is additive, not replacement. If no pattern matches, suggestions stay as today.
- **Document the recognized patterns** in the subclass docstring so future maintainers know which substrings the diagnostic is looking for and can update if a provider rewords.

## Tradeoff vs typed sub-discriminators

There's a related issue proposing typed sub-discriminators on `MissingApiKeyError` (`kind="quota_exceeded"`, etc.). The two approaches differ:

| Approach | Where detection lives | Stability |
|---|---|---|
| **Typed sub-discriminator** (the other issue) | `_classify_litellm_error` at the seam | Detection is centralized; consumers branch on a stable enum; one place to update if provider rewords. |
| **Suggestion enrichment** (this issue) | Each subclass's `to_diagnostics()` | Detection is co-located with the suggestion text; cheaper; doesn't preclude typed discriminators later. |

These are complementary, not alternative:
- Typed discriminators are the right shape for sub-cases that need *different remediation pathways* (quota → billing UI; suspended → contact support). Worth the architectural investment.
- Suggestion enrichment is right for sub-cases where the same remediation pathway applies but with provider-specific *detail* (deprecated-model name, exceeded-token-count). Lighter touch.

If both ship: the seam-side discriminators set `kind`, and `to_diagnostics()` may further enrich suggestions from `provider_message` text within a `kind` bucket.

## Why this is worth doing even with the typed-discriminator option

- **Lower architectural cost** — small per-subclass changes, no enum extensions, no seam-side detection logic.
- **Wins on patterns that don't deserve a discriminator** — e.g. surfacing "retired on" model dates is useful but doesn't justify a new `UnknownModelError(reason)` value.
- **Can ship incrementally** as patterns surface; doesn't require a coordinated change.

## Why deferred from Task 158

Task 158's structural pass already added `provider_message` to the diagnostic context. Suggestion enrichment is an extension that's nice-to-have but not load-bearing — agents reading the JSON output already get the raw text and can render it themselves. Filing as follow-up so the surface is captured for prioritization.

## Scope of implementation

- 3-5 stable patterns per subclass to start.
- Tests in each subclass's test file (`test_llm_client.py::TestLLMDiagnostics`) verifying that recognized patterns produce the enriched suggestion AND that unrecognized text falls back cleanly to the generic suggestion.

## References

- `provider_message` introduction: `.taskmaster/tasks/task_158/implementation/progress-log.md` §40.
- Existing static suggestions: `src/pflow/core/exceptions.py` per-subclass `to_diagnostics()` overrides.
- Related (typed sub-discriminators on `MissingApiKeyError`): the other Task 158 follow-up issue.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Task 158 follow-up: enrich LLM diagnostic suggestions from provider_message text #354

Context

Problem

Scope

Tradeoff vs typed sub-discriminators

Why this is worth doing even with the typed-discriminator option

Why deferred from Task 158

Scope of implementation

References

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Approach	Where detection lives	Stability
Typed sub-discriminator (the other issue)	`_classify_litellm_error` at the seam	Detection is centralized; consumers branch on a stable enum; one place to update if provider rewords.
Suggestion enrichment (this issue)	Each subclass's `to_diagnostics()`	Detection is co-located with the suggestion text; cheaper; doesn't preclude typed discriminators later.

Task 158 follow-up: enrich LLM diagnostic suggestions from provider_message text #354

Description

Context

Problem

Scope

Tradeoff vs typed sub-discriminators

Why this is worth doing even with the typed-discriminator option

Why deferred from Task 158

Scope of implementation

References

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions