[FEATURE] `crewai[dspy]` — Algorithmic prompt optimization via the existing LLM hooks

### Feature Area

Core functionality

### Is your feature request related to a an existing bug? Please link it here.

NA

### Describe the solution you'd like

## Problem

Developers building production CrewAI applications spend significant manual effort tuning `role`, `goal`, `backstory`, and task `description` fields by hand — iterating on prompts without a systematic method to measure improvement. This is the classic "prompt engineering treadmill": changes are based on intuition, results are hard to reproduce, and there is no objective signal for when a crew is actually better.

The community has been solving this with workarounds for over a year:
- **Paid course**: [Building Smart AI Agents: CrewAI & DSPy Prompt Optimization](https://www.udemy.com/course/crewai-dspy-optimization/) — community paying to learn how to monkey-patch the framework
- **Instructional repo**: [Ronoh4/dspy_crewai_course](https://github.com/Ronoh4/dspy_crewai_course) — 60+ stars, monkey-patching `crewai`'s LLM call chain (tested against CrewAI 0.152.0 + DSPy 2.6.27)
- **Cross-framework optimizer targeting CrewAI directly**: [Optimas/SuperOptiX](https://dev.to/shashikant86/optimas-superoptix-global-reward-optimization-for-dspy-crewai-autogen-and-openai-agents-sdk-ehb)
- Related closed issues: [#3280](https://github.com/crewAIInc/crewAI/issues/3280), [#3015](https://github.com/crewAIInc/crewAI/issues/3015)

Every one of these workarounds monkey-patches the internal LLM call chain because there is no stable, documented seam to hook into — meaning they break on every CrewAI release.

---

## Proposed Solution

**Ship a `crewai[dspy]` optional extra** that provides a `DSPyOptimizer` class — a thin adapter between CrewAI's existing infrastructure and DSPy's optimization algorithms (MIPROv2, BootstrapFewShot, GEPA, etc.).

### Key insight: the infrastructure already exists

The LLM hooks system introduced in [#1875](https://github.com/crewAIInc/crewAI/issues/1875) provides almost everything needed:

```python
# crewai/hooks/llm_hooks.py — already shipped
from crewai.hooks.llm_hooks import (
    register_before_llm_call_hook,
    register_after_llm_call_hook,
    LLMCallHookContext,
)

# LLMCallHookContext already exposes:
# context.messages    — the full composed prompt (mutable in-place)
# context.agent       — the agent (role, goal, backstory, system_template)
# context.task        — the task (description, expected_output)
# context.crew        — the crew instance
# context.response    — the LLM's response (in after hooks)
```

A `crewai[dspy]` adapter would use these hooks to:
1. **During optimization runs**: capture `(messages, response)` pairs and score them against a developer-provided metric function
2. **After convergence**: write optimized instructions back to `agent.role`, `agent.goal`, `agent.backstory`, or `agent.system_template`
3. **At inference time**: inject optimized few-shot examples into `context.messages` via a before hook

This is structurally identical to how `crewai[mem0]` plugs into the memory system — a framework-level optional extra, not a runtime tool.

### Developer experience: before vs. after

**Before (current workaround — breaks on CrewAI updates):**
```python
import dspy
from crewai import Crew, Agent, Task

# Monkey-patch the internal LLM method (version-coupled, fragile)
original_call = crew.agents[0].llm._call
def patched_call(prompt, **kwargs):
    response = original_call(prompt, **kwargs)
    dspy_module.update(prompt, response)
    return response
crew.agents[0].llm._call = patched_call

# Run optimizer separately with no awareness of crew structure
optimizer = dspy.MIPROv2(metric=my_metric)
# ... glue code to connect DSPy signatures to CrewAI agents ...
```

**After (proposed `crewai[dspy]`):**
```python
from crewai import Crew, Agent, Task
from crewai.optimizers.dspy import DSPyOptimizer  # crewai[dspy]

crew = Crew(
    agents=[researcher, writer],
    tasks=[research_task, write_task],
)

def quality_metric(example, prediction) -> float:
    """Score output quality — any callable returning 0.0–1.0."""
    judge = dspy.ChainOfThought("output -> score: float")
    return float(judge(output=prediction.final_output).score)

optimizer = DSPyOptimizer(
    crew=crew,
    metric=quality_metric,
    algorithm="MIPROv2",           # or "BootstrapFewShot", "GEPA"
)

# Run optimization against labeled examples
result = optimizer.compile(trainset=my_examples, num_trials=20)

# Optimized crew — same interface, better prompts
optimized_crew = result.crew
optimized_crew.kickoff(inputs={"topic": "climate change"})

# Inspect what changed
print(result.score_delta)           # +0.18
print(result.optimized_instructions) # dict of agent_role -> new instructions
```

---

## Implementation Scope

This proposal is scoped to **three independently-reviewable PRs** to stay within the contributing guide's `size/XL` threshold:

### PR 1 — Stable read/write access to agent instructions (in core)

Confirm or add a documented, public path to read and write the effective instructions of an agent after construction. Today `agent.role`, `agent.goal`, `agent.backstory`, `agent.system_template`, and `agent.prompt_template` are all writable Pydantic fields, but this is undocumented as a public API.

**Ask**: Add a doc comment confirming these fields are stable and intended for programmatic rewrite, and add a helper `agent.get_effective_system_prompt() -> str` if one doesn't already exist.

**Files**: `lib/crewai/src/crewai/agent/core.py`
**Size**: < 50 lines

### PR 2 — `DSPyOptimizer` as `crewai[dspy]` optional extra (in core)

Add `lib/crewai/src/crewai/optimizers/dspy_optimizer.py` and declare the optional extra:

```toml
# lib/crewai/pyproject.toml
[project.optional-dependencies]
dspy = ["dspy>=2.5,<3"]
```

The `DSPyOptimizer` class:
- Registers before/after LLM call hooks during the optimization loop
- Uses `before_kickoff_callbacks` / `after_kickoff_callbacks` on `Crew` to demarcate runs
- Delegates to `crew.train()` mechanics for the outer training loop
- Writes optimized instructions back via the documented agent fields from PR 1
- Returns an `OptimizationResult` dataclass with `crew`, `score_delta`, `optimized_instructions`, `version_id`

**Files**: `lib/crewai/src/crewai/optimizers/__init__.py`, `lib/crewai/src/crewai/optimizers/dspy_optimizer.py`, `lib/crewai/pyproject.toml`
**Size**: ~300 lines

### PR 3 — Example in `crewai-examples`

End-to-end notebook: email-drafting crew optimized with MIPROv2 against an LLM-judge metric. Adapts the working monkey-patch tutorial at [Ronoh4/dspy_crewai_course](https://github.com/Ronoh4/dspy_crewai_course) into the clean API.

---

## What This Is NOT

To be explicit about scope, given the history of related closed issues:

- **Not an "auto-improve" feature**: `DSPyOptimizer` does not run automatically or connect to any hosted service. It is an offline, developer-invoked, local optimization loop — no different in kind from `crew.train()`.
- **Not a replacement for manual prompt crafting**: It is a tool for developers who want to measure and improve their crews against a metric they define.
- **Not a hosted prompt-management product**: It stores optimized configs locally. It does not touch CrewAI Enterprise, AMP, or any cloud observability feature.
- **Not a new dependency in the default install**: `dspy` is only installed when a developer explicitly runs `pip install crewai[dspy]`.

---

## Acceptance Criteria

- [ ] `pip install crewai[dspy]` succeeds without errors
- [ ] `DSPyOptimizer(crew, metric).compile(trainset)` runs an optimization loop and returns an `OptimizationResult`
- [ ] The optimized crew returned by `result.crew` produces measurably better outputs on the training metric than the baseline
- [ ] No change in behavior when `crewai[dspy]` is not installed (no import at module level)
- [ ] The before/after LLM hook registration is cleaned up after `compile()` completes (no global state leak)
- [ ] Works with any LLM provider supported by CrewAI (tested with at least OpenAI and Anthropic)
- [ ] Example notebook runs end-to-end in `crewai-examples`

---

## Additional Context

**Prior art in CrewAI:**
- [#1875](https://github.com/crewAIInc/crewAI/issues/1875) — DSPy-style callbacks accepted and shipped → the contribution shape this proposal follows
- `crew.train()` / `TaskEvaluator` — the existing training loop this optimizer extends
- `crewai[mem0]` — the optional-extra packaging pattern this follows

**Prior art elsewhere:**
- [Haystack + DSPy integration](https://haystack.deepset.ai/cookbook/prompt_optimization_with_dspy) — framework-level integration, not a runtime tool
- [MLflow DSPy autologging](https://mlflow.org/docs/latest/genai/flavors/dspy/optimizer/) — pattern for logging optimizer runs that `DSPyOptimizer` could emit events compatible with

**Willingness to contribute:** Yes — happy to submit PR 1 and PR 2 if the maintainer team signals interest in the approach. Would appreciate a comment confirming the proposed file locations and optional-extra name before starting.

---

*Filed against: `crewAIInc/crewAI` main branch*
*Related: #1875, #3280, #3015*
*Label suggestion: `feature-request`, `integration`, `Core functionality`*

### Describe alternatives you've considered

## Alternatives Considered

**1. Leave it to the community (status quo)**
The workaround ecosystem (courses, monkey-patch tutorials, third-party optimizers) handles it. *Rejected*: the workarounds break on every CrewAI release because they patch internals. A stable hook surface in core prevents this fragility — even if CrewAI never ships `DSPyOptimizer` itself, the hooks protect the community from churn.

**2. Standalone `crewai-dspy` package (not in core)**
Ship the adapter entirely outside the monorepo. *Considered*: this is viable and has precedent (LangChain's approach). *Not preferred*: the `crewai[mem0]` pattern (optional extra in core) gives tighter CI coupling — changes to hooks or agent internals are caught in the same test suite that tests the optimizer, rather than silently breaking a downstream package. The path for `crewai-tools` (external → absorbed into monorepo) suggests the maintainers prefer eventual consolidation.

**3. Only add the hook documentation (PR 1 only)**
Document the existing LLM hooks as the official stable seam without shipping an adapter. *Also acceptable* as a first step if the maintainer team prefers to let the community build the adapter first.

### Additional context

_No response_

### Willingness to Contribute

Yes, I'd be happy to submit a pull request

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEATURE] `crewai[dspy]` — Algorithmic prompt optimization via the existing LLM hooks #5818

Feature Area

Is your feature request related to a an existing bug? Please link it here.

Describe the solution you'd like

Problem

Proposed Solution

Key insight: the infrastructure already exists

Developer experience: before vs. after

Implementation Scope

PR 1 — Stable read/write access to agent instructions (in core)

PR 2 — `DSPyOptimizer` as `crewai[dspy]` optional extra (in core)

PR 3 — Example in `crewai-examples`

What This Is NOT

Acceptance Criteria

Additional Context

Describe alternatives you've considered

Alternatives Considered

Additional context

Willingness to Contribute

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[FEATURE] crewai[dspy] — Algorithmic prompt optimization via the existing LLM hooks #5818

Description

Feature Area

Is your feature request related to a an existing bug? Please link it here.

Describe the solution you'd like

Problem

Proposed Solution

Key insight: the infrastructure already exists

Developer experience: before vs. after

Implementation Scope

PR 1 — Stable read/write access to agent instructions (in core)

PR 2 — DSPyOptimizer as crewai[dspy] optional extra (in core)

PR 3 — Example in crewai-examples

What This Is NOT

Acceptance Criteria

Additional Context

Describe alternatives you've considered

Alternatives Considered

Additional context

Willingness to Contribute

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

[FEATURE] `crewai[dspy]` — Algorithmic prompt optimization via the existing LLM hooks #5818

PR 2 — `DSPyOptimizer` as `crewai[dspy]` optional extra (in core)

PR 3 — Example in `crewai-examples`