feat(crew): add consensual process with pluggable consensus engine#5691
Closed
gkotsia wants to merge 1 commit into
Closed
feat(crew): add consensual process with pluggable consensus engine#5691gkotsia wants to merge 1 commit into
gkotsia wants to merge 1 commit into
Conversation
Contributor
|
Hey, two things:
|
Author
|
@greysonlalonde Ok, I've opened the related issue here: #5708 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Title:
feat(crew): add consensual process with pluggable consensus engineLabels:
llm-generated(required — this PR was authored with AI assistance)Why merge this
Process.consensualhas been aTODOinprocess.pysince the original three-process design. This PR ships it, with three properties that matter to maintainers and users:ConsensusEngineProtocol + entry-point discovery lets third-party libraries plug in viapip install. The reference engine — Snowveil, a probabilistic Borda-CHB protocol — is published on PyPI today; the same pattern accommodates future plugins (Ranked Pairs, weighted voting, capability-based scoring, etc.). CrewAI itself ships onlyMajorityVoteConsensus— no new runtime imports, no version constraints to maintain.Process.hierarchicalrequires configuring a separatemanager_llm(typically a stronger, costlier model) to dispatch each unowned task.Process.consensualpolls the existing agents in parallel instead — no extra model to configure, audit, or pay separately for. Trade-off: more total LLM calls per task (N agent rankings vs 1 manager call), but on the agents' existing model configs and in parallel — net spend depends on which side has the more expensive model.Backward-compatible. Existing crews are untouched. The new process is opt-in (
process=Process.consensual), the new field is opt-in (consensus=...defaults toNone), and unmodified Protocol clients continue to work.What the user sees
Snowveil is the reference third-party engine — a probabilistic ranked-preference protocol from arxiv:2512.18444 (Kotsialou). Already published on PyPI (
pip install snowveil); works against this PR today.Summary of changes
Process.consensual— implements the third process mode. Tasks without an explicitagentare dispatched by polling every other agent for a ranked preference and aggregating ballots.ConsensusEngineProtocol +MajorityVoteConsensusdefault —@runtime_checkable, typed, pluggable. CrewAI ships only the trivial baseline; richer engines live in third-party packages.discover_engines()) — supports both Python entry points (crewai.consensus_enginesgroup) and a small built-in fallback registry.Crew(consensus="snowveil")resolves automatically when Snowveil is installed; broken plugins log a warning and skip rather than crash an unrelated crew.<task>tags, length-capped at 2000 chars, and explicitly marked as untrusted input. Centralised inbuild_handler_ranking_prompt()so all consensus engines share the same hardening.What this PR is not about
Cross-host or cross-organisational crews. CrewAI today is a single-process framework — every
Crewruns in one Python address space.Process.consensualoperates within one crew's agents and uses Snowveil's in-process mode (InMemoryTransport); it does not touch the network. A future integration could use Snowveil's distributedWebSocketTransportto enable federated decision-making across organisations or hosts, but that's a separate design conversation (likely aFederatedCrewprimitive, not aProcess.consensualconfiguration) and is intentionally out of scope here.Files
lib/crewai/src/crewai/consensus.pyhelp(crewai.consensus)surfaces the integration path.lib/crewai/src/crewai/crew.pyconsensusfield + validator (instance or string name),_run_consensual_process,_collect_handler_rankings(parallel viaThreadPoolExecutor),_agent_by_role,_require_unique_agent_roles.lib/crewai/src/crewai/process.pyProcess.consensualenum value.lib/crewai/tests/test_consensus.pyProcess.consensualend-to-end.Total: ~1,184 insertions, 2 deletions across 4 files (
uv.lockexcluded).Design notes
consensus: Anyfield type. Pydantic can't generate a schema for aProtocol, so the field is annotatedAnyand validated structurally at runtime viaisinstanceagainst the@runtime_checkableProtocol. Strings are resolved by name first.discover_engines()iteratesimportlib.metadata.entry_points(group="crewai.consensus_engines")first (the future path for any plugin), then merges in_KNOWN_ENGINE_IMPORT_PATHS(a small dict — currently justsnowveil, which was published before adopting the entry-point convention). Entry points always win. Failed loads log aWARNINGand are skipped — a broken third-party engine never crashes an unrelated crew. Cached viafunctools.cache._MIN_RANKING_RATIO = 0.5. If fewer than half of agents return a parseable ballot,_collect_handler_rankingsraises rather than pick a handler from a tiny minority.ThreadPoolExecutorsinceagent.execute_taskis synchronous.parse_role_rankingis algorithmically equivalent to a parser in Snowveil; CrewAI cannot depend on Snowveil, so the duplication is intentional.Test plan
45 tests in
lib/crewai/tests/test_consensus.py, all passing locally (~1s, no network):MajorityVoteConsensus— single voter, majority winner, candidate-order tie-break (and reversed), empty rankings, empty ballot, unknown candidate,runtime_checkableProtocol matching._validate_ballots— accepts complete ballot, rejects empty rankings / per-voter ballot / unknown candidate.parse_role_ranking— strict JSON, JSON in surrounding text, first-appearance fallback, partial JSON falls through, unparseable raises, partial text match raises.build_handler_ranking_prompt— task and roles included, marked UNTRUSTED, length-capped, empty description handled.None; accepts engine instance; accepts string name; rejects non-engine, unknown name (with installed-engines list), and empty string (dedicated error); instance path does not calldiscover_engines.discover_engines()happy paths — built-inmajorityalways present; entry points discovered; fallback registry resolves when module importable; cache returns same dict until cleared; two named plugins coexist; entry points override fallback for the same name.discover_engines()defensive paths — fallback skipped silently when not installed; fallback raising non-ImportErrorlogs warning; fallback missing attribute logs warning; entry-point load raising logs warning; entry point returning a non-class is rejected with a warning; duplicate entry-point names log a collision warning.Process.consensual— unanimous winner assigned, explicittask.agentnot overridden, duplicate roles raise, low quorum raises, customConsensusEnginehonoured over default.uv run ruff check lib/clean.uv run ruff format --check lib/clean.uv run mypy lib/crewai/— not verified locally;uv syncfails on macOS x86_64 becauselancedb(pinned>=0.29.2,<0.30.1) ships no x86_64 macOS wheel. Directmypyonconsensus.pyreports no issues; relying on CI mypy across 3.10–3.13 for the rest.pytest lib/crewai/tests/test_consensus.py— 45 passed locally.Open questions / follow-ups
consensusbecome a typed field once Pydantic gains better Protocol support, or stayAnywith the runtime validator?MajorityVoteConsensus, or shouldProcess.consensualrequire an explicit engine to avoid surprising users with naive plurality?docs/en/concepts/processes.mdx(and translations inar,ko,pt-BR) are not in this PR — tracked separately.crewAIInc/crewAI-examples, matching upstream's separation of core and samples.