Generalized Cognitive Refinement Iteration
GCRI (Generalized Cognitive Refinement Iteration) Single Unit is a Hierarchical Multi-Agent System where central coordination and field execution are separated. Rather than simply generating code, strategy formulation-execution-verification-evaluation stages are performed by different specialized agents, and this process occurs in isolated sandbox environments.
This is not just an LLM wrapper—it's an agent-centric architecture where multiple teams compete, critique, and converge to produce verified solutions.
A single GCRI loop functions as one unified thinking unit that can replace traditional LLM calls. Unlike simple prompt-response patterns, each GCRI unit:
- Controls Resources: Manages its own workspace, file system, and execution environment
- Self-Verifies: Internal red team challenges every solution before returning results
- Learns from Failures: Builds constraints that prevent repeated mistakes
- Returns Verified Output: Only outputs that survive internal criticism are released
Think of it as a "super-LLM" where a single function call triggers an entire competitive ecosystem of agents working toward the same goal.
Because GCRI is a complete graph with clear input/output contracts, it can be composed into larger systems:
The Meta-Planner (gcri plan) decomposes complex goals into sequential tasks and delegates each to a fresh GCRI unit. Each unit:
- Receives context from previous units
- Executes its specialized task with full agent competition
- Returns verified results to the planner
- Passes accumulated knowledge to the next unit
This enables modular reasoning where each step is internally verified before proceeding.
The management layer that sets system direction, audits results, and makes final decisions.
| Agent | Role | Key Responsibilities | Input/Output |
|---|---|---|---|
| Strategy Generator (Strategy Planner) |
Tactician | • Multi-angle Approach: Analyzes user requirements to establish N different solution strategies, not a single solution. • Diversity Assurance: Sets different initial directions so that all branches don't write identical code. |
• Input: Task, Memory (Constraints) • Output: Strategies (List of strings) |
| Decision Maker (Final Authority) |
Judge | • Gatekeeping: Coldly evaluates the validity of results from each execution branch. • Winner Selection: If the task is accomplished, identifies the 'winning branch' that wrote the most perfect code. • Deployment Approval: Sends approval signal for the system to merge (commit) sandbox results to the original project. |
• Input: Aggregated Results, File Contexts • Output: Decision (Bool), best_branch_index |
| Memory Manager (Memory Keeper) |
Analyst | • Failure Analysis: Analyzes errors and logical flaws from failed loops. • Constraint Generation: Converts "what should never be done next time" into ActiveConstraints to continuously update agent intelligence. |
• Input: Global Feedback • Output: ActiveConstraints (Rules) |
The practitioner layer that performs actual coding and verification within isolated sandboxes created per branch.
| Agent | Role | Key Responsibilities | Input/Output |
|---|---|---|---|
| Hypothesis Generator (Code Generator) |
Coder | • Execution: Implements assigned strategy into actual working code. • File Manipulation: Directly accesses sandbox filesystem to create ( write_file) or modify files. |
• Input: Task, Strategy • Output: Hypothesis (Code Artifacts) |
| Reasoning Agent (Refiner) |
Reviewer | • Self-Critique: Doesn't immediately execute coder's hypothesis, but first reviews for logical leaps or missing requirements. • Refinement: Reinforces logic and refines (concretizes) hypothesis. |
• Input: Hypothesis • Output: Reasoning (Refined Logic) |
| Verification Agent (Verifier) |
Red Team | • Vulnerability Search: Finds logical flaws or execution errors in written code. • Counter-Example Generation: Presents specific 'counter-examples' that can break the code to test solution robustness. • Survival Judgment: If code cannot withstand this counter-example, that branch fails. |
• Input: Refined Hypothesis • Output: Verification (Counter-Example) |
- Command:
Strategy Generatoranalyzes the problem and issues 3 infiltration routes (strategies): A, B, C. - Isolation: System builds 3 mutually invisible sandboxes (workspaces) for teams A, B, C. (Smart Copy & Link)
- Execution:
- Each team's
Hypothesis Generatorwrites code. Reasoning Agentrefines it.Verification Agentattacks and attempts to break it.
- Each team's
- Report: Survival status and results from each team are reported to
Decision Maker. - Verdict & Merge: If
Decision Makerjudges Team B's results as best, the system reflects only Team B's sandbox contents to the original server.
Planners (Strategy), executors (Hypothesis), verifiers (Verification), and evaluators (Decision) are separated, minimizing bias and hallucination.
Multiple agent teams compete in parallel to find optimal solutions.
All execution occurs in environments isolated from the main system, with only verified results exported.
Failed attempts are converted into constraints, making the system smarter with each iteration.
- Python 3.12+ required
- Docker (for sandbox isolation)
- Set up environment variables in .env file (e.g., OPENAI_API_KEY, ANTHROPIC_API_KEY)
# Install from PyPI
pip install gcri
# Or clone repository for development
git clone https://github.com/Dirac-Robot/GCRI.git
cd GCRI
pip install -e .Single Task Mode - Execute one task with multiple competing strategies:
gcriEnter your task at the prompt. GCRI will spawn multiple agent teams that compete to solve it.
Planner Mode - Break down complex goals into sequential tasks:
gcri planThe meta-planner will decompose your goal into subtasks and execute them systematically.
from gcri.config import scope
from gcri.graphs.gcri_unit import GCRI
config = scope() # returns config object
unit = GCRI(config)
result = unit('Write a Python script to analyze CSV files')
print(result['final_output'])GCRI provides pre-configured presets in the presets/ directory for convenience. These are ready-to-use configurations for different use cases and model providers.
- Balanced: General-purpose configuration with good speed/quality tradeoff
- Coding Specialist: Optimized for code generation tasks
- Deep Research: Maximum thoroughness for complex problems
- Lightweight: Fast execution with minimal resource usage
- OpenAI (GPT-4o, GPT-4o-mini)
- Anthropic (Claude 4 Sonnet, Claude 4.5 Haiku)
- Google (Gemini 2.5 Pro, Gemini 2.5 Flash)
- Mixed providers (
mixed_*.json) - Local models (
local_*.json)
Use the custom_config_path parameter when running GCRI:
# Single task mode with preset
gcri custom_config_path=presets/gpt_5_balanced.json
# Planner mode with preset
gcri plan custom_config_path=presets/claude_deep_research.json
# Use your own custom configuration
gcri custom_config_path=/path/to/your/custom_config.jsonGCRI/
├── assets/ # Project assets (logos, images)
├── gcri/
│ ├── config.py # Configuration management
│ ├── entry.py # CLI entry point
│ ├── graphs/
│ │ ├── gcri_unit.py # Core GCRI workflow
│ │ ├── planner.py # Meta-planner for multi-task
│ │ ├── schemas.py # Pydantic data models
│ │ └── states.py # Workflow state definitions
│ ├── templates/ # Prompt templates
│ │ ├── strategy_generator.txt
│ │ ├── hypothesis.txt
│ │ ├── reasoning.txt
│ │ ├── verification.txt
│ │ ├── decision.txt
│ │ └── memory.txt
│ └── tools/
│ └── cli.py # Local execution tools
├── presets/ # Pre-configured model setups
└── README.md
Each branch executes in its own isolated workspace directory:
- Pattern:
logs/{timestamp}/workspaces/iter_{N}_branch_{M}/ - Files created by agents are scoped to their workspace
- Decision agent can inspect and verify outputs before deployment
Decision agent performs mandatory audits:
- Checks if claimed files actually exist
- Executes code to verify it runs without errors
- Only deploys verified results to project root
- Active Constraints: Rules extracted from failures
- Iteration History: Complete log of all attempts
- Feedback Loop: Failed strategies inform future iterations
Main workflow executor.
GCRI(config)(
task: str,
initial_memory: StructuredMemory = None,
commit_mode: str = 'manual' # 'manual' or 'auto-accept'
) -> dictReturns:
decision: Boolean indicating if task was completedfinal_output: Solution text (if decision=True)memory: Updated memory stateresults: Detailed branch results
Multi-task planner.
GCRIMetaPlanner(config)(goal: str) -> dictLLM agent builder with optional tool access.
build_model(
model_id: str,
gcri_options: dict = None,
work_dir: str = None,
**parameters
) -> CodeAgentBuilderexecute_shell_command(command: str): Execute shell commands in workspaceread_file(filepath: str): Read files from workspacewrite_file(filepath: str, content: str): Write files to workspacelocal_python_interpreter(code: str): Execute Python code
All tools operate within isolated workspace contexts and include interactive safety guards.
pip install pytest
pytest -qpip install pylint
pylint gcripip install black isort
black gcri/
isort gcri/- Check if required authentication keys exist in
.env - Verify model ID and parameters in
gcri/config.pyare correct
- Check
config.templatespath - Relative paths depend on working directory
- Local tools require user confirmation via
InteractiveToolGuard - Enable auto-mode or set
gcri_options.use_code_tools=False
- Check write permissions for
config.log_dir - Verify path exists and is writable
We welcome contributions! Please follow these guidelines:
- Fork & Branch: Create a feature branch from
main - Code Style: Use
blackandisortfor formatting - Commit Messages: Clearly state purpose of changes
- Tests: Add tests for new features
- Documentation: Update relevant docs and templates
- Create JSON file in
presets/directory - Follow existing preset structure
- Document model requirements and use cases
This project is licensed under the MIT License. See LICENSE for details.
- Issues: GitHub Issues
- Discussions: GitHub Discussions
Built with:
- LangGraph - Workflow orchestration
- LangChain - LLM integration
- Pydantic - Data validation
- Rich - Terminal formatting
GCRI: Where Multiple Minds Converge to Code


