A framework for optimizing document visibility and influence in generative engines
Overview | Why MAGEO | Method | Code Map | Quick Start | Results
MAGEO is a Generative Engine Optimization (GEO) framework designed for the post-SEO setting, where content is no longer judged only by list ranking, but by how it is selected, cited, exposed, and integrated into final answers generated by large language model based engines.
Unlike conventional SEO pipelines, MAGEO treats optimization as a closed-loop decision process over document revision, engine preference modeling, answer-level evaluation, memory retrieval, and safety-aware candidate selection. This repository focuses on the core optimization workflow, rather than legacy demos or general-purpose agent infrastructure.
In our implementation, the main loop is centered on:
- Preference Agent for engine preference profiling
- Planner Agent for revision planning
- Editor Agent for candidate generation
- Evaluator Agent for DSV-CF scoring
- Hierarchical Memory for cross-step and cross-instance reuse
- Fidelity Gate for attribution and faithfulness preservation
The primary orchestration entry is pipeline/geo_optimizer.py.
The paper starts from a simple observation: in generative engines, content creators no longer optimize only for retrieval rank. They optimize for whether their content ultimately shapes the answer.
This shift introduces four structural challenges:
- Opaque presentation: exposure is mediated by generated answers rather than a transparent ranked list.
- Undefined objectives: the optimization target is multi-dimensional, not a single ranking score.
- Unclear optimization path: document edits affect answer synthesis through a complex latent pipeline.
- Ambiguous engine preference: different engines may reward different forms of evidence, structure, and style.
MAGEO addresses these challenges with a memory-augmented multi-agent loop that optimizes not only for visibility, but also for influence and reliability.
This repository is intentionally scoped as a research-grade implementation rather than a broad agent platform.
- It keeps the method components that matter for the main optimization workflow.
- It removes historical modules that would blur the main contribution.
- It makes the evaluation objective explicit through the DSV-CF metric family.
- It keeps safety constraints first-class via a fidelity-aware candidate gate.
If you want to understand the actual system path from idea to code, this repository is meant to be that bridge.
Figure 1. The shift from ranking-oriented SEO to answer-mediated GEO raises new questions about presentation opacity, measurement, optimization path, and cross-engine preference ambiguity.
Figure 2. MAGEO combines preference modeling, planning, multi-candidate editing, answer-level evaluation, fidelity-aware selection, and hierarchical memory within a closed optimization loop.
The current implementation follows this main loop:
- Build an initial baseline under a Twin-Branch evaluation protocol.
- Normalize engine-specific heuristics into a reusable Preference Profile.
- Retrieve reusable patterns from dual-layer memory.
- Ask the Planner Agent to generate a high-level revision strategy.
- Ask the Editor Agent to produce multiple candidate revisions.
- Ask the Evaluator Agent to score candidates with DSV-CF metrics.
- Select only candidates that pass the Fidelity Gate.
- Write back successful edits to Step-level and Creator-level memory.
- Stop early when the DSV-CF objective reaches a plateau.
| Agent / Module | Responsibility | Main Entry |
|---|---|---|
| Preference Agent | Convert raw engine rules into a structured preference profile | agent/preference_agent.py |
| Planner Agent | Generate revision plans conditioned on preference and memory | agent/planner_agent.py |
| Editor Agent | Produce diverse candidate rewrites from the revision plan | agent/editor_agent.py |
| Evaluator Agent | Predict candidate quality under the DSV-CF objective | agent/evaluation_agent.py |
| GEO Optimizer | Coordinate the full closed-loop process | pipeline/geo_optimizer.py |
MAGEO uses a two-level memory design:
- Step-level memory stores local successful edit traces within an optimization trajectory.
- Creator-level memory stores reusable editing patterns across queries, documents, and engines.
Relevant files:
The repository implements a two-axis evaluation family:
- SSV:
WLV,DPA,CP,SI - ISI:
AA,FA,KC,AD
The unified objective is:
S_DSV-CF = λ * SSV + (1 - λ) * ISI - γ * (10 - AA)
Default settings in code:
lambda = 0.5gamma = 0.5
Implementation:
Candidate selection is not purely reward-maximizing. MAGEO explicitly rejects candidates that degrade attribution or faithfulness beyond the configured tolerance.
In the current implementation:
AAandFAare treated as safety-critical dimensions.- Candidates must satisfy the fidelity threshold before they are eligible.
- Optimization halts early when the DSV-CF objective stops improving for
krounds.
This logic is implemented in evaluation/candidate_selector.py.
The repository is organized so that the core method maps cleanly to the main components:
| Paper Concept | Repository Implementation |
|---|---|
| Twin-Branch Evaluation Protocol | evaluation/simulated_evaluator.py |
| Preference Profiling | agent/preference_agent.py |
| Planning | agent/planner_agent.py |
| Multi-Candidate Editing | agent/editor_agent.py |
| DSV-CF Evaluation | agent/evaluation_agent.py, evaluation/metrics.py |
| Fidelity-Aware Selection | evaluation/candidate_selector.py |
| Dual-Layer Memory | memory/memory_bank.py, memory/schema.py |
| Closed Optimization Loop | pipeline/geo_optimizer.py |
Additional design notes:
To keep the repository focused on the main method, earlier auxiliary modules have been removed from the main path, including:
FusionAgentReactAgentToolAgentSummaryMemory- generic tool registries and helper scaffolding
- DAG-style pipeline base classes unrelated to the final method
- earlier online demo scripts not aligned with the final paper loop
MAGEO/
├── README.md
├── MAGEO_开发文档.md
├── docs/
│ ├── paper_alignment.md
│ └── test_phase.md
├── agent/
├── evaluation/
├── memory/
├── model/
├── pipeline/
├── prompt/
├── config/
├── scripts/
├── test/
└── tool/
High-level module responsibilities:
agent/: core agent rolesevaluation/: DSV-CF metrics, safety gate, candidate selectionmemory/: step-level and creator-level memorypipeline/: the optimization orchestration layertool/: external search interfacescripts/: interactive and batch experiment entrypointstest/: focused unit tests for the core pipeline
- Python
3.12+ uvrecommended, thoughpipalso works
git clone https://github.com/Wu-Beining/MAGEO.git
cd MAGEO
uv venv
source .venv/bin/activate
uv pip install -r requirements.txtFor Windows:
.venv\Scripts\activateThis repository separates environment variables from model configuration:
- Copy
.example_envto.envif you want to use a custom config path. - Copy
config/config.yaml.exampleto your actual config file. - Fill in your provider-specific model settings in that config.
- Export
WEB_SEARCH_API_KEYif you want to run the interactive search pipeline.
Example:
cp .example_env .env
cp config/config.yaml.example config/config.yamlNotes:
tool/web_search.pydepends on the optionalzai-sdk.- The interactive entrypoint checks
WEB_SEARCH_API_KEY. - Core modules can still be imported even if the web-search dependency is absent.
To optimize a single document-selection scenario end-to-end:
python -m pipeline.interactive_optimize --query "Best AI coding agents" --autoUseful flags:
--query/-q: target user query--auto/-a: automatically choose the first retrieved result--yes/-y: skip confirmation and continue until early stopping
This path runs:
- query rewriting
- web search
- document selection
- answer generation
- DSV-CF evaluation
- closed-loop revision and final answer regeneration
For benchmark-style batch execution:
python scripts/batch_optimize_v2.py --json path/to/test_queries.jsonExpected JSON format:
[
{
"index": 1,
"query": "Best AI coding agents",
"is_optimized": false,
"log": ""
}
]Alternative batch entrypoints:
python scripts/batch_optimize.py --json path/to/test_queries.jsonpython scripts/batch_optimize_sequential.py --json path/to/test_queries.json
By default, optimization artifacts are written under log/, including:
- optimization trajectories
- source metadata
- final answers
- web search traces
- memory snapshots
Figure 3. Distribution of query intent, topical coverage, and source composition for the benchmark setting used in our GEO analysis.
Figure 4. A practical Pareto frontier illustrating the trade-off between token budget and word-level visibility, with MAGEO targeting better visibility-efficiency balance.
Compared with aggressive heuristic GEO or naive single-model rewriting, MAGEO is designed to improve answer-level visibility and influence without giving up attribution quality and faithfulness constraints. The key intuition is not just "edit more," but "edit with structured preference, memory, and safety-aware evaluation."
If you find this repository useful in your research, please cite the corresponding paper once the bibliographic entry is finalized.
@misc{mageo2026,
title = {From Experience to Skill: Multi-Agent Generative Engine Optimization via Reusable Strategy Learning},
author = {Beining Wu, Fuyou Mao, Jiong Lin, Cheng Yang, et al},
year = {2026},
note = {ACL 2026}
}


