Pipeline Phases

The paper pipeline runs 10 sequential phases. Each phase must complete before the next begins. Verification and review phases can loop back for revision.

Phase 0: Voice Calibration

Input: 2-3 paragraphs of the author's published writing Output: Voice style profile JSON

The Research Director analyzes:

Sentence length distribution
Vocabulary level and word choice
Paragraph structure and transitions
Punctuation habits
Common constructions

This profile is loaded by every Writer agent. Writers match the author's voice at the sentence level.

Phase 1: Literature Review

Input: Research topic query Output: Merged, deduplicated literature corpus

5 parallel scouts hit different sources:

Scout	Source	Max Results
1	arXiv OAI-PMH	200
2	Semantic Scholar	100
3	CrossRef	50
4	OpenAlex	50
5	Field-specific (NCBI, etc.)	Varies

Results are deduplicated by title similarity (first 80 chars). Each paper includes title, authors, abstract, citation count, key claims, methodology, limitations.

Phase 1.5: Novelty Generation

Input: Literature corpus Output: 50+ hypotheses scored by novelty × tractability

All 6 novelty engines run in parallel:

Engine	Angle	Output
Contrarian	Invert field claims	10 counter-hypotheses
Cross-Pollinator	Import from distant fields	Top 5 analogies
Assumption Excavator	Find unstated assumptions	5 testable assumptions
Counterfactual Generator	Rewrite field history	5 counterfactual histories
Paradox Sifter	Cross-reference limitations	Paradoxes + elephants
Heretic	50 wild guesses from title alone	50 hypotheses + haunting idea

The orchestrator scores all hypotheses by:

Novelty (1-10): Has this been explored?
Tractability (1-10): Can we test this?
Evidence (1-10): How much partial evidence exists?

Top 3 become the paper's contribution.

Phase 2: Hypothesis Selection

Top hypotheses are selected (user input or automatic). Each includes:

Primary claim with evidence base
Gap in literature it fills
Proposed experimental approach

Phase 3: Methodology Design

The Methodology Designer agent:

Recommends correct statistical tests
Performs power analysis
Designs experimental protocol
Flags confounds and controls
Outputs reproducible analysis plan

Phase 4: Data Engineering

The Data Engineer agent:

Writes Python analysis code
Generates publication-ready figures (SciencePlots, 300 DPI, colorblind-safe)
Computes all statistics
Outputs: analysis.py, figures/*.pdf, statistical_report.json

Phase 5: Parallel Writing

5 Writer subagents write simultaneously:

Writer	Section
1	Abstract
2	Introduction
3	Methods
4	Results
5	Related Work + Discussion

Hard constraints (all 41 Humanizer patterns):

No significance inflation ("pivotal", "transformative")
No AI vocabulary ("showcasing", "underscores")
No em dashes (ZERO tolerated)
No synonym cycling
No passive voice without actor
No filler phrases
No "further research is needed"
No "state-of-the-art"

Phase 6: Verification

3 parallel verification modules:

Citation Verifier: Every citation checked against Semantic Scholar AND CrossRef. If a paper doesn't exist or doesn't contain the claimed result, it's flagged as hallucinated.

Statistical Auditor: Every p-value, test statistic, and error bar validated. Checks for p-hacking, multiple comparisons, power issues.

AI-Pattern Detector: Every sentence scanned for all 41 patterns. Density must be < 2/1000 words. Any em dash = reject.

Phase 7: Adversarial Review

10 reviewer personas read the complete paper independently:

Persona	Focus
Theorist	Formal proofs, mathematical rigor
Empiricist	Experimental design, baselines
Pragmatist	Practical applicability
Skeptic	"Your results are wrong"
Historian	Prior art, citation accuracy
Methodologist	Statistical correctness
Ethicist	Societal implications
Competitor	Novelty relative to existing work
Student	Clarity and accessibility
Dreamer	"What if you went further?"

Each returns score (1-10), strengths, weaknesses, and recommendation. All 10 must pass.

Phase 8: Revision

Writers receive annotated critiques and revise. Loop continues until all 10 reviewers accept or the orchestrator determines critiques are adequately addressed.

Phase 9: Style Audit

The Style Auditor runs the complete paper through:

All 41 Humanizer patterns scan
Em dash count (must be ZERO)
Pattern density (< 1 per 2000 words)
Voice consistency with author profile

PASS → Formatting. FAIL → Back to Writer with line-level annotations.

Phase 10: Formatting → Submission

The Formatter:

Loads venue-specific LaTeX template
Generates BibTeX from verified citations
Embeds figures at 300 DPI
Compiles to PDF
Outputs: paper.tex, references.bib, paper.pdf

Supported venues: NeurIPS, ICML, ICLR, Nature, arXiv (templates are stub — contributions welcome!)

Sisyphus Academica — MIT License

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pipeline Phases

Pipeline Phases

Phase 0: Voice Calibration

Phase 1: Literature Review

Phase 1.5: Novelty Generation

Phase 2: Hypothesis Selection

Phase 3: Methodology Design

Phase 4: Data Engineering

Phase 5: Parallel Writing

Phase 6: Verification

Phase 7: Adversarial Review

Phase 8: Revision

Phase 9: Style Audit

Phase 10: Formatting → Submission

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Navigation

Clone this wiki locally