Dialectica

Dialectica is a design for an agent that transforms natural language into structured logical blocks before downstream prompting, retrieval, and embedding.

The goal is not to force language into brittle formal logic. The goal is to produce a stable intermediate representation that is:

expressive enough to capture reasoning structure
soft enough to preserve ambiguity and uncertainty
machine-friendly enough for retrieval, planning, and synthesis

By doing so, the project explores how token-independent structure can work alongside strong language models without pretending that every task should be reduced to strict symbolic logic.

Current Status

The repo includes a runnable local toolchain with:

a local parser CLI
a small HTTP API
a benchmark and evaluation harness
an end-to-end answer evaluation harness
a synthetic complexity-sweep benchmark generator
schema, prompt, and sample output files

The parser is heuristic so the project can run locally with no external model dependency. That makes it useful for fast iteration on the intermediate representation before swapping in an LLM-backed parser later.

Quick Start

From the repo root, use uv as the default workflow:

uv sync

Then run commands with uv run:

uv run dialectica parse "If the battery is dead, the car will not start unless we jump it." --pretty

Parse Text From The CLI

uv run dialectica parse "If the battery is dead, the car will not start unless we jump it." --pretty

Print The Canonical Serialization

uv run dialectica parse "The experiment failed because the reagent was contaminated." --canonical

Build A Structural Fingerprint

uv run dialectica fingerprint "If the battery is dead, the car will not start unless we jump it." --summary --pretty

Compare Two Structural Fingerprints

uv run dialectica compare-fingerprints "The policy requires encryption at rest." "The policy does not require encryption at rest." --pretty

Parse A Multi-Sentence Document

uv run dialectica parse --document "Employees may access the lab only if they completed safety training. Alice completed safety training." --pretty

Run The Local API

uv run dialectica serve --host 127.0.0.1 --port 8000

Then send a request:

curl -X POST http://127.0.0.1:8000/parse ^
  -H "Content-Type: application/json" ^
  -d "{\"text\":\"We should allow remote work if it improves productivity.\"}"

Run The Benchmark Harness

uv run dialectica evaluate --dataset benchmarks/core.json --pretty

Run The Answer Evaluation Harness

For a smoke test that exercises the full pipeline without an API call:

uv run dialectica evaluate-answers --dataset benchmarks/objective_qa.sample.json --provider oracle --pretty

For a real model comparison using the OpenAI Responses API:

$env:OPENAI_API_KEY="your_key_here"
uv run dialectica evaluate-answers --dataset benchmarks/objective_qa.sample.json --provider openai --model gpt-5-mini --output reports/openai_metrics.json --performance-output reports/openai_metrics.performance.json --pretty

For a no-cost local run with Ollama:

ollama pull qwen2.5:3b
uv run dialectica evaluate-answers --dataset benchmarks/objective_qa.sample.json --provider ollama --model qwen2.5:3b --output reports/ollama_metrics.json --performance-output reports/ollama_metrics.performance.json --pretty

The Ollama provider is localhost-only by default. It refuses non-local Ollama endpoints unless you explicitly set DIALECTICA_ALLOW_REMOTE_OLLAMA=1.

Generate A Complexity-Sweep Benchmark

uv run dialectica generate-complexity-benchmark --levels 6 --scenarios-per-level 2 --followups 2 --output benchmarks/complexity_sweep.sample.json

Evaluate A Complexity Sweep

uv run dialectica evaluate-complexity-sweep --dataset benchmarks/complexity_sweep.sample.json --provider ollama --model qwen2.5:3b --output reports/ollama_complexity.json --performance-output reports/ollama_complexity.performance.json --summary-output reports/ollama_complexity.sweep.json --markdown-output reports/ollama_complexity.report.md --charts-dir reports/ollama_complexity.charts --pretty

The complexity summary separates two crossover questions:

when the compact Dialectica encoding becomes smaller than the raw context
when the full compact prompt becomes smaller than the direct prompt

The report bundle can also emit:

a markdown summary report
SVG bar charts for accuracy, tokens, latency, and iterations
an SVG line plot of raw context size versus average total tokens

Run The Reasoning Pipeline

uv run dialectica reason "If Alice completed amber orientation, then Alice holds an amber badge. If Alice holds an amber badge, then Alice may enter the amber workshop. Alice completed amber orientation. May Alice enter the amber workshop?" --pretty

If you want a step-by-step explanation of what happens during a run, see run walkthrough.

Repo Layout

src/dialectica/ir.py: hybrid reasoning AST
src/dialectica/translator.py: controlled natural-language translator
src/dialectica/compiler.py: export dispatcher
src/dialectica/engine.py: deterministic Horn and broader FOL-style reasoning engines
src/dialectica/explainer.py: natural-language explanation layer
src/dialectica/parser.py: heuristic parser
src/dialectica/canonical.py: canonicalization and serialization
src/dialectica/complexity.py: synthetic complexity benchmark generation and summaries
src/dialectica/filtering.py: question-conditioned logical-form pruning
src/dialectica/fingerprint.py: Goedel-inspired structural fingerprint backend
src/dialectica/validation.py: lightweight validation
src/dialectica/api.py: stdlib HTTP API
src/dialectica/cli.py: command-line entry points
src/dialectica/reporting.py: markdown and SVG report generation
benchmarks/core.json: initial benchmark set
benchmarks/objective_qa.sample.json: sample objective answer benchmark format
benchmarks/complexity_sweep.sample.json: generated scaling benchmark sample
benchmarks/reasoning.messy.json: canonical noisy reasoning benchmark with yes / no / unknown answers
schema/logical-blocks.schema.json: JSON schema for the target structure
schema/logical-encoding.schema.json: TPTP-inspired encoding layer
prompts/parser.md: parser prompt for an eventual LLM-backed version
docs/concepts/README.md: concept-level architecture and method guides
docs/tptp-inspired-encoding.md: notes on the richer formal encoding
docs/structural-fingerprint.md: notes on the Goedel-inspired auxiliary backend
docs/objective-evaluation.md: how to test whether Dialectica improves correctness
docs/run-walkthrough.md: step-by-step explanation of what happens during a run
docs/reasoning-pipeline.md: reasoning pipeline design and CLI usage
docs/encoding-backends.md: practical comparison of export targets
docs/reasoning-benchmark-suite.md: benchmark philosophy, categories, metrics, and usage

Concept Guides

Dialectica has two main tracks:

a retrieval-oriented logical-form lane for prompting, indexing, and hybrid retrieval
a deterministic reasoning lane for the safe symbolic fragment

The longer architectural notes live in docs/ so this README can stay focused on setup and navigation.

Start here:

Shortest mental model:

parse prose into blocks and relations
canonicalize that structure into a stable form
use the structure either as better prompt material or compile it into a local reasoning program

Next Implementation Steps

Replace or augment the heuristic parser with an LLM-backed parser
Broaden contradiction, negation, and arithmetic coverage in the translator
Add embedding generation for raw text and canonical graph serializations
Add hybrid retrieval experiments and scoring
Measure whether logical retrieval beats text-only retrieval for selected tasks

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
.codex		.codex
benchmarks		benchmarks
docs		docs
examples		examples
prompts		prompts
schema		schema
src/dialectica		src/dialectica
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
architecture.md		architecture.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Dialectica

Current Status

Quick Start

Parse Text From The CLI

Print The Canonical Serialization

Build A Structural Fingerprint

Compare Two Structural Fingerprints

Parse A Multi-Sentence Document

Run The Local API

Run The Benchmark Harness

Run The Answer Evaluation Harness

Generate A Complexity-Sweep Benchmark

Evaluate A Complexity Sweep

Run The Reasoning Pipeline

Repo Layout

Concept Guides

Next Implementation Steps

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Dialectica

Current Status

Quick Start

Parse Text From The CLI

Print The Canonical Serialization

Build A Structural Fingerprint

Compare Two Structural Fingerprints

Parse A Multi-Sentence Document

Run The Local API

Run The Benchmark Harness

Run The Answer Evaluation Harness

Generate A Complexity-Sweep Benchmark

Evaluate A Complexity Sweep

Run The Reasoning Pipeline

Repo Layout

Concept Guides

Next Implementation Steps

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages