Overview (Details here)
- Use the Flash Workbench first for rapid triage: identify the theory’s function class, core claim, explanation level, falsifiers, gaps, and quick verdict.
- Then use the Extended Workbench for the full audit: map the theory into the complete schema, require evidence for every claim, score reconstruction/invariants/falsifiability/MVT criteria, and force an incompleteness declaration.
- Together, Flash gives the fast orientation and Extended gives the drift-resistant, uniform analysis that different LLMs can follow consistently.
Physics has always advanced by a strange mixture of audacity and restraint. We ask enormous questions — What is matter? What is space? What was the beginning? — but nature answers only when we discipline our imagination. The great danger in theoretical physics is not speculation. Speculation is necessary. The danger is confusion: mistaking a calculation for an explanation, an analogy for a theory, a compatibility check for a derivation.
The Workbench was born from that problem.
At first, it seemed natural to imagine it as a tool: perhaps a system, perhaps a knowledge graph, perhaps a way for machines to read papers and organize theories. But over time its deeper identity became clearer. The Workbench is not software. It is not an oracle. It is not a theory of everything. It is a grammar of intellectual honesty.
Its purpose is simple: make every theory say what it is, what it explains, what it merely permits, what it forbids, and what it cannot yet prove.
That sounds modest. It is not.
In modern physics, many debates persist not because the mathematics is absent, but because the participants are often answering different questions. One proposal may be an ontology, saying what the world is made of. Another may be a reconstruction method, showing how one description can be recovered from another. A third may be a consistency filter, ruling out impossible combinations of assumptions. To compare all three as if they were rival “theories of everything” is like asking whether a map is better than a compass or a border checkpoint.
The Workbench begins by stopping that category error.
Its fast form — the Flash Workbench — is the first pass. It asks, quickly but sharply: What kind of claim is this? What problem does it address? Does it derive, explain, accommodate, or merely postulate? What does it forbid? What would make it fail? Where are the gaps?
This is not the final judgment. It is triage. But good triage matters. A confusing claim, once placed in the right category, often becomes less mysterious and more useful.
The recent paper String Theory from Maximal Supersymmetry is an excellent example. Read casually, the title may suggest that all of string theory has been derived from supersymmetry. The Flash Workbench prevents that overreading. The paper’s result is powerful, but more precise: under special assumptions — planar, non-gravitational, tree-level N=4 super Yang–Mills effective field theory, higher-point factorization, a parity condition, and positivity — the allowed four-point amplitude is driven toward the open-string Veneziano amplitude. That is a striking sectoral result. It is not a derivation of the whole string universe. The Workbench does not diminish the paper; it protects its real achievement from exaggeration.
The second form — the Extended Workbench — is slower and more demanding. It is meant for depth, uniformity, and resistance to drift. This matters especially now, when large language models can summarize papers fluently but may silently change the meaning of a claim. “Suggests” becomes “proves.” “Compatible with” becomes “derives.” “Works in one regime” becomes “solves the theory.” These are small linguistic errors with large scientific consequences.
The Extended Workbench gives the model no such freedom. It requires a schema. Declare the theory’s ingredients. Declare its dynamics. Declare how it recovers space, time, gravity, matter, amplitudes, or thermodynamics. Declare what is invariant. Declare what is only fitted. Declare inverse constraints: can observed data tell us anything about the supposed underlying structure? Declare falsifiers. Declare incompleteness.
This last requirement may be the most important.
Every serious theory has a shadow: the thing it cannot yet prove. The Workbench does not punish that. It punishes hiding it. A theory that says, “Here is what I can show, here is the regime where it works, and here is what remains open,” is stronger than a theory that speaks in totalizing language while quietly leaning on unproven assumptions.
This is why the Workbench is, in its final form, “Gödel-humble.” It knows that no framework rich enough to judge all theories can prove its own final completeness. Its categories are useful, not sacred. Its scorecards are provisional, not divine. It is a discipline, not a tribunal.
Its ultimate significance is therefore not that it will decide which theory of physics is true. It will not. Nature keeps that privilege.
Its significance is that it can make our questions cleaner.
For non-technical readers, the Workbench offers protection against grand scientific headlines. It teaches us to ask: What exactly was shown? In what regime? Under which assumptions?
For physics enthusiasts, it offers a way to admire bold ideas without being seduced by vague claims of finality.
For physicists, it offers a shared diagnostic language: function class, explanation ladder, inverse constraint, reconstruction map, falsifiability output, incompleteness declaration. These are not buzzwords. They are guardrails against self-deception.
The best theories in physics have always done more than fit the facts. They reveal why the facts could not have been otherwise. The Workbench asks every proposal to show whether it has reached that level — or whether it is still accommodating, organizing, or gesturing toward it.
That is not a small contribution. In an age of abundant papers, accelerating AI, and increasingly sophisticated speculation, clarity itself becomes a scientific instrument.
The Workbench is that instrument: first as a flash of orientation, then as a full discipline of judgment.
It does not replace imagination. It gives imagination a conscience.
MIT