Don't ask if it works. Ask what kills it.
A survivability testing philosophy for discovering what kills a system before production does.
Most systems are tested under conditions designed for success.
Clean inputs. Cooperative environments. Predictable load. Expected sequences.
These tests verify correctness.
They rarely verify survivability.
Cockroach Testing is a validation philosophy for the other question: what happens when everything goes wrong at once?
A system is robust if it survives hostile conditions — not if it behaves correctly in friendly ones.
Cockroach Testing evaluates systems under:
- partial and cascading failure
- degraded, corrupted, or adversarial inputs
- resource starvation and extreme load
- recursive and self-referential pressure
- sustained chaos over time
If the system continues functioning — or fails gracefully — the architecture is resilient.
If it corrupts silently, the architecture is not.
Cockroaches survive environments that destroy most organisms.
Not because they are optimal. Because they tolerate:
- radiation
- starvation
- environmental chaos
- conditions that kill more specialized creatures
The metaphor captures a testing philosophy:
Traditional testing asks: Does it work?
Cockroach testing asks: Can it be killed?
Both questions are necessary. Most teams only ask the first.
Can it survive catastrophic input?
Inject maximum entropy. Random bytes. Contradictory state. Recursive structures. Extreme values at every boundary.
Pass criteria: System degrades but does not corrupt. Crashes are acceptable. Silent corruption is not.
def nuke_test(system):
attacks = [
random_bytes(1_000_000),
contradictory_state(),
recursive_structure(depth=10_000),
all_zeros(1_000_000),
]
for attack in attacks:
system.process(attack)
assert system.is_responsive() # still alive
assert system.integrity_check() # not silently corruptedDo harmful patterns stay harmful after reconstruction?
Inject carefully crafted adversarial inputs. Retrieve them multiple times. Verify they cannot replicate exactly.
Pass criteria: No verbatim recall possible. Pattern degrades on each reconstruction.
def mutation_test(system):
original = "adversarial pattern requiring exact syntax"
system.store(original)
reconstructions = [system.retrieve() for _ in range(10)]
for recon in reconstructions:
assert recon != original # no exact replication
assert similarity(recon, original) < 1.0Does sustained pressure improve or degrade the system?
Subject the system to 1000+ attacks over time. Measure vulnerability before and after.
Pass criteria: Resilience does not degrade. Ideally it improves.
def adaptation_test(system):
baseline = measure_vulnerability(system)
for i in range(1000):
system.process(generate_attack())
after = measure_vulnerability(system)
assert after <= baseline # didn't get worseDo defensive properties reinforce each other?
Test individual protections in isolation. Then test them combined. Combined resilience should exceed any individual protection.
Pass criteria: Protections are synergistic, not competing.
def composition_test():
system = System(loss=0.30, decay=0.05, coupling="limited")
combined_survival = run_chaos_suite(system)
assert combined_survival > any_single_protection()The goal is not to pass all tests.
The goal is to find what breaks — and understand why.
Document every failure in a survival matrix:
| Attack Type | Iterations | Survival Rate | Failure Mode | Why It Survived / Failed |
|---|---|---|---|---|
| Recursive bomb | 1,000 | 100% | None | Loss rate forced convergence |
| Memory exhaustion | 10,000 | 98% | 2% hit max size | Decay prevented full growth |
| Viral syntax | 5,000 | 100% | None | 70% fidelity mutated the pattern |
| Contradictory state | 500 | 94% | 6% silent corruption | Missing integrity check on write |
Failures are not embarrassments. They are the data.
Silent corruption → system failure
invisible
catastrophic
Graceful crash → system survival
visible
recoverable
A system that crashes cleanly is more robust than one that corrupts silently.
Design for graceful degradation. Test to confirm it exists.
Problems that correctness testing consistently misses:
| Hidden Failure | Trigger | Why Normal Tests Miss It |
|---|---|---|
| Recursive instability | Long-duration stress | Tests run too briefly |
| Exact-syntax exploits | Adversarial inputs | Tests use cooperative inputs |
| State corruption | Partial failure mid-operation | Tests assume clean boundaries |
| Dependency collapse | Degraded infrastructure | Tests mock dependencies cleanly |
| Delayed corruption | Time-based accumulation | Tests don't run long enough |
Traditional engineering: How do we make this perfect?
Cockroach engineering: How do we make this unkillable?
Perfect systems optimize for known conditions.
Unkillable systems survive unknown conditions.
Both matter. In adversarial, uncertain, or long-running environments — the second question is usually more important.
Before claiming cockroach properties:
- Survived 1,000+ random attacks
- Survived 24+ hours of sustained pressure
- Failed gracefully (crashes, not corruption) when it did fail
- All failure modes documented with root cause
- Survival mechanisms formalized as design properties
- Compared to non-robust baseline
- Composition of protections tested, not just individual properties
These two ideas form a pair:
70% Fidelity Principle → stability mechanism
(what makes a system robust)
Cockroach Testing → validation methodology
(how you confirm it is robust)
The 70% fidelity boundary was discovered through cockroach testing — not designed in advance.
Adversarial chaos injection revealed that systems with controlled loss (~30%) consistently survived conditions that destroyed systems optimized for exact replication.
The principle emerged from the test results. Not the other way around.
- What classes of systems benefit most from cockroach testing vs. correctness testing?
- How should survivability metrics be formally defined across domains?
- Can cockroach testing methodologies be standardized without losing the adversarial creativity that makes them useful?
- What is the minimum chaos injection needed to get meaningful signal?
- The 70% Fidelity Principle — the stability mechanism this methodology validated
- Designing for Failure — structural patterns for catastrophic-state systems
- Stability Before Alignment — structural coherence as prerequisite
Research Index — full map of the constellation
Concept seed — v1.0
Extracted from a larger research constellation exploring stability under constraint.
Not a production framework. Not a complete testing methodology.
A philosophy and a set of patterns that revealed real failure modes in real systems.
Take what's useful. Ignore the rest.
Part of an open research constellation on stability under constraint.
MIT License — ideas are free.
Part of the Conceptual Seeds layer in the Stability-Under-Constraint research constellation.
Related seed: The 70% Fidelity Principle - the stability mechanism this testing philosophy helped reveal.
