Structural Alignment

An empirical approach to AI safety that enforces constraints through input format rather than model training.

The Paper

Structural Alignment: A Dual-Control Mechanism for Enforcing AI Safety Constraints

Overview

Current AI alignment methods (RLHF, fine-tuning, prompt engineering) rely on probabilistic approaches to shape model behavior. Structural alignment demonstrates that safety constraints can be enforced through the structure of the input format itself, achieving empirical evidence rather than probabilistic ones.

Key Finding

A dual-control mechanism consisting of:

Classification engine (symbol catalog) — identifies harmful patterns
Constraint backstop (invariants) — defines boundary conditions

With the full suite (70 symbols + invariants): 100% refusal rate (0 compliant escapes across 160 baseline test cycles).

The Dual-Control Interaction

The two components interact non-linearly: the classification engine's behavior depends on whether the constraint backstop is present. This suggests structural alignment is an emergent property of the format, not a sum of its parts.

Repository Structure

paper/
  paper.tex              — LaTeX source
  Structural_Alignment.pdf — Published paper
scripts/
  classify_responses.py  — Response classification tool
data/
  prompts/               — 20 harmful request test cases
  results/               — Experimental result files
  seed/                  — Symbol catalog seed domain
src/                     — FormatEntropy test harness (React/Vite)
tests/                   — Playwright test suite

Author

Brett Earley

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
data		data
dist		dist
paper		paper
scripts		scripts
src		src
test-results		test-results
tests		tests
.gitignore		.gitignore
.npmrc		.npmrc
README.md		README.md
REFUSAL_ANALYSIS.md		REFUSAL_ANALYSIS.md
index.html		index.html
package-lock.json		package-lock.json
package.json		package.json
postcss.config.js		postcss.config.js
tailwind.config.js		tailwind.config.js
tsconfig.json		tsconfig.json
tsconfig.node.json		tsconfig.node.json
vite.config.ts		vite.config.ts
vite.dev.config.ts		vite.dev.config.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Structural Alignment

The Paper

Overview

Key Finding

The Dual-Control Interaction

Repository Structure

Author

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Structural Alignment

The Paper

Overview

Key Finding

The Dual-Control Interaction

Repository Structure

Author

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages