# FluxEM Demo

Two demonstrations of the same principle:
1. **Arithmetic embeddings** with homomorphism guarantees
2. **SCAN oracle baseline** (rule execution vs discovery)

The thesis: **Separate rule execution from rule discovery.** When the rule system is known (or cheaply recoverable), bake it into the representation.

## Part 1: Arithmetic Embeddings

FluxEM provides algebraic embeddings where arithmetic operations are geometric:

In [None]:
from fluxem import create_unified_model

model = create_unified_model()

# The homomorphism property: embed(a) + embed(b) = embed(a+b)
# This is algebraic, not learned.
print("FluxEM Arithmetic: Homomorphic Embeddings")
print("=" * 50)
print(f"1847 * 392 = {model.compute('1847*392')}")
print(f"123456 + 789012 = {model.compute('123456+789012')}")
print(f"56088 / 123 = {model.compute('56088/123')}")
print(f"99999 - 12345 = {model.compute('99999-12345')}")

### Why It Works

- **Addition/Subtraction**: Linear embeddings where `embed(a) + embed(b) = embed(a+b)`
- **Multiplication/Division**: Log-space embeddings where `log_embed(a) + log_embed(b) = log_embed(a*b)`

The structure IS the computation. Nothing to learn, nothing to fail OOD.

In [None]:
# Extended operations
from fluxem import create_extended_ops

ops = create_extended_ops()
print("\nExtended Operations")
print("=" * 50)
print(f"sqrt(144) = {ops.sqrt(144)}")
print(f"2^10 = {ops.power(2, 10)}")
print(f"exp(1) = {ops.exp(1):.6f}")
print(f"ln(e) = {ops.ln(2.718281828):.6f}")

## Part 2: SCAN Oracle Baseline

The SCAN solver is an **oracle baseline**, not a learning result. It demonstrates that once compositional rules are known, execution is trivial.

Neural networks fail SCAN not because composition is hard, but because **rule discovery from limited examples** is hard.

In [None]:
from fluxem import AlgebraicSCANSolver

solver = AlgebraicSCANSolver()

print("SCAN Oracle: Rule Execution vs Rule Discovery")
print("=" * 50)
print("This is an ORACLE BASELINE, not a learning result.")
print("It demonstrates: once rules are known, composition is trivial.\n")

examples = [
    "jump twice",
    "walk around left",
    "run right and look left",
    "jump opposite right thrice",
]

for cmd in examples:
    print(f"'{cmd}'")
    print(f"  -> {solver.solve(cmd)}\n")

### The Diagnostic

SCAN benchmarks conflate two capabilities:
1. **Rule discovery**: Inferring composition rules from examples
2. **Rule execution**: Applying known rules to new inputs

| Split | Oracle | Seq2Seq | What This Means |
|-------|--------|---------|----------------|
| addprim_jump | 100% | ~1% | Seq2seq failed to discover the rule |
| addprim_turn_left | 100% | ~1% | Same: rule discovery failure |
| length | 100% | ~14% | Length extrapolation is trivial once rules are known |
| simple | 100% | ~99% | When examples cover the space, memorization works |

## The Research Direction

The interesting question is: **Can we learn the rules from limited examples?**

Milestones:
1. Few-shot lexicon induction (given parse tree, infer operator meanings)
2. Grammar + lexicon induction (infer both)
3. Noise/ambiguity robustness

This oracle baseline is step 0: verify that rule execution is trivial.

## Summary

FluxEM demonstrates: **when the rule system is known, bake it into the representation.**

- **Arithmetic**: known algebraic structure -> encode as geometry
- **SCAN**: known compositional rules -> encode as algebraic transformations

Neither requires learning. Both achieve 100% generalization.

See:
- `docs/FORMAL_DEFINITION.md` for mathematical specification
- `docs/ERROR_MODEL.md` for precision guarantees
- `docs/SCAN_BASELINE.md` for full oracle framing