-
Notifications
You must be signed in to change notification settings - Fork 0
Engine and API
A developer's tour of src/. This page explains the term representation, the bounded reduction-graph analyzer analyze() (including the SN / WN / ¬WN decision rules and the one genuinely tricky subtlety we had to fix), the de Bruijn census counter/sampler/classifier, and the seven BGK lab experiments. There are short, copy-pasteable examples of importing and calling the key functions.
Everything is pure Python 3.9+ standard library — no dependencies, no pip install. If you want the project at a glance first, start at Home; for the mathematics see The Research; for the numbers see Results and Findings; to run it yourself see Reproduce; for the data layout see Dataset; to add your own experiment see Contributing and Experiments.
Honesty note up front: this engine decides SN / WN on a bounded reduction graph. Inside the bound the answers are exact (proven by construction below). Outside the bound it answers
None("undecided"). It does not, and cannot, settle the open Barendregt–Geuvers–Klop conjecture — it gathers evidence about the mechanism. Where a claim is empirical we say so.
| File | Role |
|---|---|
bgk_lab.py |
The core λ-calculus engine: term ops, β-reduction, the analyze() reduction-graph decider, plus the seven experiments (E1–E7). |
debruijn_census.py |
The headline open-data contribution: a natural-size de Bruijn counter, exact enumerator, uniform sampler, and classifier that reuses analyze(). |
build_census_dataset.py |
Assembles the publishable dataset (per-size densities, Wilson CIs, smallest-separator witness) into JSON + CSV. See Dataset. |
normalize_check.py |
A standalone, narrated validation harness (the C1–C4 checks) that demonstrates the conservation story by executable reduction, not assertion. |
bgk_lab.py is the engine; debruijn_census.py imports from it. normalize_check.py is an earlier, self-contained sibling of the same machinery used for the human-readable walkthrough — note its analyze() carries the older size-cap behaviour (see the subtlety below), so treat bgk_lab.analyze as canonical.
A λ-term is a small tagged tuple. Four constructors:
('v', name) # variable, name is a string
('l', x, body) # abstraction λx.body
('a', f, x) # application (f x)
('p', T, U) # Klop "memory pair" [T,U] — only used by the ι-translation / λI[,] calculusbgk_lab.py exposes named constructors so call sites read like math:
from bgk_lab import V, L, A, show, free_vars, size
I = L('x', V('x')) # λx.x
Delta = L('x', A(V('x'), V('x'))) # λx.(x x)
Omega = A(Delta, Delta) # (λx.xx)(λx.xx) — the canonical non-terminating term
print(show(Omega)) # λx.x x (λx.x x)
print(free_vars(Omega), size(Omega)) # set() 7Notes:
-
free_vars,subst,db_key, etc. all handle the'p'pair case so the extended λI[,] calculus shares one code path. -
size(t)here is the structural size (variables, λ, app each contribute, by1 + …). This is not the de Bruijn natural size used by the census — that lives indebruijn_census.dbsize(|index k| = k+1). Keep the two straight:sizeis the engine's internal node measure; natural size is the census's classification axis. - Capture-avoiding substitution (
subst) renames a binder viafresh()only when the substituend's free variables would be captured, so α-equivalence is respected.
To compare terms up to α-equivalence, the engine maps each term to a de Bruijn key:
('b', i) # bound variable, i = de Bruijn index into the binder env
('f', name) # free variable
('l', body_key)
('a', f_key, x_key)
('p', T_key, U_key)Two α-equivalent terms produce the same db_key. This key is what the reduction graph is keyed on — so the graph is an α-quotient: distinct-but-α-equal reducts collapse to one node.
from bgk_lab import is_redex, contract, one_step
is_redex(('a', ('l', 'x', body), arg)) # True: a top-level (λx.body) arg
contract(redex) # β-contract the top redex: body[x := arg]
one_step(t) # list of ALL terms reachable by one β-step (any redex)one_step does full (non-deterministic) β-reduction: it returns every one-step reduct, contracting the head redex if present and recursing into every sub-position. This is what lets analyze build the entire reachable graph rather than following a single strategy.
For strategy-specific walks there are also leftmost-outermost (lo_path / reduce_at / normal_order_len) and the perpetual strategy (perpetual_step), discussed under E4 below.
This is the heart of the engine.
from bgk_lab import analyze
a = analyze(term, node_cap=4000, size_cap=220)
# a = {'sn': True|False|None, 'wn': True|False|None,
# 'nodes': int, 'trunc': bool, 'nodes_map': {...}, 'succ': {...}, 'nf': [...]}graph(t, node_cap, size_cap) performs a breadth-first exploration from t's α-key, following one_step. It records a successor map succ (α-key → list of α-keys) and a node map (α-key → representative term). Two caps keep it finite:
-
size_cap— drop any reduct whose structuralsizeexceeds the cap (and flagtrunc). -
node_cap— stop once the graph exceeds this many nodes (and flagtrunc).
The graph is α-quotiented (keyed on db_key), so cycles in reduction become real cycles in the graph.
SN ⟺ the reachable reduction graph is finite and acyclic
¬SN ⟺ the graph contains a β-cycle (a cycle is a witness of an infinite reduction)
WN ⟺ a TRUE normal form is reachable
¬WN ⟺ the graph is fully explored (not truncated) and contains no normal form
Concretely:
-
SN.
has_cycle(succ)runs an iterative (stack-based, recursion-safe) DFS. If it finds a back-edge to a node on the current DFS stack, there is a cycle →sn = False. If there is no cycle and the graph was fully explored (not trunc), the reachable graph is a finite DAG, so every reduction sequence terminates →sn = True. If there's no cycle but we truncated, we cannot rule out divergence beyond the cap →sn = None. -
WN. If any explored node is a true normal form,
wn = True(we have witnessed a normalizing path). If there is no normal form and we fully explored, then no normal form exists →wn = False. If there is no normal form but we truncated →wn = None.
A None in either field means "the caps were hit before a decision could be forced." The census reports such terms as UND (undecided).
This is the one place where a naive implementation is silently wrong, so it deserves emphasis for contributors.
A graph node with no recorded successors looks like a normal form. But it might only look terminal because all of its reducts were dropped by size_cap. Such a node is not a normal form — it still has a redex; we just refused to follow it because the result was too big. Counting it as a normal form would falsely set wn = True, manufacturing phantom weakly-normalizing terms.
The fix in bgk_lab.analyze re-checks each apparently-terminal node:
true_nf = []
for k in succ:
if not succ[k]:
if one_step(nodes[k]): # it HAD reducts — they were pruned by size_cap
trunc = True # so this node is not terminal; mark truncated
else:
true_nf.append(k) # genuinely no redex → a real normal form
wn = True if true_nf else (None if trunc else False)So WN is decided only from true normal forms (one_step returns []), and a node whose reducts were merely too large flips the result to truncated/None rather than to a false True. The earlier analyze in normalize_check.py predates this fix and treats any successor-less node as an NF — fine for its curated, hand-checked examples, but not safe for blind enumeration. The census uses the corrected bgk_lab.analyze.
from bgk_lab import V, L, A, analyze
Delta = L('x', A(V('x'), V('x'))); Omega = A(Delta, Delta)
analyze(Omega)['sn'], analyze(Omega)['wn'] # (False, False) — Ω→Ω is a 1-cycle: no NF, not SN
KIO = A(L('x', V('y')), Omega) # (λx.y)Ω — erasing fn applied to Ω
r = analyze(KIO)
r['wn'], r['sn'] # (True, False) — the canonical SEPARATOR(λx.y)Ω is weakly normalizing (leftmost-outermost erases the diverging argument and reaches y) but not strongly normalizing (you can also reduce inside Ω forever). That WN ∧ ¬SN combination is what we call a separator, and erasure (a bound variable not occurring in the body) is exactly what manufactures one. See The Research for why this is the crux of the conjecture.
The census classifies closed untyped λ-terms by natural size in the de Bruijn model:
|index k| = k + 1 (0-based de Bruijn index)
|λ M| = 1 + |M|
|M N| = 1 + |M| + |N|
Internal de Bruijn term repr (distinct from the engine's named repr): ('i', k) | ('l', M) | ('a', M, N).
L(n, m) is the number of de Bruijn terms of natural size n valid in a context of m binders (indices < m), via the obvious recurrence (memoized with functools.lru_cache). Closed terms of size n are L(n, 0):
from debruijn_census import closed_count
[closed_count(n) for n in range(1, 14)]
# [0, 1, 1, 3, 6, 17, 41, 116, 313, 895, 2550, 7450, 21881]These counts are exactly OEIS A275057 (Lescanne 2016), validated term-for-term by the selftest. (Note: closed_count(1) = 0 — there is no closed term of natural size 1, since the smallest index 0 already has size 1 but is not closed.)
from debruijn_census import enum
terms13 = enum(13, 0) # every closed de Bruijn term of natural size 13
len(terms13) # 21881 (== closed_count(13))len(enum(n, 0)) == closed_count(n) is asserted in the selftest for n ≤ 12.
For sizes too large to enumerate, sample draws a term uniformly at random among all size-n terms in context m, top-down by the same counts (L) used for counting — a standard rank/unrank construction. sample_idx is the deterministic enumeration-rank companion kept consistent with sample.
import random
from debruijn_census import sample, dbsize
rng = random.Random(12345)
t = sample(40, 0, rng)
dbsize(t) # 40 — every sample has exactly the requested natural sizeBridges to the engine: to_named converts the de Bruijn term to the engine's named repr (using a binder stack so de Bruijn 0 is the innermost binder), then calls bgk_lab.analyze, then maps (sn, wn) to one of four labels:
from debruijn_census import classify, enum
classify(enum(9, 0)[0]) # e.g. 'SN'| Label | Meaning | Decided from |
|---|---|---|
SN |
strongly normalizing | sn is True |
SEP |
separator = WN ∧ ¬SN
|
sn is False and wn is True |
NWN |
no normal form (not weakly normalizing) | sn is False and wn is False |
UND |
undecided within engine caps | otherwise (a None) |
python3 src/debruijn_census.py selftest # counts vs A275057, sampler size checks, n≤12 classification preview
python3 src/debruijn_census.py census # full run → JSON on stdout (uses all cores)run_census(ex_max, mc_plan, K, seed) enumerates exhaustively for n up to ex_max (default 16) and Monte-Carlo samples (default K = 50000) for the sizes in mc_plan (default up to n = 55); it uses a multiprocessing.Pool across all cores. The plan stops at 55 because the bounded analysis becomes intractable around n = 70 — the decidability horizon. The run also records the smallest separator witness. See Results and Findings and Reproduce.
Implementation detail for contributors:
debruijn_census.pyimportsbgk_labviasys.path.insert(0, '/tmp/bgk')/'/tmp'. If you run from a checkout, make surebgk_lab.pyis importable (run fromsrc/, setPYTHONPATH=src, or adjust the path insert). See Reproduce.
python3 src/bgk_lab.py [seconds_per_idea] runs all seven, each wall-clock-budgeted (default 540s per experiment), and emits one JSON document {meta, E1..E7} on stdout (also incrementally persisted to /tmp/bgk_results.json). Most experiments fan out over a multiprocessing.Pool. The headline run was on a 16-core box.
| # | Function | What it checks | Headline outcome |
|---|---|---|---|
| E1 | E1_conservation |
Exhaustively enumerate small λI terms (no erasing abstractions) and classify each. Conservation predicts no separators in λI. | 0 separators across ~3.13M λI terms. |
| E2 | E2_separators |
Exhaustive λK enumeration by size; record separator onset and density. | Separator onset at size 12 (in this experiment's own ad-hoc named enumeration — distinct from the census's closed-de-Bruijn natural size 13). |
| E3 | E3_geometry |
For SN terms, measure reduction geometry: longest vs. shortest path to NF (de Vrijer-style L(·)), max term blow-up, graph size. | Scatter + per-size stats for the report. |
| E4 | E4_perpetual |
Test the law "perpetual strategy diverges ⟺ ¬SN" over enumerated terms. | 0 violations / ~762k checks. |
| E5 | E5_transfer |
Apply Klop's ι-translation and test SN(M) ⟺ SN(ι(M)). | 0 violations / ~711k checks. |
| E6 | E6_speedup |
Parallel-speedup benchmark of the analyze workload across core counts. |
Speedup/efficiency table. |
| E7 | E7_zoo |
Build small reduction graphs for a curated zoo (I, K, S, Ω, the separator (λx.y)Ω, S K K, …) for the interactive report. |
Node/edge graphs per term. |
The exact counts above (3.13M, 762k, 711k; onset sizes) are empirical results of a particular run with the given caps and budget — see Results and Findings. Reproductions will match the qualitative findings (zero violations, the onset sizes); the totals depend on how far each budget gets.
-
Perpetual strategy.
perpetual_stepimplements a maximal/perpetual one-step strategy: at an outermost erasing redex(λx.A)Bwithx ∉ FV(A)andBnot yet normal, it steps into the doomed argumentBinstead of discarding it — exposing the infinite path that normal-order reduction hides.perpetual_diverges(t, bound)reports whether that strategy fails to normalize. E4 checks it agrees withanalyze's SN verdict on every term. -
ι-translation.
iota(t)transports an erasing λK term into the non-erasing λI[,] calculus: an erasingλx.bodybecomesλx.[ι(body), x]— the memory pair('p', …)"parks" the otherwise-discardedx, making the abstraction a legal λI term.is_lambda_I(t)checks the λI property;one_step_ext/analyze_extadd the π-rule[T,U] T' → [T T', U]for reducing the extended calculus. E5 checks SN is preserved across the translation. -
λI enumeration.
gen_by_size(n, alphabet, lambda_I)(built ongen_terms) enumerates named terms of structural sizen, optionally restricted to λI (every binder's variable must occur in its body). E1/E2 drive this. -
Graph metrics.
longest_path(DP over the DAG; only meaningful for SN terms),shortest_to_nf(BFS), and thenodes_map/succreturned byanalyzefeed E3 and E7.
from multiprocessing import Pool, cpu_count
import bgk_lab
with Pool(cpu_count()) as pool:
res = bgk_lab.E1_conservation(budget=60.0, pool=pool, alpha=['y'])
print(res['total_terms'], res['total_separators']) # expect (..., 0)A standalone script (python3 src/normalize_check.py) that prints the conservation story step by step from executable reductions: C1/C2 show (λx.y)Ω is the canonical WN-but-not-SN separator and Ω is a 1-cycle; C3 shows the perpetual strategy walking into the erased argument; C4 shows λI is separator-free (exhaustive, consistent with Conservation) while a curated λK family produces separators. It is a good first read for contributors who want the intuition before diving into analyze. (Its in-file analyze is the pre-fix version — see the subtlety above; use bgk_lab.analyze for any quantitative work.)
- Want to run all of this and reproduce the numbers? → Reproduce
- Want the actual classification tables and densities? → Results and Findings
- Want to add an experiment or a decider improvement? → Contributing and Experiments
- Want the column-by-column data spec? → Dataset
- New here? → Home · curious about the math? → The Research · common questions → FAQ
Code is MIT; data and docs are CC BY 4.0. Contact: cabotage@pm.me. Newsletter & cross-project info: crabcc.app/about. Support: github.com/sponsors/peterlodri-sec.