Skip to content

Contributing and Experiments

Peter Lodri edited this page Jun 10, 2026 · 1 revision

Contributing and Experiments

This is a small, dependency-free research repo — pure Python 3.9+, standard library only, no build step — so it is easy to hack on. Contributions of code, data, experiments, and corrections are all welcome.

Two ways in:

  1. Open a PR / issue — for anyone. Push the census higher, add a size model, sharpen the deciders, fix the report, or correct a citation.
  2. Sponsor the $25 "Contributor" tier — submit one experiment per month and we run it on the cluster and publish the open data with credit to you.

If you are new here, read The Research for the BGK conjecture context, Results and Findings for what we have measured, and Engine and API plus Reproduce for the code.


The research bar (non-negotiable)

This project lives or dies on being honest about what is proven versus what is measured. Every contribution — yours or ours — must clear four rules. These are the same rules stated in the repo's CONTRIBUTING.md, restated here because they are the whole point.

  1. Computed, not asserted. Every quantitative claim must be reproducible from code in src/. Include the exact command that produces the number. No hand-typed counts, no "approximately", no rounding that you cannot regenerate.
  2. State the undecided remainder. Never report a density without its decided / undecided split, and — for any Monte-Carlo number — a confidence interval. Our engine has bounded caps (node_cap=3000, size_cap=200), so some terms come back UND (undecided); that fraction is data, not an embarrassment to hide. See Results and Findings on the "decidability horizon."
  3. Cite primary sources for any theorem, and clearly separate proven (cited) from empirical (your measurement). The BGK conjecture is open (TLCA problem #9, open since the 1990s). Nothing in this repo resolves it; the experiments corroborate the mechanism (erasure as the obstruction to WN ⟹ SN). Keep that line crisp.
  4. Don't conflate models. Always say which size convention a number is in. The asymptotics flip depending on it: Bendkowski–Grygiel–Lescanne–Zaionc (2017) prove SN-density → 0 in the natural-size de Bruijn model we use; David et al. (2013) prove SN-density → 1 in a model where variables cost 0. A density with no stated model is meaningless here.

If a number cannot meet all four, it does not go in the Dataset.


What to contribute (specific invitations)

We especially want these three, because each one extends the science rather than polishing it:

1. New size models

Add other size conventions so densities can be compared across models in one place — e.g. the variables-cost-0 convention of David et al. (2013), or binary-λ-calculus / Tromp size. The headline tension in this field is that the size convention controls the asymptotic answer, and having multiple models side-by-side, computed by the same engine, is exactly the comparison the literature does not currently lay out together. A new model means: a counter (analogous to L(n,m) in debruijn_census.py), an enumerator, a uniform sampler, and a selftest that pins the counts to a known reference sequence (ours matches OEIS A275057 term-for-term).

2. Better SN deciders (shrink UND)

This is the single biggest open lever. The undecided fraction climbs with size (0 at small n → roughly 10% by n=55), and every term we cannot decide is non-SN mass we cannot see directly. A stronger SN/WN decider — perpetual-reduction bounds, type-based shortcuts, smarter cycle detection, or just larger caps run on more cores — directly improves the dataset and tightens the WN∖SN-gap measurements. Watch the load-bearing subtlety in analyze: a node whose reducts were all dropped by size_cap is not a normal form, and counting it as one falsely flags WN. That was a real bug we fixed; any new decider must respect it.

3. Higher exact n

The exact (exhaustive) enumeration currently goes to n ≤ 16; beyond that we switch to uniform Monte-Carlo (K=50000) up to n=55, and the MC run itself became intractable around n=70 (the decidability horizon). Pushing the exact boundary even one or two sizes higher removes sampling error from those rows entirely. The census is embarrassingly parallel (multiprocessing) — ours ran on a 16-core box but scales to any core count.

Also welcome:

  • New experiments / theory checks — anything that computes a normalization-theoretic quantity and ships the data (the BGK lab in src/bgk_lab.py is the template: λI exhaustive, λK separator onset, perpetual-strategy law, Klop's ι-translation transfer, reduction geometry).
  • Report and vizreport/ is plain HTML/JS/CSS with vendored ECharts / Cytoscape / KaTeX; UX and design PRs welcome.
  • Citations and corrections — if a result we call "apparently undocumented" (e.g. the size-13 smallest closed separator (λa.λb.b) Ω) is in fact published, open an issue with the reference and we will correct it gladly.

How to contribute by PR

No dependencies, Python 3.9+. Clone, then run the self-test before you touch anything.

python3 src/debruijn_census.py selftest   # MUST pass: counts match OEIS A275057, sampler exact
python3 src/bgk_lab.py 8                   # quick smoke (tiny per-experiment budget, in seconds)

Never hand-edit the data files. New census rows are regenerated through the build script, which recomputes the smallest-separator witness locally and attaches Wilson 95% confidence intervals to the Monte-Carlo rows:

python3 src/build_census_dataset.py        # writes census_dataset.{json,csv}

See Reproduce for the full command list and Engine and API for the term representation and the analyze / classify contract.

PR checklist

  • python3 src/debruijn_census.py selftest passes.
  • Any new data was regenerated with build_census_dataset.py — the CSV/JSON are not hand-edited.
  • Every claim cites a source or a reproducing command.
  • The model / size convention is stated explicitly for every number.
  • Code stays stdlib-only (no new dependencies) unless discussed in an issue first.

Code style

Match the surrounding code: small pure functions, explicit term tuples (('v',name) | ('l',x,body) | ('a',f,x) | ('p',T,U)), and comments where a step is subtle. The two genuinely load-bearing subtleties — the bounded reduction-graph caps, and the "true normal form vs size-capped dead-end" distinction in analyze — are documented inline; preserve those notes.

Licensing of contributions

By contributing you agree your code is MIT-licensed and your data and docs are CC BY 4.0 — the same split as the rest of the repo.


The $25 "Contributor" sponsor tier — submit an experiment, we run it

We have access to a 16-core research box and run a weekly experiment cadence. The 🔧 Contributor tier ($25/mo) on GitHub Sponsors turns that into a service for you:

Submit one research experiment per month — we run it on the cluster (experiments execute on a weekly cadence) and publish the open data and a write-up, crediting you. Plus priority issue triage and roadmap voting.

How it works

  1. Sponsor at the $25 tier at github.com/sponsors/peterlodri-sec.
  2. Propose an experiment — open an issue describing the normalization-theoretic quantity you want measured: the size model, the term class, what gets classified, and what the output table should look like. The more it reads like an src/bgk_lab.py experiment, the faster it runs.
  3. We run it on the weekly cadence on the cluster, with the engine's caps (or larger caps where feasible).
  4. We publish the open data and write-up — same four research-bar rules apply: computed not asserted, undecided remainder stated, sources cited, model named — and you are credited.

The output lands as open data (CC BY 4.0) just like the existing Dataset, so your experiment becomes a citable, reproducible artifact, not a one-off favor.

Good experiment ideas for this tier

  • A density curve in a new size model for direct comparison against the natural-size results.
  • Pushing a specific quantity (e.g. the WN∖SN separator density, which is not computed asymptotically in either of the two key papers — it is open) to higher n than we have published.
  • A targeted decider improvement measured against the current UND fraction at a fixed size.
  • Replicating a BGK-lab invariant (perpetual-strategy law, ι-translation transfer) on a larger or different term population.

Other tiers

The $25 tier is the experiment tier. Higher tiers ($100 Backer, $500 Sponsor, $2,500 Partner) add README credit, early dataset access, office hours, and direction input; $5 Supporter and one-time tips help keep the compute and open-data hosting running. Full tier list on the Sponsors profile.


Stay in the loop

See also: Home · The Research · Results and Findings · Engine and API · Dataset · Reproduce · FAQ