Skip to content

feat(scripts): AffineScript port of check-ts-allowlist (TS->AS campaign STEP 2 seed)#284

Merged
hyperpolymath merged 2 commits into
mainfrom
feat/ts-to-affinescript-check-allowlist
May 30, 2026
Merged

feat(scripts): AffineScript port of check-ts-allowlist (TS->AS campaign STEP 2 seed)#284
hyperpolymath merged 2 commits into
mainfrom
feat/ts-to-affinescript-check-allowlist

Conversation

@hyperpolymath
Copy link
Copy Markdown
Owner

Summary

Adds scripts/check-ts-allowlist.affine as the AffineScript port of the existing scripts/check-ts-allowlist.ts, filed under the estate TS->AffineScript migration campaign (#239 umbrella, STEP 2 — tail-batch-1 per standards#241).

Self-referential: the script that enforces the no-new-TypeScript policy is itself one of the TS files that policy applies to. Landing the AffineScript version is a symbolic milestone for the meta-tool.

Pattern: phronesis#19 seed style. Add .affine alongside the live .ts. No workflow change in this PR; CI keeps invoking the .ts via deno run --allow-read .standards-checkout/scripts/check-ts-allowlist.ts. Workflow cutover (compile .affine to .deno.js, wire wrapper, retire .ts) is a follow-up issue.

Behaviour equivalence

The regression suite at scripts/tests/check-ts-allowlist-test.sh exercises 13 cases (builtin allowlist classes + Layer-2 CLAUDE.md exemptions + Layer-2.5 governance-allowlist + dotted-dir skip + multi-heading-table parsing).

  • All 13 PASS against the AffineScript-emitted .deno.js (verified locally with a sibling harness using the same fixtures)
  • All 13 PASS against the original .ts (no regression)

Both implementations are behaviour-equivalent on the substring assertions the suite makes. Cosmetic difference: the .affine port uses ASCII [FAIL] / [OK] sentinels instead of the original emoji (see "Seam findings" below).

Stdlib surface used

All externs already shipped in stdlib/Deno.affine via affinescript#445:

  • walkRecursive (recursive file enumeration)
  • regexMatch (JS RegExp.test wrapper)
  • readTextFile (synchronous file read; throws on missing — wrapped in try/catch)
  • args, exit, consoleError

Plus AffineScript builtins: string_get, string_sub, string_find, char_to_int, int_to_string, len.

AffineScript seam findings surfaced by this port

(Each would be a separate affinescript-repo PR — out of scope for this per-standards-repo PR per the campaign's ownership gate.)

  • String less-than lex-compare: not built-in. Implemented inline via byte-wise str_lt (calls char_to_int(string_get(...))). Bare a < b on String produces TypeMismatch (String, Int).
  • break/continue in while: BREAK/CONTINUE tokens are reserved in lib/parser.mly but no production rule uses them yet. Refactored two natural occurrences (s_trim inner loops + strip_leading_dot_slash) to combined-guard / sentinel-boolean forms.
  • Non-ASCII string literals: lower to octal escape sequences (e.g. "\\226\\157\\140") in --deno-esm output. Strict-mode ESM rejects octal escapes with SyntaxError: Octal escape sequences are not allowed in strict mode. Worked around with ASCII [FAIL] / [OK] sentinels.
  • Stale installed binary: the affinescript at ~/.local/bin/ predated PR #445 and silently emitted bare walkRecursive(".") calls (no __as_walkRecursive shim in the prelude). A trunk rebuild surfaces the new shims correctly.

Sequencing follow-ups (NOT part of this PR)

  • Compile-time wiring: build .affine -> .deno.js in CI; commit .deno.js as a generated artefact OR add a precompile step.
  • Workflow cutover: update .github/workflows/governance-reusable.yml to invoke the .deno.js (with the existing --allow-read scope).
  • Retire .ts: delete scripts/check-ts-allowlist.ts and update docs/EXEMPTION-MECHANISMS.adoc references.
  • Upstream affinescript fixes for the 3 seam findings above (file as separate affinescript-repo issues).

Test plan

  • affinescript check type-checks the .affine
  • affinescript compile --deno-esm emits .deno.js with no octal-escape errors
  • 13/13 regression tests pass against the AffineScript-emitted .deno.js
  • 13/13 regression tests pass against the original .ts (no regression)
  • GPG-signed commits

Refs

🤖 Generated with Claude Code

…Refs #239)

Adds `scripts/check-ts-allowlist.affine` as the AffineScript port of
`scripts/check-ts-allowlist.ts`, filed under the estate TS->AffineScript
migration campaign (#239 umbrella, STEP 2 —
tail-batch-1 per standards#241). Self-referential — the script that
enforces the no-new-TypeScript policy is itself one of the TS files
that policy applies to. Symbolic landing for the meta-tool.

Pattern follows the phronesis#19 seed: add `.affine` alongside the live
`.ts`, no workflow change in this PR; CI keeps invoking the `.ts`. A
follow-up PR will compile the `.affine` to `.deno.js`, add a one-line
`.mjs` wrapper that calls `main()`, and retire the `.ts`.

Stdlib surface (all now in stdlib/Deno.affine after affinescript#445):
- walkRecursive, args, exit, consoleError, regexMatch
- readTextFile, endsWith, stripSuffix (pre-existing)

Behaviour mirrors the TS faithfully:
- Built-in directory + filename allowlist
- Layer 2 `.claude/CLAUDE.md` exemption-table parser (multi-table aware)
- Layer 2.5 `.governance-allowlist` plain-text glob list
- Glob->regex translation (`*` -> `.*`, `?` -> `.`, regex-escape chars)
- Exempt fallback: regex match, literal bare equality, trailing-slash prefix
- Lex-sort of the violation list (manual `str_lt` via `char_to_int`
  since AffineScript has no `<` on String)
- Same exit code (0 success / 1 violation) and same output text

Drive-by: SPDX header normalised to MPL-2.0 (estate language-policy
2026-05-25; PMPL-1.0-or-later is the legacy form on the live `.ts`).

Oracle: `affinescript compile scripts/check-ts-allowlist.affine
-o /tmp/check-ts-allowlist.deno.js --deno-esm` -> exit 0, clean compile.

Co-Authored-By: Claude Opus 4.7 (parallel session) <noreply@anthropic.com>

Refs #239
Refs #241
…rral

Updates the seed port to call Deno.exit() directly inside main()
instead of returning an exit code. Rationale: the Deno-ESM backend
emits a top-level `await main();` (no `process.exit(main())` wiring),
so a returned Int would be discarded and the script would always
exit 0. Matching the TS original's `Deno.exit(1)` on violation
requires the host-side terminate.

The Int return type is kept so the type checker accepts both branches;
in practice the exit() calls never return (their signature in
Deno.affine documents this — the Int is for flow-compatibility with
if/else arms).

Effect on follow-up sequencing: the `.mjs` wrapper from the original
plan is no longer needed; the follow-up PR can wire the workflow
directly to `node check-ts-allowlist.deno.js` (or invoke via deno).

Co-Authored-By: Claude Opus 4.7 (parallel session) <noreply@anthropic.com>

Refs #239
Refs #241
@github-actions
Copy link
Copy Markdown

🔍 Hypatia Security Scan

Findings: 183 issues detected

Severity Count
🔴 Critical 65
🟠 High 30
🟡 Medium 88

⚠️ Action Required: Critical security issues found!

View findings
[
  {
    "reason": "Action perpolymath/standards/.github/workflows/governance-reusable.yml@main\n needs attention",
    "type": "unpinned_action",
    "file": "governance.yml",
    "action": "pin_sha",
    "rule_module": "workflow_audit",
    "severity": "medium"
  },
  {
    "reason": "Issue in affinescript-verify.yml",
    "type": "unknown",
    "file": "affinescript-verify.yml",
    "action": "flag",
    "rule_module": "workflow_audit",
    "severity": "medium"
  },
  {
    "reason": "Issue in boj-build.yml",
    "type": "unknown",
    "file": "boj-build.yml",
    "action": "flag",
    "rule_module": "workflow_audit",
    "severity": "medium"
  },
  {
    "reason": "Issue in casket-pages.yml",
    "type": "unknown",
    "file": "casket-pages.yml",
    "action": "flag",
    "rule_module": "workflow_audit",
    "severity": "medium"
  },
  {
    "reason": "Issue in casket-pages.yml",
    "type": "unknown",
    "file": "casket-pages.yml",
    "action": "flag",
    "rule_module": "workflow_audit",
    "severity": "medium"
  },
  {
    "reason": "Issue in changelog-reusable.yml",
    "type": "unknown",
    "file": "changelog-reusable.yml",
    "action": "flag",
    "rule_module": "workflow_audit",
    "severity": "medium"
  },
  {
    "reason": "Issue in codeql-reusable.yml",
    "type": "unknown",
    "file": "codeql-reusable.yml",
    "action": "flag",
    "rule_module": "workflow_audit",
    "severity": "medium"
  },
  {
    "reason": "Issue in codeql.yml",
    "type": "unknown",
    "file": "codeql.yml",
    "action": "flag",
    "rule_module": "workflow_audit",
    "severity": "medium"
  },
  {
    "reason": "Issue in deno-ci-reusable.yml",
    "type": "unknown",
    "file": "deno-ci-reusable.yml",
    "action": "flag",
    "rule_module": "workflow_audit",
    "severity": "medium"
  },
  {
    "reason": "Issue in doc-format.yml",
    "type": "unknown",
    "file": "doc-format.yml",
    "action": "flag",
    "rule_module": "workflow_audit",
    "severity": "medium"
  }
]

Powered by Hypatia Neurosymbolic CI/CD Intelligence

@hyperpolymath hyperpolymath merged commit bfef9ad into main May 30, 2026
38 checks passed
@hyperpolymath hyperpolymath deleted the feat/ts-to-affinescript-check-allowlist branch May 30, 2026 13:19
hyperpolymath added a commit to hyperpolymath/affinescript that referenced this pull request May 30, 2026
 #460) (#463)

## Summary

Closes #460 — non-ASCII string literals in AffineScript source no longer
break strict-mode ESM in the Deno/Node JS backends.

## Root cause

OCaml's \`String.escaped\` emits non-ASCII bytes as \`\\NNN\`
**decimal** sequences. JavaScript parses \`\\NNN\` as **octal** escapes
which strict-mode ESM rejects:

\`\`\`
SyntaxError: Octal escape sequences are not allowed in strict mode.
\`\`\`

(And even outside strict mode the bytes would decode to the wrong
characters — \`\\226\` octal = 0x96, not the 0xE2 lead-byte of ❌.)

## Fix

New helper \`Js_codegen.js_string_lit\` walks the UTF-8 byte sequence,
decodes code points, and emits:

| Character class | Output |
|---|---|
| Printable ASCII (0x20-0x7E except \`\\\` \`\"\`) | as-is |
| \`\\\` \`\"\` \`\n\` \`\r\` \`\t\` | conventional escape |
| Other ASCII (control bytes) | \`\\xHH\` |
| Non-ASCII BMP (U+0080..U+FFFF) | \`\\uXXXX\` |
| Non-BMP (U+10000+) | \`\\u{XXXXX}\` |

Wired into both \`js_codegen.ml\` (Node target) and \`codegen_deno.ml\`
(Deno-ESM target) at the \`LitString\`/\`LitChar\` emit sites.

## Test plan

New \`tests/codegen-deno/non_ascii.affine\` fixture + harness:

\`\`\`affine
pub fn emoji_cross() -> String { return \"❌\"; }    // BMP U+274C
pub fn non_bmp_sob() -> String { return \"😭\"; }     // non-BMP U+1F62D
pub fn cjk_hello()   -> String { return \"你好\"; }
pub fn latin_accent() -> String { return \"café résumé\"; }
pub fn mixed()       -> String { return \"[OK] café 你好 ❌\"; }
pub fn ascii_only()  -> String { return \"plain ASCII\"; }
pub fn quotes_and_backslash() -> String { return \"\\\"escaped\\\" and
\\\\back\"; }
\`\`\`

The \`import\` itself is the strictest test: if the emitted \`.deno.js\`
contains octal escapes, the module fails to parse and the harness import
throws SyntaxError before any assertion runs.

- [x] Local \`./tools/run_codegen_deno_tests.sh\`: **13/13** harnesses
green (including the new fixture)
- [x] Local \`dune test\`: **352/352** unit tests green
- [x] Compiler output spot-check: \`emoji_cross\` emits \`return
\"\\u274C\";\`, \`non_bmp_sob\` emits \`return \"\\u{1F62D}\";\`, ASCII
passes through unchanged
- [x] Manual: emitted \`.deno.js\` parses + runs under Node 20 ESM
(which uses strict mode by default)

## Out of scope

- \`rescript_codegen.ml\` also uses \`String.escaped\` but emits
ReScript source (which the rescript compiler then transforms to JS).
Whether ReScript inherits the same bug is a separate question; not
addressed here.
- Other non-JS codegens (lua, c, rust, julia, gleam, nickel, why3) keep
\`String.escaped\` — they target languages with their own escape
conventions.

## Refs

- Closes #460 (the gap)
- Refs hyperpolymath/standards#284 (the seam-analyst PR that surfaced
this — worked around with ASCII \`[FAIL]\`/\`[OK]\` sentinels)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
hyperpolymath added a commit to hyperpolymath/affinescript that referenced this pull request May 30, 2026
…464)

## Summary

Closes #458 — \`String < String\` (and \`>\` / \`<=\` / \`>=\`) now
type-check, lowering to JS's native lexicographic string comparison.
Pre-fix: \`TypeMismatch (String, Int)\`.

## Implementation

Single addition to the existing comparison dispatch in
\`Typecheck.synth_expr\` for \`ExprBinary\`:

\`\`\`ocaml
match repr lhs_ty with
| TCon "Float" -> ...
| TCon "String" ->
    let* () = check ctx rhs ty_string in
    Ok ty_bool
| _ -> ...   (* legacy Int monomorphism *)
\`\`\`

Pattern mirrors the existing Float dispatch a few lines up. No codegen
changes needed — JavaScript's \`<\` / \`>\` / \`<=\` / \`>=\` on strings
is lex compare natively, and the JS-family backends already emit those
operators verbatim.

## Test plan

New regression fixture \`tests/codegen-deno/string_lex_cmp.affine\` +
harness with **22 assertions**:

- All four ops via functional form (\`lt(a, b)\`, etc.) — covers each
operator's positive/negative direction
- All four ops via literal form (\`first_lt()\`, etc.)
- Equal-string corner cases — \`x <= x\` true, \`x >= x\` true, \`x <
x\` false
- Empty strings — \`\"\" < \"a\"\`, \`\"\" <= \"\"\`
- Prefix relations — \`\"abc\" < \"abcd\"\`

- [x] Local \`./tools/run_codegen_deno_tests.sh\`: **14/14** harnesses
green
- [x] Local \`dune test\`: **352/352** green
- [x] Smoke compile: \`return a < b;\` emits as \`return (a < b);\` (JS
native)

## Out of scope

- **Non-ASCII string comparison** in the fixture: this branch forked
from \`main\` before #463 (the companion Unicode-escape codegen fix for
#460) lands, so non-ASCII source literals would still emit OCaml-style
\`\\NNN\` octal escapes that strict-mode ESM rejects. The relational
typecheck change is orthogonal to literal encoding — non-ASCII lex
compare works naturally once both PRs merge. A non-ASCII assertion can
be added in a follow-up commit after #463 merges, or auto-rebased here
if they land in either order.
- **Other backends** (rescript, wasm, lua, c, rust): out of scope; #458
specifically called out the JS-family ergonomic gap. If \`String <\`
lowering for other backends becomes load-bearing, file separately.

## Refs

- Closes #458
- Refs hyperpolymath/standards#284 (the seam-analyst PR with the
\`str_lt\` workaround)
- Companion: #463 (#460 Unicode-escape codegen, lands together to
unblock non-ASCII relational comparisons)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
hyperpolymath added a commit to hyperpolymath/affinescript that referenced this pull request May 30, 2026
## Summary

Closes #459 — `break` and `continue` now parse, type-check (rejected
outside loop bodies with a clear error), and lower to JS
`break;`/`continue;` in the Deno-ESM and Node JS backends. Pre-fix:
`BREAK`/`CONTINUE` were lexer-reserved tokens with no parser production
consuming them; any use was a syntax error.

## Pipeline changes

| File | Change |
|---|---|
| `lib/ast.ml` | `ExprBreak of Span.t`, `ExprContinue of Span.t` |
| `lib/parser.mly` | `BREAK`/`CONTINUE` productions in `expr_assign`
(diverging prefix, next to `RETURN`/`RESUME`) |
| `lib/resolve.ml` | pass-through (`resolve_expr` + `lower_expr`) |
| `lib/typecheck.ml` | new `ctx.in_loop : mutable bool` flipped on
`StmtWhile`/`StmtFor` body entry; `synth` returns `ty_never`; new
`NotInLoop of string` error |
| `lib/borrow.ml` | pass-through (span lookup, visit-recurse, free-var
collection, main checker) |
| `lib/quantity.ml`, `lib/effect_sites.ml` | pass-through (no resources,
no call sites) |
| `lib/codegen_deno.ml`, `lib/js_codegen.ml` | statement-position
lowering to bare JS keywords |

## Test fixture

`tests/codegen-deno/loop_break_continue.affine` + harness — 14
assertions across:

- `while` + `break` (threshold-driven early exit)
- `while` + `continue` (skip-evens accumulator)
- `for` + `break` (find-first-match)
- `for` + `continue` (count-positive filter)
- Edge cases: break on first iteration, no-break path, empty array

## Out of scope

- **Non-JS backends** (wasm/GC/lua/c/rust/etc.): fall through existing
wildcards. Full backend support files separately if needed.
- **JS-codegen expression-position IIFE wrapper** (legacy MVP path)
emits `(() => { break; })()` which would throw `SyntaxError: Illegal
break statement` at runtime — legal AffineScript places break/continue
inside loop bodies so the statement path fires. Deno backend uses the
correct statement-position emit.

## Test plan

- [x] `./tools/run_codegen_deno_tests.sh`: 15/15 harnesses green
- [x] `dune test`: 352/352 unit tests green
- [x] Misuse check: `pub fn bad() -> () { break; }` emits the new
`NotInLoop` error with the expected message

## Refs

- Closes #459
- Refs hyperpolymath/standards#284 (workarounds documented in the "Seam
findings" section that surfaced this gap)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant