Skip to content

Rewrite to Typescript#1

Merged
rzcoder merged 8 commits into
masterfrom
claude/modernize-agents-typescript-1GgqN
May 17, 2026
Merged

Rewrite to Typescript#1
rzcoder merged 8 commits into
masterfrom
claude/modernize-agents-typescript-1GgqN

Conversation

@rzcoder
Copy link
Copy Markdown
Owner

@rzcoder rzcoder commented May 17, 2026

Summary

Full rewrite of data-struct in TypeScript with a new codec-token API,
two-pass encoder, dual ESM/CJS build, vitest + benchmarks + CI, plus a
follow-up security pass and a perf pass on the field-iteration hot path.
Wire format is byte-identical with 0.0.x for the corresponding new
codecs; legacy DataTypes / DataReader / DataWriter exports are
removed (clean break, ships as 0.1.0).

Changes

API

  • Typed codec tokens: t.bool, t.i8/u8/...i32/u32, t.f32/f64,
    t.i64/u64, t.string, t.shortBytes, t.bytes, with t.le.*
    little-endian variants.
  • struct(schema) factory that compiles a schema once and reuses it
    across encode / decode / sizeOf.
  • Functional encode(value, schema) and decode(buf, schema).
  • Infer<S> derives the TypeScript shape of decoded values from a
    schema (including top-level array forms — pinned by tests).
  • DataStructError with codes (VALUE_OUT_OF_RANGE, STRING_TOO_LONG,
    BYTES_TOO_LONG, ARRAY_TOO_LONG, BUFFER_UNDERFLOW,
    SCHEMA_MISMATCH, INVALID_SCHEMA), field path (e.g.
    $.skills[1].description), and decode offset.

Build & packaging

  • Dual ESM + CJS via tsup with separate .d.ts / .d.cts. Minified
    output: dist/index.mjs is ~7.5 KB (down from ~15 KB).
  • "type": "module", engines.node >= 20.9, sideEffects: false.

Performance

  • Two-pass encoder allocates the output buffer exactly once (no
    Buffer.concat).
  • Lazy error-path construction via try/catch + rethrowWithPrefix
    instead of allocating ${path}.${key} strings per field. Measured on
    Node 22: list of list encode -59% / decode -45%, nested encode
    -33%, hero encode/decode -17% each.

Security hardening

  • Decoded structs use Object.create(null) — neutralises prototype
    pollution from a __proto__ key in an untrusted schema.
  • String decoder is TextDecoder({ fatal: true }) — invalid UTF-8 now
    raises SCHEMA_MISMATCH instead of silently substituting U+FFFD.
  • struct.encode asserts measure() bytes == write() bytes, so a
    buggy codec can't leak uninitialised memory from Buffer.allocUnsafe.
  • CI runs npm audit --audit-level=high --omit=dev.

Tooling & CI

  • vitest suite: wire-format goldens, roundtrip, error paths, v8
    coverage.
  • tinybench benchmark suite + on-demand benchmark.yml workflow.
  • biome for lint + format.
  • CI matrix: Node 20/22/24 on Linux + Node 22 on macOS/Windows.
  • Tag-triggered release workflow with npm provenance.
  • Dependabot, PR template, CODEOWNERS, .editorconfig, .nvmrc.

Examples & docs

  • examples/01..06-*.ts covering basics, the hero struct, functional
    form, errors, little-endian, and sizeOf.
  • README rewritten; CHANGELOG.md added.

Testing

  • npm test passes (vitest: wire-format goldens, roundtrip, errors)
  • npm run lint passes (biome)
  • npm run typecheck passes
  • npm run bench — perf numbers reported above

Notes

Breaking changes. This ships as 0.1.0. Legacy DataTypes,
DataReader, and DataWriter exports are gone; consumers move to
struct(...) / encode / decode / t.*. The README has a migration
table. Minimum Node is now >=20.9.

Wire format. Byte-identical with 0.0.x for the corresponding new
codecs — the previous test suite's golden buffers are preserved in
test/wire-format.test.ts to lock this in.

Removed dev deps. grunt, grunt-simple-mocha, jit-grunt,
grunt-contrib-jshint, chai, benchmark.

claude and others added 8 commits May 16, 2026 09:23
- typed codec tokens (`t.bool`, `t.u32`, `t.string`, `t.le.*`, `t.i64/u64`)
- `struct(schema)` factory + functional `encode` / `decode`
- `Infer<S>` derives the TS shape of decoded values from a schema
- two-pass writer: measure once, allocate once, no `Buffer.concat`
- `DataStructError` with codes, field path, and decode offset
- dual ESM + CJS build via tsup (`.mjs` / `.cjs` / `.d.ts` / `.d.cts`)
- vitest suite (wire-format goldens, roundtrip, error paths) with v8 coverage
- tinybench benchmark suite
- biome for lint + format
- GitHub Actions CI matrix (Node 20/22/24 on linux + macOS/Windows on 22),
  on-demand benchmark workflow, tag-triggered release with npm provenance
- dependabot, PR template, CODEOWNERS

Wire format is byte-identical to 0.0.x for the corresponding new codecs;
legacy DataTypes/DataReader/DataWriter exports are removed (clean break).
- struct.encode: assert measure() size == write() bytes emitted; protects
  against allocUnsafe leaking uninitialised memory if a codec measure
  diverges from its write
- struct decode: build output objects with Object.create(null) so a stray
  __proto__ key in an untrusted schema cannot pollute the prototype chain
- string codec: TextDecoder(fatal:true), surfaces invalid UTF-8 as
  SCHEMA_MISMATCH instead of silently replacing with U+FFFD
- ci: add audit job running `npm audit --audit-level=high --omit=dev` so
  high/critical advisories in production deps fail the build; dev-only
  moderate advisories (vitest dev-server chain) remain tolerated

Adds tests covering invalid UTF-8 and null-prototype output. Coverage
remains above threshold.
Regression guard that the existing Schema/Infer types correctly compute
the value shape for `[codec]`, `[codec] as const`, `[[codec]]` and
`[{...}]` forms, including the struct() factory accepting an array
schema directly.
@rzcoder rzcoder merged commit 025c699 into master May 17, 2026
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants