Skip to content

test: implement full Go test suite from toml-test#153

Merged
DecimalTurn merged 27 commits intolatestfrom
dev-go-test2
Apr 15, 2026
Merged

test: implement full Go test suite from toml-test#153
DecimalTurn merged 27 commits intolatestfrom
dev-go-test2

Conversation

@DecimalTurn
Copy link
Copy Markdown
Owner

@DecimalTurn DecimalTurn commented Apr 13, 2026

In this PR:

  1. Integrate the official Go toml-test runner into CI
  2. Preserve integer precision by default with new parsing modes
  3. Support raw UTF-8 byte input and enforce invalid-encoding rejection
  4. Add minimumDecimals formatting option for float serialization
  5. Additional fixes and updates

1. Integrate the official Go toml-test runner into CI

This PR adds a full end-to-end run of the upstream toml-test binary against toml-patch.

Before this PR, CI validated Jest/spec tests, but did not execute the Go test harness directly. That left room for drift between local assumptions and upstream parser/encoder expectations.

The fix

  • Add a dedicated runner script (inspired by smol-toml): run-toml-test.bash
  • Add Node adapters used by toml-test:
    • toml-test-decode.mjs
    • toml-test-encode.mjs
  • Add npm script:
    • test:go
  • Extend CI workflow to:
    • setup Go
    • install toml-test
    • run pnpm run test:go

CI now installs a pinned binary version for deterministic runs:

go install github.com/toml-lang/toml-test/v2/cmd/toml-test@v2.1.0

2. Preserve integer precision by default with new parsing modes

While running the suite, a problem surfaced: TOML integers can exceed JavaScript's safe integer range. Historically, parsing into JS number could silently lose precision for large values.

Example of problematic values:

int64-max = 9223372036854775807
int64-min = -9223372036854775808

The fix

Introduce ParseOptions.integersAsBigInt with 3 modes:

  • 'asNeeded' (default): safe integers -> number, unsafe integers -> bigint
  • true: all integers -> bigint
  • false: all integers -> number (legacy/possibly lossy behavior)

This behavior is implemented in the AST-to-JS conversion path (integerFromRaw) and covered by dedicated tests.

3. Support raw UTF-8 byte input and enforce invalid-encoding rejection

toml-test includes invalid-encoding cases that must be checked at byte level. Reading everything as a JS string first can mask malformed UTF-8 sequences.

The fix

  • parse() now accepts string | Uint8Array
  • TomlDocument constructor now accepts string | Uint8Array
  • Byte input is decoded through fatal UTF-8 decoding so invalid sequences throw before parsing
  • Invalid encoding tests are read as raw bytes where required

This aligns behavior with TOML's UTF-8 requirement and lets the suite properly validate malformed inputs.

4. Add minimumDecimals formatting option for float serialization

Some use cases require stable decimal formatting (for readability or predictable diffs), including padding integer-valued JS numbers as TOML floats.

The fix

Add TomlFormat.minimumDecimals (default 0):

  • 0: default behavior
  • 1+: ensures at least that many decimal places for numeric float output

Examples:

// minimumDecimals = 0
stringify({ x: 1, y: 1.5 })
// x = 1
// y = 1.5

// minimumDecimals = 2
stringify({ x: 1, y: 1.5 })
// x = 1.00
// y = 1.50

bigint values remain TOML integers regardless of this setting.

5. Additional fixes and updates

  • Fix date formatting to always zero-pad UTC year to 4 digits
  • Ensure DEL (U+007F) is escaped in generated TOML string contexts requiring escaping

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds missing TOML spec compliance features needed to run the upstream toml-test suite end-to-end, primarily around UTF-8 handling, integer precision, and float/string formatting.

Changes:

  • Add parse() support for raw UTF-8 bytes with fatal decoding and integrate this into spec tests.
  • Introduce configurable integer materialization (integersAsBigInt) with default “as-needed” BigInt promotion for unsafe integers.
  • Add formatting support for minimum float decimals and tighten TOML string/key escaping.

Reviewed changes

Copilot reviewed 18 out of 18 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
toml-test-encode.mjs Adds a toml-test encoder script; uses minimumDecimals for float conformance.
toml-test-decode.mjs Adds a toml-test decoder script; tags parsed values for toml-test interop.
run-toml-test.bash Convenience wrapper to run toml-test against the library’s encoder/decoder.
src/index.ts Extends parse() to accept Uint8Array + parse options; exports date classes.
src/parse-options.ts Introduces ParseOptions and IntegersAsBigInt type.
src/to-js.ts Implements integer parsing from raw with integersAsBigInt behavior.
src/ast.ts Allows integer node values to be `number
src/utils.ts Adds isBigInt helper for stringify pipeline.
src/parse-js.ts Updates JS→AST generation to preserve BigInts and apply minimumDecimals.
src/generate.ts Adds quoteTomlString, BigInt integer generation, and minimum-decimal float rendering.
src/toml-format.ts Adds minimumDecimals option with defaults, validation, and resolution.
src/toml-document.ts Allows constructing documents from raw UTF-8 bytes; passes input for error reporting.
src/date-format.ts Pads year to 4 digits for RFC3339/TOML datetime output.
src/__tests__/toml-document.test.ts Adds coverage for byte-input TomlDocument construction.
src/__tests__/parse.test.ts Adds coverage for default “as-needed” BigInt promotion behavior.
src/__tests__/parse-bigint-options.test.ts Adds coverage for integersAsBigInt option modes.
specs/specs.test.ts Reads certain encoding-invalid fixtures as raw bytes so fatal UTF-8 decoding can reject them.
.devcontainer/devcontainer.json Adds a devcontainer definition for consistent local runs.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/toml-format.ts Outdated
Comment thread toml-test-decode.mjs
Comment thread src/index.ts
…ncodings

Add a single-pass UTF-8 byte validator (src/validate-utf8.ts) that catches:
- Unexpected continuation bytes (0x80-0xBF at sequence start)
- Overlong sequences (0xC0-0xC1 lead bytes)
- Truncated multi-byte sequences
- Invalid continuation bytes within a sequence
- Surrogate code points (U+D800-U+DFFF encoded as 0xED 0xA0-0xBF...)
- Code points above U+10FFFF
- Invalid bytes 0xF5-0xFF

parse() now accepts string | Uint8Array. When raw bytes are provided,
the validator runs before TextDecoder converts to a JS string (since
the JS string decoder silently replaces invalid bytes with U+FFFD,
making them undetectable after the fact).

Update specs/specs.test.ts to read the six byte-level encoding tests
as raw Buffer so the validator can inspect the sequences. The seventh
encoding test (ideographic-space, U+3000) was already rejected by the
tokenizer as an unexpected character outside ASCII whitespace.

Removes all 7 encoding/* tests from SKIPPED_TESTS.
@DecimalTurn DecimalTurn marked this pull request as ready for review April 15, 2026 06:51
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 22 out of 22 changed files in this pull request and generated 5 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread package.json
Comment thread CHANGELOG.md Outdated
Comment thread README.md
Comment thread README.md
Comment thread .github/workflows/test-and-build.yml Outdated
@DecimalTurn DecimalTurn merged commit a0ce844 into latest Apr 15, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants