Skip to content

v0.9.0

Choose a tag to compare

@Hindurable Hindurable released this 24 May 20:35
· 476 commits to main since this release
196fa7e

[0.9.0] — 2026-05-24

Changed — refactor/nurlify branch

Picks up where refactor/pure-nurl left off and drives stdlib/runtime.c
the rest of the way down. Per the PURIFY.md tracker:

  • stdlib/runtime.c: 6 265 → 4 540 LOC (−1 725, −27.5 %). Combined
    with the prior branch the total reduction since v0.8.0 is 8 879 →
    4 540 LOC (−4 339, −48.9 %)
    — over half of the C runtime is gone.
    Bootstrap fixed point held on every shipped phase; full test corpus
    green.
  • PURIFY §17 random. rand_u64 / rand_hex_str ported to pure
    NURL in stdlib/std/random.nu. Only nurl_rand_fill stays C —
    the getrandom / arc4random_buf / BCryptGenRandom platform
    branching is genuinely syscall-shaped FFI.
  • PURIFY §4 file ops batch. nurl_read_file_bytes /
    _write_file_bytes / _file_read_chunk / _read_n_bytes /
    _errno_kind moved to pure NURL. The g_last_bytes_len sideband is
    gone — fread / fwrite write directly into the Vec[u] data
    buffer and vec_set_len records the count. EACCES / EPERM /
    EEXIST added to nurl_native_constant; errno_kind now lives in
    stdlib/core/posix.nu.
  • PURIFY §22 gzip. nurl_gzip_compress / _decompress moved to
    pure-NURL FFI in stdlib/ext/compress.nu over deflateInit2_ /
    deflate / deflateEnd + inflateInit2_ / inflate /
    inflateEnd. Two tiny C accessors (nurl_z_setup /
    nurl_z_total_out) bridge the platform-varying z_stream field
    layout (LP64 vs LLP64 uLong width).
  • PURIFY §14 HTTP response accessors. The 7 accessor C functions
    (status / err_kind / body / body_len / header_count /
    header_name / header_value) deleted from runtime.c. Pure-NURL
    equivalents in stdlib/ext/http.nu read the NurlHttpResponse
    heap struct via nurl_peek(p, slot) over its 6-i64 slot layout.
    Static asserts in runtime.c pin the layout at compile time so a
    future field reorder breaks the native build instead of silently
    miscompiling NURL reads. nurl_http_response_free stays C because
    it walks headers[] deallocating each name / value pair plus the
    body.
  • PURIFY §14b HTTP libcurl backend + multi-stream orchestration
    driven from pure NURL. Sync nurl_http_perform_full_to and
    multi-stream _open_to / _next / _pump_headers plus the 5
    stream accessors live in stdlib/ext/http.nu; 22 monomorphic
    trampolines stay C (nurl_curl_* setopt / multi /
    stream-state) because libcurl's variadic curl_easy_setopt and the
    raw-fn-pointer callbacks (nurl__http_write_body /
    _write_header) can't cross the FFI directly. NurlHttpStream's
    three historical int fields widened to long long for a clean
    14×i64 slot layout; static_assert pins it. Live verified
    against httpbin.
  • PURIFY §2 SIMD CSV scanner ported to pure NURL
    (stdlib/ext/csv.nu, −508 C). The vectorised newline / delimiter
    scanner is now NURL @-fns over nurl_peek of a heap-side byte
    window.
  • runtime.c prose cleanup (commits d558844, f88d7bb):
    trimmed verbose multi-paragraph explanations, phase-by-phase
    migration history and prose that just restated what the code does
    — kept one-line function-purpose intros and the non-obvious "why"
    notes (TLS / SNI race discipline, fiber park-unlock ordering,
    wasm32 layout caveats, libz LP64 / LLP64 differences). Net −913
    comment-only LOC.

Changed — JSON-to-production branch

  • json ext goes production-ready. stdlib/ext/json.nu:
    • Typed JsonError replaces the bare ParseErr — carries
      kind (BadFormat / Empty / TrailingGarbage / Overflow),
      pos (0-based byte offset), line (1-based) and col
      (1-based). Location is computed once per failure and travels
      with the error value — no global state, so nested
      json_parse calls and multi-threaded use are both safe.
      json_format_error renders the standard message; build your
      own from the fields if you need a custom shape.
    • RFC 8259 strict mode. Non-conforming numbers (leading zeros
      like 01, +5, lone .5, 1.) are now BadFormat instead of
      parsing to the prefix — json_stringify ∘ json_parse is
      guaranteed-valid JSON.
    • New constructorsjson_int n / json_float x from
      primitives (no i8* roundtrip), json_arr_new / json_obj_new
      for empty containers.
    • Duplicate-key behavior documented. Parser preserves
      duplicate keys as-is; json_obj_get returns the first match in
      source order; json_obj_set replaces the first match in source
      order.
    • Call sites updated across stdlib/ext/anthropic.nu,
      stdlib/ext/mcp{,_client,_http,_stdio}.nu, nurlapi/main.nu,
      examples/serde_demo.nu, tools/nurl-lsp/jsonrpc.nu.

Changed — refactor/pure-nurl branch

The refactor/pure-nurl branch (42 commits, 2026-05-23 → 2026-05-24)
took the bulk of stdlib/runtime.c out of C and into pure NURL —
either as pure-NURL @-fns or as direct & \c`/& `pthread`/& `sqlite3`` FFI declarations. Per the PURIFY.md tracker:

  • stdlib/runtime.c: 8 879 → 6 265 LOC (−2 614, −29.4 %). The
    bootstrap fixed point held on every shipped phase and the full
    test corpus stayed green.
  • Python removed from the bootstrap. compiler/nurlc.py and
    compiler/src/*.py are gone. Stage 0 now links the committed
    compiler/nurlc_lastgood.ll snapshot directly via clang. The
    only build-time dependency is clang/LLVM 14+. Refresh the
    snapshot with ./build.sh --refresh-bootstrap when a
    grammar/runtime-ABI change leaves the current snapshot unable
    to compile current nurlc.nu.
  • Box[T] / Cell[T] / Rc[T] / Arc[T] heap-stable
    allocator surface — stdlib/core/box.nu, stdlib/core/cell.nu,
    stdlib/std/rc.nu, stdlib/std/arc.nu. % Drop auto-fires;
    nurl_native_sizeof + nurl_atomic_i64_* runtime primitives
    added. This unblocked Phases 6 / 8 / 11 / 12 of the purification.
  • PURIFY phases shipped (per-phase detail in PURIFY.md Part VII):
    • Phase 1 §3 char classification (stdlib/core/char.nu, −11 C)
    • Phase 2 §15 logging level (stdlib/std/log.nu, −7 C)
    • Phase 3 §11 libm + integer helpers (& \m`/& `c`` FFI, −17 C)
    • Phase 4 §17 crypto MD5/SHA-1/256/512 + HMAC
      (stdlib/std/hash_*.nu, −541 C)
    • Phase 5 §2 string ops over libc (strlen/strcmp/strncmp/strstr/
      memcmp/memmem/atoll/atof/memcpy/strdup via preamble, −682 C)
    • Phase 6 §19 threads / mutex / cond (pthread & \pthread`FFI instdlib/std/thread.nu, −162 C; mingw-w64 winpthreads linked via -lpthread`)
    • Phase 7 §4 + §13 file & dir syscalls — incremental over many
      batches (realpath / write_file_safe / file_size / mmap / fread
      fallback / dir_list POSIX, −158 C combined)
    • Phase 8 §16 + §16b process spawn (fork/exec/poll, || and
      && added as language tokens for the spawn-error sideband,
      −245 C)
    • Phase 9a §7 + §8 codegen counters + last-type sideband
      (pure-NURL @-fns in nurlc.nu, −71 C)
    • Phase 9b §6b symbol table (3 parallel grow-by-2× arrays,
      inner loops via direct *s / *i pointer arithmetic, −72 C;
      ~0.95× of C runtime — LTO inlines everything and the parallel
      layout is cache-friendlier than the C interleaved struct)
    • Phase 9c §5 HashMap deleted entirely (the canonical
      stdlib/std/hashmap.nu HashMap[s i] is the one-true map for
      every consumer; the migration also fixed hash_string from
      O(n²) → O(n) by switching from per-byte nurl_str_get to a
      direct *u byte walk, −101 C)
    • Phase 10 §6a Lexer (the big one, −592 C). Full state
      machine + 4-deep lookahead ported to pure-NURL @-fns over a
      280-byte heap handle. Uncovered + fixed a subtle escape-handling
      bug: only \n \t \r \\ are real escapes; any other \X
      (including \`) writes the lone \ and advances one byte.
    • Phase 11 §23 DoS protection (stdlib/std/dos.nu, −180 C)
    • Phase 12 §21 SQLite bridge — pure-NURL FFI over 18 libsqlite3
      symbols (stdlib/ext/sqlite.nu, −330 C)
    • §12 Time — clock_gettime + nanosleep FFI (stdlib/std/time.nu,
      −38 C; macOS uses CLOCK_MONOTONIC = 6 vs 1 elsewhere, read
      at runtime via nurl_native_constant)
    • §13 batch 2/3 — stdin + dir_list POSIX FFI (−80 C)
    • §11 strtod sideband eliminated with an endptr buffer (−20 C)
  • || and && operators added as language tokens — strict
    binary, bool-only short-circuit. Alternative to the chainable
    | / & for cases that are more readable as a || / &&
    chain. Grammar v2.0 documents them. Same LLVM IR as | / &
    on i1 left operands.
  • ./check.sh <file.nu> — per-file syntax/type check tool;
    runs nurlc against a single source file in ~0.2 s vs build.sh
    ~60 s. Use in iterate-fix loops before kicking the full build.
  • Test runner output split into success.txt + failures.txt
    so a failed test is greppable without scrolling through the
    green output.
  • Parenthesised-operator diagnostic. A ( begins a call, so
    ( . obj field ) / ( | a b ) / ( + x y ) etc. now produce
    a precise call-site error: instead of a far-away LLVM-verifier
    complaint. (Listed earlier in this section under the original
    feature work; reiterated here as it landed in this branch.)
  • Call-arity diagnostics. Every call's argument count is
    checked against the callee's declared parameter count; a
    mismatch points at the call site (same listing remark).
  • Prefix arity-cascade diagnostic. Short-an-argument prefix
    operator over-reads now name the offending token and point back
    at the line where the cascade started.
  • mcp_response_get_resultmcp_client's 1-arg result
    extractor renamed for consistency with the rest of the surface.

Fixed — refactor/pure-nurl branch

  • WASM FFI width mismatches uncovered by uuidgen wasm build
    (2026-05-24). nurl_errno_get / nurl_errno_set /
    nurl_wait_is_exited / _exit_status / _is_signaled /
    _term_sig paluut + parametrit widened intlong long
    in stdlib/runtime.c. On x86_64 SysV the int return's upper
    32 bits were undefined and accidentally zero; wasm-ld validates
    signatures strictly and refused to link until the C side
    agreed with the NURL FFI's → i (i64) declaration. memmem
    added to api/app/main.py:LIBC_WASM32_ABI (the playground's
    wasm-build IR rewriter), since wasm32 size_t is i32 but
    nurlc.nu's preamble emits memmem(i8*, i64, i8*, i64).
  • macOS WIFEXITED lvalue requirement. The widened
    nurl_wait_* functions originally passed (int)status as an
    rvalue to the W*-macros; macOS's <sys/wait.h> expands them
    to *(int*)&(x) which needs an lvalue. Fixed by binding
    int s = (int)status; first inside each wrapper. Restores the
    zig macOS-arm64 / macOS-x64 cross-build.

Added

  • MsgPack serde. stdlib/ext/serde.nu gained from_msgpack_i /
    from_msgpack_f / from_msgpack_b / from_msgpack_string
    decoding MessagePack bytes straight to a built-in value. There is no
    % MsgpackSerialize trait: MessagePack and JSON share a data model,
    so a value is encoded by composing the existing to_json with
    msgpack_encode. The decoders return !T MsgpackErr (not
    ParseErrMsgpackErr is the richer error type and represents
    every failure losslessly); MsgpackErr gained a
    MsgpackTypeMismatch variant for a value that decoded cleanly but
    is the wrong shape. Demo examples/msgpack_demo.nu; regression
    compiler/tests/msgpack_serde.nu. With this the serde story covers
    JSON, TOML and MessagePack — all reusing one JsonSerialize impl
    per type.

  • TOML serde. stdlib/ext/serde.nu gained its TOML side: a
    % TomlSerialize [T] { @ to_toml T x → TomlValue } trait with impls
    for i / b / s / String, and from_toml_i / from_toml_b /
    from_toml_string decoders returning !T ParseErr — the same shape
    and error type as the JSON helpers. There is no f impl: the
    TomlValue AST has no float variant. stdlib/ext/toml.nu gained
    toml_stringify, the inverse of toml_parse: a TomlValue is
    rendered as TOML text — top-level key = value lines, nested tables
    and arrays inline, strings escaped with the \\ \" \n \r \t set the
    parser accepts, so toml_parse ∘ toml_stringify round-trips.
    Regression compiler/tests/toml_serde.nu; verified leak-free under
    ASan/UBSan/LSan.

  • MessagePack codec. stdlib/ext/msgpack.nu is a faithful binary
    codec between the Json value and the MessagePack wire format:
    msgpack_encode Json → ! ( Vec u ) MsgpackErr and msgpack_decode ( Vec u ) → ! Json MsgpackErr. The encoder emits the smallest
    signed integer format, float64 for reals, and length-appropriate
    str / array / map headers; the decoder accepts every integer and
    float format plus all str / array / map sizes. bin / ext and
    non-string map keys are reported as MsgpackUnsupported; truncation
    and malformed input have their own MsgpackErr variants; a
    recursion cap guards both directions. Three runtime helpers —
    nurl_f64_bits, nurl_f64_from_bits, nurl_f32_from_bits — provide
    the IEEE-754 bit access needed for the float wire format. Regression
    compiler/tests/msgpack_basic.nu (37 assertions: round-trips,
    msgpack.org encode vectors, non-canonical-format decoding, malformed
    inputs); verified leak-free under ASan/UBSan/LSan. First of the
    three Serde-completion ships (codec, then TOML serde, then MsgPack
    serde).

  • Runtime float-bits helpersnurl_f64_bits,
    nurl_f64_from_bits, nurl_f32_from_bits in stdlib/runtime.c.

  • Parenthesised-operator diagnostic. A ( begins a function call,
    so the token after it must be a function name. An operator token
    there — ( . obj field ), ( | a b ), ( + x y ) — meant an
    operator expression was wrongly wrapped in parentheses. nurlc used
    to take the operator's lexeme as the callee, emit a call to a
    function literally named . / | / +, and let the build fail far
    from the source at link time with use of undefined value.
    gen_call now rejects a binary operator, member access ., the
    cast # or the caret ^ immediately after ( with a precise
    error: at the call site — operator '.' cannot be a call target: '(' begins a function call, but operator expressions are written without parentheses — and a caret on the operator. Regression
    compiler/tests/should_fail_paren_operator.nu.

  • Typed Path. stdlib/std/path.nu gained a Path { String inner }
    typed, owning wrapper over a path string, with a concise
    Rust-PathBuf-style verb API — path_new, path_str (borrow the
    inner buffer), path_len, path_is_empty, path_clone,
    path_free, path_eq, path_push (join one component),
    path_parent, path_name, path_is_abs — and the two operations
    the existing string-level layer lacked: path_canonical (realpath:
    absolute, symbolic links resolved) and path_relative_to (a purely
    lexical relative path between two paths). Both return ? Path
    None for a missing / inaccessible path or a not-comparable pair. The
    string-s-based path_* functions stay the raw layer underneath;
    the typed functions never consume their arguments. One runtime
    helper, nurl_realpath, is reached through the pure-NURL & \c`FFI model, so the compiler is unchanged and the bootstrap fixed point is byte-identical. Regressioncompiler/tests/path_typed.nu`;
    verified leak-free under ASan/UBSan/LSan.

  • Runtime helpernurl_realpath (realpath on POSIX, _fullpath
    on Windows) in stdlib/runtime.c.

  • Extended hash family — SHA-512, MD5, HMAC-SHA-512.
    stdlib/std/hash.nu gained sha512_bytes / sha512_hex (FIPS 180-4
    SHA-512, 64-byte digest), md5_bytes / md5_hex (RFC 1321 MD5,
    16-byte digest) and hmac_sha512_bytes / hmac_sha512_hex (RFC 2104
    HMAC over SHA-512). All are binary-clean — they take ( Vec u ) and
    are length-aware, so NUL bytes are preserved — mirroring the existing
    sha1_bytes. The three algorithms are self-contained in
    runtime.c §17 (no libsodium / OpenSSL dependency) and are reached
    through the pure-NURL & \c`FFI model, so the compiler is unchanged and the bootstrap fixed point is byte-identical. MD5 and SHA-1 are documented as compatibility-only — both are collision-broken and must not authenticate data or hash secrets. Regressioncompiler/tests/hash_extended.nu` checks every algorithm against
    published vectors (RFC 1321 §A.5, FIPS 180-4, RFC 4231 HMAC cases
    1/2/6); verified leak-free under ASan/UBSan/LSan.

  • Runtime hash primitivesnurl_md5_bytes, nurl_sha512_bytes,
    nurl_hmac_sha512_bytes in stdlib/runtime.c.

  • Advanced filesystem operations. stdlib/std/fs.nu gained
    recursive directory operations and a streaming file reader:
    dir_create_all is mkdir -p — it creates every missing parent
    directory and treats an already-existing directory as success;
    dir_remove_all is recursive rm -rf — it walks the tree removing
    every entry before removing the directory itself, and unlinks a
    symlink rather than descending through it. file_open /
    file_read_chunk / file_eof / file_close read a file in
    fixed-size byte chunks over a new File handle, so a binary input
    far larger than RAM can be processed without ever being fully
    resident (line-oriented streaming stays with stdlib/std/bufio.nu).
    Three new runtime.c filesystem helpers back this — nurl_path_type
    (an lstat-based entry classifier), nurl_file_read_chunk and
    nurl_file_eof — all reached through the pure-NURL & \c`FFI model, so the compiler is unchanged and the bootstrap fixed point is byte-identical. Regressioncompiler/tests/fs_advanced.nu`; the new
    paths are verified leak-free under ASan/UBSan/LSan.

  • Runtime filesystem helpersnurl_path_type,
    nurl_file_read_chunk, nurl_file_eof in stdlib/runtime.c.

  • Call-arity diagnostics. gen_call now checks every call against
    the callee's declared parameter count and rejects a mismatch with a
    precise error: at the call site — e.g. call to 'add' has the wrong number of arguments: expected 2, got 1. Previously a
    wrong-arity call to a known function either miscompiled silently
    (too few arguments) or emitted a malformed call the LLVM verifier
    complained about far from the source (too many). scan_fn_sigs
    records each non-generic @-function's parameter count through a
    new pure-lexical type skipper (scan_skip_type — no parse_type
    call, which would desync the scan); a name carrying two definitions
    of differing arity is marked ambiguous and skipped rather than
    mis-blamed. Generic and variadic-FFI callees are out of scope for
    v1.

  • Prefix arity-cascade diagnostic. When a prefix operator runs out
    of operands and over-reads into the following statement — the
    classic NURL cascade, since operators have fixed arity and no
    closing bracket — the resulting "unexpected token" error now names
    the offending token and points back at the line where the
    short-an-argument statement began, instead of blaming the innocent
    next line.

  • Compiler regressionscompiler/tests/{call_arity_ok,
    should_fail_call_arity_few,should_fail_call_arity_many,
    should_fail_prefix_cascade}.nu.

  • MQTT client — topic-filter wildcard matching. mqtt_topic_matches
    in stdlib/ext/mqtt.nu implements the MQTT 5.0 §4.7 + / #
    wildcard rules — + matches one topic level, # matches the
    remainder (zero levels included, so sport/# also matches the parent
    sport) — including the §4.7.2 guard that a filter beginning with a
    wildcard never matches a $SYS/... topic. The intended use is
    client-side dispatch when one connection carries several
    subscriptions.

  • MQTT offline codec regressioncompiler/tests/mqtt_codec.nu
    exercises the Variable Byte Integer round-trip, the unsigned byte
    reader, MQTT UTF-8 string framing, the CONNECT byte layout, CONNACK
    reason extraction, MQTT 5 user-property parsing, the typed MqttErr
    names, and topic matching — no network, CI-safe (closes the bulk of
    MQTT_PLAN.md Phase 5).