Skip to content

v0.9.4

Choose a tag to compare

@Hindurable Hindurable released this 02 Jun 04:51
· 313 commits to main since this release
8845c57

Added

  • Keyword arguments — default parameter values + named call arguments.
    A trailing parameter may carry a default: @ f s a s b = x i n = 3 → R
    (the default is a single source token — literal / const / atom). A call
    may then omit defaulted trailing arguments — ( f val ) — and/or pass
    arguments by name in any order, mixed with leading positional ones:
    ( f a: 1 b: 2 ), ( f val n: 5 ), ( greet greeting: Hiname:Bob ).
    Implemented as a call-site desugaring to an ordinary positional call:
    scan_fn_sigs records each function's parameter names + default sources;
    gen_call fills omitted trailing defaults inline, and routes a call that
    uses name: labels through gen_call_kwargs, which evaluates arguments
    in source order and assembles them in parameter order. Existing positional
    calls take the unchanged path (byte-identical IR — bootstrap fixed point
    holds). Regression: compiler/tests/kwargs.nu. Current limits (documented
    in the grammar): not on generic functions, FFI/variadic, or parameters
    with the inout/sink convention; **kwargs-style collection is not
    provided (pass a Json/struct).

  • BLAKE3 hash (pure NURL) — completes the hash family. New
    stdlib/std/hash_blake3.nu implements full BLAKE3 (the ChaCha-derived
    compression function, 1024-byte chunks split into 64-byte blocks with
    CHUNK_START/CHUNK_END flags, the binary Merkle tree of chaining values,
    and the ROOT-flagged final node), exposed via blake3_bytes /
    blake3_hex in stdlib/std/hash.nu (unkeyed, 32-byte output). All-NURL
    u32 wrapping arithmetic, little-endian, binary-clean over ( Vec u )
    no C at all (compiler and runtime untouched). Verified
    digest-for-digest against the official BLAKE3 reference across every
    structural path (empty, sub-block, the 1024-byte single chunk, the
    1025-byte two-chunk boundary, balanced multi-chunk trees up to 5000
    bytes); regression compiler/tests/blake3.nu; clean under ASan/UBSan/LSan.
    Closes the ROADMAP "Extended Hash Family" item — SHA-1/256/512, MD5,
    HMAC, and BLAKE3 are all shipped.

  • volatile_load / volatile_store compiler intrinsics for MMIO. Emit
    load volatile / store volatile as pure IR (no runtime call, so they
    work on a freestanding target). The optimizer can no longer hoist an MMIO
    read out of a polling loop (LICM), reorder accesses, or coalesce repeated
    reads/writes — the missing piece for spinning on a device status register
    at -O2. The access width comes from the typed pointer argument (*T),
    so one pair covers i8/i16/i32/i64. stdlib/hal/mmio.nu
    (mmio_read32/write32/set32/clear32) now uses them, so the ESP32
    UART/GPIO drivers no longer need the -O0 workaround. Regression:
    compiler/tests/volatile_mmio.nu; verified at -O2 the volatile load
    stays inside the loop body.

  • ESP32 bare-metal register HAL (stdlib/hal/esp32.nu). Pure-NURL GPIO
    and UART0 over the chip's memory-mapped registers (built on
    stdlib/hal/mmio.nu) — no ESP-IDF, no FFI. GPIO output enable / set /
    clear, and a blocking UART console (esp32_uart_putc / getc / puts
    with FIFO-count helpers), with register addresses taken from the ESP32 TRM
    and cross-checked against ESP-IDF's soc/*_reg.h. Demonstrated by the new
    fully-NURL UART echo example (examples/esp32/idf-uart).

  • C64 emulator example (examples/c64). A MOS 6510 / Commodore 64
    emulator in pure NURL — a single core.nu engine shared by a native CLI
    and a WebAssembly browser front-end. The CPU core passes Klaus Dormann's
    6502_functional_test (the canonical 6502 correctness oracle, validated
    headlessly), and with stock KERNAL/BASIC/CHARGEN ROMs the machine boots
    through the full power-on sequence — PLA banking, CIA1 jiffy IRQ — to the
    BASIC READY. prompt.

Fixed

  • nurlfmt split hex/binary/octal integer literals. The tokenizer's
    numeric scanner stopped at the first non-decimal digit, so 0x3FF44008
    became two tokens (0 + identifier x3FF44008) and the reformatted
    source miscompiled — silently, because --check is idempotent on its own
    broken output. tools/nurlfmt/tokenize.nu now scans a 0x/0b/0o
    prefix and its body as one token. Verified by the
    nurlfmt_idempotent.sh gate (450 files, IR-transparent) and by restoring
    the hex literals in the examples/esp32/* register maps that had been
    worked around with decimal constants.

  • SQLite production hardening (Tier 1 + Tier 2). stdlib/ext/sqlite.nu
    is now binary-safe and resource-safe:

    • NUL-safe text I/O. sqlite_column_text reads the column's exact
      byte length via sqlite3_column_bytes (was strlen, which truncated
      at the first embedded NUL), and sqlite_bind_text now takes a String
      and passes an explicit byte length to sqlite3_bind_text instead of
      -1 — strings with embedded NULs round-trip intact.
    • BLOB support. New sqlite_bind_blob (Vec usqlite3_bind_blob
      • SQLITE_TRANSIENT) and sqlite_column_blob (sqlite3_column_blob +
        _bytes → owned Vec u) — the binary-safe write/read path.
    • sqlite_open_v2 with open flags. SQLITE_OPEN_READONLY /
      READWRITE / CREATE / URI / NOMUTEX / FULLMUTEX / NOFOLLOW
      constants exposed; sqlite_open is now READWRITE|CREATE over
      open_v2. A read-only connection refuses writes (new SqliteReadOnly
      error variant) instead of silently creating a file.
    • sqlite_busy_timeout wraps sqlite3_busy_timeout so SQLITE_BUSY
      blocks-and-retries under concurrent access rather than failing
      immediately.
    • % Drop auto-close. Database and Statement implement the Drop
      trait; a scope-local handle — including one unwrapped from a
      ! Database E / ! Statement E result in a match arm — closes itself
      on every path (Ok, Err, early return) with no manual
      sqlite_close/sqlite_finalize. Teardown zeroes the handle slot after
      closing, so a stale internal re-entry is a no-op. Verified leak-free
      and double-free-free under ASan + UBSan (compiler/tests/sqlite_hardening.nu).
    • Tier 3 — datatypes & transactions. sqlite_bind_double /
      sqlite_column_double (REAL columns), sqlite_column_is_null,
      sqlite_begin / commit / rollback, and a closure-based
      with_transaction that COMMITs on Ok and ROLLBACKs on Err
      (propagating the original error).
    • Tier 4 — hardening for untrusted SQL/DB. Extended result codes are
      enabled on every open, so constraint failures now map to distinct
      variants (SqliteConstraintUnique / …ForeignKey / …NotNull /
      …PrimaryKey / …Check). Added sqlite_last_insert_rowid;
      sqlite_set_defensive / sqlite_enable_load_extension /
      sqlite_harden (DEFENSIVE on + extension-loading off — blocks
      corruption/RCE from a hostile DB); sqlite_limit (bound query
      complexity); a closure-based sqlite_set_authorizer /
      sqlite_clear_authorizer that installs a sandbox callback with the
      exact C ABI libsqlite expects (the closure's compiled function +
      captured env are passed as xAuth + pUserData, the same mechanism
      thread_spawn uses for pthread_create — no C bridge); and PRAGMA
      helpers sqlite_journal_wal / sqlite_foreign_keys /
      sqlite_synchronous. Verified under ASan + UBSan
      (compiler/tests/sqlite_tier34.nu).

Changed

  • Match-arm payload bindings now participate in auto-drop. A % Drop
    type bound as a ?? match-arm payload (e.g. ?? r { T db → … }) — or a
    : let inside a match arm — is now dropped at arm scope exit, on the same
    void-arm-only rule used for owned strings/structs. Previously such
    bindings were never dropped (a latent leak); this is what lets the SQLite
    handles above close automatically in the idiomatic result-unwrap flow.

Documentation

  • ROADMAP brought up to date. The Status header now reads Grammar v2.1
    (was v2.0) and points at spec/grammar.ebnf. Items that were marked pending
    but are in fact shipped are now [x]: the async runtime (stackful M:N
    fibers — the Coroutines-vs-async/await decision is settled), HTTP server
    Phase 8
    (production hardening) and Phase 9 server-side (TLS+SNI+ALPN+
    mTLS+reload, HTTP/2, WebSocket — client-side remains), the optional
    -lcurl
    sentinel-gated linking, and the nurlc_lastgood.nu refresh
    lifecycle (documented via --refresh-bootstrap). Added an explicit
    "What's actually left" summary to the Status section (HTTP/2+WebSocket
    client-side; mobile/no_std targets; SQLite BLOB/double; reverse-proxy
    binary bodies; blake3; MCP SSE/sessions/auth; the runtime.c file-split;
    a compiler-embedded LLM; bench peers). Stale build-size figures left only
    in dated historical "shipped" entries (records, not current claims).

  • Removed hard-coded build-artifact sizes from the reference docs. The
    ~480 KB nurlc.wasm (docs/PLAYGROUND.md) and ~1.6 MB
    nurlc_lastgood.ll (docs/BUILDING.md) figures drift every build and
    mislead when the real artifact differs. Build sizes belong in the
    changelog/release notes (tied to a specific version), not in
    instructional docs.

  • Cleaned stale GOTCHAS.md item N / §N references out of code comments.
    After docs/GOTCHAS.md lost its numbered list, ~44 source comments (in
    compiler/nurlc.nu, the nurlc_lastgood.nu snapshot mirror, nine
    compiler/tests/*.nu, and stdlib/ext/{http_middleware}.nu) still pointed
    at item/section numbers that no longer exist. Each now points at the real
    home (escape/lifetime → docs/MEMORY.md §2.3, grammar → docs/LIMITATIONS.md)
    or simply describes the behaviour inline. The nurlc_lastgood.nu edits are
    comment-only — verified to produce byte-identical IR, so the committed
    bootstrap nurlc_lastgood.ll is unchanged; the build still reaches its
    fixed point and the full test suite passes.

  • docs/GOTCHAS.md reduced to "Currently no known gotchas." Every
    source-level trap is now a compiler diagnostic (error:/warning: with a
    caret + cure), so the page no longer lists a museum of resolved issues.
    The real content that lived there was relocated to its proper home: the
    fiber-runtime operational caveats (non-blocking handle flipping,
    runtime_run blocking, stack-borrow capture, plus runtime-maintainer
    notes on TLS-under-LTO and the reactor park/unpark ordering) moved to
    docs/ASYNC.md → Operational caveats, and the
    : ~-capture lifetime rule now points at docs/MEMORY.md
    §2.3. Updated every back-reference (docs/spec.md, docs/LIMITATIONS.md,
    ROADMAP.md's "5 active quirks" status line, the VS Code extension
    README, and stale GOTCHAS.md item N comments in stdlib/ext/toml.nu,
    mcp_http.nu, http_multipart.nu). All internal links verified.

  • docs/LIMITATIONS.md scoped to actual language/compiler limitations.
    Removed the standard-library capability tables (PostgreSQL, SQLite,
    panic/recover) that were never language limitations — that information
    lives with each module (stdlib headers, ROADMAP.md, TODO.md). Moved
    the HTTPS/TLS table to docs/NETWORKING.md where it
    belongs. Removed two entries that were stale (the behaviour already
    works, verified empirically): "no tail-call optimisation" (self-recursive
    tail calls emit tail call → LLVM sibcall-opt; 50M-deep tail recursion
    runs without overflow) and "enum forward references unsupported"
    (scan_type_names registers type names before codegen, so a struct
    payload can be declared after its enum). The page now lists only
    language/compiler constraints (Type system, Functions/calls, Enums,
    Imports, Grammar).

  • Playground now renders linked docs instead of 404ing. Clicking a
    relative link inside a rendered doc (e.g. docs/LIMITATIONS.md from the
    README, or ../spec/grammar.ebnf from a docs/ page) used to hit "not
    found". nurlapi now serves the repo doc tree by its natural path —
    /docs/*, /spec/*, /bench/*, and the capitalised top-level
    /README.md · /ROADMAP.md · /CHANGELOG.md · /CONTRIBUTING.md
    rendering .md to HTML (__serve_repo_doc, path-traversal-guarded) and
    serving other files as text; examples/*.md renders too (.nu stays
    JSON for the editor). Because the route hierarchy mirrors the repo, the
    browser's own relative-link resolution chains correctly between docs. The
    container image now copies the whole docs/ tree (was only
    GOTCHAS.md) plus CHANGELOG.md, CONTRIBUTING.md, and bench/.

  • README refactored into a slim overview + topic docs. The 991-line
    kitchen-sink README is now a ~230-line overview (why/principles,
    architecture, quick start, syntax-at-a-glance, a documentation index, and
    project layout) that links out to focused pages under docs/. New:
    docs/BUILDING.md, docs/TOOLING.md, docs/PLATFORMS.md,
    docs/PLAYGROUND.md (HTTP API + playground + MCP), docs/NETWORKING.md
    (sockets + MQTT), docs/LIMITATIONS.md. Syntax/type/memory sections now
    point to the existing authoritative homes (spec/grammar.ebnf,
    docs/spec.md, docs/MEMORY.md) instead of duplicating them.

  • Removed stale / frequently-changing content. The README no longer
    hard-codes a grammar version, benchmark tables (point to bench/), the
    example file list, the MCP tool count, or the .vsix version. The
    PostgreSQL "Known Limitations" (claimed no binary protocol / async /
    LISTEN-NOTIFY / COPY — all shipped) and the MQTT section (TLS-only +
    verify-on-by-default + exactly-once QoS 2 + subscribe_many) are now
    accurate. Dropped references to non-existent spec/types.md / ir.md /
    bootstrapping.md, fixed CONTRIBUTING.md's api/nurlapi/, a dead
    HTTP_SERVER_PLAN.md link in ROADMAP.md, the compiler's prefix-arity
    diagnostic (pointed at the moved README section → docs/LIMITATIONS.md),
    and docs/GOTCHAS.md's cross-reference. All internal doc links verified.

Added

  • MQTT: multi-topic SUBSCRIBE (mqtt_subscribe_many) sends one
    SUBSCRIBE for N filters at a shared max QoS and validates every
    per-filter SUBACK reason code (a new __mqtt_check_suback that parses
    the property block instead of assuming a single trailing byte —
    mqtt_subscribe_qos now uses it too).
  • PostgreSQL advanced protocol features — binary, async, LISTEN/NOTIFY,
    COPY
    (stdlib/ext/postgres.nu). Closes the last Tier-5 Postgres gap;
    all four are pure-NURL libpq FFI (no runtime.c bridge) and are
    exercised end to end by the new examples/pg_advanced.nu, live-verified
    against PostgreSQL 16.14.
    • Binary result protocol: pg_exec_params_binary requests
      resultFormat = 1; pg_get_i16_bin / _i32_bin / _i64_bin /
      _bool_bin / _f64_bin decode network-byte-order cells (float8
      reinterpreted from its IEEE-754 bit pattern, not a numeric cast), with
      pg_get_length / pg_field_format / pg_binary_tuples.
    • Asynchronous queries: pg_send / pg_send_params dispatch without
      blocking; pg_get_result (→ ?PgResult, None when finished) and the
      blocking convenience pg_await collect results; pg_consume_input /
      pg_is_busy / pg_socket / pg_flush / pg_set_nonblocking hook into
      an event loop.
    • LISTEN/NOTIFY: pg_listen, pg_notify_send, pg_notifies
      (→ ?PgNotify { relname, be_pid, extra }, read after
      pg_consume_input) and pg_notify_free.
    • COPY: pg_copy_start (accepts the PGRES_COPY_IN / COPY_OUT
      handshake that plain pg_exec rejects), pg_put_copy_data /
      pg_put_copy_str / pg_put_copy_end for COPY … FROM STDIN, and
      pg_get_copy_data (→ ?String) for COPY … TO STDOUT.

Fixed

  • Compiler: ??-match on a result/option-typed parameter dropped its
    payload.
    @ f !S E r → S { ?? r { T x → ^ x } } emitted ret i64
    against the %S return type (an LLVM "value doesn't match function
    result type" error), and the same gap mishandled a ( Vec u ) handle
    payload and dropped the unsigned flag on a ?u parameter (sign-extending
    a byte ≥ 0x80). gen_fn_param now records the
    <param>__res_nurl_T / __res_t_llvm / __res_e_llvm / __opt_nurl_T
    metadata that gen_let_or_struct already records for let-bound result
    vars, so gen_match reconstructs struct / pointer / unsigned payloads
    for a parameter scrutinee exactly as it does for a let binding. Bootstrap
    fixed point held; regression compiler/tests/match_param_payload.nu.
  • Compiler: an empty block {} returned from a void function emitted
    invalid IR.
    An empty block is the unit/void value, but gen_block left
    the "last type" at whatever preceded it (i64 by default, i1 inside a
    conditional), so ^ {} in a void function produced ret i64 undef
    rejected by LLVM. The block now types as void when it has no trailing
    statement.
  • Compiler: undefined identifier in value position no longer emits an
    undefined SSA value with exit status 0
    (PR #25 / Fixes). gen_ident's
    bare %<name> fallback fired for any name lacking a __ptr / __global
    binding — so : i x ^ a b c emitted ret i64 %a that nurlc accepted and
    only clang rejected. The fallback now requires a by-value parameter and
    otherwise dies with "use of undefined identifier". This was critic.md §4's
    headline contradiction of "every trap is a compiler diagnostic".
    Regression compiler/tests/should_fail_undef_ident.nu.
  • Compiler: a within-statement prefix-arity cascade that swallowed the
    next ^ silently returned early
    (PR #25 / Fixes). : i x + 1 /
    ^ a parsed as + 1 (^ a)ret %a plus a dead add, exiting 0. A new
    g_ret_forbidden flag (armed by a gen_operand wrapper around every
    value-operand parse, reset by gen_stmt and the ? / ?? arm bodies)
    makes gen_ret refuse to emit a ret in operand position. Regression
    compiler/tests/should_fail_cascade_caret.nu.
  • Compiler: ??-match on a direct-call scrutinee dropped pointer/handle
    payloads and option signedness
    (PR #25 / Fixes). : ( Vec u ) x ?? ( f … ) { … } / : s x ?? ( f … ) { … } left the binding undef (no T-arm
    reconstruction, no result phi) for handle/pointer payloads, and ?? ( vec_get [u] … ) { T b → # i b } sign-extended an unsigned byte. The
    callee's Ok/Err-payload LLVM types and option-inner token are now recorded
    per function and surfaced to gen_match's direct-call synthesis (with a
    bare-i8* inttoptr path for both arms). Regressions
    compiler/tests/match_bind_call_handle.nu, match_call_opt_unsigned.nu.
  • Compiler: option/result construction from a sized-int literal didn't
    truncate
    (PR #25 / Fixes). @ ?u { T 0x86 } emitted insertvalue { i1, i8 } …, i64 134, 1, which clang rejected. gen_agg_lit's opt/res
    payload coercion now truncs/sexts/zexts the literal to the payload width
    (option = T's real width, result = i64). Regression
    compiler/tests/opt_lit_payload_width.nu.
  • MQTT inbound QoS 2 is now exactly-once. A retransmitted (DUP)
    QoS 2 PUBLISH was acknowledged but re-delivered to the application. The
    client now tracks inbound packet ids across their PUBREC…PUBCOMP window
    (MqttClient.qos2_rx, bounded, oldest evicted past 256), acknowledges a
    duplicate but delivers it only once. __mqtt_parse_publish returns
    ?MqttMessage (None on a de-duplicated retransmit) and the dedup policy
    is unit-tested in compiler/tests/mqtt_qos2_dedup.nu. The doc-drift
    comment on __mqtt_do_publish (claimed a fixed packet id) was corrected.

Security

  • MQTT TLS certificate verification is now configurable and on by
    default.
    mqtt_connect_cfg / mqtt_reconnect previously hard-coded
    verify = F, so every TLS connection was effectively --insecure
    (MITM-able). MqttConfig gained a tls_verify field, threaded through to
    tcp_connect_tls; mqtt_config defaults it to T (peer-cert chain + host
    name verified against the system trust store). Set it F only for a
    self-signed broker in a trusted environment.
  • pg_listen SQL injection (critical) — fixed. A channel name is a SQL
    identifier and cannot be a bound parameter, so it now goes through
    pg_escape_identifier (PQescapeIdentifier) before interpolation; raw
    concatenation previously let pg_listen c "x; DROP TABLE …; --" execute
    the injected statement. (pg_notify_send was already safe — it binds the
    channel as a value to pg_notify($1, $2).)
  • Out-of-bounds read in the binary accessors (medium) — fixed. A new
    PQgetlength-checked __pg_bin_ptr guards every pg_get_*_bin:
    reading an int4 cell with the 8-byte pg_get_i64_bin, or any accessor
    on a binary SQL NULL (0 bytes), now returns 0 instead of reading past
    the cell into adjacent libpq buffer memory.
  • TLS is not verified by default — documented. pg_connect and the
    file header now carry a prominent warning that libpq's default
    sslmode=prefer neither prevents a silent plaintext fallback nor
    verifies the server certificate (MITM-able), recommending
    sslmode=verify-full sslrootcert=… for non-local connections. Not
    force-defaulted, as that would break legitimate unix-socket / trusted-LAN
    connections. Minor: pg_get_bool gained a NULL-pointer guard, and the
    empty-on-NULL behaviour of pg_escape_literal / pg_escape_identifier
    is now documented.