v0.9.3
[0.9.3] — 2026-05-31
Summary
A full Game Boy (DMG) emulator written in NURL now plays commercial
games with sound. examples/gameboy/ passes Blargg cpu_instrs 11/11,
instr_timing and 02-interrupts, is 100 %/pixel-perfect on dmg-acid2,
and runs Tobu Tobu Girl end to end — full gameplay plus a complete
4-channel APU mixed to stereo — in the browser at /gameboydemo via the
WebAssembly target. Building it drove three new language/compiler
features (hex/binary integer literals, pointer/aggregate global
initialisers, hex literals in match) and turned one
silently-accepted bare-literal statement into a hard compile error.
Generics now range over option and pointer element types — Vec ?T,
vec_get [?T] → ??T, ??T parameters/returns and nested ?? matching
all compile (five front-end root-cause fixes). The PostgreSQL client is
production-grade (stdlib/ext/postgres.nu + examples/psql.nu),
including option-typed nullable params and getters
(pg_exec_params_opt, pg_get_opt), verified live against
PostgreSQL 16 under AddressSanitizer.
HTTP/2 + HPACK + WebSocket conformance suites remain green: h2spec 2.6.0
reports 146/146 cases against examples/h2c_server.nu; the
autobahn-testsuite fuzzing client reports 294 OK / 4 NON-STRICT /
3 INFORMATIONAL / 0 FAILED across all 301 RFC 6455 cases against
examples/ws_echo.nu. Both binaries run under ASan + UBSan
without findings.
Bootstrap fixed point at 1 772 342 B (stage1 ≡ stage2 byte-identical
IR).
nurlc.wasm 501,204 bytes
Added
-
Game Boy (DMG) emulator —
examples/gameboy/. A cycle-aware Sharp
LR35902 core (every opcode + CB-prefix, exact Z/N/H/C flags + DAA,
EI/DI IME enable-delay, HALT + HALT-bug, DIV/TIMA timer, interrupt
dispatch) passing Blarggcpu_instrs11/11,instr_timingand
02-interrupts; a BG/window/sprite PPU that is 100 %/pixel-perfect
on dmg-acid2 (0/23040 diff — LYC raster + window internal line
counter); MBC1/3/5 mappers, joypad and OAM DMA; and a complete
4-channel APU (2 square w/ sweep, 4-bit wave RAM, 15-bit-LFSR
noise, 512 Hz frame sequencer, NR50/51 mix, DMG high-pass) mixed to
stereo. The engine is split into a sharedcore.nuwithgb.nu(CLI)
andgb_wasm*.nu(wasm32-wasi → canvas) front-ends; the browser demo
at/gameboydemoauto-starts Tobu Tobu Girl and plays it with
sound through the playground audio shim. Two sub-instruction timing
fixes (TIMA increments on the DIV falling edge; the fetch M-cycle is
clocked before the instruction body) took it from a title-screen crash
to full gameplay. Build the wasm at-O2(lower-Oleaks the C
shadow-stack pointer on the interrupt-dispatch path). -
Generics over option / pointer element types. Option (and pointer)
element types are now first-class generic type arguments:
vec_get [?String] → ??String,Vec ?T/vec_push/vec_set/
vec_free_with,??Tas a parameter and return type, and nested
?? o { T inner → ?? inner { … } }matching all compile — every one of
these previously failed at compile time. Five front-end root-cause
fixes, each verified by a full bootstrap + test-suite run and an
ASan-clean probe: (1)parse_type_optoptfor the fused??Ttoken;
(2)capture_type_arg_src+nurl_src_to_llvm+ anopt_
mangle/demangle round-trip so compound type args like[?String]are
one substitutable word; (3);-separated closure parameter types so an
aggregate type ({ i1, %String }) no longer truncates at its first
space; (4) slice-vs-pointer store discrimination; (5)int → aggregate
zeroinit. Test corpus on branchfeature/generic-option-types(PR #21). -
Hex / binary integer literals —
0xFF,0b1010. Added to the
number lexer; the token carries the parsed value and keeps its spelling
for diagnostics. Two companion compiler fixes: pointer- and
aggregate-typed global initialisers (: s g 0→global i8* null,
: String g 0→zeroinitializer,inttoptrfor a nonzero address),
and hex-literal normalisation inmatch(int-patterns?? op { 0xCB → … }and enum field-constraintsCode 0xFF → …are rewritten to
decimal before theicmp, since LLVM reads0x…as a hex float).
Regression testcompiler/tests/hex_literals.nu. -
Production-grade PostgreSQL client +
psqlCLI.
stdlib/ext/postgres.nureaches production grade: aPgParamsbuilder
(pg_bind_text/str/int/bool/null) for typed + NULL parameter binds the
libpq/pgx way,pg_prepare/pg_exec_prepared,pg_run,
pg_begin/commit/rollback, typed getters (pg_get_int/f64/bool),
pg_reset/pg_err_msg/pg_server_version/pg_escape_literal/
pg_escape_identifier, and — now that generics range over option types
— option-typed nullable params/getterspg_exec_params_opt ( Vec ?String ),pg_get_opt → ?String,pg_get_opt_int → ?i. New
examples/psql.nu(aligned-table renderer, command tags, multi-line
;accumulation,\dt \d \l \du \conninfometa-commands,-c "SQL"
one-shot) andexamples/pg_optional.nu. Verified live against
PostgreSQL 16 under ASan (PRs #20 / #22). -
Audio output in the WASM playground. An
env.audio_out_pushhost
shim streams packed-stereoi64samples to 48 kHz Web Audio, letting
WASM programs emit sound; demonstrated byexamples/audio_tone.nuand
used by the Game Boy demo's APU output. -
Trait bounds on generic functions —
[A: Trait]. A generic type
parameter may now carry one or more trait bounds:@ my_max [A: Ord] A x A y → A { … }. Trait-method dispatch inside a generic body already
resolved to the concreteimplthrough monomorphisation (dispatch is
keyed on the first argument's LLVM type, which becomes concrete at
instantiation); the bound adds the up-front guarantee.scan_impl_decl
now registers each% Trait Type {}asTrait##<llvm>in
g_trait_syms;gen_generic_fn_storerecords per-tparam bounds; and
check_generic_bounds(called fromgen_callat every generic call
site) verifies each bounded tparam's concrete type has the impl —
turning a missing impl from a cryptic unresolved-call link error into a
clear "type 'X' does not implement trait 'Y' required by bound A: Y"
diagnostic. Generic detection ingen_fn_declextended to recognise a
colon anywhere in the[…](a slice param's type never contains one).
This removes the need to passOrd/Hash/eqclosures into generic
helpers when animplexists. Testscompiler/tests/trait_bounds.nu
(positive, i + String) andshould_fail_trait_bound.nu(bound
violation → COMPILE FAIL). Bootstrap fixed point holds (stage1 ≡ stage2
byte-identical at 1 730 148 B). -
??match guards + or-patterns. Two additions togen_match:- Guards —
Pattern payloads ? <cond> → body. The guard is
evaluated after payload binding (so it can read the bound
payloads); a false guard falls through to the next arm. Implemented
by recording the guard's source span during arm parse and replaying
it vianurl_lex_set_posat the arm body, branching to the body or
the next arm. A guarded arm does NOT satisfy exhaustiveness for its
variant — a catch-all (unguarded or_) is still required. Not
allowed on a_wildcard arm or combined with an or-pattern. - Or-patterns —
A | B | C → body: several tag-only named
variants share one body (emit_or_chainlowers the alternatives to
a tag-compare chain). No payload binding or literal constraints; all
listed variants count toward exhaustiveness.
Test
compiler/tests/match_guards_or.nu. Bootstrap fixed point holds
(stage1 ≡ stage2 byte-identical at 1 720 428 B). - Guards —
-
Compile-time const folding for integer globals. A top-level
integer const (: i NAME …, or u / sized ints — notb) may now take
a prefix expression over integer literals instead of a single literal:
+ - * / << >> & | ^^(not%, which collides with the trait/impl
decl sigil at scan time).const_eval_intingen_const_declfolds it
to one value. Fixes the long-standing wart where e.g. the
two's-complement minimum needed a niladic helper —stdlib/std/int.nu
now exposes: i INT_MIN - -9223372036854775807 1directly
(int_min_valretained, delegating to it). Transparent (computes a
value, hides no control flow); fits the parse-directed architecture.
Testcompiler/tests/const_eval.nu. Bootstrap fixed point holds. -
selectover channels —?? { … }— Go-style select. A??
whose scrutinee is immediately{(no value to match) is a channel
select; each arm[T] ch → bind { body }receives from one channel
and the construct proceeds with the first ready arm. With no_
default it BLOCKS until some channel is ready (value sent or channel
closed); a_ → { … }default makes it non-blocking.bindis the
?Tthe receive yields (None ⇒ closed). Arms are heterogeneous (each
channel may carry a different element type) and tried in source order.
Implemented ingen_select(compiler/nurlc.nu) as a desugaring that
synthesises NURL source from the verbatim user channel-exprs + bodies
and compiles it through a sub-lexer — no raw IR, no new lexer token.
The blocking rendezvous (a sharedSelectWaiterarmed on every
channel, fired by senders/closers under the channel mutex) lives in
stdlib/std/channel.nuvia the type-erasedchan_raw_poll/
chan_raw_arm/chan_raw_disarm/select_waiter_*helpers — the
element type drops out of the orchestration, so one non-generic code
path serves channels of any type. Test
compiler/tests/select_basic.nu(deterministic default / value /
closed / priority cases always-on; concurrent blocking path gated on
NURL_NET_TESTS=1). Bootstrap fixed point holds (stage1 ≡ stage2
byte-identical at 1 691 603 B). -
Stdlib numeric + text utility round-out — four pure-NURL
additions (no compiler changes, each with an offline test):stdlib/std/int.nu:int_gcd,int_lcm,int_isqrt(Newton-method
exact floor sqrt). Testcompiler/tests/int_extra.nu.stdlib/std/float.nu:float_trunc,float_cbrt,float_hypot,
float_log2,float_log10(direct libm FFI) + pure-NURL
float_sign. Testcompiler/tests/float_extra.nu.stdlib/core/string.nu:string_join(complement ofstring_split)
andstring_count(non-overlapping occurrence count). Test
compiler/tests/string_join_count.nu.stdlib/core/char.nu:is_upper,is_lower,is_hexdigit,
to_upper_ascii,to_lower_ascii,hex_val. Predicates use the
same# i <bool-expr>shape as the existingis_alpha/is_digit
family — now returning a canonical 1/0 thanks to the cast fix below.
Testcompiler/tests/char_extra.nu.
Fixed
- Pointer-vs-integer comparison emitted invalid IR.
gen_binary
producedicmp eq i8* %p, 0; comparison operators nowptrtointany
pointer operand toi64and compare ini64, so== raw 0
null-checks compile. Found bringingpostgres.nuto production grade. ^ <void-call>(returning a→ vcall) emitted a value return.
Returning the result of a void function now lowers toret void
instead of attempting to return a non-existent value. Found in the
postgres work.- A bare numeric/string literal as a statement is now a hard compile
error. Previously& m 255 0x40(single&) silently discarded the
trailing0x40— a bare-literal discard statement the compiler
accepted — which masked a real masking bug in the Game Boy PPU's
STAT-bit-6 handling.gen_block_stmts/gen_block_retnow reject a
bare literal whose value is unused. (The no-workarounds dividend from
debugging dmg-acid2.) # i <bool>now zero-extends (was -1 for true). Casting a boolean
(ani1from a comparison /&/|/!) to a wider integer
emittedsext i1, so# i truewas -1 instead of 1. Harmless for the
ubiquitous!= 0callers, but it silently broke every predicate
documented as "→ 1":is_alpha/is_digit/is_space/
is_alnum_usall returned -1 for true. NURL has no signed 1-bit type,
so a boolean true is canonically 1 —gen_castnow forceszextfor
anyi1source (comparisons never set the__last_unsigned__
side-channel that the unsigned-widen path relies on, hence the explicit
guard). Latent fix across the whole stdlib; no existing test output
changed (nothing depended on the -1). Regression
compiler/tests/cast_bool_int.nu. Bootstrap fixed point holds
(stage1 ≡ stage2 byte-identical at 1 660 838 B).HttpOptionsstruct (HTTP client) —stdlib/ext/http.nugained
HttpOptions { i timeout_ms, i connect_timeout_ms, i follow_redirects, i max_redirects, i verify_tls, s user_agent }bundling the per-request
transport overrides that were previously hardcoded in the libcurl
orchestrator. New entry points:http_options_default → HttpOptions,
http_request_with_opts, andhttp_get_opts/http_post_opts
conveniences. The orchestrator body moved into
__libcurl_perform_full_opts(wiresCURLOPT_FOLLOWLOCATION/
MAXREDIRS/SSL_VERIFYPEER/SSL_VERIFYHOST/USERAGENTfrom the
struct); the legacy timeout-only__libcurl_perform_full_tois now a
thin shim over it, sohttp_request/http_request_toare
behaviour-preserving.user_agentis borrowedsso HttpOptions owns
nothing (no free fn, safe by-value). WinHTTP / stub backends honour
only the two timeouts (redirect / TLS / UA ignored — documented).
Stdlib-only; compiler IR unperturbed. Tests: offline
compiler/tests/http_options.nu(always-on) + a liveGET_OPTScase
incompiler/tests/http_basic.nu.examples/h2c_server.nu— minimal cleartext-HTTP/2 ("h2c,
prior-knowledge") echo server (~135 LOC). Async accept loop via
stdlib/std/async.nuso h2spec's probe + test connections can be
served concurrently; per-conn read timeout of 1 s keeps the
sequential accept queue draining when a test deliberately leaves
a connection half-open; response body sized ≥ 5 bytes so the
dataLen >= 5gate in h2spec §6.9.2/2 runs the test instead of
skipping. Verified green under both./nurl.shand
NURL_SAN=1 ./nurl.sh.examples/ws_echo.nu— minimal WebSocket echo server (~110
LOC). Uses the stdlibws_perform_handshake+ws_serve_messages
pair against the same TCP accept loop. Per-serverWsLimits
raisesfragment_max_countto 131 072 so autobahn §9.x's 4-MiB
message split into 65 536 frames assembles successfully; per-
frame and per-message byte caps stay at the stdlib defaults.NURL_SAN=1support innurl.sh— drops-flto, adds
-fsanitize=address,undefined -fsanitize-address-use-after-scope -fno-omit-frame-pointer -fno-sanitize-recover=allat the link,
and builds a side-by-sidestdlib/runtime_san.o(non-LTO,
matching flags) ifstdlib/runtime.cis newer than the cached
artefact. Matches the toolchain./build.sh --sanalready uses
for its own corpus.- HPACK lowercase-header-name encoder (
stdlib/ext/http2_hpack.nu
__hpack_lower_name_dup) — RFC 9113 §8.2.2 mandates lowercase
header field names on the wire;hpack_encode_headersnow
lowercases every name before encoding. Previously curl's HTTP/2
parser rejected ourContent-Typeresponse header. - Inline WINDOW_UPDATE pump in
__h2_send_response— when the
stream OR connection send-window is exhausted mid-response, the
writer reads frames off the peer and applies WINDOW_UPDATE /
SETTINGS / PRIORITY semantics in place (RFC 9113 §5.2.1, §6.9.1).
HEADERS for a new stream during the pump is refused with
RST_STREAM(REFUSED_STREAM) per §5.1.2. EmptyDATA(END_STREAM)
fallback (§6.9.1 permits zero-length DATA + END_STREAM regardless
of window state) closes the stream cleanly when the pump bails. - HTTP/2 request HEADERS validation pass (
__h2_validate_request_ headersinstdlib/ext/http2_conn.nu) — RFC 9113 §8.3 / §8.2.1 /
§8.2.2: lowercase names, pseudo-headers precede regular ones,
exactly one:method/:scheme/:path(non-empty), no
duplicate or response-only or unknown pseudo-headers, no
connection-specific headers (Connection,Proxy-Connection,
Keep-Alive,Transfer-Encoding), andTE— if present — holds
exactly"trailers". Runs immediately after HPACK decode succeeds
on both the HEADERS+END_HEADERS and HEADERS+CONTINUATION+
END_HEADERS paths. - HTTP/2 frame-validation pass — SETTINGS / GOAWAY / RST_STREAM
/ PRIORITY / DATA stream-ID + length + ACK rules per §6.5 / §6.4
/ §6.3 / §6.1 / §6.8. - HEADERS-on-existing-open-stream = trailers (§8.1) — accepting
a HEADERS frame on a stream already inopen/half-closed- localstate as the trailers section. Trailers MUST carry
END_STREAM; decoded fields are discarded butend_stream_received
is marked and the handler is dispatched. - PUSH_PROMISE rejection (§6.6) — client→server PUSH_PROMISE is
now PROTOCOL_ERROR (we advertise SETTINGS_ENABLE_PUSH=0). - §5.3.1 self-dependency check for PRIORITY and HEADERS-with-
PRIORITY-flag — a stream MUST NOT depend on itself; rejected as
PROTOCOL_ERROR. - §6.9.1 flow-control overflow detection — WINDOW_UPDATE that
carries a stream's or connection's send-window above 2^31-1 is
now FLOW_CONTROL_ERROR (stream-level → RST_STREAM, conn-level →
GOAWAY). - §5.1 idle-stream WINDOW_UPDATE — WINDOW_UPDATE on a stream
with sid > last_peer_stream_id (never opened) is PROTOCOL_ERROR;
on a closed stream silently no-ops. - HPACK §4.2 dynamic-table-size-update placement check — size
updates after any indexed/literal field in the block are
COMPRESSION_ERROR (newseen_fieldflag inhpack_decode_block),
and the new size is bounded byh2_default_header_table_size
(4 096) — our advertised SETTINGS_HEADER_TABLE_SIZE — rather than
the table's currentmax_size, which may have been lowered by a
previous update in the same connection. - §8.1.1 content-length consistency check (
__h2_content_length _mismatch) — when a request carriescontent-length, the sum of
DATA-payload lengths MUST equal that value. Mismatched (or
unparseable, or duplicated and disagreeing) content-length becomes
PROTOCOL_ERROR before handler dispatch. - RFC 6455 §5.5.1 / §7.4.2 WebSocket close-frame validation —
payload length 1 →WsInvalidCloseCode(close code 1002, not
1000); status code outside 1000–2999 OR 1004 / 1005 / 1006 /
1015 / 1016+ → close 1002; close-reason bytes validated as UTF-8.
Previously a close frame's payload was discarded outright and the
server replied withWsClosedByPeer→ 1000 regardless of what the
peer sent. - TCP_NODELAY on accepted sockets (
stdlib/runtime.c,
nurl_tcp_accept) — disables Nagle's algorithm on every accepted
TCP connection. Small framing-level ACKs (SETTINGS-ACK, PING-ACK,
WINDOW_UPDATE) were otherwise pinned behind the previous write
for up to 40 ms, which is exactly the window h2spec's per-test
short timeouts can't tolerate. h2_default_header_table_sizeconstant in
stdlib/ext/http2_frame.nu— value4 096, used as the upper
bound for HPACK dynamic-table-size updates and matches the
RFC 9113 §6.5.2 default for SETTINGS_HEADER_TABLE_SIZE.
Changed
scan_fn_sigsis now brace-depth-tracked — only TT_AT / TT_AMP
/ TT_DOLLAR / TT_PERCENT openers at depth 0 trigger their
respective dispatch branches; everything inside{ ... }advances
silently. Matches the patternscan_type_namesalready used (see
the docstring there). Closes the family of param-walk-desync bugs.- HTTP/2 GOAWAY-receive no longer triggers immediate shutdown —
per RFC 9113 §6.8 the receiver of GOAWAY MUST keep processing in-
flight frames (PING, RST_STREAM, in-progress streams) until the
peer closes the socket; only NEW stream creation is forbidden.
Previously we hard-exited the serve loop on the first GOAWAY,
which broke the h2spec GOAWAY-then-PING sequence. - HTTP/2 invalid-preface error path sends GOAWAY only when the
preface was structurally invalid (H2FrameBadPreface); on a read
error (timeout / EOF / IO) we tear down silently. GOAWAY-on-
every-preface-error was being seen by h2spec's per-test probe
connections and counted as the test response.
Fixed
scan_fn_sigsbrace-depth desync — the@inside a closure-
shaped struct field type (( @ HttpResponse HttpRequest ) handler)
was treated as the start of a function declaration; the param
walker then readHttpResponseas the phantomfnameand the
NEXT type-name-shaped token (instdlib/ext/http_server.nu,
DosLimits, declared 5 lines later) as that phantom function's
ret_ty, silently writingsyms["HttpResponse"] = "%DosLimits".
gen_match's wide-payload reconstruction for a
: ! HttpResponse WsErr rr (...)binding then looked up
syms["HttpResponse"]to size the heap-box load and emitted
inttoptr i64 ... to %DosLimits*+load %DosLimits+bitcast %DosLimits* ... to i8*against the real HttpResponse pointer.
Under -O1+ this manifested as a runtime nurl_peek of a misaligned
sub-page address (the HttpResponse i64 status field read as a Vec
ctl). Under -O0 the extra reload of the alloca round-tripped the
bits exactly so the struct's field accesses happened to land at
the right offsets, hiding the bug.__h2_stream_to_requestdouble-freedreq.querywhen the
request had no?in the path — the field was freed
unconditionally but only reassigned inside theqi >= 0branch.
request_freethen freed the dangling pointer again. Clear ASan
use-after-free on the very first h2c request through h2spec.__h2_decode_stream_headersfreed the oldcur.dec_dynbefore
assigning the newdd.dyn— butHpackDynTable.entriesis
aliased through the by-value pass intohpack_decode_block, so
the two wrappers shared one Vec ctl. The free turned the new
assignment into a dangling pointer; subsequent reads on
connection close tripped nurl_peek.hpack_decode_blockfailure path freedcur— butcurwas
initialised from the inputdyn(struct copy, entries Vec
pointer-aliased), so freeing in the error path left the caller's
dec_dynpointing at a freed Vec entries pointer. The next
h2_conn_free vec_free_with double-freed.__h2_frame_err_to_connreturned bare enum tags from??-arm
bodies when the function return type wrapped them as a struct;
follow the established__net_err_ofconvention with explicit
# H2ConnErr Tagcasts so the IR'sret %H2ConnErrmatches the
function signature.nurl_str_slice_unsafedid pointer-load instead of pointer
arithmetic —. rp fromlowers to "load the byte at rp+from",
not "compute address rp+from". The code intended an unsafe
substring view (rp + from interpreted as a string pointer); now
spelled# s + # i raw from(cast-add-cast).- Two latent parenthesised-operator compile errors in
stdlib/ext/http2_conn.nu—( % n 6 )and( . rp from ). The
diagnostic that rejects these landed 2026-05-22 but
http2_conn.nuwas never on the build/test path, so they sat
silently untilexamples/h2c_server.nupulled the file in. __pow2defined in two translation units — once in
stdlib/ext/http_response.nu(used for hex-format expansion) and
once instdlib/ext/http2_hpack.nu(used for HPACK integer
width). Linker rejected the redefinition the first time both
modules were used together; the HPACK helper is now
__hpack_pow2.__h2_apply_settingssign-extension — byte-shift-and-OR
assembly of the 24-bit length / 32-bit value fields used# i u
to widen each payload byte without masking, so any byte ≥ 0x80
propagated as a negative i64 into the next shift, corrupting
value. Fixed with explicit& 255masks at every byte read; the
same fix applied to the WINDOW_UPDATE increment decode in the
main serve loop and in__h2_pump_one_frame.autobahn-testsuite §6.4.1-4UTF-8 fail-fast — accepted as
NON-STRICT (the spec permits either streaming UTF-8 rejection or
whole-message validation; we do the latter). Documented for
follow-up work.
Tooling / dev experience
./build.sh --sanASan/UBSan corpus runs WebSocket close
validation end-to-end through autobahn-testsuite's first seven
case sections (92/92 OK; 0 sanitizer findings).docs/GOTCHAS.mdremains empty — every gotcha surfaced
during the interop push (parenthesised-operator calls, sign-
extension,__pow2collision, enum-tag-cast-on-return) is
diagnosed by the compiler at compile time..github/workflows/bench.yml— reproducible CI bench runner.
Triggers on push-to-main (paths-filtered to bench/, compiler/,
stdlib/, bench.yml itself),workflow_dispatch, and a weekly
Monday 06:00 UTC cron. Installs clang + rustup stable + the FFI
libs the regularci.ymluses, bootstraps nurlc, runs
bench/run.sh 5on a fixedubuntu-latest2-vCPU runner, and
uploads the results as a workflow artifact. Manual / scheduled
runs additionally commit a refreshedbench/RESULTS_CI.mdback
to main. The README's headline numbers (captured on a 12-core Intel
@ 3.5 GHz) stay inRESULTS.mdas the hand-captured figures;
RESULTS_CI.mdis the reproducible baseline.
Tokeniser-aware token-efficiency baseline
bench/token_efficiency.py+bench/TOKEN_EFFICIENCY.md—
BPE-aware token counts usingtiktoken'scl100k_base
(GPT-3.5 / GPT-4 / Claude legacy),o200k_base(GPT-4o /
o-series), andgpt2(proxy for Llama-3, which is HF-gated)
against every cross-language benchmark inbench/. NURL/Python
BPE-aware token-count ratios on these three benchmarks are
0.82–0.95× (LCG, all encoders) / 1.88–2.06× (sieve) /
1.60–1.77× (json_parse).bench/{lcg,sieve,json_parse}.nucleanup — dropped the
redundant$ "stdlib/core/io.nu"import (theputsand
nurl_str_intcalls resolve through the compiler's libc/runtime
prelude) and switched the trailing print from
( puts ( nurl_str_int x ) )to the one-call
( nurl_print_int x ).sieve.nualso drops the redundant FFI
decls formallocandfree(both pre-registered by
init_syms). Net result: −29 to −80 source bytes per file, NURL
token counts down ~6–10 % across every encoder, and@mainLLVM
IR is byte-identical for both compute benchmarks.
Borrow checker
--strict-borrowck(off by default) — opt-in mode that
extends two existing on-by-default checks:- Aliased mutation through
. obj fieldarguments. The
default N-readers-XOR-1-writer check fires only when both
aliasing arguments at a call site are bare identifiers.
Strict mode also recognises. obj fieldas an access of the
root bindingobj, so( swap c . c n )is now flagged when
one of the arguments isinout. The iterator-invalidation
check is widened in the same shape. # *T <owned-binding>raw-pointer escape. When a*T
cast's source binding sits on any of the auto-drop
side-tables (__owned_strings__/__owned_slices__/
__owned_struct_fields__) OR is a non-parameter heap binding
(%Struct / enum / aggregate, mirroring the move-tracker's
bck_let_aliasheuristic), strict mode flags the cast: the
binding's auto-drop at scope exit invalidates the pointer.
- Aliased mutation through
- Regression tests
compiler/tests/borrow_strict_field_alias.nu
andcompiler/tests/borrow_strict_raw_ptr_escape.nu— both
compile cleanly under the default checker and error out under
--strict-borrowck.compiler/tests/run_tests.shrecognises any
borrow_strict_*filename and adds the flag automatically. - Bootstrap fixed point unchanged: strict mode is purely a
diagnostic-only analysis pass.