Releases: Heniokhos-Systems/KernRift
Release list
v2.8.26
Full Changelog: v2.8.25...v2.8.26
v2.8.25
v2.8.24
v2.8.22
v2.8.21
Three IR codegen wins compounding to a 5.2% bootstrap shrink (1.24 MB → 1.18 MB) and 30% sort runtime drop (153 → 108 ms — krc now beats gcc -O2 on bubble-sort by 2.5×).
What changed
-
6th register colour (rbp). Graph-colouring regalloc gained one more callee-saved register, dropping spill rate compiler-wide. rbp had been left out historically; the lz4 / fat-archive paths surfaced an off-by-one in stack-arg overflow loads — replaced `ir_frame_size + 48` (hardcoded "5 pushes + ret addr") with `ir_frame_size + ir_callee_save_bytes + 8`.
-
Per-function used-callee-save prologue. Functions push only the colours regalloc actually assigned. fib's prologue dropped from 5 pushes to 3; leaf-ish helpers often drop to 0-1. Variable alignment math (push_count parity decides frame_size +8) keeps SP 16-aligned at every CALL.
-
Cross-register spill-reload peephole. `store rax,V; load rcx,V` (different reg) now emits `mov rcx, rax` instead of a stack roundtrip. Catches matmul-style intermediate-vreg flows through different scratch regs.
Runtime delta (Ryzen 9 7900X)
| bench | v2.8.20 | v2.8.21 | gcc -O2 | krc Δ |
|---|---|---|---|---|
| fib | 442 ms | 427 ms | 78 ms | -3% |
| sort | 153 ms | 108 ms | 270 ms | -30%, 2.5× ahead of gcc -O2 |
| sieve | 3 ms | 3 ms | 2 ms | tied |
| matmul | 34 ms | 33 ms | 4 ms | -3% |
Verified: bootstrap fixed point at 1,176,168 bytes; 439/439 tests pass.
Next on the optimization roadmap
- matmul still 8× behind gcc -O2 — needs loop strength reduction to remove per-iter address recomputation (~10 of ~16 inner-loop insns are `(i*N+j)*8` calculations the compiler should hoist as a running pointer).
- fib still 5× behind — gcc -O2 inlines fib 4-5 levels deep before materialising leaves as real `call`. Recursive inlining at the IR level.
Full Changelog: v2.8.20...v2.8.21
Full Changelog: v2.8.20...v2.8.21
v2.8.20
Fix: return N from main was silently ignored.
The auto-inserted exit syscall at the end of main was clobbering the return register (rax on x86_64, x0 on aarch64) with a hardcoded 0 right before the syscall, so:
```kernrift
fn main() -> int32 { return 42 }
```
exited with status 0 instead of 42 — on every backend (legacy and IR, both arches). The user-visible symptom was that while 1 == 1 { if cond { return 42 } } ignored the return, but it was the same root cause; the loop had nothing to do with it. Existing examples like hello.kr were fine because they call exit(0) explicitly and never reach the auto-glue.
Fix:
- Auto-exit syscall now forwards the current
rax/x0as the exit code. IR_RET_VOIDzerosrax/x0before branching to the epilogue, sofn main() { ... }(void) still exits with0.- Legacy backends do the same zero at the start of
main's body. - Function-local
return(in non-mainfunctions) is unaffected: callers don't read the return register on void returns.
Verified:
fn main() -> int32 { return 42 }→ exit 42 ✓while 1 == 1 { ...; return 7 }→ exit 7 ✓while 1 == 1 { println(i); if i == 3 { return 42 } }→ prints1 2 3, exit 42 ✓fn main() { ... }(void) → exit 0 ✓
Bootstrap fixed point at 1,244,072 bytes; 439/439 tests pass.
Full Changelog: v2.8.19...v2.8.20
Full Changelog: v2.8.19...v2.8.20
v2.8.19
Three stacked bugs that prevented .krbo fat binaries from running in Termux on Android 14+, plus four signed-arithmetic / mixed-float codegen-correctness fixes.
Termux runner
runner.krargv-shift detection now matchesargv[0] == argv[1](Termux's exec wrapper duplicates the binary path), sokr --versionno longer parses--versionas a.krbopath and SIGBUS on garbage.- New
kr-runnerMake target that concatenatesrunner.kr+bcj.kr. Building the runner alone leftfilter_aarch64_bcj/filter_x86_64_bcjunresolved and silently corrupted every extracted slice (entry-point bytes clobbered → SIGBUS at startup). packaging/kr.shshell wrapper that catches the runner's exit-120 and re-execs./kr-execfrom the user's shell context. Rawexecvefrom the runner's app SELinux domain hitsEACCESbecause Termux wrapsexecvevia libcLD_PRELOADand oursvc 0syscall bypasses it; the user's shell context has the wrapper engaged so the re-exec succeeds.
Compiler correctness
- Mixed f32/f64 in
BinOp+Assign— widen the narrower side to f64 in float arithmetic.a + 1.0forf32 ano longer silently reads zero (low 32 bits of f641.0). - Signed-aware compare —
int64 var < 0now actually fires. Added a parallelir_vreg_signed_bufbyte array;int8/16/32/64declarations tag it, BinOp/Assign propagate, ordering compares pickIR_SCMP_*over unsignedIR_CMP_*.uint*stays unsigned. - Signed
/,%,>>— newIR_SDIV(132),IR_SMOD(133),IR_SAR(134) emitcqo + idiv/asron x86 and aarch64. Mirrored in the legacy backend with AST-derived signedness (legacy_node_is_signed).
Verified on Z Fold 5 / Android 16 / Termux: kr program.krbo runs and exits 0. Bootstrap fixed point at 1,240,432 bytes; 439/439 tests pass.
Full Changelog: v2.8.18...v2.8.19
Full Changelog: v2.8.18...v2.8.19