Skip to content

uint8array#3324

Merged
2kai2kai2 merged 7 commits intocanaryfrom
kai/bytes
Apr 3, 2026
Merged

uint8array#3324
2kai2kai2 merged 7 commits intocanaryfrom
kai/bytes

Conversation

@2kai2kai2
Copy link
Copy Markdown
Collaborator

@2kai2kai2 2kai2kai2 commented Apr 2, 2026

See https://beps.boundaryml.com/beps/25

A mutable heap-allocated core BAML type for storing binary data. The API is inspired by javascript's UInt8Array. It is not serializable.

Summary by CodeRabbit

  • New Features
    • Added Uint8Array type for handling binary data with byte string literal syntax (b"...")
    • Introduced comprehensive byte array operations: length(), element access via at(), indexing, mutation methods (push(), pop(), sort()), and array manipulation (concat(), reverse(), slice())
    • Added encoding/decoding support: from_hex()/to_hex() and from_base64()/to_base64() conversions

@vercel
Copy link
Copy Markdown

vercel bot commented Apr 2, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
beps Ready Ready Preview, Comment Apr 3, 2026 8:10pm
promptfiddle Ready Ready Preview, Comment Apr 3, 2026 8:10pm

Request Review

@2kai2kai2 2kai2kai2 enabled auto-merge April 2, 2026 06:10
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 2, 2026

📝 Walkthrough

Walkthrough

This PR introduces comprehensive support for the uint8array primitive type across the BAML language ecosystem. Changes include a new Uint8Array class with byte-array operations, byte-string literal syntax (b"..."), type system extensions, compiler IR support, VM runtime implementation with methods like length, at, push, pop, concat, slice, and conversions (to_hex, from_base64), and external value bridging for host-engine communication.

Changes

Cohort / File(s) Summary
Builtin Definition
baml_language/crates/baml_builtins2/baml_std/baml/uint8array.baml, baml_language/crates/baml_builtins2/src/lib.rs
New Uint8Array class stub with 16 methods (read, mutation, conversion operations) and registration in builtin files list.
AST Type & Expression Definitions
baml_language/crates/baml_compiler2_ast/src/ast.rs, baml_language/crates/baml_compiler2_ast/src/lib.rs
Added TypeExpr::Uint8Array variant and Expr::ByteStringLiteral(Vec<u8>) variant; updated attribute handling and test span-stripping logic.
Parser & Syntax
baml_language/crates/baml_compiler_parser/src/parser.rs, baml_language/crates/baml_compiler_syntax/src/syntax_kind.rs
Added parse_byte_string() method and BYTE_STRING_LITERAL syntax kind; integrated byte-string literal parsing into primary expressions with escape sequence support.
Byte String Lowering & Diagnostics
baml_language/crates/baml_compiler2_ast/src/lower_expr_body.rs, baml_language/crates/baml_compiler2_ast/src/lower_type_expr.rs, baml_language/crates/baml_compiler2_ast/src/lowering_diagnostic.rs
Lowering for ByteStringLiteral and Uint8Array type names; new InvalidByteStringEscape diagnostic with escape-sequence parsing logic supporting \n, \t, \r, \0, \\, \", \xHH.
Type System Core
baml_language/crates/baml_compiler2_tir/src/ty.rs, baml_language/crates/baml_codegen_types/src/ty.rs, baml_language/crates/baml_type/src/lib.rs, baml_language/crates/baml_type/src/typetag.rs
Added PrimitiveType::Uint8Array, baml_codegen_types::Ty::Uint8Array, and baml_type::Ty::Uint8Array with namespace, default-value, and validation logic; introduced runtime type tag UINT8ARRAY = 12.
TIR Lowering & Normalization
baml_language/crates/baml_compiler2_tir/src/builder.rs, baml_language/crates/baml_compiler2_tir/src/lower_type_expr.rs, baml_language/crates/baml_compiler2_tir/src/normalize.rs
Type inference for ByteStringLiteral, index-element type resolution for Uint8Array, builtin-member routing, and StructuralTy::Uint8Array normalization.
MIR & Lowering
baml_language/crates/baml_compiler2_mir/src/ir.rs, baml_language/crates/baml_compiler2_mir/src/lower.rs, baml_language/crates/baml_compiler2_mir/src/cleanup.rs, baml_language/crates/baml_compiler2_mir/src/pretty.rs
Added Rvalue::Uint8Array(Vec<u8>) variant; lowering from AstExpr::ByteStringLiteral and index handling for Uint8Array; cleanup match arms; pretty-print formatting as b"<N bytes>".
Codegen Type Mappings
baml_language/crates/baml_builtins2_codegen/src/types.rs, baml_language/crates/baml_builtins2_codegen/src/codegen_io.rs, baml_language/crates/baml_codegen_python/src/ty.rs, baml_language/crates/baml_project/src/client_codegen.rs
Added Uint8Array variant to codegen type enums; mapped to Vec<u8> for Rust, bytes for Python, and generic scalar union support.
Codegen Logic & Builtins
baml_language/crates/baml_builtins2_codegen/src/codegen.rs, baml_language/crates/baml_builtins2_codegen/src/extract.rs
Extended fallibility marking for Uint8Array constructors; receiver extraction, argument conversion, and result allocation for VM builtin methods; treated Uint8Array as dedicated "Object variant" class like Array and Map.
Emit & Pull Semantics
baml_language/crates/baml_compiler2_emit/src/emit.rs, baml_language/crates/baml_compiler2_emit/src/pull_semantics.rs, baml_language/crates/baml_compiler2_emit/src/stack_carry.rs, baml_language/crates/baml_compiler2_emit/src/analysis.rs
Added alloc_uint8array() method to PullSink trait; implemented stack simulation and rvalue analysis for Uint8Array allocation; updated local and projection traversal logic.
HIR & PPIR Support
baml_language/crates/baml_compiler2_hir/src/builder.rs, baml_language/crates/baml_compiler2_ppir/src/ty.rs
Type expression rendering and CannotBeStreamedOrigin::Uint8Array for proper round-tripping of Uint8Array in PPIR.
VM Type System
baml_language/crates/bex_vm_types/src/types.rs
Added Object::Uint8Array(Vec<u8>) and ObjectType::Uint8Array heap object variants with display formatting.
VM Runtime Implementation
baml_language/crates/bex_vm/src/vm.rs, baml_language/crates/bex_vm/src/package_baml/uint8array.rs, baml_language/crates/bex_vm/src/package_baml/root.rs, baml_language/crates/bex_vm/src/package_baml/unstable.rs
Implemented Uint8Array operations (length, at, push, pop, concat, includes, reverse, slice, zeroes, from_array/to_array, from_hex/to_hex, from_base64/to_base64, sort) with error handling; extended VM comparison ops, array-like indexing, and deep-copy/equality; added module registration.
VM Module & Cargo
baml_language/crates/bex_vm/src/package_baml/mod.rs, baml_language/crates/bex_vm/Cargo.toml
Added uint8array module declaration and base64 workspace dependency.
External Values & Conversion
baml_language/crates/bex_external_types/src/bex_external_value.rs, baml_language/crates/bex_engine/src/conversion.rs, baml_language/crates/bex_heap/src/accessor.rs
Added BexExternalValue::Uint8Array(Vec<u8>) variant; extended value-to-type matching and union discrimination; updated owning conversion for both external and heap-backed variants.
Heap Management
baml_language/crates/bex_heap/src/gc.rs, baml_language/crates/bex_heap/src/heap_debugger/real.rs
Treated Uint8Array as primitive with no internal references in GC traversal and fix-up; added invariant-check handling.
Bridging & Protocol
baml_language/crates/bridge_ctypes/src/handle_table.rs, baml_language/crates/bridge_ctypes/src/value_decode.rs, baml_language/crates/bridge_ctypes/src/value_encode.rs, baml_language/crates/bridge_ctypes/types/baml/cffi/v1/baml_inbound.proto, baml_language/crates/bridge_ctypes/types/baml/cffi/v1/baml_outbound.proto
Added uint8array_value fields (proto bytes type) to inbound/outbound protocol; extended conversion logic between BamlValueVariant::Uint8arrayValue and BexExternalValue::Uint8Array; added BamlFieldTypeUint8Array marker message.
Prompt Rendering
baml_language/crates/sys_llm/src/build_request/mod.rs, baml_language/crates/sys_llm/src/jinja/value_conversion.rs, baml_language/crates/sys_llm/src/types/output_format.rs
Handled Uint8Array as non-JSON-serializable and non-Jinja-passable; marked as unsupported type in output format rendering.
SAP Integration
baml_language/crates/bex_sap/src/sap_model/convert.rs
Marked Uint8Array as non-parsable and non-SAP-parseable; extended field-attribute generation.
Events & Serialization
baml_language/crates/bex_events/src/serialize.rs
Added serialization arm for Uint8Array as JSON object with length metadata.
Formatting & Display
baml_language/crates/baml_fmt/src/ast/expressions.rs, baml_language/crates/baml_fmt/src/ast/tokens.rs, baml_language/crates/baml_lsp2_actions/src/utils.rs, baml_language/crates/tools_onionskin/src/compiler.rs
Added Expression::ByteString variant and ByteString token type with CST parsing and printing; updated type display in LSP and onionskin tools.
Testing
baml_language/crates/baml_tests/projects/byte_string_literals/main.baml, baml_language/crates/baml_tests/tests/byte_strings.rs, baml_language/crates/baml_tests/src/compiler2_tir/mod.rs
Comprehensive test suite covering literal construction, escape sequences, methods (length, at, push, pop, etc.), indexing, equality, and integration scenarios; snapshot test rendering for expressions and types.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested reviewers

  • imalsogreg
🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Title check ⚠️ Warning The title "uint8array" is a single word that names the feature but does not summarize the primary change or indicate what was done with this type. Use a more descriptive title that conveys the action, such as "Add uint8array type for binary data" or "Implement uint8array with byte string literals and methods".
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch kai/bytes

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 17


ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 757dc589-f1f6-4227-aca2-66519e0bf4c2

📥 Commits

Reviewing files that changed from the base of the PR and between 7bb4d74 and 72cb65a.

⛔ Files ignored due to path filters (24)
  • baml_language/Cargo.lock is excluded by !**/*.lock
  • baml_language/crates/baml_tests/snapshots/__baml_std__/baml_tests____baml_std____03_hir.snap is excluded by !**/*.snap
  • baml_language/crates/baml_tests/snapshots/__baml_std__/baml_tests____baml_std____04_5_mir.snap is excluded by !**/*.snap
  • baml_language/crates/baml_tests/snapshots/__baml_std__/baml_tests____baml_std____04_tir.snap is excluded by !**/*.snap
  • baml_language/crates/baml_tests/snapshots/__baml_std__/baml_tests____baml_std____06_codegen.snap is excluded by !**/*.snap
  • baml_language/crates/baml_tests/snapshots/byte_string_literals/baml_tests__byte_string_literals__01_lexer__main.snap is excluded by !**/*.snap
  • baml_language/crates/baml_tests/snapshots/byte_string_literals/baml_tests__byte_string_literals__02_parser__main.snap is excluded by !**/*.snap
  • baml_language/crates/baml_tests/snapshots/byte_string_literals/baml_tests__byte_string_literals__03_hir.snap is excluded by !**/*.snap
  • baml_language/crates/baml_tests/snapshots/byte_string_literals/baml_tests__byte_string_literals__04_5_mir.snap is excluded by !**/*.snap
  • baml_language/crates/baml_tests/snapshots/byte_string_literals/baml_tests__byte_string_literals__04_tir.snap is excluded by !**/*.snap
  • baml_language/crates/baml_tests/snapshots/byte_string_literals/baml_tests__byte_string_literals__05_diagnostics.snap is excluded by !**/*.snap
  • baml_language/crates/baml_tests/snapshots/byte_string_literals/baml_tests__byte_string_literals__06_codegen.snap is excluded by !**/*.snap
  • baml_language/crates/baml_tests/snapshots/byte_string_literals/baml_tests__byte_string_literals__10_formatter__main.snap is excluded by !**/*.snap
  • baml_language/crates/baml_tests/snapshots/client_option_types/baml_tests__client_option_types__06_codegen.snap is excluded by !**/*.snap
  • baml_language/crates/baml_tests/snapshots/comment_after_string_in_config/baml_tests__comment_after_string_in_config__06_codegen.snap is excluded by !**/*.snap
  • baml_language/crates/baml_tests/snapshots/comment_in_type/baml_tests__comment_in_type__06_codegen.snap is excluded by !**/*.snap
  • baml_language/crates/baml_tests/snapshots/config_dictionary/baml_tests__config_dictionary__06_codegen.snap is excluded by !**/*.snap
  • baml_language/crates/baml_tests/snapshots/config_model_string/baml_tests__config_model_string__06_codegen.snap is excluded by !**/*.snap
  • baml_language/crates/baml_tests/snapshots/control_flow/baml_tests__control_flow__06_codegen.snap is excluded by !**/*.snap
  • baml_language/crates/baml_tests/snapshots/format_checks/baml_tests__format_checks__06_codegen.snap is excluded by !**/*.snap
  • baml_language/crates/baml_tests/snapshots/header_in_llm_function/baml_tests__header_in_llm_function__06_codegen.snap is excluded by !**/*.snap
  • baml_language/crates/baml_tests/snapshots/o1_allowed_roles/baml_tests__o1_allowed_roles__06_codegen.snap is excluded by !**/*.snap
  • baml_language/crates/baml_tests/snapshots/retry_policy/baml_tests__retry_policy__06_codegen.snap is excluded by !**/*.snap
  • baml_language/crates/baml_tests/src/compiler2_tir/snapshots/baml_tests__compiler2_tir__phase5__snapshot_baml_package_items.snap is excluded by !**/*.snap
📒 Files selected for processing (63)
  • baml_language/crates/baml_builtins2/baml_std/baml/uint8array.baml
  • baml_language/crates/baml_builtins2/src/lib.rs
  • baml_language/crates/baml_builtins2_codegen/src/codegen.rs
  • baml_language/crates/baml_builtins2_codegen/src/codegen_io.rs
  • baml_language/crates/baml_builtins2_codegen/src/extract.rs
  • baml_language/crates/baml_builtins2_codegen/src/types.rs
  • baml_language/crates/baml_codegen_python/src/ty.rs
  • baml_language/crates/baml_codegen_types/src/objects.rs
  • baml_language/crates/baml_codegen_types/src/ty.rs
  • baml_language/crates/baml_compiler2_ast/src/ast.rs
  • baml_language/crates/baml_compiler2_ast/src/lib.rs
  • baml_language/crates/baml_compiler2_ast/src/lower_expr_body.rs
  • baml_language/crates/baml_compiler2_ast/src/lower_type_expr.rs
  • baml_language/crates/baml_compiler2_ast/src/lowering_diagnostic.rs
  • baml_language/crates/baml_compiler2_emit/src/analysis.rs
  • baml_language/crates/baml_compiler2_emit/src/emit.rs
  • baml_language/crates/baml_compiler2_emit/src/pull_semantics.rs
  • baml_language/crates/baml_compiler2_emit/src/stack_carry.rs
  • baml_language/crates/baml_compiler2_hir/src/builder.rs
  • baml_language/crates/baml_compiler2_mir/src/cleanup.rs
  • baml_language/crates/baml_compiler2_mir/src/ir.rs
  • baml_language/crates/baml_compiler2_mir/src/lower.rs
  • baml_language/crates/baml_compiler2_mir/src/pretty.rs
  • baml_language/crates/baml_compiler2_ppir/src/ty.rs
  • baml_language/crates/baml_compiler2_tir/src/builder.rs
  • baml_language/crates/baml_compiler2_tir/src/lower_type_expr.rs
  • baml_language/crates/baml_compiler2_tir/src/normalize.rs
  • baml_language/crates/baml_compiler2_tir/src/ty.rs
  • baml_language/crates/baml_compiler_diagnostics/src/diagnostic.rs
  • baml_language/crates/baml_compiler_parser/src/parser.rs
  • baml_language/crates/baml_compiler_syntax/src/syntax_kind.rs
  • baml_language/crates/baml_fmt/src/ast/expressions.rs
  • baml_language/crates/baml_fmt/src/ast/tokens.rs
  • baml_language/crates/baml_lsp2_actions/src/utils.rs
  • baml_language/crates/baml_tests/projects/byte_string_literals/main.baml
  • baml_language/crates/baml_tests/src/compiler2_tir/mod.rs
  • baml_language/crates/baml_tests/tests/byte_strings.rs
  • baml_language/crates/baml_type/src/lib.rs
  • baml_language/crates/baml_type/src/typetag.rs
  • baml_language/crates/bex_engine/src/conversion.rs
  • baml_language/crates/bex_events/src/serialize.rs
  • baml_language/crates/bex_external_types/src/bex_external_value.rs
  • baml_language/crates/bex_heap/src/accessor.rs
  • baml_language/crates/bex_heap/src/gc.rs
  • baml_language/crates/bex_heap/src/heap_debugger/real.rs
  • baml_language/crates/bex_sap/src/sap_model/convert.rs
  • baml_language/crates/bex_vm/Cargo.toml
  • baml_language/crates/bex_vm/src/errors.rs
  • baml_language/crates/bex_vm/src/package_baml/mod.rs
  • baml_language/crates/bex_vm/src/package_baml/root.rs
  • baml_language/crates/bex_vm/src/package_baml/uint8array.rs
  • baml_language/crates/bex_vm/src/package_baml/unstable.rs
  • baml_language/crates/bex_vm/src/vm.rs
  • baml_language/crates/bex_vm_types/src/types.rs
  • baml_language/crates/bridge_ctypes/src/handle_table.rs
  • baml_language/crates/bridge_ctypes/src/value_decode.rs
  • baml_language/crates/bridge_ctypes/src/value_encode.rs
  • baml_language/crates/bridge_ctypes/types/baml/cffi/v1/baml_inbound.proto
  • baml_language/crates/bridge_ctypes/types/baml/cffi/v1/baml_outbound.proto
  • baml_language/crates/sys_llm/src/build_request/mod.rs
  • baml_language/crates/sys_llm/src/jinja/value_conversion.rs
  • baml_language/crates/sys_llm/src/types/output_format.rs
  • baml_language/crates/tools_onionskin/src/compiler.rs

@codspeed-hq
Copy link
Copy Markdown

codspeed-hq bot commented Apr 2, 2026

Merging this PR will not alter performance

⚠️ Unknown Walltime execution environment detected

Using the Walltime instrument on standard Hosted Runners will lead to inconsistent data.

For the most accurate results, we recommend using CodSpeed Macro Runners: bare-metal machines fine-tuned for performance measurement consistency.

✅ 15 untouched benchmarks
⏩ 98 skipped benchmarks1


Comparing kai/bytes (a5b6aae) with canary (57fb046)

Open in CodSpeed

Footnotes

  1. 98 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 2, 2026

Binary size checks passed

7 passed

Artifact Platform Gzip Baseline Delta Status
bridge_cffi Linux 5.4 MB 5.7 MB -239.5 KB (-4.2%) OK
bridge_cffi-stripped Linux 4.0 MB 4.3 MB -242.9 KB (-5.7%) OK
bridge_cffi macOS 4.5 MB 4.6 MB -145.3 KB (-3.1%) OK
bridge_cffi-stripped macOS 3.3 MB 3.5 MB -148.9 KB (-4.3%) OK
bridge_cffi Windows 4.4 MB 4.6 MB -175.7 KB (-3.8%) OK
bridge_cffi-stripped Windows 3.4 MB 3.5 MB -173.4 KB (-4.9%) OK
bridge_wasm WASM 2.9 MB 3.0 MB -29.4 KB (-1.0%) OK

Generated by cargo size-gate · workflow run

- Implemented as a new core type (similar to `string` but backed by a `Vec<u8>`)
- Can be indexed like an `int[]` but will panic if setting a value outside `u8` range
- TODO: more builtin methods for `uint8array`
- `uint8array` is not serializable
- TODO: byte string literals
- `b"hello"` creates a `uint8array`
- Added testing for `uint8array` and byte string literals
It is properly handled by the formatter
- Properly emit diagnostics for invalid escape sequences in the byte string.
- Added to a few match statements where it was missing (also made the matches no longer just have a wildcard at the end)
based on parts of the javascript api mostly
- Include BYTE_STRING_LITERAL in expression node whitelist
- Allow byte string literals in config blocks
- Include `uint8array` in the 'any scalar' union in ffi
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 9

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
baml_language/crates/baml_compiler2_mir/src/lower.rs (1)

3163-3167: ⚠️ Potential issue | 🟠 Major

uint8array index-kind update is incomplete for optional-index lvalues.

Line 3163 was updated, but Line 3279 still checks only Ty::List(..).
OptionalIndex assignment paths on uint8array? can still be emitted with IndexKind::Map, which is incorrect.

🔧 Proposed fix
-                let kind = if matches!(unwrapped_ty, Ty::List(..)) {
+                let kind = if matches!(unwrapped_ty, Ty::List(..) | Ty::Uint8Array { .. }) {
                     IndexKind::Array
                 } else {
                     IndexKind::Map
                 };
baml_language/crates/bex_vm/src/vm.rs (1)

2336-2369: ⚠️ Potential issue | 🟠 Major

Validate uint8array writes before updating watch state.

update_watched_node() runs before the Object::Uint8Array branch proves the new element is an in-range Value::Int. A rejected store still mutates the watch graph and last_assigned, so watched values can drift from the actual byte array after an error. Please normalize/validate the byte first, then update the watch topology with the value that will actually be stored.

baml_language/crates/baml_builtins2_codegen/src/codegen.rs (1)

1142-1151: ⚠️ Potential issue | 🟡 Minor

Add Uint8Array case with as_deref() in Optional handling.

When call_arg_needs_ref() returns true for Uint8Array, the default branch emits {name}.as_ref(), which produces Option<&Vec<u8>>. The clean signature for Optional<Uint8Array> is Option<&[u8]>, and Rust does not implicitly coerce Option<&Vec<u8>> to Option<&[u8]>. The fix is to add an explicit Uint8Array case matching the existing pattern for String and List:

Minimal fix
         BamlType::Optional(inner) => match inner.as_ref() {
             BamlType::String => format!("{name}.as_deref()"),
             BamlType::List(_) => format!("{name}.as_deref()"),
+            BamlType::Uint8Array => format!("{name}.as_deref()"),
             _ => {
                 if call_arg_needs_ref(inner) {
                     format!("{name}.as_ref()")

Also applies to: 1164-1173

♻️ Duplicate comments (3)
baml_language/crates/baml_compiler_parser/src/parser.rs (1)

4552-4563: ⚠️ Potential issue | 🟠 Major

Byte strings are still broken in config arrays.

Line 4552 correctly classifies b"..." as an expression start, but Line 4626 in parse_config_array_element still consumes any Word as a bare identifier first. For [b"abc"], only b is consumed and the quote remains, so array config parsing still fails for byte literals.

🩹 Suggested fix
 fn parse_config_array_element(&mut self) {
     if self.at(TokenKind::LBrace) {
         // Parse as config block (config-style: no colons required)
         self.parse_config_block();
     } else if self.at(TokenKind::RBracket) {
         // Empty or trailing - don't consume
+    } else if self.looks_like_config_expression() {
+        // Expression-like entries (including b"...", env.*, booleans, numbers, strings)
+        self.parse_config_value();
     } else if self.at(TokenKind::Word) {
         // Simple identifier (e.g., client names in strategy arrays)
         self.with_node(SyntaxKind::CONFIG_VALUE, |p| {
             p.bump();
         });
     } else {
         // Parse as simple value (string, number, etc.)
         self.parse_config_value();
     }
 }
baml_language/crates/baml_compiler2_mir/src/pretty.rs (1)

322-322: ⚠️ Potential issue | 🟡 Minor

Length-only byte rendering obscures actual constants.

b"\x00" and b"\xff" both print as the same length-only string, which makes MIR debugging less reliable.

🔧 Suggested tweak
-        Rvalue::Uint8Array(bytes) => write!(f, "b\"<{} bytes>\"", bytes.len()),
+        Rvalue::Uint8Array(bytes) => {
+            write!(f, "b\"")?;
+            for byte in bytes.iter().take(16) {
+                write!(f, "\\x{byte:02x}")?;
+            }
+            if bytes.len() > 16 {
+                write!(f, "...")?;
+            }
+            write!(f, "\" ({} bytes)", bytes.len())
+        }
baml_language/crates/baml_builtins2_codegen/src/extract.rs (1)

274-275: ⚠️ Potential issue | 🟡 Minor

Add the missing regression assertion for Uint8Array class exclusion.
extract_class_fields now excludes Uint8Array, but test_extract_class_fields still doesn’t assert that behavior.

Suggested test patch
         assert!(
             !class_defs.iter().any(|c| c.name == "String"),
             "String should be excluded"
         );
+        assert!(
+            !class_defs.iter().any(|c| c.name == "Uint8Array"),
+            "Uint8Array should be excluded"
+        );
As per coding guidelines `**/*.rs`: Prefer writing Rust unit tests over integration tests where possible.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 74046f1d-2ac4-438c-b57a-82246517abb9

📥 Commits

Reviewing files that changed from the base of the PR and between 72cb65a and a5b6aae.

⛔ Files ignored due to path filters (24)
  • baml_language/Cargo.lock is excluded by !**/*.lock
  • baml_language/crates/baml_tests/snapshots/__baml_std__/baml_tests____baml_std____03_hir.snap is excluded by !**/*.snap
  • baml_language/crates/baml_tests/snapshots/__baml_std__/baml_tests____baml_std____04_5_mir.snap is excluded by !**/*.snap
  • baml_language/crates/baml_tests/snapshots/__baml_std__/baml_tests____baml_std____04_tir.snap is excluded by !**/*.snap
  • baml_language/crates/baml_tests/snapshots/__baml_std__/baml_tests____baml_std____06_codegen.snap is excluded by !**/*.snap
  • baml_language/crates/baml_tests/snapshots/byte_string_literals/baml_tests__byte_string_literals__01_lexer__main.snap is excluded by !**/*.snap
  • baml_language/crates/baml_tests/snapshots/byte_string_literals/baml_tests__byte_string_literals__02_parser__main.snap is excluded by !**/*.snap
  • baml_language/crates/baml_tests/snapshots/byte_string_literals/baml_tests__byte_string_literals__03_hir.snap is excluded by !**/*.snap
  • baml_language/crates/baml_tests/snapshots/byte_string_literals/baml_tests__byte_string_literals__04_5_mir.snap is excluded by !**/*.snap
  • baml_language/crates/baml_tests/snapshots/byte_string_literals/baml_tests__byte_string_literals__04_tir.snap is excluded by !**/*.snap
  • baml_language/crates/baml_tests/snapshots/byte_string_literals/baml_tests__byte_string_literals__05_diagnostics.snap is excluded by !**/*.snap
  • baml_language/crates/baml_tests/snapshots/byte_string_literals/baml_tests__byte_string_literals__06_codegen.snap is excluded by !**/*.snap
  • baml_language/crates/baml_tests/snapshots/byte_string_literals/baml_tests__byte_string_literals__10_formatter__main.snap is excluded by !**/*.snap
  • baml_language/crates/baml_tests/snapshots/client_option_types/baml_tests__client_option_types__06_codegen.snap is excluded by !**/*.snap
  • baml_language/crates/baml_tests/snapshots/comment_after_string_in_config/baml_tests__comment_after_string_in_config__06_codegen.snap is excluded by !**/*.snap
  • baml_language/crates/baml_tests/snapshots/comment_in_type/baml_tests__comment_in_type__06_codegen.snap is excluded by !**/*.snap
  • baml_language/crates/baml_tests/snapshots/config_dictionary/baml_tests__config_dictionary__06_codegen.snap is excluded by !**/*.snap
  • baml_language/crates/baml_tests/snapshots/config_model_string/baml_tests__config_model_string__06_codegen.snap is excluded by !**/*.snap
  • baml_language/crates/baml_tests/snapshots/control_flow/baml_tests__control_flow__06_codegen.snap is excluded by !**/*.snap
  • baml_language/crates/baml_tests/snapshots/format_checks/baml_tests__format_checks__06_codegen.snap is excluded by !**/*.snap
  • baml_language/crates/baml_tests/snapshots/header_in_llm_function/baml_tests__header_in_llm_function__06_codegen.snap is excluded by !**/*.snap
  • baml_language/crates/baml_tests/snapshots/o1_allowed_roles/baml_tests__o1_allowed_roles__06_codegen.snap is excluded by !**/*.snap
  • baml_language/crates/baml_tests/snapshots/retry_policy/baml_tests__retry_policy__06_codegen.snap is excluded by !**/*.snap
  • baml_language/crates/baml_tests/src/compiler2_tir/snapshots/baml_tests__compiler2_tir__phase5__snapshot_baml_package_items.snap is excluded by !**/*.snap
📒 Files selected for processing (63)
  • baml_language/crates/baml_builtins2/baml_std/baml/uint8array.baml
  • baml_language/crates/baml_builtins2/src/lib.rs
  • baml_language/crates/baml_builtins2_codegen/src/codegen.rs
  • baml_language/crates/baml_builtins2_codegen/src/codegen_io.rs
  • baml_language/crates/baml_builtins2_codegen/src/extract.rs
  • baml_language/crates/baml_builtins2_codegen/src/types.rs
  • baml_language/crates/baml_codegen_python/src/ty.rs
  • baml_language/crates/baml_codegen_types/src/objects.rs
  • baml_language/crates/baml_codegen_types/src/ty.rs
  • baml_language/crates/baml_compiler2_ast/src/ast.rs
  • baml_language/crates/baml_compiler2_ast/src/lib.rs
  • baml_language/crates/baml_compiler2_ast/src/lower_expr_body.rs
  • baml_language/crates/baml_compiler2_ast/src/lower_type_expr.rs
  • baml_language/crates/baml_compiler2_ast/src/lowering_diagnostic.rs
  • baml_language/crates/baml_compiler2_emit/src/analysis.rs
  • baml_language/crates/baml_compiler2_emit/src/emit.rs
  • baml_language/crates/baml_compiler2_emit/src/pull_semantics.rs
  • baml_language/crates/baml_compiler2_emit/src/stack_carry.rs
  • baml_language/crates/baml_compiler2_hir/src/builder.rs
  • baml_language/crates/baml_compiler2_mir/src/cleanup.rs
  • baml_language/crates/baml_compiler2_mir/src/ir.rs
  • baml_language/crates/baml_compiler2_mir/src/lower.rs
  • baml_language/crates/baml_compiler2_mir/src/pretty.rs
  • baml_language/crates/baml_compiler2_ppir/src/ty.rs
  • baml_language/crates/baml_compiler2_tir/src/builder.rs
  • baml_language/crates/baml_compiler2_tir/src/lower_type_expr.rs
  • baml_language/crates/baml_compiler2_tir/src/normalize.rs
  • baml_language/crates/baml_compiler2_tir/src/ty.rs
  • baml_language/crates/baml_compiler_diagnostics/src/diagnostic.rs
  • baml_language/crates/baml_compiler_parser/src/parser.rs
  • baml_language/crates/baml_compiler_syntax/src/syntax_kind.rs
  • baml_language/crates/baml_fmt/src/ast/expressions.rs
  • baml_language/crates/baml_fmt/src/ast/tokens.rs
  • baml_language/crates/baml_lsp2_actions/src/utils.rs
  • baml_language/crates/baml_project/src/client_codegen.rs
  • baml_language/crates/baml_tests/projects/byte_string_literals/main.baml
  • baml_language/crates/baml_tests/src/compiler2_tir/mod.rs
  • baml_language/crates/baml_tests/tests/byte_strings.rs
  • baml_language/crates/baml_type/src/lib.rs
  • baml_language/crates/baml_type/src/typetag.rs
  • baml_language/crates/bex_engine/src/conversion.rs
  • baml_language/crates/bex_events/src/serialize.rs
  • baml_language/crates/bex_external_types/src/bex_external_value.rs
  • baml_language/crates/bex_heap/src/accessor.rs
  • baml_language/crates/bex_heap/src/gc.rs
  • baml_language/crates/bex_heap/src/heap_debugger/real.rs
  • baml_language/crates/bex_sap/src/sap_model/convert.rs
  • baml_language/crates/bex_vm/Cargo.toml
  • baml_language/crates/bex_vm/src/package_baml/mod.rs
  • baml_language/crates/bex_vm/src/package_baml/root.rs
  • baml_language/crates/bex_vm/src/package_baml/uint8array.rs
  • baml_language/crates/bex_vm/src/package_baml/unstable.rs
  • baml_language/crates/bex_vm/src/vm.rs
  • baml_language/crates/bex_vm_types/src/types.rs
  • baml_language/crates/bridge_ctypes/src/handle_table.rs
  • baml_language/crates/bridge_ctypes/src/value_decode.rs
  • baml_language/crates/bridge_ctypes/src/value_encode.rs
  • baml_language/crates/bridge_ctypes/types/baml/cffi/v1/baml_inbound.proto
  • baml_language/crates/bridge_ctypes/types/baml/cffi/v1/baml_outbound.proto
  • baml_language/crates/sys_llm/src/build_request/mod.rs
  • baml_language/crates/sys_llm/src/jinja/value_conversion.rs
  • baml_language/crates/sys_llm/src/types/output_format.rs
  • baml_language/crates/tools_onionskin/src/compiler.rs

@2kai2kai2 2kai2kai2 added this pull request to the merge queue Apr 3, 2026
Merged via the queue into canary with commit a260a16 Apr 3, 2026
43 of 44 checks passed
@2kai2kai2 2kai2kai2 deleted the kai/bytes branch April 3, 2026 22:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant