Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
📝 WalkthroughWalkthroughThis PR introduces comprehensive support for the Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 17
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 757dc589-f1f6-4227-aca2-66519e0bf4c2
⛔ Files ignored due to path filters (24)
baml_language/Cargo.lockis excluded by!**/*.lockbaml_language/crates/baml_tests/snapshots/__baml_std__/baml_tests____baml_std____03_hir.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/__baml_std__/baml_tests____baml_std____04_5_mir.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/__baml_std__/baml_tests____baml_std____04_tir.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/__baml_std__/baml_tests____baml_std____06_codegen.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/byte_string_literals/baml_tests__byte_string_literals__01_lexer__main.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/byte_string_literals/baml_tests__byte_string_literals__02_parser__main.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/byte_string_literals/baml_tests__byte_string_literals__03_hir.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/byte_string_literals/baml_tests__byte_string_literals__04_5_mir.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/byte_string_literals/baml_tests__byte_string_literals__04_tir.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/byte_string_literals/baml_tests__byte_string_literals__05_diagnostics.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/byte_string_literals/baml_tests__byte_string_literals__06_codegen.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/byte_string_literals/baml_tests__byte_string_literals__10_formatter__main.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/client_option_types/baml_tests__client_option_types__06_codegen.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/comment_after_string_in_config/baml_tests__comment_after_string_in_config__06_codegen.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/comment_in_type/baml_tests__comment_in_type__06_codegen.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/config_dictionary/baml_tests__config_dictionary__06_codegen.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/config_model_string/baml_tests__config_model_string__06_codegen.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/control_flow/baml_tests__control_flow__06_codegen.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/format_checks/baml_tests__format_checks__06_codegen.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/header_in_llm_function/baml_tests__header_in_llm_function__06_codegen.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/o1_allowed_roles/baml_tests__o1_allowed_roles__06_codegen.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/retry_policy/baml_tests__retry_policy__06_codegen.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/src/compiler2_tir/snapshots/baml_tests__compiler2_tir__phase5__snapshot_baml_package_items.snapis excluded by!**/*.snap
📒 Files selected for processing (63)
baml_language/crates/baml_builtins2/baml_std/baml/uint8array.bamlbaml_language/crates/baml_builtins2/src/lib.rsbaml_language/crates/baml_builtins2_codegen/src/codegen.rsbaml_language/crates/baml_builtins2_codegen/src/codegen_io.rsbaml_language/crates/baml_builtins2_codegen/src/extract.rsbaml_language/crates/baml_builtins2_codegen/src/types.rsbaml_language/crates/baml_codegen_python/src/ty.rsbaml_language/crates/baml_codegen_types/src/objects.rsbaml_language/crates/baml_codegen_types/src/ty.rsbaml_language/crates/baml_compiler2_ast/src/ast.rsbaml_language/crates/baml_compiler2_ast/src/lib.rsbaml_language/crates/baml_compiler2_ast/src/lower_expr_body.rsbaml_language/crates/baml_compiler2_ast/src/lower_type_expr.rsbaml_language/crates/baml_compiler2_ast/src/lowering_diagnostic.rsbaml_language/crates/baml_compiler2_emit/src/analysis.rsbaml_language/crates/baml_compiler2_emit/src/emit.rsbaml_language/crates/baml_compiler2_emit/src/pull_semantics.rsbaml_language/crates/baml_compiler2_emit/src/stack_carry.rsbaml_language/crates/baml_compiler2_hir/src/builder.rsbaml_language/crates/baml_compiler2_mir/src/cleanup.rsbaml_language/crates/baml_compiler2_mir/src/ir.rsbaml_language/crates/baml_compiler2_mir/src/lower.rsbaml_language/crates/baml_compiler2_mir/src/pretty.rsbaml_language/crates/baml_compiler2_ppir/src/ty.rsbaml_language/crates/baml_compiler2_tir/src/builder.rsbaml_language/crates/baml_compiler2_tir/src/lower_type_expr.rsbaml_language/crates/baml_compiler2_tir/src/normalize.rsbaml_language/crates/baml_compiler2_tir/src/ty.rsbaml_language/crates/baml_compiler_diagnostics/src/diagnostic.rsbaml_language/crates/baml_compiler_parser/src/parser.rsbaml_language/crates/baml_compiler_syntax/src/syntax_kind.rsbaml_language/crates/baml_fmt/src/ast/expressions.rsbaml_language/crates/baml_fmt/src/ast/tokens.rsbaml_language/crates/baml_lsp2_actions/src/utils.rsbaml_language/crates/baml_tests/projects/byte_string_literals/main.bamlbaml_language/crates/baml_tests/src/compiler2_tir/mod.rsbaml_language/crates/baml_tests/tests/byte_strings.rsbaml_language/crates/baml_type/src/lib.rsbaml_language/crates/baml_type/src/typetag.rsbaml_language/crates/bex_engine/src/conversion.rsbaml_language/crates/bex_events/src/serialize.rsbaml_language/crates/bex_external_types/src/bex_external_value.rsbaml_language/crates/bex_heap/src/accessor.rsbaml_language/crates/bex_heap/src/gc.rsbaml_language/crates/bex_heap/src/heap_debugger/real.rsbaml_language/crates/bex_sap/src/sap_model/convert.rsbaml_language/crates/bex_vm/Cargo.tomlbaml_language/crates/bex_vm/src/errors.rsbaml_language/crates/bex_vm/src/package_baml/mod.rsbaml_language/crates/bex_vm/src/package_baml/root.rsbaml_language/crates/bex_vm/src/package_baml/uint8array.rsbaml_language/crates/bex_vm/src/package_baml/unstable.rsbaml_language/crates/bex_vm/src/vm.rsbaml_language/crates/bex_vm_types/src/types.rsbaml_language/crates/bridge_ctypes/src/handle_table.rsbaml_language/crates/bridge_ctypes/src/value_decode.rsbaml_language/crates/bridge_ctypes/src/value_encode.rsbaml_language/crates/bridge_ctypes/types/baml/cffi/v1/baml_inbound.protobaml_language/crates/bridge_ctypes/types/baml/cffi/v1/baml_outbound.protobaml_language/crates/sys_llm/src/build_request/mod.rsbaml_language/crates/sys_llm/src/jinja/value_conversion.rsbaml_language/crates/sys_llm/src/types/output_format.rsbaml_language/crates/tools_onionskin/src/compiler.rs
Merging this PR will not alter performance
|
Binary size checks passed✅ 7 passed
Generated by |
- Implemented as a new core type (similar to `string` but backed by a `Vec<u8>`) - Can be indexed like an `int[]` but will panic if setting a value outside `u8` range - TODO: more builtin methods for `uint8array` - `uint8array` is not serializable - TODO: byte string literals
- `b"hello"` creates a `uint8array` - Added testing for `uint8array` and byte string literals
It is properly handled by the formatter
- Properly emit diagnostics for invalid escape sequences in the byte string. - Added to a few match statements where it was missing (also made the matches no longer just have a wildcard at the end)
based on parts of the javascript api mostly
- Include BYTE_STRING_LITERAL in expression node whitelist - Allow byte string literals in config blocks - Include `uint8array` in the 'any scalar' union in ffi
There was a problem hiding this comment.
Actionable comments posted: 9
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
baml_language/crates/baml_compiler2_mir/src/lower.rs (1)
3163-3167:⚠️ Potential issue | 🟠 Major
uint8arrayindex-kind update is incomplete for optional-index lvalues.Line 3163 was updated, but Line 3279 still checks only
Ty::List(..).
OptionalIndexassignment paths onuint8array?can still be emitted withIndexKind::Map, which is incorrect.🔧 Proposed fix
- let kind = if matches!(unwrapped_ty, Ty::List(..)) { + let kind = if matches!(unwrapped_ty, Ty::List(..) | Ty::Uint8Array { .. }) { IndexKind::Array } else { IndexKind::Map };baml_language/crates/bex_vm/src/vm.rs (1)
2336-2369:⚠️ Potential issue | 🟠 MajorValidate
uint8arraywrites before updating watch state.
update_watched_node()runs before theObject::Uint8Arraybranch proves the new element is an in-rangeValue::Int. A rejected store still mutates the watch graph andlast_assigned, so watched values can drift from the actual byte array after an error. Please normalize/validate the byte first, then update the watch topology with the value that will actually be stored.baml_language/crates/baml_builtins2_codegen/src/codegen.rs (1)
1142-1151:⚠️ Potential issue | 🟡 MinorAdd
Uint8Arraycase withas_deref()in Optional handling.When
call_arg_needs_ref()returns true forUint8Array, the default branch emits{name}.as_ref(), which producesOption<&Vec<u8>>. The clean signature forOptional<Uint8Array>isOption<&[u8]>, and Rust does not implicitly coerceOption<&Vec<u8>>toOption<&[u8]>. The fix is to add an explicitUint8Arraycase matching the existing pattern forStringandList:Minimal fix
BamlType::Optional(inner) => match inner.as_ref() { BamlType::String => format!("{name}.as_deref()"), BamlType::List(_) => format!("{name}.as_deref()"), + BamlType::Uint8Array => format!("{name}.as_deref()"), _ => { if call_arg_needs_ref(inner) { format!("{name}.as_ref()")Also applies to: 1164-1173
♻️ Duplicate comments (3)
baml_language/crates/baml_compiler_parser/src/parser.rs (1)
4552-4563:⚠️ Potential issue | 🟠 MajorByte strings are still broken in config arrays.
Line 4552 correctly classifies
b"..."as an expression start, but Line 4626 inparse_config_array_elementstill consumes anyWordas a bare identifier first. For[b"abc"], onlybis consumed and the quote remains, so array config parsing still fails for byte literals.🩹 Suggested fix
fn parse_config_array_element(&mut self) { if self.at(TokenKind::LBrace) { // Parse as config block (config-style: no colons required) self.parse_config_block(); } else if self.at(TokenKind::RBracket) { // Empty or trailing - don't consume + } else if self.looks_like_config_expression() { + // Expression-like entries (including b"...", env.*, booleans, numbers, strings) + self.parse_config_value(); } else if self.at(TokenKind::Word) { // Simple identifier (e.g., client names in strategy arrays) self.with_node(SyntaxKind::CONFIG_VALUE, |p| { p.bump(); }); } else { // Parse as simple value (string, number, etc.) self.parse_config_value(); } }baml_language/crates/baml_compiler2_mir/src/pretty.rs (1)
322-322:⚠️ Potential issue | 🟡 MinorLength-only byte rendering obscures actual constants.
b"\x00"andb"\xff"both print as the same length-only string, which makes MIR debugging less reliable.🔧 Suggested tweak
- Rvalue::Uint8Array(bytes) => write!(f, "b\"<{} bytes>\"", bytes.len()), + Rvalue::Uint8Array(bytes) => { + write!(f, "b\"")?; + for byte in bytes.iter().take(16) { + write!(f, "\\x{byte:02x}")?; + } + if bytes.len() > 16 { + write!(f, "...")?; + } + write!(f, "\" ({} bytes)", bytes.len()) + }baml_language/crates/baml_builtins2_codegen/src/extract.rs (1)
274-275:⚠️ Potential issue | 🟡 MinorAdd the missing regression assertion for
Uint8Arrayclass exclusion.
extract_class_fieldsnow excludesUint8Array, buttest_extract_class_fieldsstill doesn’t assert that behavior.As per coding guidelines `**/*.rs`: Prefer writing Rust unit tests over integration tests where possible.Suggested test patch
assert!( !class_defs.iter().any(|c| c.name == "String"), "String should be excluded" ); + assert!( + !class_defs.iter().any(|c| c.name == "Uint8Array"), + "Uint8Array should be excluded" + );
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 74046f1d-2ac4-438c-b57a-82246517abb9
⛔ Files ignored due to path filters (24)
baml_language/Cargo.lockis excluded by!**/*.lockbaml_language/crates/baml_tests/snapshots/__baml_std__/baml_tests____baml_std____03_hir.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/__baml_std__/baml_tests____baml_std____04_5_mir.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/__baml_std__/baml_tests____baml_std____04_tir.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/__baml_std__/baml_tests____baml_std____06_codegen.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/byte_string_literals/baml_tests__byte_string_literals__01_lexer__main.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/byte_string_literals/baml_tests__byte_string_literals__02_parser__main.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/byte_string_literals/baml_tests__byte_string_literals__03_hir.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/byte_string_literals/baml_tests__byte_string_literals__04_5_mir.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/byte_string_literals/baml_tests__byte_string_literals__04_tir.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/byte_string_literals/baml_tests__byte_string_literals__05_diagnostics.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/byte_string_literals/baml_tests__byte_string_literals__06_codegen.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/byte_string_literals/baml_tests__byte_string_literals__10_formatter__main.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/client_option_types/baml_tests__client_option_types__06_codegen.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/comment_after_string_in_config/baml_tests__comment_after_string_in_config__06_codegen.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/comment_in_type/baml_tests__comment_in_type__06_codegen.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/config_dictionary/baml_tests__config_dictionary__06_codegen.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/config_model_string/baml_tests__config_model_string__06_codegen.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/control_flow/baml_tests__control_flow__06_codegen.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/format_checks/baml_tests__format_checks__06_codegen.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/header_in_llm_function/baml_tests__header_in_llm_function__06_codegen.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/o1_allowed_roles/baml_tests__o1_allowed_roles__06_codegen.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/retry_policy/baml_tests__retry_policy__06_codegen.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/src/compiler2_tir/snapshots/baml_tests__compiler2_tir__phase5__snapshot_baml_package_items.snapis excluded by!**/*.snap
📒 Files selected for processing (63)
baml_language/crates/baml_builtins2/baml_std/baml/uint8array.bamlbaml_language/crates/baml_builtins2/src/lib.rsbaml_language/crates/baml_builtins2_codegen/src/codegen.rsbaml_language/crates/baml_builtins2_codegen/src/codegen_io.rsbaml_language/crates/baml_builtins2_codegen/src/extract.rsbaml_language/crates/baml_builtins2_codegen/src/types.rsbaml_language/crates/baml_codegen_python/src/ty.rsbaml_language/crates/baml_codegen_types/src/objects.rsbaml_language/crates/baml_codegen_types/src/ty.rsbaml_language/crates/baml_compiler2_ast/src/ast.rsbaml_language/crates/baml_compiler2_ast/src/lib.rsbaml_language/crates/baml_compiler2_ast/src/lower_expr_body.rsbaml_language/crates/baml_compiler2_ast/src/lower_type_expr.rsbaml_language/crates/baml_compiler2_ast/src/lowering_diagnostic.rsbaml_language/crates/baml_compiler2_emit/src/analysis.rsbaml_language/crates/baml_compiler2_emit/src/emit.rsbaml_language/crates/baml_compiler2_emit/src/pull_semantics.rsbaml_language/crates/baml_compiler2_emit/src/stack_carry.rsbaml_language/crates/baml_compiler2_hir/src/builder.rsbaml_language/crates/baml_compiler2_mir/src/cleanup.rsbaml_language/crates/baml_compiler2_mir/src/ir.rsbaml_language/crates/baml_compiler2_mir/src/lower.rsbaml_language/crates/baml_compiler2_mir/src/pretty.rsbaml_language/crates/baml_compiler2_ppir/src/ty.rsbaml_language/crates/baml_compiler2_tir/src/builder.rsbaml_language/crates/baml_compiler2_tir/src/lower_type_expr.rsbaml_language/crates/baml_compiler2_tir/src/normalize.rsbaml_language/crates/baml_compiler2_tir/src/ty.rsbaml_language/crates/baml_compiler_diagnostics/src/diagnostic.rsbaml_language/crates/baml_compiler_parser/src/parser.rsbaml_language/crates/baml_compiler_syntax/src/syntax_kind.rsbaml_language/crates/baml_fmt/src/ast/expressions.rsbaml_language/crates/baml_fmt/src/ast/tokens.rsbaml_language/crates/baml_lsp2_actions/src/utils.rsbaml_language/crates/baml_project/src/client_codegen.rsbaml_language/crates/baml_tests/projects/byte_string_literals/main.bamlbaml_language/crates/baml_tests/src/compiler2_tir/mod.rsbaml_language/crates/baml_tests/tests/byte_strings.rsbaml_language/crates/baml_type/src/lib.rsbaml_language/crates/baml_type/src/typetag.rsbaml_language/crates/bex_engine/src/conversion.rsbaml_language/crates/bex_events/src/serialize.rsbaml_language/crates/bex_external_types/src/bex_external_value.rsbaml_language/crates/bex_heap/src/accessor.rsbaml_language/crates/bex_heap/src/gc.rsbaml_language/crates/bex_heap/src/heap_debugger/real.rsbaml_language/crates/bex_sap/src/sap_model/convert.rsbaml_language/crates/bex_vm/Cargo.tomlbaml_language/crates/bex_vm/src/package_baml/mod.rsbaml_language/crates/bex_vm/src/package_baml/root.rsbaml_language/crates/bex_vm/src/package_baml/uint8array.rsbaml_language/crates/bex_vm/src/package_baml/unstable.rsbaml_language/crates/bex_vm/src/vm.rsbaml_language/crates/bex_vm_types/src/types.rsbaml_language/crates/bridge_ctypes/src/handle_table.rsbaml_language/crates/bridge_ctypes/src/value_decode.rsbaml_language/crates/bridge_ctypes/src/value_encode.rsbaml_language/crates/bridge_ctypes/types/baml/cffi/v1/baml_inbound.protobaml_language/crates/bridge_ctypes/types/baml/cffi/v1/baml_outbound.protobaml_language/crates/sys_llm/src/build_request/mod.rsbaml_language/crates/sys_llm/src/jinja/value_conversion.rsbaml_language/crates/sys_llm/src/types/output_format.rsbaml_language/crates/tools_onionskin/src/compiler.rs
See https://beps.boundaryml.com/beps/25
A mutable heap-allocated core BAML type for storing binary data. The API is inspired by javascript's
UInt8Array. It is not serializable.Summary by CodeRabbit
Uint8Arraytype for handling binary data with byte string literal syntax (b"...")length(), element access viaat(), indexing, mutation methods (push(),pop(),sort()), and array manipulation (concat(),reverse(),slice())from_hex()/to_hex()andfrom_base64()/to_base64()conversions