Conversation
| if (static_cast<unsigned char>(bytes[i]) != 0) | ||
| return std::nullopt; | ||
| } | ||
| } |
There was a problem hiding this comment.
Big-endian decode logic truncates wrong bytes for overflow
High Severity
The decodeBigEndianI64 function incorrectly handles byte arrays longer than 8 bytes. It checks if trailing bytes (indices 8+) are zero and then uses the leading 8 bytes. In big-endian, trailing bytes are the least significant, so this logic rejects valid values that fit in 64 bits (like 256 in a 10-byte array) while accepting values that overflow (like 2^72 truncated to its high bytes). The check for overflow needs to verify the leading excess bytes are zero (or proper sign extension), not the trailing ones.
src/model/model.cpp
Outdated
| strings_(std::move(strings)) | ||
| { | ||
| columns_.stringData_.reserve(detail::ColumnPageSize*4); | ||
| columns_.byteArrayData_.reserve(detail::ColumnPageSize*4); |
There was a problem hiding this comment.
Reuse stringData column instead.
src/model/model.cpp
Outdated
| clear_and_shrink(columns.strings_); | ||
| clear_and_shrink(columns.stringData_); | ||
| clear_and_shrink(columns.byteArrays_); | ||
| clear_and_shrink(columns.byteArrayData_); |
| auto idx = n.addr().index(); | ||
| if (auto err = checkBounds(impl_->columns_.byteArrays_)) | ||
| return tl::unexpected<Error>(*err); | ||
| auto& val = impl_->columns_.byteArrays_[idx]; |
There was a problem hiding this comment.
Access stringData column
src/model/model.cpp
Outdated
| ModelNode::Ptr ModelPool::newValue(simfil::ByteArray const& value) | ||
| { | ||
| impl_->columns_.byteArrays_.emplace_back(Impl::StringRange{ | ||
| (uint32_t)impl_->columns_.byteArrayData_.size(), |
There was a problem hiding this comment.
Also stringData column
| return l == r.toDisplayString(); | ||
| } | ||
|
|
||
| auto operator()(const ByteArray& l, int64_t r) const |
There was a problem hiding this comment.
I do not think this should be allowed:
- We do not allow comparing strings to numbers without casting
toDisplayString()changes format depending on content and should not be used for comparison at all
(Same for double.)
| return l < r.toDisplayString(); | ||
| } | ||
|
|
||
| auto operator()(const ByteArray& l, int64_t r) const |
There was a problem hiding this comment.
See previous comment about ByteArray to number conversion.
include/simfil/operator.h
Outdated
|
|
||
| auto operator()(const ByteArray& v) const -> std::string | ||
| { | ||
| return v.toDisplayString(); |
There was a problem hiding this comment.
Imo. this should always use toHex().
include/simfil/operator.h
Outdated
|
|
||
| auto operator()(const ByteArray&) const -> std::string_view | ||
| { | ||
| static auto n = "string"sv; |
There was a problem hiding this comment.
Why is the type "string"? Since ByteArray acts differently (hex format, ...)
it should get its own type name or be handled exactly like string.
|
Do we really need to make the |
The relevant issue: Could you take a look? Maybe you've got a great idea @johannes-wolf |
Can we use the native simfil int & string values instead? Livesource could do the conversion to either an |
|
TODO:
|
…>. Add a possibility to declare a byte array through the language. Fix consistency with cross-type operator logic. Add JSON conversion as tagged object (similar to _multimap). Remove toDisplayString().
…g. Fix ByteArray display format not round-trippable with literal syntax (parse b string literal as hex bytes).
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 2 potential issues.
Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.
PR comments are addressed :)
|
|





Note
Medium Risk
Touches core parsing/tokenization, value/operator dispatch, and model serialization/JSON encoding, so regressions could affect query evaluation and persistence. Changes are well-scoped and covered by new unit tests.
Overview
Adds first-class
bytessupport to simfil end-to-end: a newByteArrayscalar type,b"..."/b'...'literals (including\xNNescapes),typeof/#support,as bytescasting, and comparison semantics against other bytes and numeric values.Extends the model layer to store bytes in
ModelPool(new column + serialization) and introduces a tagged JSON representation ({"_bytes":true,"hex":...}) with round-trip parsing/printing; also refactors typed node resolution to an ADL-basedModel::resolve<T>API (replacing the oldresolveObject/resolveArrayhelpers). Documentation and tests are updated to cover the new bytes behavior.Written by Cursor Bugbot for commit eb25799. This will update automatically on new commits. Configure here.