forked from tarantool/tarantool
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Implementation notes: - The varbinary type is implemented as VLS cdata so we can't use the existing luaL_pushcdata and luaL_checkcdatas helpers for pushing an object of this type to Lua stack. Instead, we copied the implementation from the Lua JIT internals. - We already have the code handling `MP_BIN` fields in all built-in serializers. We just need to patch it to convert the data to/from a varbinary object instead of a plain string. - We updated the tuple.tostring method to set the NOWRAP base64 encoder flag when dumping binary blobs. The flag was apparently omitted by mistake because we mask all other new line characters while converting a tuple to a string. - The box/varbinary_type test was rewritten using the luatest framework with all the FFI code needed to insert binary data replaced with the new varbinary object. - We have to update quite a few SQL tests involving varbinary type because binary blobs are now returned as varbinary objects, not as plain strings, as they used to be. Closes tarantool#1629 @TarantoolBot document Title: Document the varbinary type The new module `varbinary` was introduced. The module implements the following functions: - `varbinary.new` - constructs a varbinary object from a plain string or cdata pointer and size (to be used with the `buffer` module). - `varbinary.is` - returns true if the argument is a varbinary object. ```Lua local bin = varbinary.new('data') assert(varbinary.is(bin)) assert(not varbinary.is('data')) ``` Like a plain string, a varbinary object stores arbitrary data. Unlike a plain string, it's encoded as a binary blob by the built-in encoders that support the binary type (MsgPack, YAML). (Actually, encoding binary blobs with the proper type is the main goal of the new type.) ``` tarantool> '\xFF\xFE' --- - "\xFF\xFE" ... tarantool> varbinary.new('\xFF\xFE') --- - !!binary //4= ... tarantool> msgpack.encode('\xFF\xFE') --- - "\xA2\xFF\xFE" ... tarantool> msgpack.encode(varbinary.new('\xFF\xFE')) --- - "\xC4\x02\xFF\xFE" ... ``` Note, the JSON format doesn't support the binary type so a varbinary object is still encoded as a plain string: ``` tarantool> json.encode('\xFF\xFE') --- - "\"\xFF\xFE\"" ... tarantool> json.encode(varbinary.new('\xFF\xFE')) --- - "\"\xFF\xFE\"" ... ``` The built-in decoders now decode binary data fields (fields with the 'binary' tag in YAML; the `MP_BIN` type in MsgPack) to a varbinary object by default: ``` tarantool> varbinary.is(msgpack.decode('\xC4\x02\xFF\xFE')) --- - true ... tarantool> varbinary.is(yaml.decode('!!binary //4=')) --- - true ... ``` This also implies that the data stored in the database under the 'varbinary' field type is now returned to Lua not as a plain string, but as a varbinary object. It's possible to revert to the old behavior by toggling the new compat option `binary_data_decoding` because this change may break backward compatibility: ``` tarantool> compat.binary_data_decoding = 'old' --- ... tarantool> varbinary.is(msgpack.decode('\xC4\x02\xFF\xFE')) --- - false ... tarantool> varbinary.is(yaml.decode('!!binary //4=')) --- - false ... ``` Please create a documentation page for the new compat option: https://tarantool.io/compat/binary_data_decoding A varbinary object implements the following meta-methods: - `__len` - returns the length of the binary data, in bytes. - `__tostring` - returns the data in a plain string. - `__eq` - returns true if the varbinary object contains the same data as another varbinary object or a string. ```Lua local bin = varbinary.new('foo') assert(#bin == 3) assert(tostring(bin) == 'foo') assert(bin == 'foo') assert(bin ~= 'bar') assert(bin == varbinary.new('foo')) assert(bin ~= varbinary.new('bar')) ``` There are no string manipulation methods, like `string.sub` or `string.match`. If you need to match a substring in a varbinary object, you have to convert it to a string first. For more details, see the [design document][1]. [1]: https://www.notion.so/tarantool/varbinary-in-Lua-a0ce453dcf5a46e3bc421bf80d4cc276
- Loading branch information
Showing
31 changed files
with
638 additions
and
413 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,9 @@ | ||
## feature/lua | ||
|
||
* **[Breaking change]** Added the new `varbinary` type to Lua. An object of | ||
this type is similar to a plain string but encoded in MsgPack as `MP_BIN` so | ||
it can be used for storing binary blobs in the database. This also works the | ||
other way round: data fields stored as `MP_BIN` are now decoded in Lua as | ||
varbinary objects, not as plain strings, as they used to be. Since the latter | ||
may cause compatibility issues, the new compat option `binary_data_decoding` | ||
was introduced to revert the built-in decoder to the old behavior (gh-1629). |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,65 @@ | ||
local ffi = require('ffi') | ||
|
||
ffi.cdef([[ | ||
int memcmp(const char *s1, const char *s2, size_t n); | ||
]]) | ||
|
||
local memcmp = ffi.C.memcmp | ||
|
||
local const_char_ptr_t = ffi.typeof('const char *') | ||
local varbinary_t = ffi.typeof('struct varbinary') | ||
|
||
local function is_varbinary(obj) | ||
return ffi.istype(varbinary_t, obj) | ||
end | ||
|
||
local function new_varbinary(data, size) | ||
if data == nil then | ||
size = 0 | ||
elseif type(data) == 'string' then | ||
size = #data | ||
elseif ffi.istype(varbinary_t, data) then | ||
size = ffi.sizeof(data) | ||
elseif not ffi.istype(const_char_ptr_t, data) or type(size) ~= 'number' then | ||
error('Usage: varbinary.new(str) or varbinary.new(ptr, size)', 2) | ||
end | ||
local bin = ffi.new(varbinary_t, size) | ||
ffi.copy(bin, data, size) | ||
return bin | ||
end | ||
|
||
local function varbinary_len(bin) | ||
assert(ffi.istype(varbinary_t, bin)) | ||
return ffi.sizeof(bin) | ||
end | ||
|
||
local function varbinary_tostring(bin) | ||
assert(ffi.istype(varbinary_t, bin)) | ||
return ffi.string(bin, ffi.sizeof(bin)) | ||
end | ||
|
||
local function varbinary_eq(a, b) | ||
if not (type(a) == 'string' or ffi.istype(varbinary_t, a)) or | ||
not (type(b) == 'string' or ffi.istype(varbinary_t, b)) then | ||
return false | ||
end | ||
local size_a = #a | ||
local size_b = #b | ||
if size_a ~= size_b then | ||
return false | ||
end | ||
local data_a = ffi.cast(const_char_ptr_t, a) | ||
local data_b = ffi.cast(const_char_ptr_t, b) | ||
return memcmp(data_a, data_b, size_a) == 0 | ||
end | ||
|
||
ffi.metatype(varbinary_t, { | ||
__len = varbinary_len, | ||
__tostring = varbinary_tostring, | ||
__eq = varbinary_eq, | ||
}) | ||
|
||
return { | ||
is = is_varbinary, | ||
new = new_varbinary, | ||
} |
Oops, something went wrong.