Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend msgpack Lua API #2755

Closed
kostja opened this issue Sep 13, 2017 · 2 comments
Closed

Extend msgpack Lua API #2755

kostja opened this issue Sep 13, 2017 · 2 comments
Assignees
Labels
feature A new functionality
Milestone

Comments

@kostja
Copy link
Contributor

kostja commented Sep 13, 2017

http://github.com/tarantool/dump currently can not perform piece-by-piece restore of a dump because there is no way to parse a partial msgpuck chunk with the built-in msgpuck.

Please extend the built-in msgpuck to not throw exceptions on error/partial parse. After that it will be bossible to improve http://github.com/tarantool/dump.

On the same token it would be nice to finish msgpackffi and make sure it has the same API as msgpuck.

Thanks.

@kostja kostja added the feature A new functionality label Sep 13, 2017
@kostja kostja added this to the 1.7.7 milestone Sep 13, 2017
@locker
Copy link
Member

locker commented Nov 23, 2017

Proposed API extension:

  • msgpack.encode(obj) -> str

    (This function already exists, no changes made to it)

    Encode a Lua object as msgpack and store the result in a Lua
    string. Return the new Lua string.

  • msgpack.ibuf_encode(ibuf, obj) -> number of bytes

    (New function)

    Encode a Lua object as msgpack and store the result in the
    provided buffer. Return the number of bytes stored in the
    buffer.

  • msgpack.decode(str, offset) -> obj, new_offset
    msgpack.decode(buf, size) -> obj, new_buf

    (1st variant already exists, 2nd is new)

    Decode a Lua object from msgpack.

    Both variants return the decoded Lua object on success.

    The first variant works with a Lua string and an offset in the
    string. It returns the new offset in the second argument.

    The second variant works with a C buffer and its size. It
    returns a pointer to the position following the last written
    byte in the buffer in the second argument.

    On failure, return {nil, error message}. Strictly speaking,
    this breaks backward compatibility, because currently these
    functions raise an error, but error paths should be rare so
    let's try it and see what users say.

  • msgpack.decode_unchecked(str, offset) -> obj, new_offset
    msgpack.decode_unchecked(buf) -> obj, new_buf

    (New function, supposed to replace msgpack.ibuf_decode)

    Same as msgpack.decode(), but don't check the buffer with
    mp_check() before decoding it. It is intended to be used in
    hot paths where the caller is sure that the decoded data are
    not corrupted.

  • fio.write(fh, str)
    fio.pwrite(fh, str, offset)

    fio.write(fh, buf, size)
    fio.pwrite(fh, buf, size, offset)

    (Added variants for working with a C buffer)

    Write a Lua string (1st variant) or a C buffer (2nd variant)
    to a file. Return true on success, {false, error} on failure.

  • fio.read(fh, size) -> str
    fio.pread(fh, size, offset) -> str

    fio.read(fh, buf, size) -> number of bytes read
    fio.pread(fh, buf, size, offset) -> number of bytes read

    (Added variants for working with a C buffer)

    Read a file to a Lua string (1st variant) or to a C buffer
    (2nd variant).

    The first variant returns a new string on success while the
    second variant returns the number of bytes stored in the
    buffer. On failure both functions return {nil, error}.

locker added a commit that referenced this issue Nov 24, 2017
 - Do not throw an exception from msgpack.decode() if the supplied
   data cannot be parsed (mp_check() fails). Return {nil, nil, error}
   instead:

     obj, offset, err = msgpack.decode(data)
     if obj == nil then
         print(err)
	 return
     end

 - Allow to pass a C buffer to msgpack.decode(). Syntax:

     buf = buffer.ibuf()
     ...
     obj, rpos, err = msgpack.decode(buf.rpos, buf:size())

 - Introduce a version of msgpack.decode() that doesn't check the
   supplied msgpack - msgpack.decode_unchecked(). It has the same
   signature as msgpack.decode() except if called on a C buffer it
   doesn't require the buffer size. It is supposed to supplant
   msgpack.ibuf_decode() over time.

 - Introduce msgpack.ibuf_encode(buf, obj). It encodes a Lua object to
   an input buffer and returns the number of bytes encoded.

 - Add tests.

See #2755
locker added a commit that referenced this issue Nov 24, 2017
In order to use fio in conjunction with ibuf, we need to extend read(),
pread(), write(), pwrite() so that they can take a C buffer instead of
a Lua string. The syntax is as follows:

  read(size) -> str
  read(buf, size) -> len

  pread(size, offset) -> str
  pread(buf, size, offset) -> len

  write(str)
  write(buf, size)

  pwrite(str, offset)
  pwrite(buf, size, offset)

See #2755
rtsisyk pushed a commit that referenced this issue Nov 27, 2017
In order to use fio in conjunction with ibuf, we need to extend read(),
pread(), write(), pwrite() so that they can take a C buffer instead of
a Lua string. The syntax is as follows:

  read(size) -> str
  read(buf, size) -> len

  pread(size, offset) -> str
  pread(buf, size, offset) -> len

  write(str)
  write(buf, size)

  pwrite(str, offset)
  pwrite(buf, size, offset)

See #2755
locker added a commit that referenced this issue Nov 27, 2017
 - Allow to pass a C buffer to msgpack.decode(). Syntax:

     buf = buffer.ibuf()
     ...
     obj, rpos = msgpack.decode(buf.rpos, buf:size())

 - Introduce a version of msgpack.decode() that doesn't check the
   supplied msgpack - msgpack.decode_unchecked(). It has the same
   signature as msgpack.decode() except if called on a C buffer it
   doesn't require the buffer size. It is supposed to supplant
   msgpack.ibuf_decode() over time.

 - Allow to store encoded objects in a user-supplied ibuf. Syntax:

     buf = buffer.ibuf()
     len = msgpack.encode(obj, buf)

   ('len' is the number of bytes stored in the buffer)

 - Add tests.

Closes #2755
locker added a commit that referenced this issue Dec 1, 2017
 - Allow to pass a C buffer to msgpack.decode(). Syntax:

     buf = buffer.ibuf()
     ...
     obj, rpos = msgpack.decode(buf.rpos, buf:size())

 - Introduce a version of msgpack.decode() that doesn't check the
   supplied msgpack - msgpack.decode_unchecked(). It has the same
   signature as msgpack.decode() except if called on a C buffer it
   doesn't require the buffer size. It is supposed to supplant
   msgpack.ibuf_decode() over time.

 - Allow to store encoded objects in a user-supplied ibuf. Syntax:

     buf = buffer.ibuf()
     len = msgpack.encode(obj, buf)

   ('len' is the number of bytes stored in the buffer)

 - Add tests.

Closes #2755
rtsisyk pushed a commit that referenced this issue Dec 4, 2017
 - Allow to pass a C buffer to msgpack.decode(). Syntax:

     buf = buffer.ibuf()
     ...
     obj, rpos = msgpack.decode(buf.rpos, buf:size())

 - Introduce a version of msgpack.decode() that doesn't check the
   supplied msgpack - msgpack.decode_unchecked(). It has the same
   signature as msgpack.decode() except if called on a C buffer it
   doesn't require the buffer size. It is supposed to supplant
   msgpack.ibuf_decode() over time.

 - Allow to store encoded objects in a user-supplied ibuf. Syntax:

     buf = buffer.ibuf()
     len = msgpack.encode(obj, buf)

   ('len' is the number of bytes stored in the buffer)

 - Add tests.

Closes #2755
@rtsisyk rtsisyk changed the title No-exceptions api of built-in msgpuck Extend msgpack Lua API Dec 4, 2017
@rtsisyk rtsisyk modified the milestones: 1.8.7, 1.7.7 Dec 4, 2017
@rtsisyk
Copy link
Contributor

rtsisyk commented Dec 4, 2017

fio => #2960
No-exceptions API has been discarded because it is a breaking change.
ibuf support for msgpack has been pushed.

@rtsisyk rtsisyk closed this as completed Dec 4, 2017
Totktonada added a commit that referenced this issue Nov 6, 2019
Rewritten the test case. There are several reasons to do so.

First, the previous test case implementation uses test:iscdata() which
is not sufficient to verify whether the const qualifier is applied to a
cdata type, because of using ffi.istype() under hood, which ignores
'const'.

Second, the previous test case was invalid, because of wrong order of
ffi.cast() arguments.

Third, we don't need a real ibuf object, ffi.cast(ctype, 'a string') is
enough. This allows to simplify test cases.

Fourth, the test case is rewritten in a declarative manner to reduce
code duplication. This also allows to expand it w/o many changes for
msgpack.decode() function in the following commit.

I left only msgpackffi.decode_unchecked() cases. A bit context is needed
to describe why.

msgpack.decode() accepts two arguments: a buffer and its size (or a
string and an offset, but it is out of scope here).
msgpack.decode_unchecked() accepts only a buffer (or, again, a string
and an offset) and does not verify whether we perform reads out of the
buffer bounds. See #2755 for the formal description of the API.

msgpackffi module contains only msgpackffi.decode_unchecked().
msgpackffi.decode() is just alias for decode_unchecked: it does not
verify a buffer bounds. AFAIU the alias was added to unify
testing code for msgpack and msgpackffi modules.

Our website has no documentation about msgpackffi. I don't sure whether
its API should be considered as public API. However if we'll make the
module public, we'll need to implement buffer bound check. And only then
state that the API is stable and can be used by users.

If we don't consider msgpackffi.decode() as public function for now, it
make sense to left it untested when a testing code is not unified
between msgpack and msgpackffi modules.

However I unified this testing code for both modules in the following
commit, so it will be tested anyway: but in the commit where it looks
reasonable.
MariaHajdic added a commit that referenced this issue Nov 6, 2019
Function decode of module msgpackffi was passing
value of type const unsigned char * to a C function
that accepts arguments of type const char *.

Closes #3926

FIXUP: msgpackffi.decode can now be assigned to buf.rpos

Added a note that ffi.istype() ignores the const qualifier and changed
the code to make it more clear.

Saved an argument cdata type to a local variable to make the logic
'return a pointer of the same type as passed one' more clear.

No behaviour changes.

FIXUP: msgpackffi.decode can now be assigned to buf.rpos

Rewritten the test case. There are several reasons to do so.

First, the previous test case implementation uses test:iscdata() which
is not sufficient to verify whether the const qualifier is applied to a
cdata type, because of using ffi.istype() under hood, which ignores
'const'.

Second, the previous test case was invalid, because of wrong order of
ffi.cast() arguments.

Third, we don't need a real ibuf object, ffi.cast(ctype, 'a string') is
enough. This allows to simplify test cases.

Fourth, the test case is rewritten in a declarative manner to reduce
code duplication. This also allows to expand it w/o many changes for
msgpack.decode() function in the following commit.

I left only msgpackffi.decode_unchecked() cases. A bit context is needed
to describe why.

msgpack.decode() accepts two arguments: a buffer and its size (or a
string and an offset, but it is out of scope here).
msgpack.decode_unchecked() accepts only a buffer (or, again, a string
and an offset) and does not verify whether we perform reads out of the
buffer bounds. See #2755 for the formal description of the API.

msgpackffi module contains only msgpackffi.decode_unchecked().
msgpackffi.decode() is just alias for decode_unchecked: it does not
verify a buffer bounds. AFAIU the alias was added to unify
testing code for msgpack and msgpackffi modules.

Our website has no documentation about msgpackffi. I don't sure whether
its API should be considered as public API. However if we'll make the
module public, we'll need to implement buffer bound check. And only then
state that the API is stable and can be used by users.

If we don't consider msgpackffi.decode() as public function for now, it
make sense to left it untested when a testing code is not unified
between msgpack and msgpackffi modules.

However I unified this testing code for both modules in the following
commit, so it will be tested anyway: but in the commit where it looks
reasonable.
MariaHajdic added a commit that referenced this issue Nov 6, 2019
Function decode of module msgpackffi was passing
value of type const unsigned char * to a C function
that accepts arguments of type const char *.

Closes #3926

FIXUPs from Totktonada/gh-3926-msgpack-decode-retval-ctype branch

FIXUP: msgpackffi.decode can now be assigned to buf.rpos

Added a note that ffi.istype() ignores the const qualifier and changed
the code to make it more clear.

Saved an argument cdata type to a local variable to make the logic
'return a pointer of the same type as passed one' more clear.

No behaviour changes.

FIXUP: msgpackffi.decode can now be assigned to buf.rpos

Rewritten the test case. There are several reasons to do so.

First, the previous test case implementation uses test:iscdata() which
is not sufficient to verify whether the const qualifier is applied to a
cdata type, because of using ffi.istype() under hood, which ignores
'const'.

Second, the previous test case was invalid, because of wrong order of
ffi.cast() arguments.

Third, we don't need a real ibuf object, ffi.cast(ctype, 'a string') is
enough. This allows to simplify test cases.

Fourth, the test case is rewritten in a declarative manner to reduce
code duplication. This also allows to expand it w/o many changes for
msgpack.decode() function in the following commit.

I left only msgpackffi.decode_unchecked() cases. A bit context is needed
to describe why.

msgpack.decode() accepts two arguments: a buffer and its size (or a
string and an offset, but it is out of scope here).
msgpack.decode_unchecked() accepts only a buffer (or, again, a string
and an offset) and does not verify whether we perform reads out of the
buffer bounds. See #2755 for the formal description of the API.

msgpackffi module contains only msgpackffi.decode_unchecked().
msgpackffi.decode() is just alias for decode_unchecked: it does not
verify a buffer bounds. AFAIU the alias was added to unify
testing code for msgpack and msgpackffi modules.

Our website has no documentation about msgpackffi. I don't sure whether
its API should be considered as public API. However if we'll make the
module public, we'll need to implement buffer bound check. And only then
state that the API is stable and can be used by users.

If we don't consider msgpackffi.decode() as public function for now, it
make sense to left it untested when a testing code is not unified
between msgpack and msgpackffi modules.

However I unified this testing code for both modules in the following
commit, so it will be tested anyway: but in the commit where it looks
reasonable.
MariaHajdic added a commit that referenced this issue Nov 6, 2019
Function decode of module msgpackffi was passing
value of type const unsigned char * to a C function
that accepts arguments of type const char *.

Closes #3926

FIXUPs from Totktonada/gh-3926-msgpack-decode-retval-ctype branch

FIXUP: msgpackffi.decode can now be assigned to buf.rpos

Added a note that ffi.istype() ignores the const qualifier and changed
the code to make it more clear.

Saved an argument cdata type to a local variable to make the logic
'return a pointer of the same type as passed one' more clear.

No behaviour changes.

FIXUP: msgpackffi.decode can now be assigned to buf.rpos

Rewritten the test case. There are several reasons to do so.

First, the previous test case implementation uses test:iscdata() which
is not sufficient to verify whether the const qualifier is applied to a
cdata type, because of using ffi.istype() under hood, which ignores
'const'.

Second, the previous test case was invalid, because of wrong order of
ffi.cast() arguments.

Third, we don't need a real ibuf object, ffi.cast(ctype, 'a string') is
enough. This allows to simplify test cases.

Fourth, the test case is rewritten in a declarative manner to reduce
code duplication. This also allows to expand it w/o many changes for
msgpack.decode() function in the following commit.

I left only msgpackffi.decode_unchecked() cases. A bit context is needed
to describe why.

msgpack.decode() accepts two arguments: a buffer and its size (or a
string and an offset, but it is out of scope here).
msgpack.decode_unchecked() accepts only a buffer (or, again, a string
and an offset) and does not verify whether we perform reads out of the
buffer bounds. See #2755 for the formal description of the API.

msgpackffi module contains only msgpackffi.decode_unchecked().
msgpackffi.decode() is just alias for decode_unchecked: it does not
verify a buffer bounds. AFAIU the alias was added to unify
testing code for msgpack and msgpackffi modules.

Our website has no documentation about msgpackffi. I don't sure whether
its API should be considered as public API. However if we'll make the
module public, we'll need to implement buffer bound check. And only then
state that the API is stable and can be used by users.

If we don't consider msgpackffi.decode() as public function for now, it
make sense to left it untested when a testing code is not unified
between msgpack and msgpackffi modules.

However I unified this testing code for both modules in the following
commit, so it will be tested anyway: but in the commit where it looks
reasonable.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature A new functionality
Projects
None yet
Development

No branches or pull requests

3 participants