Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RDY] Remove recursion from various serializers/converters #4607

Merged
merged 12 commits into from
Jun 25, 2016

Conversation

ZyX-I
Copy link
Contributor

@ZyX-I ZyX-I commented Apr 19, 2016

Contains:

  • Refactoring of encode.c which also enhances documentation.
  • Recursion removal from vim_to_object; it was completely rewritten and code is now shared with encode.c.
  • Unit tests for vim_to_object.
  • (Unfinished) test library for testing typval_T and Object values (converts between ffi and lua values so it is easier to use).

Also fixes #4596.

Extracted from #4411.

@ZyX-I
Copy link
Contributor Author

ZyX-I commented Apr 19, 2016

There is also a question: now I can treat special dictionaries specially in vim_to_object, but currently chosen not to do so. Most interesting part of them is that you may pass strings with NUL bytes through API undamaged, assuming object_to_vim will support this enhancement.

If you think it is good idea to treat special dictionaries specially what do you think should be done? Current variant is:

  1. Strings with NUL bytes represented as special dictionaries are converted to String objects with NUL bytes inside (so str.size < strlen(str.data)). All strings are always followed by NUL byte, no matter whether they contain NUL or are special or not.
  2. Binary strings are treated exactly as regular strings.
  3. 64-bit unsigned integers are blindly converted to 64-bit signed (Integer), so it may overflow.
  4. EXT strings are converted to NIL.
  5. Dictionaries with non-string keys receive __INVALID_KEY__ keys in place of all non-string keys.
  6. Nobody cares about duplicates in dictionaries, Dictionary objects created from special dictionaries may always contain duplicates (not only due to 5).

@marvim marvim added the RFC label Apr 19, 2016
@ZyX-I ZyX-I force-pushed the luaviml'/lua'/encode_vim_to_object branch from f6a587f to 2460ea4 Compare April 23, 2016 23:40
@ZyX-I
Copy link
Contributor Author

ZyX-I commented Apr 24, 2016

Removed recursion from three functions just to find out there is recursion which crashes my ASAN build when doing clear_tv. I also remember garbage_collect being not fine with something like this.

@justinmk
Copy link
Member

There is also a question: now I can treat special dictionaries specially in vim_to_object, but currently chosen not to do so. Most interesting part of them is that you may pass strings with NUL bytes through API undamaged, assuming object_to_vim will support this enhancement.

What is the disadvantage?

@ZyX-I
Copy link
Contributor Author

ZyX-I commented Apr 25, 2016

Breaks backwards compatibility. Makes hack created because of flaws in the existing data types more official: if we e.g. will ever have normal objects (even internally: i.e. VAR_OBJECT which consists of void *data and struct { BinaryOp add; ... } *type which will be used for some types, but without a way to construct them provided by user) it will be harder to get rid of special dictionaries if they are that common.

Though this is minor. More major is that not all special dictionaries can be expressed with Object: specifically:

  • EXT types.
  • Also a question what to do with EXT dictionary which has EXT type from e.g. Buffer. And not necessary valid value.
  • MAPs with non-string keys.
  • Distinguishing STR/BIN.

This is either extending of the API types or admitting that some special dictionaries are even more special - and result in a NIL.

@ZyX-I ZyX-I force-pushed the luaviml'/lua'/encode_vim_to_object branch from f3ca24b to 20c5d5a Compare May 1, 2016 01:44
@ZyX-I ZyX-I force-pushed the luaviml'/lua'/encode_vim_to_object branch 4 times, most recently from 0ccb90e to 23f929f Compare May 16, 2016 14:31
@ZyX-I ZyX-I force-pushed the luaviml'/lua'/encode_vim_to_object branch 3 times, most recently from 336d89b to fed833b Compare May 30, 2016 20:19
@justinmk
Copy link
Member

justinmk commented Jun 8, 2016

Rebase should clear up the build.

#define API_INTEGER_MAX INT64_MAX

/// Minimum value of an Integer
#define API_INTEGER_MIN INT64_MIN
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not used anywhere.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

API_INTEGER_MIN was added because I added API_INTEGER_MAX (which is used), not because it is used.

@ZyX-I ZyX-I force-pushed the luaviml'/lua'/encode_vim_to_object branch from 7288131 to e6c2988 Compare June 10, 2016 18:45
do { \
if (tv->vval.v_list->lv_refcount > 1) { \
tv->vval.v_list->lv_refcount--; \
tv->vval.v_list = NULL; \
Copy link
Contributor

@oni-link oni-link Jun 10, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This macro is called before we push a new element on the stack in nothing_convert_one_value().
When constructing the pushed element, tv->vval.v_list->lv_first is read after tv->vval.v_list was set to NULL.

@ZyX-I ZyX-I force-pushed the luaviml'/lua'/encode_vim_to_object branch from e6c2988 to a865a4b Compare June 11, 2016 18:42
if ((r = did_set_string_option(opt_idx, varp, (int)true, oldval, NULL,
opt_flags)) == NULL)
did_set_option(opt_idx, opt_flags, TRUE);
char *r = NULL;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initialize with function output did_set_string_option() instead of NULL.

This ought to prevent stack overflow, but I do not see this actually working:
*lua* code crashes with stack overflow when trying to deserialize msgpack from
Neovim, Neovim is fine even if nesting level is increased 100x (though test
becomes very slow); not sure how recursive function may survive this. So it
looks like there are currently only two positive effects:

1. NULL lists are returned as empty (neovim#4596).
2. Functional tests are slightly more fast. Very slightly. Checked for Release
   build for test/functional/eval tests because benchmarking of debug mode is
   not very useful.
This removes some stack overflows in new test regarding deeply nested variables.
Now in place of crashing vim_to_object/msgpack_rpc_from_object/etc it crashes
clear_tv with stack overflow.
It appears that used msgpack library is not able to parse back message created 
by msgpack_rpc_from_object() if nesting level is too high, so log_server_msg now 
cares about msgpack_unpack_next() return value. Also error message from 
server_notifications_spec.lua is not readable if something is wrong (though at 
least now it does not crash when parsing deeply nested structures).

log_server_msg() in the test reports

    [msgpack-rpc] nvim -> client(1) [error]        "parse error"
@ZyX-I ZyX-I force-pushed the luaviml'/lua'/encode_vim_to_object branch from d0de23b to 458a4d0 Compare June 24, 2016 14:01
This avoids gcc warnings about undefined behaviour.
@ZyX-I ZyX-I changed the title [RFC] Remove recursion from vim_to_object [RFC] Remove recursion from various serializers/converters Jun 24, 2016
Also adds one exception to linter rules:

    typedef struct {
      kvec_t(Object) stack;
    } EncodedData;

is completely valid (from the style guide point of view) code.
@ZyX-I
Copy link
Contributor Author

ZyX-I commented Jun 25, 2016

Tests pass, so this looks like RDY.

@ZyX-I
Copy link
Contributor Author

ZyX-I commented Jun 25, 2016

I will do some benchmarks regarding kv vs kvi now though.

@ZyX-I
Copy link
Contributor Author

ZyX-I commented Jun 25, 2016

Benchmarks still confirm that kvi is good when working with not too nested (did not actually confirm, but I expect that it should be better with up to 8 nesting levels) objects while kv variant is slightly better when working with deeply nested objects. Since I guess that slightly nested objects are more common, this should be left as-is.

bin/nvim -i NONE -u NONE -N --cmd 'let l=map(range(100), "map(range(100), ''map(range(100), \"{1:[[]]}\")'')")' --cmd 'echo g:l' --cmd 'unlet l' --cmd qa &>/dev/null: average 7.2s total kv, 7.0s kvi.

bin/nvim -i NONE -u NONE -N --cmd 'let l=[]|call map(range(100000), "extend(g:, {''l'': [g:l]})")' --cmd 'echo g:l' --cmd 'unlet l' --cmd qa &>/dev/null: average 2.3s total kv, 2.3s kvi (actually in this test kvi is rather slightly better, but previously I used to see kv be better, not sure why: 0.6s kvi vs 0.5s kv, but I lost command actually used that time).

@ZyX-I ZyX-I changed the title [RFC] Remove recursion from various serializers/converters [RDY] Remove recursion from various serializers/converters Jun 25, 2016
@ZyX-I
Copy link
Contributor Author

ZyX-I commented Jun 25, 2016

Marking this as RDY.

@ZyX-I
Copy link
Contributor Author

ZyX-I commented Jun 25, 2016

For kv tests apply

diff --git a/src/nvim/eval/typval_encode.h b/src/nvim/eval/typval_encode.h
index 98fa7b2..fa8bf5f 100644
--- a/src/nvim/eval/typval_encode.h
+++ b/src/nvim/eval/typval_encode.h
@@ -169,13 +169,13 @@ typedef struct {
 } MPConvStackVal;

 /// Stack used to convert VimL values to messagepack.
-typedef kvec_withinit_t(MPConvStackVal, 8) MPConvStack;
+typedef kvec_t(MPConvStackVal) MPConvStack;

 // Defines for MPConvStack
 #define _mp_size kv_size
-#define _mp_init kvi_init
-#define _mp_destroy kvi_destroy
-#define _mp_push kvi_push
+#define _mp_init kv_init
+#define _mp_destroy kv_destroy
+#define _mp_push kv_push
 #define _mp_pop kv_pop
 #define _mp_last kv_last

@marvim marvim added RDY and removed RFC labels Jun 25, 2016

#define TYPVAL_ENCODE_CONV_NUMBER(ignored) \
do { \
(void)ignored; \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the purpose of this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@justinmk If you see me writing something like this it means I got compiler warning. I cannot recall details now, but I do not use such code without purpose. If I am not mistaking problem is code in the unreachable if(TYPVAL_ENCODE_ALLOW_SPECIALS && …): even though it is unreachable, compiler still checks this before optimizing out.

@justinmk justinmk merged commit 26d8ab5 into neovim:master Jun 25, 2016
@justinmk justinmk removed the RDY label Jun 25, 2016
@ZyX-I
Copy link
Contributor Author

ZyX-I commented Jun 25, 2016

Ups: 2968dc7. Forgot to rebase -i.

@ZyX-I ZyX-I deleted the luaviml'/lua'/encode_vim_to_object branch June 25, 2016 20:22
@justinmk
Copy link
Member

@ZyX-I Is this a false positive (coverity)?

*** CID 149745:  Null pointer dereferences  (FORWARD_NULL)
/src/nvim/api/private/helpers.c: 502 in vim_to_object()
496     /// @param obj The source object
497     /// @return The converted value
498     Object vim_to_object(typval_T *obj)
499     {
500       EncodedData edata = { .stack = KV_INITIAL_VALUE };
501       encode_vim_to_object(&edata, obj, "vim_to_object argument");
>>>     CID 149745:  Null pointer dereferences  (FORWARD_NULL)
>>>     Dereferencing null pointer "edata.stack.items".
502       Object ret = kv_A(edata.stack, 0);
503       assert(kv_size(edata.stack) == 1);
504       kv_destroy(edata.stack);
505       return ret;
506     }
507

@ZyX-I
Copy link
Contributor Author

ZyX-I commented Jun 25, 2016

@justinmk Cannot say now, maybe this is not a false positive if encode_vim_to_object may fail. AFAIR I wrote it so it cannot ever fail (which I did because vim_to_object declaration clearly shows that no errors are expected: it has no way to report error), so assert for fixing it is enough.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

NULL lists are represented as NIL
4 participants