Skip to content
This repository has been archived by the owner on Jan 30, 2023. It is now read-only.

Improvements and fixes for luajit msgpack #11

Closed
wants to merge 3 commits into from

Conversation

piastry
Copy link

@piastry piastry commented May 19, 2015

The patchset includes several fixes and improvements that let us use this module in our environment. The first patch fixes segmentation faults error if we pass an uncompleted stream buffer for unpacking. The second patch allows to work with raw C pointers. The third patch fixes our scenario where floating point precision was truncated randomly.

@catwell
Copy link
Owner

catwell commented May 19, 2015

Thank you! I will look at it later this week.

What do you mean by "floating point precision was truncated randomly"? Do you have an example that reproduces the problem? I would like to add a test for things like this.

@piastry
Copy link
Author

piastry commented May 19, 2015

I mean it very hard to reproduce the problem with a test case. The had the following scenario:

an array of different numbers with only a few the highest and lowest bits as 1 (like 11100000000101) is given; after pack/unpack this array sometimes has several numbers that are equal.

So the code in tests/test.lua:

152 local float_val = 0xffff000000000                                                
153                                                                                  
154 printf(".")                                                                      
155 for _=0,1000 do                                                                  
156 for n=0,50 do                                                                    
157   nb_test(float_val + n, 9)                                                      
158 end                                                                              
159 end  

sometimes fails with the following error:

Integer tests ...........luajit: tests/test.lua:85: wrong value 4.50353e+15, expected 4.50353e+15
stack traceback:
        [C]: in function 'assert'
        tests/test.lua:85: in function 'nb_test'
        tests/test.lua:157: in main chunk
        [C]: at 0x00404820

We use the following luajit 64-bit build:
LuaJIT 2.0.2 -- Copyright (C) 2005-2013 Mike Pall. http://luajit.org/
JIT: ON CMOV SSE2 SSE3 SSE4.1 fold cse dce fwd dse narrow loop abc sink fuse

@catwell
Copy link
Owner

catwell commented May 19, 2015

I don't understand why you say that's a float. That's an integer that is close to the maximum number Lua can represent in the default (double) number type, but below it. That helps me understand why your patch touches the (u)int64 encoding and not the float encoding.

Thank you for the test case. I will try to check why this happens when I have free time, and make one that always fails if possible (if not it may actually be a LuaJIT bug...).

@piastry
Copy link
Author

piastry commented May 19, 2015

I mean float because lua numbers are float precision numbers. Sorry for misleading.

@catwell
Copy link
Owner

catwell commented May 20, 2015

Can you confirm 9119b8c fixes the issue with numbers? If so I prefer this way to fix (I will merge the other two commits).

@piastry
Copy link
Author

piastry commented May 21, 2015

Thank you for merging the first two patches!

I tested your patch 9119b8c in our environment and seems like it didn't solve the issue - the problem is still reproduced with the test above.

@catwell
Copy link
Owner

catwell commented May 21, 2015

OK, I really have to find a smaller test that reproduces the problem reliably then...

catwell added a commit that referenced this pull request Jun 7, 2015
As often with LuaJIT, changing any single innocuous line
in this file will hide the problem, hence the useless function
definition and local declarations...

Run it like this:

    while true; do
        luajit tests/pr-11.lua || exit 1
    done
@catwell
Copy link
Owner

catwell commented Jun 7, 2015

As LuaJIT issues often are, this is insane. I have written a smaller test file that reproduces the issue, but I have not managed to produce something small enough to send Mike yet. Of course, if I use -jv, the bug does not occur.

Even more crazy: if I change the command line to call luajit -O3 (which is supposed to be the default) instead of luajit, I do not reproduce the bug!

In hex, it decodes 0xffff000000014 instead of 0xffff000000028, so it drops a 0 at the end...

@Lupus
Copy link
Contributor

Lupus commented Jun 7, 2015

https://github.com/Lupus/luajit-bug-repro

Can you give it a try? I've reduced it to the following number of lines:

  97 luajit-msgpack-pure.lua
  30 luajit-bug.lua
   5 luajit_bug_run.sh
 132 total

@catwell
Copy link
Owner

catwell commented Jun 8, 2015

Wow, thank you, this is really good. This may be small enough for Mike.

If you replace the %gs by %xs it still works and you can see that:

luajit: ./luajit-bug.lua:17: wrong value ffff000000002, expected ffff000000005

If you agree I will try sending a link to this repository to Mike via the LuaJIT list.

@Lupus
Copy link
Contributor

Lupus commented Jun 8, 2015

Of course I agree :)

I did not place any copyright notices in the repository, but I hope you will not sue me for that.
As long as we have a resolution I will delete that repo for good.

Also I did not test it against git HEAD for 2.1 branch. That's what we have in prod currently:

LuaJIT 2.0.2 -- Copyright (C) 2005-2013 Mike Pall. http://luajit.org/
JIT: ON CMOV SSE2 SSE3 SSE4.1 fold cse dce fwd dse narrow loop abc sink fuse

@catwell
Copy link
Owner

catwell commented Jun 10, 2015

According to Mike Pall, the bug is fixed in LuaJIT git. The test case doesn't fail anymore.

Reference of the fix:

commit 7f454aed82ef364245ae73a16a04b21e2245e342
Author: Mike Pall <mike>
Date:   Wed Jun 10 16:14:41 2015 +0200

    Fix narrowing of TOBIT.

I will close this when you confirm that it works on your environment as well.

@catwell catwell closed this Nov 17, 2015
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants