Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix conversion of numbers in lua args to redis args #13115

Merged
merged 4 commits into from
Mar 10, 2024

Conversation

mdouglass
Copy link
Contributor

@mdouglass mdouglass commented Mar 6, 2024

Since lua_Number is not explicitly an integer or a double, we need to make an effort
to convert it as an integer when that's possible, since the string could later be used
in a context that doesn't support scientific notation (e.g. 1e9 instead of 100000000).

Since fpconv_dtoa converts numbers with the equivalent of %f or %e, which ever is shorter,
this would break if we try to pass a long integer number to a command that takes integer.
we'll get an implicit conversion to string in Lua, and then the parsing in getLongLongFromObjectOrReply will fail.

> eval "redis.call('hincrby', 'key', 'field', '1000000000')" 0
(nil)
> eval "redis.call('hincrby', 'key', 'field', tonumber('1000000000'))" 0
(error) ERR value is not an integer or out of range script: ac99c32e4daf7e300d593085b611de261954a946, on @user_script:1.

Switch to using ll2string if the number can be safely represented as a
long long.

The problem was introduced in #10587 (Redis 7.2).
closes #13113.

fpconv_dtoa converts 100000000 (and multiples thereof) to the form 1e9 which breaks the later conversion from string back to number in getLongLongFromObjectOrReply. Switch to using ll2string if the number can be safely represented as a long long.

redis#13113
tests/unit/scripting.tcl Outdated Show resolved Hide resolved
Co-authored-by: debing.sun <debing.sun@redis.com>
@sundb
Copy link
Collaborator

sundb commented Mar 6, 2024

if it is long long, what abotu create the string robj with long long, instead of converting to str and converting it to long long
in createStringObject() ?

@mdouglass
Copy link
Contributor Author

if it is long long, what abotu create the string robj with long long, instead of converting to str and converting it to long long
in createStringObject() ?

Do you mean like this?

            if (double2ll((double)num, &lvalue)) {
                lua_argv[j] = createStringObjectFromLongLong(lvalue, lua_argv[j]);
                continue;
            } else {

Its a little bit awkward because of needing to skip the rest of the loop and I'm not clear on the consequences of skipping the code in the /* Try to use a cached object. */ block since this is my first (well, now second) day with the redis codebase. Happy to do whichever version the team recommends.

@mdouglass
Copy link
Contributor Author

Would it make sense to add a createStringObjectFromLuaNumber function that encapsulates both the long long and the double conversion?

@sundb
Copy link
Collaborator

sundb commented Mar 7, 2024

            if (double2ll((double)num, &lvalue)) {
                lua_argv[j] = createStringObjectFromLongLong(lvalue);
                continue;
            } else {

i think it works, and please do a simple benchmark to see if it can bring benefit, thanks.

@mdouglass
Copy link
Contributor Author

            if (double2ll((double)num, &lvalue)) {
                lua_argv[j] = createStringObjectFromLongLong(lvalue);
                continue;
            } else {

i think it works, and please do a simple benchmark to see if it can bring benefit, thanks.

I don't know how to do benchmarks within redis and I can't find anything that documents it. Can you point me in the right direction for that?

@sundb
Copy link
Collaborator

sundb commented Mar 7, 2024

I don't know how to do benchmarks within redis and I can't find anything that documents it. Can you point me in the right direction for that?

Like this

redis-benchmark -n 100000 eval 'redis.call("del", "key"); for i = 1, 100 do redis.call("hincrby", "key", "field", 10000) end' 0

@mdouglass
Copy link
Contributor Author

I don't know how to do benchmarks within redis and I can't find anything that documents it. Can you point me in the right direction for that?

Like this

redis-benchmark -n 100000 eval 'redis.call("del", "key"); for i = 1, 100 do redis.call("hincrby", "key", "field", 10000) end' 0

Excellent, thank you!

@mdouglass
Copy link
Contributor Author

mdouglass commented Mar 7, 2024

redis-benchmark -n 100000 eval 'redis.call("del", "key"); for i = 1, 100 do redis.call("hincrby", "key", "field", 10000) end' 0

Kind of surprised by this but the version of the PR posted here was faster than the version that I changed to use createStringObjectFromLongLong. For reference that version is available at https://github.com/mdouglass/redis/tree/fix/13113-hincrby-bug-alt. However, both fixes were faster than the original code (about 5.5% and about 2.3%)

original code

(9738ba9)  
~/projects/redis $ redis-benchmark -n 1000000 eval 'redis.call("del", "key"); for i = 1, 100 do redis.call("hincrby", "key", "field", 10000) end' 0
====== eval redis.call("del", "key"); for i = 1, 100 do redis.call("hincrby", "key", "field", 10000) end 0 ======                                                   
  1000000 requests completed in 55.66 seconds
  50 parallel clients
  120 bytes payload
  keep alive: 1
  host configuration "save": 3600 1 300 100 60 10000
  host configuration "appendonly": no
  multi-thread: no

Latency by percentile distribution:
0.000% <= 0.655 milliseconds (cumulative count 1)
50.000% <= 2.927 milliseconds (cumulative count 503275)
75.000% <= 3.231 milliseconds (cumulative count 753149)
87.500% <= 3.399 milliseconds (cumulative count 879396)
93.750% <= 3.519 milliseconds (cumulative count 940866)
96.875% <= 3.607 milliseconds (cumulative count 970504)
98.438% <= 3.671 milliseconds (cumulative count 984461)
99.219% <= 3.735 milliseconds (cumulative count 992676)
99.609% <= 3.799 milliseconds (cumulative count 996134)
99.805% <= 3.919 milliseconds (cumulative count 998116)
99.902% <= 4.119 milliseconds (cumulative count 999029)
99.951% <= 4.511 milliseconds (cumulative count 999512)
99.976% <= 4.975 milliseconds (cumulative count 999756)
99.988% <= 5.367 milliseconds (cumulative count 999878)
99.994% <= 5.871 milliseconds (cumulative count 999940)
99.997% <= 6.287 milliseconds (cumulative count 999970)
99.998% <= 6.503 milliseconds (cumulative count 999986)
99.999% <= 6.591 milliseconds (cumulative count 999993)
100.000% <= 6.671 milliseconds (cumulative count 999997)
100.000% <= 6.719 milliseconds (cumulative count 999999)
100.000% <= 6.903 milliseconds (cumulative count 1000000)
100.000% <= 6.903 milliseconds (cumulative count 1000000)

Cumulative distribution of latencies:
0.000% <= 0.103 milliseconds (cumulative count 0)
0.001% <= 0.703 milliseconds (cumulative count 5)
0.001% <= 0.807 milliseconds (cumulative count 11)
0.001% <= 0.903 milliseconds (cumulative count 12)
0.002% <= 1.103 milliseconds (cumulative count 22)
0.005% <= 1.207 milliseconds (cumulative count 54)
0.009% <= 1.303 milliseconds (cumulative count 91)
0.026% <= 1.407 milliseconds (cumulative count 265)
0.104% <= 1.503 milliseconds (cumulative count 1045)
0.502% <= 1.607 milliseconds (cumulative count 5016)
2.303% <= 1.703 milliseconds (cumulative count 23033)
9.625% <= 1.807 milliseconds (cumulative count 96247)
19.070% <= 1.903 milliseconds (cumulative count 190701)
29.133% <= 2.007 milliseconds (cumulative count 291326)
35.569% <= 2.103 milliseconds (cumulative count 355693)
64.047% <= 3.103 milliseconds (cumulative count 640472)
99.900% <= 4.103 milliseconds (cumulative count 999000)
99.980% <= 5.103 milliseconds (cumulative count 999796)
99.996% <= 6.103 milliseconds (cumulative count 999962)
100.000% <= 7.103 milliseconds (cumulative count 1000000)

Summary:
  throughput summary: 17966.55 requests per second
  latency summary (msec):
          avg       min       p50       p95       p99       max
        2.690     0.648     2.927     3.543     3.711     6.903

this PR

(f8d408c) 
~/projects/redis $ redis-benchmark -n 1000000 eval 'redis.call("del", "key"); for i = 1, 100 do redis.call("hincrby", "key", "field", 10000) end' 0
====== eval redis.call("del", "key"); for i = 1, 100 do redis.call("hincrby", "key", "field", 10000) end 0 ======                                                   
  1000000 requests completed in 52.76 seconds
  50 parallel clients
  120 bytes payload
  keep alive: 1
  host configuration "save": 3600 1 300 100 60 10000
  host configuration "appendonly": no
  multi-thread: no

Latency by percentile distribution:
0.000% <= 0.695 milliseconds (cumulative count 1)
50.000% <= 2.839 milliseconds (cumulative count 504212)
75.000% <= 3.119 milliseconds (cumulative count 756626)
87.500% <= 3.255 milliseconds (cumulative count 877436)
93.750% <= 3.359 milliseconds (cumulative count 938567)
96.875% <= 3.447 milliseconds (cumulative count 969142)
98.438% <= 3.543 milliseconds (cumulative count 985125)
99.219% <= 3.655 milliseconds (cumulative count 992399)
99.609% <= 3.751 milliseconds (cumulative count 996195)
99.805% <= 3.847 milliseconds (cumulative count 998067)
99.902% <= 4.255 milliseconds (cumulative count 999027)
99.951% <= 5.183 milliseconds (cumulative count 999514)
99.976% <= 5.999 milliseconds (cumulative count 999759)
99.988% <= 6.287 milliseconds (cumulative count 999879)
99.994% <= 6.495 milliseconds (cumulative count 999942)
99.997% <= 6.591 milliseconds (cumulative count 999970)
99.998% <= 6.711 milliseconds (cumulative count 999985)
99.999% <= 6.879 milliseconds (cumulative count 999993)
100.000% <= 6.951 milliseconds (cumulative count 999997)
100.000% <= 6.983 milliseconds (cumulative count 999999)
100.000% <= 7.239 milliseconds (cumulative count 1000000)
100.000% <= 7.239 milliseconds (cumulative count 1000000)

Cumulative distribution of latencies:
0.000% <= 0.103 milliseconds (cumulative count 0)
0.000% <= 0.703 milliseconds (cumulative count 1)
0.001% <= 0.807 milliseconds (cumulative count 9)
0.001% <= 0.903 milliseconds (cumulative count 11)
0.002% <= 1.007 milliseconds (cumulative count 17)
0.002% <= 1.103 milliseconds (cumulative count 23)
0.004% <= 1.207 milliseconds (cumulative count 39)
0.013% <= 1.303 milliseconds (cumulative count 134)
0.073% <= 1.407 milliseconds (cumulative count 733)
0.294% <= 1.503 milliseconds (cumulative count 2937)
1.786% <= 1.607 milliseconds (cumulative count 17864)
8.018% <= 1.703 milliseconds (cumulative count 80180)
19.764% <= 1.807 milliseconds (cumulative count 197636)
28.945% <= 1.903 milliseconds (cumulative count 289448)
37.831% <= 2.007 milliseconds (cumulative count 378311)
40.142% <= 2.103 milliseconds (cumulative count 401425)
73.982% <= 3.103 milliseconds (cumulative count 739819)
99.887% <= 4.103 milliseconds (cumulative count 998870)
99.948% <= 5.103 milliseconds (cumulative count 999479)
99.980% <= 6.103 milliseconds (cumulative count 999802)
100.000% <= 7.103 milliseconds (cumulative count 999999)
100.000% <= 8.103 milliseconds (cumulative count 1000000)

Summary:
  throughput summary: 18953.39 requests per second
  latency summary (msec):
          avg       min       p50       p95       p99       max
        2.561     0.688     2.839     3.391     3.615     7.239

alternate fix with createStringObjectFromLongLong

(7c4b3bb) 
~/projects/redis ✖ redis-benchmark -n 1000000 eval 'redis.call("del", "key"); for i = 1, 100 do redis.call("hincrby", "key", "field", 10000) end' 0
====== eval redis.call("del", "key"); for i = 1, 100 do redis.call("hincrby", "key", "field", 10000) end 0 ======                                                   
  1000000 requests completed in 54.38 seconds
  50 parallel clients
  120 bytes payload
  keep alive: 1
  host configuration "save": 3600 1 300 100 60 10000
  host configuration "appendonly": no
  multi-thread: no

Latency by percentile distribution:
0.000% <= 0.743 milliseconds (cumulative count 1)
50.000% <= 2.839 milliseconds (cumulative count 501328)
75.000% <= 3.183 milliseconds (cumulative count 755671)
87.500% <= 3.327 milliseconds (cumulative count 875085)
93.750% <= 3.431 milliseconds (cumulative count 937778)
96.875% <= 3.511 milliseconds (cumulative count 969895)
98.438% <= 3.575 milliseconds (cumulative count 985798)
99.219% <= 3.631 milliseconds (cumulative count 992500)
99.609% <= 3.711 milliseconds (cumulative count 996231)
99.805% <= 3.799 milliseconds (cumulative count 998153)
99.902% <= 3.935 milliseconds (cumulative count 999039)
99.951% <= 4.359 milliseconds (cumulative count 999516)
99.976% <= 4.767 milliseconds (cumulative count 999759)
99.988% <= 5.375 milliseconds (cumulative count 999878)
99.994% <= 5.911 milliseconds (cumulative count 999939)
99.997% <= 6.671 milliseconds (cumulative count 999970)
99.998% <= 8.479 milliseconds (cumulative count 999985)
99.999% <= 8.663 milliseconds (cumulative count 999993)
100.000% <= 8.775 milliseconds (cumulative count 999997)
100.000% <= 8.815 milliseconds (cumulative count 999999)
100.000% <= 9.247 milliseconds (cumulative count 1000000)
100.000% <= 9.247 milliseconds (cumulative count 1000000)

Cumulative distribution of latencies:
0.000% <= 0.103 milliseconds (cumulative count 0)
0.001% <= 0.807 milliseconds (cumulative count 5)
0.001% <= 0.903 milliseconds (cumulative count 8)
0.002% <= 1.007 milliseconds (cumulative count 20)
0.003% <= 1.103 milliseconds (cumulative count 33)
0.005% <= 1.207 milliseconds (cumulative count 49)
0.009% <= 1.303 milliseconds (cumulative count 85)
0.027% <= 1.407 milliseconds (cumulative count 270)
0.099% <= 1.503 milliseconds (cumulative count 990)
0.485% <= 1.607 milliseconds (cumulative count 4846)
2.707% <= 1.703 milliseconds (cumulative count 27066)
12.544% <= 1.807 milliseconds (cumulative count 125438)
22.747% <= 1.903 milliseconds (cumulative count 227465)
34.009% <= 2.007 milliseconds (cumulative count 340094)
36.985% <= 2.103 milliseconds (cumulative count 369853)
68.050% <= 3.103 milliseconds (cumulative count 680499)
99.933% <= 4.103 milliseconds (cumulative count 999335)
99.983% <= 5.103 milliseconds (cumulative count 999833)
99.996% <= 6.103 milliseconds (cumulative count 999961)
99.998% <= 7.103 milliseconds (cumulative count 999980)
99.998% <= 8.103 milliseconds (cumulative count 999981)
100.000% <= 9.103 milliseconds (cumulative count 999999)
100.000% <= 10.103 milliseconds (cumulative count 1000000)

Summary:
  throughput summary: 18388.44 requests per second
  latency summary (msec):
          avg       min       p50       p95       p99       max
        2.628     0.736     2.839     3.463     3.607     9.247

@sundb
Copy link
Collaborator

sundb commented Mar 7, 2024

@mdouglass thanks, if so we can keep this one.

@sundb
Copy link
Collaborator

sundb commented Mar 7, 2024

test in my local

  1. when the number is between 0 and 10000
    mdouglass@7c4b3bb will faster.
    because we cache numbers from 0 to 10000.

./src/redis-benchmark -n 500000 eval 'redis.call("del", "key"); for i = 1, 100 do redis.call("hincrby", "key", "field", 500) end' 0

  1. starting from 10000, this PR will be faster, but as number gets larger, mdouglass@7c4b3bb will catch up.
    seems that when the number is small, conversion is faster than memory allocation.

./src/redis-benchmark -n 500000 eval 'redis.call("del", "key"); for i = 1, 100 do redis.call("hincrby", "key", "field", 9999999) end' 0

Copy link
Member

@oranagra oranagra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from what i remember from #10587 (@filipecosta90 correct me if i'm wrong), that change didn't conceptually change the output format.
even before that change, it could be that the code would return the scientific notation.
i.e. %g means either %f or %e, which ever is shorter.

maybe i don't understand the root of the problem you're trying to solve.

src/script_lua.c Outdated Show resolved Hide resolved
src/script_lua.c Outdated Show resolved Hide resolved
@sundb
Copy link
Collaborator

sundb commented Mar 7, 2024

from what i remember from #10587 (@filipecosta90 correct me if i'm wrong), that change didn't conceptually change the output format. even before that change, it could be that the code would return the scientific notation. i.e. %g means either %f or %e, which ever is shorter.

the behavior of snprintf and fconv is not consistent, when the double(integer only) isn't too large snprintf doesn't use
scientific notation.

@oranagra
Copy link
Member

oranagra commented Mar 7, 2024

ohh, only now i notice that this case tries to feed the string into code that only supports integers.
i.e. from redis's perspective we're converting a double to string, and then try to convert that string to an integer.
so i suppose the fix is right, i.e. since lua_Number is not explicitly a double or an integer, we need to make an effort to convert it to an integer if that's possible.

@mdouglass
Copy link
Contributor Author

ohh, only now i notice that this case tries to feed the string into code that only supports integers. i.e. from redis's perspective we're converting a double to string, and then try to convert that string to an integer. so i suppose the fix is right, i.e. since lua_Number is not explicitly a double or an integer, we need to make an effort to convert it to an integer if that's possible.

Yeah, I simplified the lua for the repro. For the record, the original code was trying to do the equivalent of a saturating HINCRBY, so it did something closer to:

local delta = tonumber(ARGV[1])
local maxPoints = tonumber(ARGV[2])
local newPoints = redis.call("HINCRBY", eventId, playerId, delta)
if newPoints > maxPoints then
  redis.call("HSET", eventId, playerId, maxPoints)
end

We had a test that was trying to test the saturation logic and it happened to set points = 100000000. In redis 6.2/7.0 that worked fine, but when we upgraded to 7.2 this script started throwing ERR value is not an integer or out of range which was traced back to 7.2 converting 100000000 to 1e9 which then couldn't convert back to integer.

Copy link
Contributor

@zuiderkwast zuiderkwast left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good fix.

Strictly speaking though, the arguments to a Redis command are always strings (RESP bulk). If you pass anything else to redis.call, you're relying on implicit conversion to string, so someone could argue that it's the user's responsibility to convert it to a string in the proper way (rounding, etc.) before doing redis.call.

When using string concatenation .. you get another implicit string conversion in Lua, something a user just needs to be aware of:

127.0.0.1:6379> eval 'return 100000000000000 .. ""' 0
"1e+14"
127.0.0.1:6379> eval 'return 10000000000000 .. ""' 0
"10000000000000"

@oranagra oranagra merged commit 5fdaa53 into redis:unstable Mar 10, 2024
13 checks passed
@oranagra oranagra added the release-notes indication that this issue needs to be mentioned in the release notes label Mar 10, 2024
@mdouglass
Copy link
Contributor Author

Thank you all!

YaacovHazan pushed a commit to YaacovHazan/redis that referenced this pull request May 16, 2024
Since lua_Number is not explicitly an integer or a double, we need to
make an effort
to convert it as an integer when that's possible, since the string could
later be used
in a context that doesn't support scientific notation (e.g. 1e9 instead
of 100000000).

Since fpconv_dtoa converts numbers with the equivalent of `%f` or `%e`,
which ever is shorter,
this would break if we try to pass a long integer number to a command
that takes integer.
we'll get an implicit conversion to string in Lua, and then the parsing
in getLongLongFromObjectOrReply will fail.

```
> eval "redis.call('hincrby', 'key', 'field', '1000000000')" 0
(nil)
> eval "redis.call('hincrby', 'key', 'field', tonumber('1000000000'))" 0
(error) ERR value is not an integer or out of range script: ac99c32e4daf7e300d593085b611de261954a946, on @user_script:1.
```

Switch to using ll2string if the number can be safely represented as a
long long.

The problem was introduced in redis#10587 (Redis 7.2).
closes redis#13113.

---------

Co-authored-by: Binbin <binloveplay1314@qq.com>
Co-authored-by: debing.sun <debing.sun@redis.com>
Co-authored-by: Oran Agra <oran@redislabs.com>
(cherry picked from commit 5fdaa53)
YaacovHazan pushed a commit that referenced this pull request May 19, 2024
Since lua_Number is not explicitly an integer or a double, we need to
make an effort
to convert it as an integer when that's possible, since the string could
later be used
in a context that doesn't support scientific notation (e.g. 1e9 instead
of 100000000).

Since fpconv_dtoa converts numbers with the equivalent of `%f` or `%e`,
which ever is shorter,
this would break if we try to pass a long integer number to a command
that takes integer.
we'll get an implicit conversion to string in Lua, and then the parsing
in getLongLongFromObjectOrReply will fail.

```
> eval "redis.call('hincrby', 'key', 'field', '1000000000')" 0
(nil)
> eval "redis.call('hincrby', 'key', 'field', tonumber('1000000000'))" 0
(error) ERR value is not an integer or out of range script: ac99c32e4daf7e300d593085b611de261954a946, on @user_script:1.
```

Switch to using ll2string if the number can be safely represented as a
long long.

The problem was introduced in #10587 (Redis 7.2).
closes #13113.

---------

Co-authored-by: Binbin <binloveplay1314@qq.com>
Co-authored-by: debing.sun <debing.sun@redis.com>
Co-authored-by: Oran Agra <oran@redislabs.com>
(cherry picked from commit 5fdaa53)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-notes indication that this issue needs to be mentioned in the release notes
Projects
Status: In Progress
Status: Done
5 participants