Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible invalid start and end values sent to Redis for ZRANGE #213

Closed
mt-clearhaus opened this issue Aug 28, 2023 · 10 comments
Closed

Possible invalid start and end values sent to Redis for ZRANGE #213

mt-clearhaus opened this issue Aug 28, 2023 · 10 comments
Labels

Comments

@mt-clearhaus
Copy link

mt-clearhaus commented Aug 28, 2023

Describe the bug
Both Redis 6.2.6 (via Docker image) and AWS ElastiCache 6.0.5 complain when the strings Infinity and -Infinity are used as start or stop value for ZRANGE. For Stoplight 4.1.0 this is the case in Stoplight::DataStore::Redis#query_failures:

transaction.zrange(failures_key(light), Float::INFINITY, window_start, rev: true, by_score: true)

If no custom window size is set for a light, the default Float::INFINITY will be used, and the start and stop values for ZRANGE will become the strings Infinity and -Infinity, respectively. This is because Time.now.to_i - Float::INFINITY equals -Infinity, and because RedisClient's command builder calls #to_s on Floats:

https://github.com/redis-rb/redis-client/blob/cfbe7feeebe696ee6f5c9a0a27a99bd792b67430/lib/redis_client/command_builder.rb#L34-L35

The error I see is this:

ERR value is not an integer or out of range

I'm using Ruby 3.1.2p20 on Debian Bookworm.

Redis MONITOR gives me this (the light's name is "primary"):

1693217647.398773 [0 10.0.35.172:45642] "multi"
1693217647.398791 [0 10.0.35.172:45642] "zrange" "stoplight:v4:failures:primary" "Infinity" "-Infinity" "BYSCORE" "REV"
1693217647.398800 [0 10.0.35.172:45642] "hget" "stoplight:v4:states" "primary"
1693217647.398807 [0 10.0.35.172:45642] "exec"

To Reproduce

irb(main):020:0> redis = Redis.new(...)
irb(main):021:0> redis.zrange('stoplight:v4:failures:primary', 'Infinity', '-Infinity')
/var/lib/gems/3.1.0/gems/redis-client-0.16.0/lib/redis_client/connection_mixin.rb:35:in `call': ERR value is not an integer or out of range (Redis::CommandError)
        from /var/lib/gems/3.1.0/gems/redis-client-0.16.0/lib/redis_client.rb:235:in `block (2 levels) in call_v'
        from /var/lib/gems/3.1.0/gems/redis-client-0.16.0/lib/redis_client/middlewares.rb:16:in `call'
        from /var/lib/gems/3.1.0/gems/redis-client-0.16.0/lib/redis_client.rb:234:in `block in call_v'
        from /var/lib/gems/3.1.0/gems/redis-client-0.16.0/lib/redis_client.rb:639:in `ensure_connected'
        from /var/lib/gems/3.1.0/gems/redis-client-0.16.0/lib/redis_client.rb:233:in `call_v'    
        from /var/lib/gems/3.1.0/gems/redis-5.0.7/lib/redis/client.rb:90:in `call_v'             
        from /var/lib/gems/3.1.0/gems/redis-5.0.7/lib/redis.rb:167:in `block in send_command'
        from /var/lib/gems/3.1.0/gems/redis-5.0.7/lib/redis.rb:166:in `synchronize'
        from /var/lib/gems/3.1.0/gems/redis-5.0.7/lib/redis.rb:166:in `send_command'
        from /var/lib/gems/3.1.0/gems/redis-5.0.7/lib/redis/commands/sorted_sets.rb:394:in `zrange'
        from (irb):21:in `<main>'     
	from /var/lib/gems/3.1.0/gems/padrino-core-0.15.3/lib/padrino-core/cli/base.rb:43:in `console'
	from /var/lib/gems/3.1.0/gems/thor-1.2.2/lib/thor/command.rb:27:in `run'
	from /var/lib/gems/3.1.0/gems/thor-1.2.2/lib/thor/invocation.rb:127:in `invoke_command'
	from /var/lib/gems/3.1.0/gems/thor-1.2.2/lib/thor.rb:392:in `dispatch'
	from /var/lib/gems/3.1.0/gems/thor-1.2.2/lib/thor/base.rb:485:in `start'
	... 3 levels...
/var/lib/gems/3.1.0/gems/redis-client-0.16.0/lib/redis_client/connection_mixin.rb:35:in `call': ERR value is not an integer or out of range (RedisClient::CommandError)
	from /var/lib/gems/3.1.0/gems/redis-client-0.16.0/lib/redis_client.rb:235:in `block (2 levels) in call_v'
	from /var/lib/gems/3.1.0/gems/redis-client-0.16.0/lib/redis_client/middlewares.rb:16:in `call'
	from /var/lib/gems/3.1.0/gems/redis-client-0.16.0/lib/redis_client.rb:234:in `block in call_v'
	from /var/lib/gems/3.1.0/gems/redis-client-0.16.0/lib/redis_client.rb:639:in `ensure_connected'
	from /var/lib/gems/3.1.0/gems/redis-client-0.16.0/lib/redis_client.rb:233:in `call_v'
	from /var/lib/gems/3.1.0/gems/redis-5.0.7/lib/redis/client.rb:90:in `call_v'
	from /var/lib/gems/3.1.0/gems/redis-5.0.7/lib/redis.rb:167:in `block in send_command'
	from /var/lib/gems/3.1.0/gems/redis-5.0.7/lib/redis.rb:166:in `synchronize'
	from /var/lib/gems/3.1.0/gems/redis-5.0.7/lib/redis.rb:166:in `send_command'
	from /var/lib/gems/3.1.0/gems/redis-5.0.7/lib/redis/commands/sorted_sets.rb:394:in `zrange'
	from (irb):21:in `<main>'
	from /var/lib/gems/3.1.0/gems/padrino-core-0.15.3/lib/padrino-core/cli/base.rb:43:in `console'
	from /var/lib/gems/3.1.0/gems/thor-1.2.2/lib/thor/command.rb:27:in `run'
	from /var/lib/gems/3.1.0/gems/thor-1.2.2/lib/thor/invocation.rb:127:in `invoke_command'
	from /var/lib/gems/3.1.0/gems/thor-1.2.2/lib/thor.rb:392:in `dispatch'
	from /var/lib/gems/3.1.0/gems/thor-1.2.2/lib/thor/base.rb:485:in `start'
	... 3 levels...

Expected behavior
I hope Float::INFINITY can be replaced with something else, however, I not sure what the right solution is.

Screenshots
N/A

Environment (please complete the following information):

  • OS: Debian Bookworm (via Docker image debian:bookworm-slim)
  • Ruby 3.1.2p20 (installed via Debian package)
  • Version 4.1.0

Additional context
N/A

@mt-clearhaus
Copy link
Author

FWIW, I currently use this workaround (monkey patch):

# Monkey patch two private methods in Stoplight::DataStore::Redis such that
# Stoplight does not end up sending the strings "Infinity" and "-Infinity" as
# start/min and stop/max values in Redis ZREMRANGEBYSCORE and ZRANGE commands.
# See:
# * https://github.com/bolshakov/stoplight/issues/213
# * https://redis.io/commands/zremrangebyscore/
# * https://redis.io/commands/zrange/
module Stoplight
  module DataStore
    class Redis
      private

      def remove_outdated_failures(light, time, transaction: @redis)
        failures_key = failures_key(light)

        # Remove all errors happened before the window start
        window_start = time.to_i - light.window_size
        window_start = 0 if window_start < 0
        transaction.zremrangebyscore(failures_key, 0, window_start)
        # Keep at most +light.threshold+ number of errors
        transaction.zremrangebyrank(failures_key, 0, -light.threshold - 1)
      end

      def query_failures(light, transaction: @redis)
        window_start = Time.now.to_i - light.window_size
        window_start = 0 if window_start < 0

        transaction.zrange(failures_key(light), 10_000_000_000, window_start, rev: true, by_score: true)
      end
    end
  end
end

MONITOR before:

1693231301.688236 [0 192.168.48.3:57864] "MULTI"
1693231301.688257 [0 192.168.48.3:57864] "zrange" "stoplight:v4:failures:primary" "Infinity" "-Infinity" "BYSCORE" "REV"
1693231301.688262 [0 192.168.48.3:57864] "hget" "stoplight:v4:states" "primary"
1693231301.688264 [0 192.168.48.3:57864] "EXEC"
1693231305.713619 [0 192.168.48.3:57864] "MULTI"
1693231305.713665 [0 192.168.48.3:57864] "zadd" "stoplight:v4:failures:primary" "1693231305" "{\"error\":{\"class\":\"Net::TCPClient::ConnectionFailure\",\"message\":\"[...]\"},\"time\":\"2023-08-28T14:01:45.712843713+00:00\"}"
1693231305.713734 [0 192.168.48.3:57864] "zremrangebyscore" "stoplight:v4:failures:primary" "0" "-Infinity"
1693231305.713747 [0 192.168.48.3:57864] "zremrangebyrank" "stoplight:v4:failures:primary" "0" "-6"
1693231305.713758 [0 192.168.48.3:57864] "zcard" "stoplight:v4:failures:primary"
1693231305.713766 [0 192.168.48.3:57864] "EXEC"

MONITOR after:

1693231214.596847 [0 192.168.48.3:57226] "MULTI"
1693231214.596867 [0 192.168.48.3:57226] "zrange" "stoplight:v4:failures:primary" "10000000000" "0" "BYSCORE" "REV"
1693231214.596872 [0 192.168.48.3:57226] "hget" "stoplight:v4:states" "primary"
1693231214.596875 [0 192.168.48.3:57226] "EXEC"
1693231218.622114 [0 192.168.48.3:57226] "MULTI"
1693231218.622158 [0 192.168.48.3:57226] "zadd" "stoplight:v4:failures:primary" "1693231218" "{\"error\":{\"class\":\"Net::TCPClient::ConnectionFailure\",\"message\":\"[...]\"},\"time\":\"2023-08-28T14:00:18.621338758+00:00\"}"
1693231218.622223 [0 192.168.48.3:57226] "zremrangebyscore" "stoplight:v4:failures:primary" "0" "0"
1693231218.622237 [0 192.168.48.3:57226] "zremrangebyrank" "stoplight:v4:failures:primary" "0" "-6"
1693231218.622246 [0 192.168.48.3:57226] "zcard" "stoplight:v4:failures:primary"
1693231218.622254 [0 192.168.48.3:57226] "EXEC"

@mt-clearhaus
Copy link
Author

mt-clearhaus commented Aug 28, 2023

It appears the workaround isn't sufficient. I'm seeing ERR syntax error now. Just FYI.

Interestingly, I only see it towards AWS ElastiCache, not towards ordinary Redis.

Towards ElastiCache:

irb(main):033:0> redis.zrange('stoplight:v4:failures:primary', 10_000_000_000, 0, byscore: true, rev: true)
/var/lib/gems/3.1.0/gems/redis-client-0.16.0/lib/redis_client/connection_mixin.rb:35:in `call': ERR syntax error (Redis::CommandError)
	from /var/lib/gems/3.1.0/gems/redis-client-0.16.0/lib/redis_client.rb:235:in `block (2 levels) in call_v'
[...]

Towards ordinary Redis:

irb(main):012:0> pp redis.zrange('stoplight:v4:failures:primary', 10_000_000_000, 0, byscore: true, rev: true)
["{\"error\":{[...]},\"time\":\"2023-08-28T14:31:51.948117847+00:00\"}",
 "{\"error\":{[...]},\"time\":\"2023-08-28T14:01:45.712843713+00:00\"}",
 "{\"error\":{[...]},\"time\":\"2023-08-28T14:00:18.621338758+00:00\"}"]

@bolshakov
Copy link
Owner

bolshakov commented Aug 28, 2023

Hey @mt-clearhaus 👋 Thank you for reporting this!

irb(main):020:0> redis = Redis.new(...)
irb(main):021:0> redis.zrange('stoplight:v4:failures:primary', 'Infinity', '-Infinity')

The snippet you provided as reproduction steps is not equivalent to what Stoplight sends to Redis. Stoplight searches by score, while you're searching by rank. Sets' rank indeed should be an integer, while the score value is a float (see)

This is the correct ruby command:

redis.zrange('stoplight:v4:failures:primary', 'Infinity', '-Infinity', by_score: true, rev: true)

it's translated to Redis command:

zrange stoplight:v4:failures:test Infinity -Infinity BYSCORE REV

The command above works with Redis 6.2.13 for sure (we run tests against it) and Redis 7.2.0.

May I ask you to test if these two commands work on your version of Redis with and without using the ElastiCache?:

zrange stoplight:v4:failures:test Infinity -Infinity BYSCORE REV
zrange stoplight:v4:failures:test Inf -Inf BYSCORE REV

@bolshakov
Copy link
Owner

Actually, we can safely replace -Infinity with zero and +Infinity with large enough number because the score represent a timestamp in this case.

@bolshakov
Copy link
Owner

I tested it here using Redis 6.2.6 and the specs pass #214

@mt-clearhaus
Copy link
Author

Hi @bolshakov,

Thank you very much for creating Stoplight and for getting back to me.

It turns out my issues are only present for AWS ElastiCache. The engine version is only 6.0.5.

I have primed both my local Redis 6.2.6 and the AWS ElastiCache with two failures in the primary light.

Things work flawlessly with the local Redis 6.2.6:

irb(main):001:0> redis = Redis.new([...])
=> #<Redis client v5.0.7 for redis://redis:6379/0>
irb(main):002:0> pp redis.zrange('stoplight:v4:failures:primary', 'Infinity', '-Infinity', by_score: true, rev: true)
["{\"error\":{[...]},\"time\":\"2023-08-29T07:20:13.185982997+00:00\"}",
 "{\"error\":{[...]},\"time\":\"2023-08-29T07:16:51.963196041+00:00\"}"]
irb(main):003:0> pp redis.zrange('stoplight:v4:failures:primary', 'Inf', '-Inf', by_score: true, rev: true)
["{\"error\":{[...]},\"time\":\"2023-08-29T07:20:13.185982997+00:00\"}",
 "{\"error\":{[...]},\"time\":\"2023-08-29T07:16:51.963196041+00:00\"}"]
127.0.0.1:6379> zrange stoplight:v4:failures:primary Infinity -Infinity BYSCORE REV
1) "{\"error\":{[...]},\"time\":\"2023-08-29T07:20:13.185982997+00:00\"}"
2) "{\"error\":{[...]},\"time\":\"2023-08-29T07:16:51.963196041+00:00\"}"

127.0.0.1:6379> zrange stoplight:v4:failures:primary Inf -Inf BYSCORE REV
1) "{\"error\":{[...]},\"time\":\"2023-08-29T07:20:13.185982997+00:00\"}"
2) "{\"error\":{[...]},\"time\":\"2023-08-29T07:16:51.963196041+00:00\"}"

It's another story with AWS ElastiCache:

irb(main):001:0> redis = Redis.new([...])
=> #<Redis client v5.0.7 for redis://[...].cache.amazonaws.com:6379/0>
irb(main):002:0> pp redis.zrange('stoplight:v4:failures:primary', 0, 100)
["{\"error\":{[...]},\"time\":\"2023-08-29T07:37:00.265572267+00:00\"}",
 "{\"error\":{[...]},\"time\":\"2023-08-29T07:38:00.749749018+00:00\"}"]

irb(main):003:0> redis.zrange('stoplight:v4:failures:primary', 'Infinity', '-Infinity', by_score: true, rev: true)
[...]/redis-client-0.16.0/lib/redis_client/connection_mixin.rb:35:in `call': ERR value is not an integer or out of range (Redis::CommandError)
irb(main):004:0> redis.zrange('stoplight:v4:failures:primary', 'Inf', '-Inf', by_score: true, rev: true)
[...]/redis-client-0.16.0/lib/redis_client/connection_mixin.rb:35:in `call': ERR value is not an integer or out of range (Redis::CommandError)
[...].cache.amazonaws.com:6379> zrange stoplight:v4:failures:primary 0 100
1) "{\"error\":{[...]},\"time\":\"2023-08-29T07:37:00.265572267+00:00\"}"
2) "{\"error\":{[...]},\"time\":\"2023-08-29T07:38:00.749749018+00:00\"}"

[...].cache.amazonaws.com:6379> zrange stoplight:v4:failures:primary Infinity -Infinity BYSCORE REV
(error) ERR value is not an integer or out of range
[...].cache.amazonaws.com:6379> zrange stoplight:v4:failures:primary Inf -Inf BYSCORE REV
(error) ERR value is not an integer or out of range
[...].cache.amazonaws.com:6379> zrange stoplight:v4:failures:primary 10000000000 0 BYSCORE REV
(error) ERR syntax error

I will try to upgrade the engine version in AWS ElastiCache.

@mt-clearhaus
Copy link
Author

I think it is safe to close this issue as it seems Stoplight is not to blame 🙂

@mt-clearhaus
Copy link
Author

AWS ElastiCache with engine version 6.2.6 has no issues with the queries 👍 🙌

[...].cache.amazonaws.com:6379> zrange stoplight:v4:failures:primary 0 100
1) "{\"error\":{[...]},\"time\":\"2023-08-29T08:35:28.659990873+00:00\"}"
2) "{\"error\":{[...]},\"time\":\"2023-08-29T08:35:35.481946177+00:00\"}"
[...].cache.amazonaws.com:6379> zrange stoplight:v4:failures:primary Infinity -Infinity BYSCORE REV
1) "{\"error\":{[...]},\"time\":\"2023-08-29T08:35:35.481946177+00:00\"}"
2) "{\"error\":{[...]},\"time\":\"2023-08-29T08:35:28.659990873+00:00\"}"
[...].cache.amazonaws.com:6379> zrange stoplight:v4:failures:primary Inf -Inf BYSCORE REV
1) "{\"error\":{[...]},\"time\":\"2023-08-29T08:35:35.481946177+00:00\"}"
2) "{\"error\":{[...]},\"time\":\"2023-08-29T08:35:28.659990873+00:00\"}"
[...].cache.amazonaws.com:6379> zrange stoplight:v4:failures:primary 10000000000 0 BYSCORE REV
1) "{\"error\":{[...]},\"time\":\"2023-08-29T08:35:35.481946177+00:00\"}"
2) "{\"error\":{[...]},\"time\":\"2023-08-29T08:35:28.659990873+00:00\"}"

@bolshakov
Copy link
Owner

Glad you managed to pinpoint the issue!

[...].cache.amazonaws.com:6379> zrange stoplight:v4:failures:primary 10000000000 0 BYSCORE REV
(error) ERR syntax error

Right! In Redis 6.2.0, the ZRANGEBYSCORE was deprecated since they introduced the BYSCORE argument for ZRANGE command. It seems this version of AWS ElastiCache does not support the BYSCORE argument yet.

Please, keep me updated if you managed to run it with ElastiCache

@mt-clearhaus
Copy link
Author

mt-clearhaus commented Aug 29, 2023

The syntax error quoted above was with ElastiCache engine version 6.0.5, not 6.2.6. I believe the issue is that the integer 10000000000 is too large. As shown in #213 (comment) it seems 6.2.6 supports BYSCORE with ZRANGE.

As 6.0.5 doesn't support integers (or scores in a sorted set? 🤔) of that size, it wouldn't be possible to workaround (-)Inf(inity) using a large integer in the case with Stoplight where the score is seconds since Epoch. So, AFAICT, I couldn't avoid upgrading my ElastiCache engine version 🙂

Edit: After some thought: You might be right that the syntax error was caused by the BYSCORE word and not the integer size. In any case it seems the word is accepted by ElastiCache with engine version 6.2.6.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants