Remove 512 MB max value limit #757

antirez · 2012-11-07T09:28:15Z

Currently the Redis string type and the protocol itself are limited to max string length of 512 MB.

Actually the internal limit of an sds.c string, that is the dynamic string abstraction used inside Redis, is 2GB (as the string header represents length and remaining space as a signed 32 bit integers), but at some point we decided to reduce it to 512 MB since it's a large enough value anyway, and a 32 bit integer can address every bit inside a 512 MB value, that was handy for bit operations.

However while the limit so far was never a practical problem, there is a new interesting issue about the MIGRATE, DUMP and RESTORE commands that are used in order to serialize / unserialize or atomically transfer Redis keys. It is perfectly valid for a Redis key to have a serialization blob that exceeds the 512 MB limit, while it is not common as the serialization format is very compact (for instance a 1 million elements list of small strings is serialized into a 7 MB blob).

This means that with the current limit we can't MIGRATE very very large keys, and is surely a problem for certain applications (actually Redis cluster is not the best bet if you have many million elements keys as migration can be slow).

It is certainly possible to easily remove the 512 MB limit by using 64 bit integers in sds.c string headers and removing the limits inside the bulk parsing code. The problems with this approach are:

There is no longer a protection against client errors that will output a nonsensical bulk length and will fill the server with data. But after all, how to avoid this in general terms, clients can do a lot of silly things against a Redis server? So it's not a big issue.
Every string value will use additional 8 bytes.
We need to adapt the api of bit operations to accept 64 bit offsets, this is trivial since we already handle 64 bit values very well internally.

So long story short the real tradeoff is the additional memory usage. The plan is:

Experiment with large keys to see how large a key is supposed to be to serialize in > 512 MB.
Experiment with credible datasets to see how the 8 bit overhead affects normal applications in terms of total memory usage hint in percentage.
Decide if it's wise to take the limit or to remove it.

But in general, limits suck.

For instance there was an AOF issue, and there is currently an unsolved redis-cli issue that result form the sds.c 2 GB limit. My vote for now is to remove the limit and pay the 8 bytes overhead, but more testing will provide more data points to better evaluate.

The text was updated successfully, but these errors were encountered:

charsyam · 2012-11-07T09:54:35Z

I have a question. if the limitation is removed, maybe it can affect lower performance? if someone try get or set values more than 512MB?

moreaki · 2012-11-07T09:54:46Z

Sabbinirica

I believe in the Unix philosophy and giving the user enough rope to hang himself; so please, let's remove this limit, however make it visible enough for potential users and people upgrading to this feature. Otherwise you could reap all sorts of new funny bug reports from users hanging themselves. Make it backwards compatible for smooth upgrade paths of people with important large-scale implementations, so you'll have more early adopters.

I'm in a project where we're working on some sort of redis filesystem using FUSE and given some design considerations in the future, lifting this 512MB limit could prove to be a valuable asset. So, holler as soon as you have a first implementation ready to test (which probably means by next weekend :)).

antirez · 2012-11-07T10:04:58Z

There are definitely no performance concerns if we just remove the limit the vanilla way.

However:

We could use a double-header with a flag, so that small strings will not pay the additional memory cost. This adds complexity and CPU time, it's not worth it probably... but is an option.
We can switch to uint32_t and live with a 4GB limit that is 8 times larger than the current one, but my feeling is that 4GB will be still an issue for some corner case huge key.

Well we have a couple of options fortunately and there is no need to pick the best one right now, some time will help as usually :-)

josiahcarlson · 2012-11-09T16:20:18Z

I brought the up with Pieter at RedisConf (among other topics):

What if strings were pre-sharded if they were beyond a specific size? Instead of storing it all as a single allocation, break it up into blocks, and access the blocks via a hash table.
Benefits:

No need to clear the entire space of the string initialization, only un-initialized portions of a block being written to
No real need to worry about switching to full 32 bit/64 bit in SDS (because your block size can be arbitrarily sized from 4k to 256k, 1M, etc., without serious issue)
Blocks that haven't been written to take up no memory
Overhead can be tuned based on block size

Drawbacks:

Additional complexity
Small slowdown with bitop/bitcount and getrange (adjacent blocks are no longer adjacent in memory)
Small slowdown with read/write on sharded strings (it is an additional hash-table lookup)

neomantra · 2015-09-09T22:12:38Z

Given the merging of #2509, what remains to be done in order to remove this limit? Do you think it will be part of 3.2?

I can try to assist with this, given some guidance. So far, I've just been searching for "512". The code that is particularly biting me is here:

https://github.com/antirez/redis/blob/32f80e2f1bf42d0508f1114a9dddd91c4719eb8e/src/t_string.c#L37

I also see this line, does this need to be changed or do you want the networking/protocol aspects to be unchanged?

https://github.com/antirez/redis/blob/55cb64bbfb7d4a32fd1289137631f0f0de4cd2d1/src/networking.c#L1057

My specific use case is APPENDing to a key which may grow bigger than 1GB; however, my config does not use any persistence.

antirez · 2015-09-15T15:14:02Z

Hello @neomantra, indeed as a side effect of changes operated by @oranagra to sds.c this should be a solved problem AFAIK... However there is to check if perhaps there is some assertion to remove in the code that assumed the old limits. Then there is to understand if we want also to remove the string limits in the API as well, but maybe it is a sane safeguard to have them? Thanks.

antirez · 2015-09-15T15:15:03Z

Oh sorry I said something stupid: MIGRATE uses the API anyway, so we need to remove the limit from the API as well... Or to whitelist just the MIGRATE command at least.

antirez · 2015-09-15T15:15:44Z

P.s. btw yes this is going to be part of Redis 3.2 since it affects Cluster operations in a serious way.

oranagra · 2015-09-17T06:29:49Z

@antirez i just realized there's one more change that is needed for all of that to work.
rdbSaveLen should be fixed so that it can save strings larger than 4gb.
see REDIS_RDB_32BITLEN.

neomantra · 2015-09-29T13:28:03Z

I'm going to experiment with this a bit, with some of the simple changes I noted before and ignoring safety.

With respect to safeguards, I could imagine two different configurations, allowing operators to balance security/sanity/utility:

maximum size of a String value
maximum size of a message in the Redis Protocol

In my use case, I have very large string values, but they are attained with many small APPENDs, so I might keep a low max message size, but crank up the max String value size. But ultimately, I am in a highly trusted environment (e.g. localhost with auth), so I might not fiddle with it at all.

Or, with the whitelist approach, I'd like to be able to whitelist APPEND.

neomantra · 2015-09-29T14:09:36Z

I also note that some documentation would need to be changed, e.g.:
http://redis.io/commands/SETRANGE

Seems that one can really mess things up by invoking SETRANGE with a huge number. Perhaps that is a +1 for a whitelist approach?

miriaford · 2017-10-27T17:57:31Z

Just discovered this old post. Is increasing the limit still on the table? I'm currently using Redis as parameter server for machine learning. Neural networks, in particular, can have up to a few Gb of parameters. It feels unnatural to shard it manually, split to multiple keys, then retrieve each key at the client side and reconstruct the parameters (unless there's a better way to do it?)

neomantra · 2017-10-30T01:07:09Z

@miriaford I made these changes for myself but never felt like I could get to the level of testing and understanding of its implications to the Redis world to push forward with it.

I did however love making Redis modules and have done some sharding hacks and combo-ops — e.g append a string, set a hash, AND publish a value all in one atomic (with respect to other clients) command. A Redis module might be useful for what you are trying to do without needing to trouble the client much.

antirez · 2017-10-30T09:32:03Z

Hello @miriaford , @neomantra yes it's still on the table, especially now that SDS strings version 2 have no limits, but 64 bit as it should be. The original 512MB limit was in order to be able to address every bit of a 512 MB string with a 32 bit integer, so basically to convert things would just require, AFAIK, change the bitops operations internals to use long long or uint64_t explicitly, so that 32 bit systems can still work well with such commands and strings bigger than 512 MB. I'll need to look better what commands are affected by this, but in general there should not be bad side effects in removing the limit other than just the bit commands.

miriaford · 2017-10-30T15:36:52Z

@antirez that's great to know! Thanks for the update.
Do you think using Redis as a neural network parameter server is a good idea at all? My training is distributed over multiple nodes, and "Redis replication" seems to fit the parameter server model quite well. Unless you have other opinions, thanks!

compfy · 2018-01-04T19:33:54Z

Thank you very much for looking into increasing the limit. I have tried the changes in the post of @neomantra. Plus one more in server.h: "#define PROTO_MAX_QUERYBUF_LEN (4L*_1024*1024*1024)". I have Redis version 4.0.6 64 bit. With these changes, setting and getting the string works with string sizes up to 2GB. What else can be changed to make it work with strings larger than 2GB? We need just setting and getting strings and no other commands. Strings up to 4GB would be sufficient for us. Thanks.

oranagra · 2018-01-07T08:19:20Z

Please note that just increasing the default size of the query buffer is risky, it leaves you exposed to DoS attacks.
Here's a recent PR that makes these settings tunable, and also fixes some variable types to allow sizes bigger than 4GB.
#4568

neomantra · 2018-01-26T03:33:31Z

Also noting here that RM_StringTruncate limits newlen to 512MB.
https://github.com/antirez/redis/blob/unstable/src/module.c#L1586

Since sdsnew and sdsgrowzero can take larger values, shouldn't RM_StringTruncate be able to?

qcho · 2018-02-15T11:21:37Z

I was reading the data-types-intro and got into the line "Values can be strings (including binary data) of every kind, for instance you can store a jpeg image inside a value. A value can't be bigger than 512 MB."

That line brought me here, we are considering using a Redis cluster to store binary (temporal) information that could exceed this limit in a project with similar characteristics as @moreaki

Great to see all this proactive work here. Now that PR #4568 is merged, this change seems really close.

yossigo · 2021-04-27T19:39:38Z

Fixed by #4568

bwlewis mentioned this issue Apr 30, 2013

storing large objects bwlewis/rredis#6

Closed

oranagra mentioned this issue Apr 9, 2015

sds size classes - memory optimization #2509

Closed

antirez added the memory efficiency label Sep 15, 2015

sebix mentioned this issue Jul 19, 2016

Large reports don't fit into one redis message. certtools/intelmq#547

Closed

2 tasks

agis mentioned this issue May 14, 2019

parser does not accept messages larger than 65kilobytes secmask/go-redisproto#3

Closed

yossigo closed this as completed Apr 27, 2021

hayashier mentioned this issue Sep 20, 2021

Update the description about String 512MB limit redis/redis-doc#1653

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove 512 MB max value limit #757

Remove 512 MB max value limit #757

antirez commented Nov 7, 2012

charsyam commented Nov 7, 2012

moreaki commented Nov 7, 2012

antirez commented Nov 7, 2012

josiahcarlson commented Nov 9, 2012

neomantra commented Sep 9, 2015

antirez commented Sep 15, 2015

antirez commented Sep 15, 2015

antirez commented Sep 15, 2015

oranagra commented Sep 17, 2015

neomantra commented Sep 29, 2015

neomantra commented Sep 29, 2015

miriaford commented Oct 27, 2017

neomantra commented Oct 30, 2017

antirez commented Oct 30, 2017

miriaford commented Oct 30, 2017

compfy commented Jan 4, 2018

oranagra commented Jan 7, 2018 •

edited

Loading

neomantra commented Jan 26, 2018 •

edited

Loading

qcho commented Feb 15, 2018

yossigo commented Apr 27, 2021

Remove 512 MB max value limit #757

Remove 512 MB max value limit #757

Comments

antirez commented Nov 7, 2012

charsyam commented Nov 7, 2012

moreaki commented Nov 7, 2012

antirez commented Nov 7, 2012

josiahcarlson commented Nov 9, 2012

neomantra commented Sep 9, 2015

antirez commented Sep 15, 2015

antirez commented Sep 15, 2015

antirez commented Sep 15, 2015

oranagra commented Sep 17, 2015

neomantra commented Sep 29, 2015

neomantra commented Sep 29, 2015

miriaford commented Oct 27, 2017

neomantra commented Oct 30, 2017

antirez commented Oct 30, 2017

miriaford commented Oct 30, 2017

compfy commented Jan 4, 2018

oranagra commented Jan 7, 2018 • edited Loading

neomantra commented Jan 26, 2018 • edited Loading

qcho commented Feb 15, 2018

yossigo commented Apr 27, 2021

oranagra commented Jan 7, 2018 •

edited

Loading

neomantra commented Jan 26, 2018 •

edited

Loading