Main thread stuck when loading very large keys #562

Closed
yoav-steinberg opened this Issue Jun 20, 2012 · 7 comments

Comments

Projects
None yet
2 participants
@yoav-steinberg
Contributor

yoav-steinberg commented Jun 20, 2012

Similar to #539 the main thread is stuck and redis becomes unresponsive during replication. In this case this happens because rdbLoad processes client events once every 1000 loaded keys but for rdb's with a small number of very large keys this doesn't work well. For example an rdb file with a single list that contains 10 million 1k strings will block for a very long time while reading this list.

I suggest changing the way we periodically process client events by processing them every X bytes loaded from the rdb and not every N (1000) keys.

Here's a patch that seemed to fix the issue for me:
https://github.com/yoav-steinberg/redis_garantia/commit/4d7f4ef49fa29562ec1e4f55ecf0623a4c8715c4

@antirez

This comment has been minimized.

Show comment Hide comment
@antirez

antirez Sep 27, 2012

Owner

2.6 -> 2.8

Owner

antirez commented Sep 27, 2012

2.6 -> 2.8

@antirez

This comment has been minimized.

Show comment Hide comment
@antirez

antirez Jul 10, 2013

Owner

Hello, I understand you use the current patch in product, is it right? Everything stable so far? Thank you.

Owner

antirez commented Jul 10, 2013

Hello, I understand you use the current patch in product, is it right? Everything stable so far? Thank you.

@yoav-steinberg

This comment has been minimized.

Show comment Hide comment
@yoav-steinberg

yoav-steinberg Jul 10, 2013

Contributor

This patch was modified to accommodate the "rio" layer added in 2.6 (I think it was originally written for 2.4).
The latest version is used successfully in production. You can find it here: https://github.com/GarantiaData/redis/commit/3aab06bcc7460b46c96fbb8d3b704ab1227c6be4

Contributor

yoav-steinberg commented Jul 10, 2013

This patch was modified to accommodate the "rio" layer added in 2.6 (I think it was originally written for 2.4).
The latest version is used successfully in production. You can find it here: https://github.com/GarantiaData/redis/commit/3aab06bcc7460b46c96fbb8d3b704ab1227c6be4

@antirez

This comment has been minimized.

Show comment Hide comment
@antirez

antirez Jul 15, 2013

Owner

Hello @yoav-steinberg, the patch seems good to me, however it must be noted that it introduces an important difference compared to the past, that is, when re-enter the event loop in the current implementation (before applying your patch) we may have a non completely loaded dataset, that is, just a percentage of keys could be present, not all, but keys are well-formed and accessible.

With the new implementation we may have half-loaded keys, and maybe the representation of keys may not even be ok to access without triggering a crash.

Now given that currently the commands allowed to run in loading state should never touch the dataset, this is fine, but care must be used in the future with the loading flag of the command table.

I think I'll merge ASAP, just doing the last code review right now.

Owner

antirez commented Jul 15, 2013

Hello @yoav-steinberg, the patch seems good to me, however it must be noted that it introduces an important difference compared to the past, that is, when re-enter the event loop in the current implementation (before applying your patch) we may have a non completely loaded dataset, that is, just a percentage of keys could be present, not all, but keys are well-formed and accessible.

With the new implementation we may have half-loaded keys, and maybe the representation of keys may not even be ok to access without triggering a crash.

Now given that currently the commands allowed to run in loading state should never touch the dataset, this is fine, but care must be used in the future with the loading flag of the command table.

I think I'll merge ASAP, just doing the last code review right now.

@yoav-steinberg

This comment has been minimized.

Show comment Hide comment
@yoav-steinberg

yoav-steinberg Jul 15, 2013

Contributor

Isn't the key/value added "atomically" in the addKey(db,key,val) call? Assuming that's the case I don't see how a partial key can exist because addKey doesn't perform any disk access and therefore won't trigger the process events..
What am I missing?

Contributor

yoav-steinberg commented Jul 15, 2013

Isn't the key/value added "atomically" in the addKey(db,key,val) call? Assuming that's the case I don't see how a partial key can exist because addKey doesn't perform any disk access and therefore won't trigger the process events..
What am I missing?

@antirez

This comment has been minimized.

Show comment Hide comment
@antirez

antirez Jul 15, 2013

Owner

That's a good point indeed, we create the whole value, and then add it to the key space, so actually the same property should hold true after we apply your patch. Thanks!

Owner

antirez commented Jul 15, 2013

That's a good point indeed, we create the whole value, and then add it to the key space, so actually the same property should hold true after we apply your patch. Thanks!

@antirez

This comment has been minimized.

Show comment Hide comment
@antirez

antirez Jul 16, 2013

Owner

Merged, thanks! Closing.

Owner

antirez commented Jul 16, 2013

Merged, thanks! Closing.

@antirez antirez closed this Jul 16, 2013

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment