Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increase readpartial() from 1KB to 16KB #880

Merged
merged 1 commit into from Nov 6, 2019
Merged

Increase readpartial() from 1KB to 16KB #880

merged 1 commit into from Nov 6, 2019

Conversation

@schanjr
Copy link

schanjr commented Nov 6, 2019

I noticed the redis-rb gem is using 1KB per readpartial() call. This is considered very small in comparison to something like Mongrel, Unicorn, Puma, Passenger, and Net::HTTP, of which all of them has 16KB as their read length.

Wanted to see what are your feedbacks on this.

@byroot
Copy link
Member

byroot commented Nov 6, 2019

in comparison to something like Mongrel, Unicorn, Puma, Passenger, and Net::HTTP,

All of which deal with HTTP request / responses, so not sure it is a proper argument.

However digging into hiredis shows that some experimentation was done and that 16k is indeed yield better performance: redis/hiredis@f7f022e

So 👍

@byroot byroot merged commit c7b69ba into redis:master Nov 6, 2019
1 check passed
1 check passed
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@liaden
Copy link

liaden commented Jul 1, 2020

Our production sidekiq instance saw memory usage grow drastically with this change:
Screen Shot 2020-07-01 at 11 11 24 AM

We downgraded on 5/20, and the memory usage went back to 8-10GB range whereas it was at 11.5-14GB on 4.1.4. Would it be reasonable to expose this value as a configurable with the default of 16KB? I can open a PR if it is.

@byroot
Copy link
Member

byroot commented Jul 1, 2020

I'm not against making it configurable, but what concerns me more is how it could so significantly increase you memory usage.

This is just a buffer, it's supposed to be consumed quickly, it shouldn't hold memory for that long.

@liaden
Copy link

liaden commented Jul 1, 2020

One of my thoughts is that pulling off 16.5kb of data is going to overallocate 15.5kb of memory (which is accumulating over time for some reason) and thus the increase of memory. If you have any ideas on where I can investigate to see what is happening, that would help short circuit things on my end a bit.

@casperisfine
Copy link

casperisfine commented Jul 1, 2020

(answering from another computer)

which is accumulating over time for some reason

Well, if it's accumulating, the bug would be that it accumulate. I don't see how 1 chunk of 16K would be "accumulating" faster than 4 blocks of 4K.

I dumped the buffer on my machine to see:

setup:

Redis.new.set("foo", "bar" * 10_000)

After reading from the socket, @buffer is:

{"address":"0x7faf5603eb68", "type":"STRING", "class":"0x7faf520bb5b8", "bytesize":16384, "capacity":24575, "value":"$30000\r\nbarbarb........barbarbar", "encoding":"UTF-8", "memsize":24616, "flags":{"wb_protected":true}}

But just after it's slice!:

{"address":"0x7faf5603eb68", "type":"STRING", "class":"0x7faf520bb5b8", "shared":true, "encoding":"UTF-8", "references":["0x7faf56037070"], "memsize":40, "flags":{"wb_protected":true}}

We see that Ruby is smart and simply point to a shared empty string. We can assume the allocated string pointer was freed (I'll double check, but I highly doubt such an obvious bug wouldn't have been fixed and reported yet).

So here's a bunch of theories:

1 - Connection leaks

Maybe somehow you are leaking Connection::Ruby instances? I've seen report and suspicions of such issues, but never described well enough to be able to reproduce and fix them.

A way to verify this would be to get a shell into a ruby process using rbtrace, and then check ObjectSpace.each_object(Redis::Connection::Ruby).size, see if it's too high. Or if rbtrace is complicated, you can write a sidekiq middleware to log that number somewhere.

If you can prove this is happening, then I'd be extremely interested in debugging this forward with you.

2 - Memory fragmentation

If you are using the default memory allocator, untuned, and sidekiq with many threads, then you might very well experience a lot of memory fragmentation.

  • What version of Ruby ?
  • What allocator * what settings?
  • Ever tried jemalloc / tcmalloc? They can help a lot with fragmentation. Alternatively configuring MALLOC_ARENA_MAX=2 can help a lot as well.

Other questions

  • These shark fins are quite concerning. What is it graphing exactly? I presume RSS?
  • If you deploy a sidekiq worker to only process tons of empty jobs, does it leak the same way?
  • Same thing but with a simple job that does a simple get/set in Redis?
  • Are you seeing similar growth with you web processes?
@liaden
Copy link

liaden commented Jul 1, 2020

Thanks for the thorough answer. I missed today's release so it will be tomorrow before I can try out rbtrace specifically in production.

Answers

  1. Ruby 2.7.1 with standard allocator.
  2. Attempted to switch to jemalloc via LD_PRELOAD as well as using the fullstaq.io container and then did high apache benchmark with with 5 concurrent requests against 5 puma processes and did not see any memory reduction which is sad since I've previously used jemalloc to great affect.
  3. MALLOC_ARENA_MAX=2 was also tried by another engineer and saw no improvement.
  4. Yes, RSS.
  5. Will try out a bit this afternoon.
  6. Ditto
  7. Yes, and maybe same proprotional value:

Screen Shot 2020-07-01 at 12 45 34 PM

@byroot
Copy link
Member

byroot commented Jul 1, 2020

That last graph is much less concerning. The RSS seem to stabilize around 1GB, which makes much more sense and is quite typical. The average app have a bunch of code and data that isn't properly eagerloaded, so there some memory growth on the first few requests. Plus it takes some time to reach a request that allocates quite a lot, and after that memory isn't reclaimed.

If anything this graphs makes the previous one much more concerning. Your sidekiq processes are using several times more memory than your web processes, that's not normal (unless they use much more threads), and they don't seem to stabilize. You either have a nasty leak in your sidekiq processes, or excessive fragmentation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

4 participants
You can’t perform that action at this time.