-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lobsters is experiencing a memory leak with YJIT #556
Comments
Also he said he was using Ruby version 3.3.1 |
If we could somehow repro this offline it would be really helpful. Potentially with the code of the lobsters benchmark for example. Otherwise any kind of memory profiling to give us a hint where the leak comes from would be helpful. @tenderworks can you tag the person who reported the problem on this issue? |
One thing we didn't check: it could be that there is no memory leak, just YJIT keeps compiling stuff and never plateaus. Is there a good way to check whether or not code compilation is plateauing? Or maybe we need to set the maximum memory usage for YJIT on the lobsters deployment. @pushcx What (if any) configuration options have you set for YJIT? Also, how much memory is on the machine you have? How many puma processes are you running? Also how large are the processes when they get OOM killed? Thanks! |
Howdy! The code and config are available and, yep, I recently updated to 3.3.1. I enabled YJIT with 20 puma processes, no config options. The box has 7.8GiB of RAM and no swap. I know with YJIT disabled the processes plateau at 160MB but don't know how large the processes were when they got oomkilled. This box only runs nginx, puma, and cron jobs (db is on the other vps) and htop confirms nothing else is using a significant amount of RAM, so I'd guess they were around 1/20th of RAM. I don't mind re-enabling YJIT and running it until it crashes again, but as a hobby project the hosting setup is pretty immature and has rounds-to-zero metrics/monitoring, so you'll have to tell me what's interesting to you to gather it. |
@pushcx as a way to determine whether or no this is a memory leak, would you mind enabling YJIT but put a cap on the executable memory size? Basically change your Ruby opt to something like this That would limit the executable memory size to 8 megabytes. If memory plateaus, then we know there is no leak and the default executable memory size might be too big for your machine. Also one thing I just thought of is that if you're booting the process with After we've debugged this excess memory usage, we should investigate adding this initializer to Lobsters. |
lobsters/lobsters-ansible@988c3ef Added and deployed. I'll let you know if we crash in the next day or two. (For eagle-eyed diff readers - before this commit |
I haven’t been able to get a clean result out of this yet. I’ve seen unexpected restarts but an attacker has been running a vulnerability scanner against the site so the restarts have rolled out of the log, which is overwhelmed with Rack backtraces about malformed requests. I’ve banned some IPs and we’ll see how it goes. |
@pushcx thanks for the feedback, and good luck. I look forward to results! (when you have them of course 😄 ) |
We're still getting oomkilled, caught one about 15 minutes ago. |
We caught another oomkill today (Shopify/ruby#556) and the service log had a lot of messages like "Terminating timed out worker (Worker 12 failed to boot within 63 seconds" that kept us down for a few minutes. That means this code is officially too clever; I'll take a reliable slower boot over a self-sabotaged one that doesn't complete.
@pushcx can you tell us how long it took approximately to get oomkilled, and how high the limit is set? Ideally we'd need some kind of offline repro to troubleshoot this issue. We do have an offline benchmark of lobsters. I'll try running it in a loop to see what happens, but it may not be up to date with the latest lobsters code. |
I tried running our lobsters benchmark in a loop for 24 hours with Ruby 3.3.3 while monitoring memory usage, and could not see any clear memory usage growth trend. That being said, this benchmark is not going to be representative of real web traffic. @pushcx we could really use your help in terms of diagnosing some kind of local reproduction. Also, are you able to track YJIT runtime stats counters over time? There is a dictionary of counters available through |
Looks like they're happening about 16 hours apart, same as before adding Does it matter which puma worker I access it through? I could add a |
If would be best if we could watch the same worker evolve over time, but if not, any data is better than no data. |
Also, thanks for your patience and help in diagnosing this issue 🙏 |
Can you think of a good way to pin it to a worker? Or get a worker to log it to a standalone file? I can imagine something heinous like including worker number in the endpoint and polling it until I get the right one, but it seems real likely there's a reasonable way to do this I'm ignorant of. I'm really happy to help debug and give back to the projects Lobsters is built on. I just hope this isn't some weird "clever" code I wrote prompting a snipe hunt. |
@casperisfine may have better advice here than I would. AFAIK the strategy we've used before is to enable periodic logging for only some fraction of the workers. For logging, I would periodically append the serialized dict/string to a file.
If you are getting OOM with exec mem size 8 then it does look like there's some problem, and if YJIT has some strange memory leak issue, we'd really like to fix it. Just sorry it's taking so much back and forth to track down the problem. Can you provide some details as to how strict your memory limits are for OOM kill? Like, do you know how much memory is used without YJIT vs how much memory usage triggers the OOM kill? |
I don't think there's a way to pin a worker with Puma. What you could do however is just have a single worker dump its stats on a regular basis each in a file, e.g.: # puma.rb
on_worker_boot do |worker_index|
if worker_index == 0
Thread.new do
loop do
sleep 30
File.atomic_write("/tmp/yjit-stats.txt", RubyVM::YJIT.runtime_stats)
end
end
end
end Alternatively, you can dump it every X requests (e.g. in a middleware, that you enable from that hook), and instead of always overwriting the stats file, you can just send it to the general logs, etc. Caution: My production experience with Puma is limited, so don't blindly copy this snippet and make sure it works :) |
Thanks, @casperisfine. I had a little hassle because ActiveSupport was available in dev but not in prod (!?) but I got something working. Here's the first couple lines, I'll post a longer log to gist tomorrow.
|
OK, ran to an oomkill. Note that the last two lines of this log are the replacement worker 0 after the kill. It seemed unlikely there'd be anything too interesting, but why not. |
Thanks for the stats. What I see Good things:
Weird / potentially bad things:
|
But I don't see |
Ah it was disabled to support puma phased restarts: lobsters/lobsters@9c80ba7 I'd suggest to try with |
Do you think that enabling it will improve diagnosis of this issue? I'm asking because you clearly know more than I do on this, but I know I'm easily misled by "hey look low hanging fruit" tweaks that distract from the problem at hand. I've been reluctant to deploy code that might create red herrings, including adding And I guess I'm a little reluctant to enable (It's sort of ironic how Lobsters hits these inconveniences because it is small enough to run on a single VPS but efficiently serves enough traffic that a number of visitors notice the downtime.) |
I'll wait for the others to weight in. What's weird is that isns are growing but not live iseq, so my suspicion is that something is compiling an unbounded amount of code, but not holding onto it. So I suspect there is a leak in the VM but not YJIT specifically. So I'm curious whether YJIT is actually the cause of the leak, or if it just makes the OOM happens much sooner because of the higher memory usage baseline. As for preload app, on paper it should impact what we are tracking, but it can save quite a bit of memory, so help, but not fix. As for your issues with preload app, I'd be very interested to dive more into it in the future because I don't quite understand. But I don't want to distract this thread and I'm on my way out for the week end. |
@casperisfine the It does look like YJIT execution stops pretty early on and YJIT stops compiling new code. Ideally we would need some memory profiling of some sort, or maybe we could look at GC stats? One of the things that's different with YJIT enabled is that we do pin some GC'd object references in the machine code. We could be keeping more GC roots alive. Though it's not fully clear how that could result in unbounded memory growth over time after compilation is done? @XrXr |
IMO we should do this. If we could dump |
It does seem like heap live slots matches allocated pages, so presumably the growth is coming from Ruby objects 🤔 Also curious to know if we see the same thing without YJIT. Is it possible that YJIT processes just hit OOM sooner than non-YJIT processes would? |
That's my current theory yes. I'm not fully certain this is a leak per say, leaks generally are much faster than that. Here the growth is really slow, so it looks more like some infrequently executed code paths that lazy initialize global state. So over time the memory grows a bit, and probably stabilize after a long time, or grow so slow that without YJIT there's enough memory available that the OOM is never reached. So yeah, I suspect that since YJIT adds a noticeable extra memory baseline on each worker, there's less leeway and OOM is hit after ~18 hours. Now the next step to debug this I'm afraid is Unfortunately it's a bit tricky to do with Puma, because you want to do that outside of the request cycle, and Puma being threaded, there's not an ideal way. Another option is to just use I'll leave https://stevenharman.net/so-we-have-a-memory-leak here which is a pretty good post describing how to dig into memory leaks. If you need help to dig into these heap dumps, we can help, but keep in mind all strings in the heap are dumped, so if you think about sharing a memory dump, make sure to strip all string values (in case there might be keys or some personal data in the heap). |
At the beginning of this, @tenderlove pointed out an obvious test, which was: does it go away if we disable yjit? (Yes.) In that spirit, another test occurs to me: what if I just turn down the number of workers? If it's a leak, we'll just take a little longer to crash. But maybe there's too many workers, yjit asks for a bit more RAM per worker, they slowly initialize/grow, and then after long enough they exhaust the box. I'm going to try turning down the number of workers first. As for |
The issue with that is that the dump will happen in a middle of a/mutliple requests, so the heap will be full of objects that aren't retained, so it will be hard to exploit the dump. |
Ah, shoot. Thanks for explaining why that wouldn't work. I guess after a couple days of this experiment with fewer puma workers I'll try @maximecb's suggestion of disabling yjit and gathering the same stats to compare. (Experiments this week will take longer. Lobsters traffic is highly correlated to the US work day. There's a national holiday tomorrow and many people will be out on Friday as well.) |
I think there may even have been two oomkills in the last 20h, which is surprising. Unfortunately another vulnerability scanner has scrolled off the journalctl messages. |
What's the difference with these last stats? less workers? |
Nevermind, looking at the stats YJIT seems disabled. |
It's also interesting to note how in both cases there is a relatively big jump after some indefinite amount of time. I wonder if it's a particular endpoint that is rarely accessed that causes this. |
Uh, that’s surprising, it shouldn’t be disabled. It hasn’t been disabled since before we opened this issue, and the most recent tweak has as well. The jumps… we have a couple endpoints with fairly long cache timeouts. StatsController for /stats is the first thing that comes to mind for long cache and querying a fair amount of data. Maybe /users, and it also has a one day cache (though I thought it was shorter, and it probably should be). Otherwise I’d have to grep for use of The other possibility for infrequent oddities is bad bots. Aside from the vulnerability scanners I have griped about a couple times here, the site is a popular choice for people writing their very first bots. (And we still have naive limit/offset pagination, so there are temptingly incrementable numbers in many URLs.) |
Nah, you are right, I read the logs too fast. It's indeed enabled.
Yeah, but that's something that's tricky to identify like this. The only reliable way is a heap dump, we gotta figure out a proper way to get a clean one. I'll try to ask people who run puma how they do this sort of stuff. |
Alright, I think we could try to use Puma's |
Hoping this will do: lobsters/lobsters#1292 |
I'm giving this a try now. Sorry for the delay, things have been a little hectic for me. |
No problem. Thanks for not giving up on it :) |
I haven't rigorously tracked, but I'm confident in the observation given that it's been a couple weeks. Turning down the number of workers seems to have significantly shortened the time between oomkills. We've gotten them more like every 6-8 hours instead of 11-13. I'm really surprised by this, it's the opposite of what i expected. I mention it because puma just rebooted after 6 hours without triggering the second dump ( |
That's very interesting indeed. Not unseen territory but rare. Could be explained by the smaller pool making individual workers process more requests, hence be more busy, hence leak / bloat faster. If the second dump never triggered, you can try lowering the threshold a bit. |
Shopify/ruby#556 (comment) I commented out the other log because it doesn't seem to be so useful, and changed file modes to 'w+' because otherwise it won't be possible to correlate the first dump with the second.
@byroot I had to turn it down a couple times but I got the two heap dumps. Next steps? |
Now you can use https://github.com/oxidize-rb/reap to generate a flame graph of both each snapshots, which should help visualize what take memory in you heap and by comparing both, what is not present on a fresh worker. Other tools include https://github.com/Shopify/heap-profiler, that give you a summary of the object types in the heap. And then if you find something weird, you can dig into it with https://github.com/jhawthorn/sheap. You should be able to share the generated flamegraphs, as they don't contain any secrets, I'll be happy to have a look. You can also share the actual heaps, but you'd need to first use |
@byroot generously spent some time pairing with me on this today. He spotted that Rubocop was getting loaded in production and PRd a fix. And that the flame graph conspicuously had a big ~20M chunk for I18n. Lobsters doesn't ship translations but it uses helpers like Another thing that @byroot brought up is that Lobsters doesn't use jemalloc. I only learned of it recently and didn't want to change something that may be related while this bug was open. There wasn't really anything conspicuous in the flamegraph or output from heap-profiler, so maybe what Lobsters has been seeing is just memory fragmentation rather than a leak. That might also explain the oddity that oomkills got more frequent when I halved the number of workers. If this is it, I'm sorry to distract/cause concern that there might have been a leak in yjit. Maybe something in this issue points towards a way to detect and alert about this scenario? I dunno, I'm looking for a silver lining. I'm going to dump heaps again in the next few days with these changes to see if the picture's clearer. |
I deployed jemalloc 48 hours ago and, for the first time since I enabled yjit, we've gone more than 14 hours without an oomkill of puma. I still plan to dump heaps and look at them, but I think this is conclusive evidence that Lobsters had been seeing a memory fragmentation issue rather than a memory leak. I don't know enough about the allocator to know if my hope from my previous comment that this situation could be detected and alerted is possible, and I don't know this project's workflow well enough to know if you'd want to keep this issue open or close it in favor of another. Either way, I think it can probably close with @byroot having diagnosed and fixed the core issue here. |
Unfortunately I don't really see a way to do that. The allocator is a bit of black box for Ruby. THere was some suggestion to make Anyway, happy your issues are solved and that it wasn't a YJIT issue. |
I met with the person that runs lobsters at RailsConf. He said they were experiencing a memory leak when using YJIT.
Basically the processes will slowly grow in size until about 12 hours in, at which point they get OOM killed. I had him disable YJIT so we could verify it's actually a YJIT problem. When he disables YJIT the process size plateaus and there is no problem. He said he'll do anything we need to try and debug the problem.
I'm not sure what information we should ask him to get.
The text was updated successfully, but these errors were encountered: