Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

server.rb spinning on 100% CPU trying to create tempfiles #278

Closed
mttkay opened this issue Feb 29, 2024 · 11 comments
Closed

server.rb spinning on 100% CPU trying to create tempfiles #278

mttkay opened this issue Feb 29, 2024 · 11 comments

Comments

@mttkay
Copy link

mttkay commented Feb 29, 2024

I am running into a bug that I can consistently reproduce, where the ruby-lsp-server component will start up and not terminate, consuming an entire CPU:

Screenshot from 2024-02-29 09-53-42

I took a CPU profile and found that it seems to spend most of that time trying to create a temp dir in an asdf managed directory:

Time since start: 29s. Press Ctrl+C to stop.
Summary of profiling data so far:
% self  % total  name
 54.12    59.97  rescue in block in mkdir_p - /home/mk/.asdf/installs/ruby/3.2.3/lib/ruby/gems/3.2.0/gems/bootsnap-1.18.3/lib/bootsnap/load_path_cache/store.rb:0
 23.32    23.32  directory? [c function] - (unknown)
  9.22    13.54  mkdir [c function] - (unknown)
  4.32     4.32  initialize [c function] - (unknown)
  4.18     4.18  dirname [c function] - (unknown)
  2.19     2.19  pid [c function] - (unknown)
  1.18   100.00  dump_data - /home/mk/.asdf/installs/ruby/3.2.3/lib/ruby/gems/3.2.0/gems/bootsnap-1.18.3/lib/bootsnap/load_path_cache/store.rb:110
  0.63     0.84  rescue in dump_data - /home/mk/.asdf/installs/ruby/3.2.3/lib/ruby/gems/3.2.0/gems/bootsnap-1.18.3/lib/bootsnap/load_path_cache/store.rb:0
  0.31    73.83  block in mkdir_p - /home/mk/.asdf/installs/ruby/3.2.3/lib/ruby/gems/3.2.0/gems/bootsnap-1.18.3/lib/bootsnap/load_path_cache/store.rb:126
  0.24     0.24  === [c function] - (unknown)
  0.17    93.42  mkdir_p - /home/mk/.asdf/installs/ruby/3.2.3/lib/ruby/gems/3.2.0/gems/bootsnap-1.18.3/lib/bootsnap/load_path_cache/store.rb:127
  0.07     0.07  raise [c function] - (unknown)
  0.03     0.03  rand [c function] - (unknown)
  0.00   100.00  transaction - /home/mk/.asdf/installs/ruby/3.2.3/lib/ruby/gems/3.2.0/gems/bootsnap-1.18.3/lib/bootsnap/load_path_cache/store.rb:57
  0.00   100.00  synchronize [c function] - (unknown)
  0.00   100.00  setup - /home/mk/.asdf/installs/ruby/3.2.3/lib/ruby/gems/3.2.0/gems/bootsnap-1.18.3/lib/bootsnap/load_path_cache.rb:46
  0.00   100.00  setup - /home/mk/.asdf/installs/ruby/3.2.3/lib/ruby/gems/3.2.0/gems/bootsnap-1.18.3/lib/bootsnap.rb:76
  0.00   100.00  require_relative [c function] - (unknown)
  0.00   100.00  require [c function] - (unknown)
  0.00   100.00  require - <internal:/home/mk/.asdf/installs/ruby/3.2.3/lib/ruby/3.2.0/rubygems/core_ext/kernel_require.rb>:165
Wrote raw data to /home/mk/.cache/rbspy/2024-02-29-TTLMXZcaex.raw.gz
Wrote formatted output to _gi/ruby-lsp.3.json

I have to TERM the process every single time it starts up.

Gemfile:

  gem 'ruby-lsp', "~> 0.14.2", require: false, feature_category: :tooling

  gem 'ruby-lsp-rails', "~> 0.3.0", feature_category: :tooling

  gem 'ruby-lsp-rspec', "~> 0.1.10", require: false, feature_category: :tooling

I use VSCode with the ruby-lsp v0.5.11 extension.

@mttkay
Copy link
Author

mttkay commented Feb 29, 2024

Is there some kind of mitigation for this? I cannot remove ruby-lsp-rails from Gemfile because it is under version control. I tried setting codeLens: false in the ruby-lsp extension settings, which seems to result in fewer attempts to run this server process, but it still happens every time ruby-lsp reindexes the workspace, for example.

@Earlopain
Copy link
Contributor

Earlopain commented Mar 1, 2024

Hi there, I have a hunch on what this might be. Do you have a large folder inside the workspace directory (like from activestorage)? I ran into something similar at Shopify/ruby-lsp#1665.

I would try to disable bootsnap with DISABLE_BOOTSNAP or add that folder to BOOTSNAP_IGNORE_DIRECTORIES if you can identify it. You might need to do something similar for ruby-lsp by using the .index.yml file (docs). Strange how this only happens when using the server runner.

Hope that helps and gives you some pointers for additional troubleshooting.

@vinistock
Copy link
Member

This may also be the bug fixed in v0.3.2. Could you please test the new version and let us know if your issue is fixed?

@mttkay
Copy link
Author

mttkay commented Mar 4, 2024

0.3.2 did not fix the issue, unfortunately:

Time since start: 7s. Press Ctrl+C to stop.
Summary of profiling data so far:
% self  % total  name
 48.48    55.72  rescue in block in mkdir_p - /home/mk/.asdf/installs/ruby/3.2.3/lib/ruby/gems/3.2.0/gems/bootsnap-1.18.3/lib/bootsnap/load_path_cache/store.rb:0
 28.36    28.36  directory? [c function] - (unknown)
 10.71    14.18  mkdir [c function] - (unknown)
  3.76     3.76  dirname [c function] - (unknown)
  3.47     3.47  initialize [c function] - (unknown)
  2.03     2.03  pid [c function] - (unknown)
  1.16   100.00  dump_data - /home/mk/.asdf/installs/ruby/3.2.3/lib/ruby/gems/3.2.0/gems/bootsnap-1.18.3/lib/bootsnap/load_path_cache/store.rb:110
  0.58    70.48  block in mkdir_p - /home/mk/.asdf/installs/ruby/3.2.3/lib/ruby/gems/3.2.0/gems/bootsnap-1.18.3/lib/bootsnap/load_path_cache/store.rb:126
  0.58     0.58  === [c function] - (unknown)
  0.43     0.87  rescue in dump_data - /home/mk/.asdf/installs/ruby/3.2.3/lib/ruby/gems/3.2.0/gems/bootsnap-1.18.3/lib/bootsnap/load_path_cache/store.rb:0
  0.29    93.20  mkdir_p - /home/mk/.asdf/installs/ruby/3.2.3/lib/ruby/gems/3.2.0/gems/bootsnap-1.18.3/lib/bootsnap/load_path_cache/store.rb:127
  0.14     0.14  raise [c function] - (unknown)
  0.00   100.00  transaction - /home/mk/.asdf/installs/ruby/3.2.3/lib/ruby/gems/3.2.0/gems/bootsnap-1.18.3/lib/bootsnap/load_path_cache/store.rb:57
  0.00   100.00  synchronize [c function] - (unknown)
  0.00   100.00  setup - /home/mk/.asdf/installs/ruby/3.2.3/lib/ruby/gems/3.2.0/gems/bootsnap-1.18.3/lib/bootsnap/load_path_cache.rb:46
  0.00   100.00  setup - /home/mk/.asdf/installs/ruby/3.2.3/lib/ruby/gems/3.2.0/gems/bootsnap-1.18.3/lib/bootsnap.rb:76
  0.00   100.00  require_relative [c function] - (unknown)
  0.00   100.00  require [c function] - (unknown)
  0.00   100.00  require - <internal:/home/mk/.asdf/installs/ruby/3.2.3/lib/ruby/3.2.0/rubygems/core_ext/kernel_require.rb>:165
  0.00   100.00  reinitialize - /home/mk/.asdf/installs/ruby/3.2.3/lib/ruby/gems/3.2.0/gems/bootsnap-1.18.3/lib/bootsnap/load_path_cache/cache.rb:131

I will have a look at @Earlopain's suggestions today

@mttkay
Copy link
Author

mttkay commented Mar 4, 2024

I think I know what the issue is.

I work with dev containers. When I start the Rails web app in a dev container, it mounts the source code directory as a host volume and bootsnap writes to <rails_root>/tmp/cache from the container. In my case, this will be a symbolic link to a cache Docker volume:

ll tmp/
total 12K
4.0K lrwxrwxrwx. 1 mk   24 Mar  4 10:10 cache -> /data/cache/gitlab/cache

Here, /data is a directory in the Docker volume; it does not exist on the host FS.

If ruby-lsp-rails now tries to run the server process, then because this process runs on the host OS, bootsnap will try to create a directory under this path, which does not exist, so it raises an error. I think it may get stuck in this loop, though I haven't verified this yet:

          until File.directory?(path)
            stack.push path
            path = File.dirname(path)
          end

From /.asdf/installs/ruby/3.2.3/lib/ruby/gems/3.2.0/gems/bootsnap-1.18.3/lib/bootsnap/load_path_cache/store.rb.

I think this problem is just symptomatic for ruby-lsp-rails assuming that it can boot the Rails app cleanly on the host OS. Is that an OK assumption to make when using containers?

@mttkay
Copy link
Author

mttkay commented Mar 4, 2024

As a mitigation, can you think of a way to disable bootsnap only when it's ruby-lsp-rails trying to launch the Rails server process?

I would not want to disable bootsnap when running the app ordinarily in a container.

@Earlopain
Copy link
Contributor

You can try starting the runner with strace attached, that should give you a pretty good idea of what it is actually doing. For me this very quickly showed what was actually going on. In the devcontainer doing strace bin/rails runner GEM_PATH/gems/ruby-lsp-rails-0.3.2/server.rb, something like that. GEM_PATH is one of the paths in the output of gem environment gempath (I don't know an easier way to do this).

If that is too much a hassle, this command might get some extra info: find . -maxdepth 1 -type d | while read -r dir; do printf "%s:\t" "$dir"; find "$dir" -type f | wc -l; done (https://unix.stackexchange.com/a/4176). It shows the amount of subfolders all folders in the current directory have. Maybe there is something suspicious in that output when running it in the devcontainer in the rails app root. This not finishing in a reasonable amount of time will also be suspicious since I believe bootsnap is doing something similar to discover all files it needs to cache.


Disabling bootsnap conditionally should also be possible by examining $PROGRAM_NAME, I have this in my rubocop file for example to skip the todo file only when running ruby-lsp (I want all offenses locally but CLI runs or CI should not complain):

<% unless $PROGRAM_NAME.end_with?("ruby-lsp") %>
inherit_from: .rubocop_todo.yml
<% end %>

Something similar can probably be done for bootsnap as well, though you'd have to check what $PROGRAM_NAME actually is when using the rails runner and if it can be distinguished from the normal rails runner without ruby-lsp. It probably comes down to also inspecting ARGV for a file ending in server.rb. That does require code changes, I don't have an idea for doing this through env variables alone.

@mttkay
Copy link
Author

mttkay commented Mar 4, 2024

I mean, I think we know what the problem is already, it's that tmp/cache is a broken symlink, so it exists, but no file nodes underneath it point anywhere. But indeed, strace confirms this, and I also confirmed that this is not a problem specific to ruby-lsp-rails. If I run bin/rails runner outside of the IDE, it ends up in the same loop:

write(5, "=== Scanning /home/mk/Projects/w"..., 3391) = 3391
close(5)                                = 0
getpid()                                = 211223
openat(AT_FDCWD, "/tmp/ruby-lsp.log", O_WRONLY|O_CREAT|O_TRUNC|O_CLOEXEC, 0666) = 5
ioctl(5, TCGETS, 0x7ffc745af370)        = -1 ENOTTY (Inappropriate ioctl for device)
newfstatat(AT_FDCWD, "/home/mk/Projects/work/gl-gck/gitlab-rails/tmp/cache/bootsnap", 0x7ffc745af240, 0) = -1 ENOENT (No such file or directory)
newfstatat(AT_FDCWD, "/home/mk/Projects/work/gl-gck/gitlab-rails/tmp/cache", 0x7ffc745af240, 0) = -1 ENOENT (No such file or directory)
newfstatat(AT_FDCWD, "/home/mk/Projects/work/gl-gck/gitlab-rails/tmp", {st_mode=S_IFDIR|0755, st_size=250, ...}, 0) = 0
mkdir("/home/mk/Projects/work/gl-gck/gitlab-rails/tmp/cache", 0777) = -1 EEXIST (File exists)
newfstatat(AT_FDCWD, "/home/mk/Projects/work/gl-gck/gitlab-rails/tmp/cache", 0x7ffc745aeee0, 0) = -1 ENOENT (No such file or directory)

The log messages are mine (I patched the gem to write a log file into the system tmp dir). The problem is ENOENT (No such file or directory), which breaks bootsnap.

Disabling bootsnap conditionally should also be possible by examining $PROGRAM_NAME,

Yeah. We would have to patch the application's bootsnap initializer to do this, right? Seems kinda dirty, but I'll propose this as an option, thanks.

I will close this issue here because I confirmed it is not specific to ruby-lsp-rails, but I do wonder if the assumption that bin/rails runner executions will work in any dev setup holds up well. What if that process tries to open a database connection? Is the assumption that I always run a database on the host OS? This seems to go against where the world of software development is generally headed, which is to utilize containers and run basically nothing on the host OS directly anymore. But that's a broader topic than this issue here.

Thanks for helping out, appreciate it!

@mttkay mttkay closed this as completed Mar 4, 2024
@Earlopain
Copy link
Contributor

Since bootsnap happens so early in the boot cycle, a initializer is already too late, especially if your application currently follows recommendations for the bootsnap rails setup. Check out the usage section in the docs, that explains what you need to do https://github.com/Shopify/bootsnap?tab=readme-ov-file#usage

This actually mentions that tmp cache MUST be writable, however I'd wager that CPU spinning is still a bug. I would expect it to just raise/crash somewhere. A bug report over there seems like a good idea.

@mttkay
Copy link
Author

mttkay commented Mar 4, 2024

Ah yes, my bad -- I meant to say boot.rb not initializer.

Filing an issue against bootsnap makes sense to me. I agree it should not get itself stuck like that. Will cross link here when done.

@Earlopain
Copy link
Contributor

Looking at the bootsnap code again, I find it very likely that the following code is responsible for this: https://github.com/Shopify/bootsnap/blob/8394834cd504548aae3b4651587abd823f0495d1/lib/bootsnap/load_path_cache/store.rb#L107-L108

Your strace output seems to back that up as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants