Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Explore caching index for gems #1009

Open
2 tasks
vinistock opened this issue Sep 14, 2023 · 5 comments
Open
2 tasks

Explore caching index for gems #1009

vinistock opened this issue Sep 14, 2023 · 5 comments
Labels
enhancement New feature or request pinned This issue or pull request is pinned and won't be marked as stale server This pull request should be included in the server gem's release notes

Comments

@vinistock
Copy link
Member

vinistock commented Sep 14, 2023

Caching the index for gems is easier than caching the entire index because we know the files will only change if the version of the gem has changed.

I think we can get a considerable performance improvement on indexing if we cache the entries for gems. Something like

.ruby-lsp/
  Gemfile
  Gemfile.lock
  cache/
    rails-7.1.0
    yarp-0.12.0

Where each gem cache is basically a Marshal dump of the index. This may require defining a way to merge different indices.

For the long term, it would be amazing if rubygems could generate the index cache during packaging, so that all gems are exported with an index by default. By doing the work ahead of time, we would be guaranteed to always have cached indices for all gems, significantly speeding up indexing.

Tasks

  1. server
@vinistock vinistock added enhancement New feature or request pinned This issue or pull request is pinned and won't be marked as stale labels Sep 14, 2023
@andyw8
Copy link
Contributor

andyw8 commented Sep 14, 2023

Could a Bundler plugin be an alternative? There's an after-install hook which runs after each gem is installed.

https://bundler.io/guides/bundler_plugins.html

@vinistock
Copy link
Member Author

We would depend on every gem maintainer to use the bundler plugin, which wouldn't really scale.

@andyw8
Copy link
Contributor

andyw8 commented Sep 14, 2023

It could be distributed as a gem which Ruby LSP installs in the .ruby-lsp bundle.

@vinistock
Copy link
Member Author

Would bundler pick up the plugin automatically? Even from a different Gemfile?

@aryan-soni
Copy link
Contributor

Overview of Progress

I'm wrapping up my work on this issue, so I'm leaving some context here for whoever decides to take it on.

There were 2 main components to this task:

  1. Serializing entries (Support serialization and deserialization for RubyIndexer::Entry objects #1919)
  2. Implementing caching for gems based on this serialization.

The progress for both tasks is available on the serialize-entries branch. See below for more context.

Serializing Entries

We opted to use custom JSON to serialize entries instead of something like Marshal, for performance reasons. When using JSON vs. Marshal, we found a 15x improvement in serialization and deserialization time.

entries.rb and location.rb contain the serialization logic for RubyIndexer::Entry and RubyIndexer::Location objects respectively. The test files test the serialization and deserialization for all entry possibilities. Note that the test files make use of the == methods defined in entries.rb and location.rb.

Next Steps

The serialization logic is ready to ship.

Implementing Caching

The relevant code here is in index.rb. The export_to_cache method serializes the appropriate entries. An important step to verify here is that we are extracting the correct set of entries (see the embedded TODO). The import_from_cache method deserializes the entries.

In test_caching.rb, we compare the work of manually indexing non-default gems vs. importing them from the cache. This is not intended to be shipped, but merely to illustrate the benefit of caching. Here, we see a 43% reduction in indexing time.

Next Steps

  • Ensure the caching methods cover any edge cases (such as Gemfiles that point to remote repos, as specified in the TODO in index.rb).
  • Integrate the caching methods into our indexing process. When indexing a non-default gem, we should check the cache before indexing it. When the LSP shuts down, we should write to the cache.
  • Add tests for the caching.

TL;DR

Progress has been made on serializing entries and implementing caching for gems, with the latter showing a 43% reduction in indexing time. The serialization logic is ready to ship, and the remaining tasks involve addressing potential edge cases when caching, integrating caching into the indexing workflow, and adding tests for the caching. Code is available on the serialize-entries branch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request pinned This issue or pull request is pinned and won't be marked as stale server This pull request should be included in the server gem's release notes
Projects
None yet
Development

No branches or pull requests

3 participants