Add code path caching #6729

josevalim · 2023-01-24T21:13:57Z

When an application has several entries in its code path, loading modules in interactive mode becomes more expensive because of code path misses.

This commit introduces code path caching, where we can opt-in into caching each directory individually.

This won’t be applied to all paths but a project has several paths that are unlikely to change while a system is running in development or test:

the paths from Erlang/OTP won’t be written to by most projects
paths from build tools and other languages
non-local dependencies, such as the ones from Hex/Git

Loading of the cache happens lazily, as to avoid introducing rehashes, or any upfront cost.

Benchmarks

This change brought the boot time of Livebook (the time to start all apps in interactive mode for dev+test) on my machine (MacStudio M1 Max) from 1.095s to 0.940, which is roughly a ~15% win.

If you want to try it out, compile Erlang/OTP from this branch and measure the time to boot a project (in Elixir, for example, this is mix run -e 1). Run the command at least 5 times.

Then open up code_server.erl and change this -define(CACHE_DEFAULT, cache). to -define(CACHE_DEFAULT, nocache)., run erlc -o lib/kernel/ebin lib/kernel/srccode_server.erl to recompile it without cache, and run the command again.

Decisions to be taken

Assuming we want to move forward with this, we need to take some decisions. Currently this commit enables caching by default, but I assume we don't want this in practice. I propose we add the following features for low-level control:

add_path* now accept cache/nocache as second argument
-pac and -pzc to be added to the command line

However, Erlang also loads code paths from three other locations:

Erlang/OTP lib directory
ERL_LIBS
{path, ...} instruction in the init script

We need to decide if we want one option for caching all three, such as -cache_init_paths, or one option for each, such as -cache_otp_lib, -cache_erl_libs, and -cache_init_path.

TODO

code:ensure_modules_loaded/1 does not benefit from this patch as it uses another code for loading. We should unify those code paths (which will also help address bugs)
Add cache/nocache argument to add_path*
Add cache/nocache argument to set_path*
Add cache/nocache argument to replace_path*
Support -pac and -pzc
Add code:del_paths/1
Add code:clear_caches/1
Docs

Future work

A future optimization is to remove parts of the linear lookup. This patch simply changes the code path to be {Path, #{Beam := Path}}. Therefore, if we have two cached paths in a row inside the code_server, we could merge their maps. This would be useful for Erlang/OTP, for instance, as its directories are typically stored sequentially in the code path.

Also note that where_is_file/1 and which/1 are not optimized by this pull request, as they traverse code paths on the caller. Maybe with the cache is worth moving to the server, but I would consider those changes as future work.

github-actions · 2023-01-24T21:15:00Z

CT Test Results

      4 files   187 suites 1h 46m 34s ⏱️
2 883 tests 2 605 ✔️ 273 💤 5 ❌
3 491 runs 3 147 ✔️ 337 💤 7 ❌

For more details on these failures, see this check.

Results for commit a1bdb2c.

♻️ This comment has been updated with latest results.

To speed up review, make sure that you have read Contributing to Erlang/OTP and that all checks pass.

See the TESTING and DEVELOPMENT HowTo guides for details about how to run test locally.

Artifacts

// Erlang/OTP Github Action Bot

fhunleth · 2023-01-25T03:17:01Z

I'd like to add another benchmark. This is for a 1 GHz single core 64-bit RISC-V board running Livebook with Nerves. The timing is from boot until the system is usable. Usable means that it's possible to use the Elixir shell over a UART cable. For running Livebook on Nerves this ends up being close to the time to access Livebook in a web browser, but without network setup variations between runs.

OTP 26-dev/unpatched/embedded - 41 seconds
OTP 26-dev/unpatched/interactive - 71 seconds
OTP 26-dev/jv-cache-code/interactive - 40 seconds

This is a huge improvement for Nerves since on these slower processors, we'd avoid running releases in interactive mode due to the boot time penalty. As you can imagine, I'm super happy with this PR and I hope that there's a way to include it in OTP 26.

okeuday · 2023-01-25T04:16:33Z

You may want to test with a ram disk to avoid storage device latency. A ram disk could be a way of solving the problem without source code changes. Testing with a ram disk would also help you focus on the speedup related to the source code changes. Caching in the source code may help but it would make the interactive mode less interactive while adding the potential to get stale data.

josevalim · 2023-01-25T07:06:34Z

@okeuday I'd say a huge part of this PR is exactly to remove filesystem lookups, so I think showing numbers from environments where those lookups can be very expensive is important, especially since the gains are almost free. :) It is in a way an upper bound on the benefits we will get.

max-au · 2023-01-25T18:06:56Z

What happens when a new module is added into a directory that has been previously cached? Or rather, how does the cache gets invalidated? It is very common scenario in our environment to add extra modules and hot-code-load them.

My understanding is, this feature should only be enabled at startup, and when startup is complete, boot script should disable the cache.

josevalim · 2023-01-25T19:56:31Z

@max-au not implemented in this PR yet but the goal is that add_path, replace_path and friends will accept a cache/nocache option. So just call them again with the path you want to remove the cache and the cache will be gone.

This will also delete all caches:

code:set_path(code:get_path(), nocache). %% nocache will be the default anyway

My understanding is, this feature should only be enabled at startup, and when startup is complete, boot script should disable the cache.

For Elixir, I plan to leave Erlang, Elixir and all non-local dependencies (from Hex.pm/Git) permanently cached. YMMV. :)

max-au · 2023-01-25T22:51:02Z

add_path, replace_path and friends will accept a cache/nocache option

What I mean is, the cache suggested by this PR requires explicit invalidation (add_path, replace_path, ...). This behaviour is not backwards-compatible. In all prior versions, I could put the *.beam file anywhere in the existing code path, and load the module, without the need to explicitly flush the cache. We often leveraged that (while doing hot code loads without proper release upgrade, for it was too complex to maintain). It may not be a big issue, but something to be really aware of.

josevalim · 2023-01-25T23:20:46Z

@max-au the cache will be opt-in. It is enabled by default only for now, to ease testing and benchmarking.

frazze-jobb · 2023-01-26T10:08:35Z

Hi, we want to move forward with this.

I think at least what I would prefer, but I have to check with my peers, that caching during boot should be the default, unless there are any downsides with that, and then disabling of the cache once booted. At least a flag to control this to give users the best of both worlds. I have a difficult time to come up with a fitting name for that.
-cache_code_path boot (default, caches all paths during boot and keeps the cache only for user specified paths)
-cache_code_path config (caches only user specified paths during boot and after)
-cache_code_path all (caches all paths during boot and keeps the cache after) (not sure if useful)

If you have specified -pac or -pzc, they should remain cached afterwards. And specifying -cache_init_paths should be a shorthand for caching otp, init file paths and ERL_LIBS after the boot. Not sure we need to be more fine granular than that.

josevalim · 2023-01-26T11:13:30Z

Thank you @frazze-jobb! I will work on bringing this to the finish line. However, note I don't think we can enable this only during boot.

During init, we don't use the code server at all, so this feature isn't used. This feature is used after the init boot, but then we are running user code, and Erlang then no longer knows what the application is doing. In dev/test, the application likely then proceeds with its own boot, using scripts or a build tool such as Rebar/Mix.

In other words, I think all of the -cache* flags are all about what happens after boot and we need to decide how many knobs we want to have. For Elixir, I will certainly enable caching for ERL_LIBS, Erlang OTP's lib, and the paths from the boot scripts, so a single config would suffice, but perhaps granularity is the best choice. Another option is:

-cache_code_path erl_libs otp init all

frazze-jobb · 2023-01-30T14:54:11Z

I see, anyhow I discussed with @bjorng and he agreed that we should cache OTP by default. Reloading modules in OTP only affects us developing for OTP, and we could turn it off in that case. For OTP26 we can just have one flag for everything, and if more granulatiry is needed, we will deal with it then.
Except for that, we think a code:rehash() function would be great.

frazze-jobb · 2023-02-08T06:06:18Z

@josevalim How is it going with this?

josevalim · 2023-02-08T08:08:18Z

I will focus on getting this over the finish line, but since there is a release candidate soon, I think we can break this in two.

One PR adds -code_cache_path false | true, which caches the internal paths, but does not change the code server API
Another PR will add code:del_paths, the cache argument to the code functions, code:clear_cache, and -pac and -pzc

This way we can get the version that changes the defaults sooner and test it. And the additional APIs, which imply no breaking changes, will come soon after.

josevalim · 2023-02-08T11:38:17Z

@frazze-jobb ok, V1 of this feature is now in this pull request. It caches the boot paths and that's it. I will work on follow up pull requests assuming this one has been merged, unless you would later prefer for me to squeeze it all in.

frazze-jobb · 2023-02-08T11:45:39Z

Great, I will have a look and wait for the tests

josevalim · 2023-02-08T14:26:17Z

@frazze-jobb second part is here: #6823

I gave up on adding -pac and -pzc for now. The changes are slightly more complex (it requires changing both init and code_server) and it increases the size of the command line interface. I don't have a use case for it, so I thought I would skip on it. But I will be glad to add them in case you deem necessary. It is also easier to add once we remove code_path_choice.

frazze-jobb · 2023-02-10T02:46:20Z

Okey, I didn't like the -pac and pzc in the first place, so I am fine with that. We can revisit it if someone wants it later.
We don't want to cache the current working directory, so I modified to handle that special case.

When an application has several entries in its code path, loading modules in interactive mode becomes more expensive because of code path misses. This commit introduces code path caching, where we can opt-in into caching each directory individually. By default, all code paths known during boot (the OTP root, ERL_LIBS, and the ones from the boot script) are cached. This can be disabled by calling `-cache_boot_paths false`. An extended API in the `code` module for caching will be added in future commits, as well as support for `-pac` and `-pzc`.

mikpe · 2024-02-15T14:04:47Z

add_path, replace_path and friends will accept a cache/nocache option

What I mean is, the cache suggested by this PR requires explicit invalidation (add_path, replace_path, ...). This behaviour is not backwards-compatible. In all prior versions, I could put the *.beam file anywhere in the existing code path, and load the module, without the need to explicitly flush the cache. We often leveraged that (while doing hot code loads without proper release upgrade, for it was too complex to maintain). It may not be a big issue, but something to be really aware of.

We (Klarna's KRED system) is affected by this as I'm preparing to enable OTP-26 for it. We do depend on hot-code loading of new or updated .beam files. I'm currently investigating our options. Just one data point, but this was not a backwards-compatible change.

josevalim · 2024-02-15T14:36:27Z

@mikpe for clarity, the initial version of this pull request cached all paths by default. The version that was merged caches all paths from the .boot file by default but not dynamically added paths or the ones via -pa/-pz.

One quick work-around is to add code:set_path(code:get_path()) once your application starts or before upgrade, that makes it so none of the paths are cached. I could also send a pull request to add a new option to init, such as -cache_boot_paths that disables it in your case.

mikpe · 2024-02-15T15:02:21Z

We use ERL_LIBS. Calling code:set_path(code:get_path()) at startup seems to work (at least for the test cases where I spotted the issue).

bjorng added the team:VM Assigned to OTP team VM label Jan 25, 2023

josevalim mentioned this pull request Jan 25, 2023

Add support for caching load paths #5811

Closed

3 tasks

frazze-jobb self-assigned this Jan 25, 2023

josevalim force-pushed the jv-cache-code branch from 1950849 to 55ae5e9 Compare January 25, 2023 10:27

josevalim mentioned this pull request Jan 25, 2023

Paths added with -pa and -pz cannot be removed from the module lookup #6692

Closed

josevalim force-pushed the jv-cache-code branch from 55ae5e9 to 8cc544e Compare January 30, 2023 15:51

KennethL added this to the OTP-26.0-rc1 milestone Feb 2, 2023

josevalim force-pushed the jv-cache-code branch from 8cc544e to 8c5d179 Compare February 8, 2023 09:10

frazze-jobb added the testing currently being tested, tag is used by OTP internal CI label Feb 8, 2023

josevalim mentioned this pull request Feb 8, 2023

Add del_paths/1, clear_cache/0, and cache arg #6823

Merged

frazze-jobb force-pushed the jv-cache-code branch from 0ad10a3 to 1282d6e Compare February 10, 2023 03:24

frazze-jobb force-pushed the jv-cache-code branch from 1282d6e to 6efdf5b Compare February 10, 2023 03:27

Except "." from being cached

a1bdb2c

frazze-jobb force-pushed the jv-cache-code branch from 6efdf5b to a1bdb2c Compare February 10, 2023 07:55

frazze-jobb merged commit 070c485 into erlang:master Feb 10, 2023

yskelg mentioned this pull request Jul 14, 2023

Erlang RISC-V Architecture JIT Support #7498

Open

the-mikedavis mentioned this pull request Jan 31, 2024

Use code_server path cache in code:where_is_file/1 #8078

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add code path caching #6729

Add code path caching #6729

josevalim commented Jan 24, 2023 •

edited

Loading

github-actions bot commented Jan 24, 2023 •

edited

Loading

fhunleth commented Jan 25, 2023

okeuday commented Jan 25, 2023

josevalim commented Jan 25, 2023 •

edited

Loading

max-au commented Jan 25, 2023

josevalim commented Jan 25, 2023

max-au commented Jan 25, 2023

josevalim commented Jan 25, 2023

frazze-jobb commented Jan 26, 2023

josevalim commented Jan 26, 2023 •

edited

Loading

frazze-jobb commented Jan 30, 2023 •

edited

Loading

frazze-jobb commented Feb 8, 2023

josevalim commented Feb 8, 2023 •

edited

Loading

josevalim commented Feb 8, 2023

frazze-jobb commented Feb 8, 2023

josevalim commented Feb 8, 2023

frazze-jobb commented Feb 10, 2023

mikpe commented Feb 15, 2024

josevalim commented Feb 15, 2024

mikpe commented Feb 15, 2024

Add code path caching #6729

Add code path caching #6729

Conversation

josevalim commented Jan 24, 2023 • edited Loading

Benchmarks

Decisions to be taken

TODO

Future work

github-actions bot commented Jan 24, 2023 • edited Loading

CT Test Results

Artifacts

fhunleth commented Jan 25, 2023

okeuday commented Jan 25, 2023

josevalim commented Jan 25, 2023 • edited Loading

max-au commented Jan 25, 2023

josevalim commented Jan 25, 2023

max-au commented Jan 25, 2023

josevalim commented Jan 25, 2023

frazze-jobb commented Jan 26, 2023

josevalim commented Jan 26, 2023 • edited Loading

frazze-jobb commented Jan 30, 2023 • edited Loading

frazze-jobb commented Feb 8, 2023

josevalim commented Feb 8, 2023 • edited Loading

josevalim commented Feb 8, 2023

frazze-jobb commented Feb 8, 2023

josevalim commented Feb 8, 2023

frazze-jobb commented Feb 10, 2023

mikpe commented Feb 15, 2024

josevalim commented Feb 15, 2024

mikpe commented Feb 15, 2024

josevalim commented Jan 24, 2023 •

edited

Loading

github-actions bot commented Jan 24, 2023 •

edited

Loading

josevalim commented Jan 25, 2023 •

edited

Loading

josevalim commented Jan 26, 2023 •

edited

Loading

frazze-jobb commented Jan 30, 2023 •

edited

Loading

josevalim commented Feb 8, 2023 •

edited

Loading