Read .beam files from cached paths in the client #7981

josevalim · 2023-12-27T08:59:38Z

This reduces the amount of work on the code server
and erl_prim_loader, by loading data on the client
whenever possible. This speeds up compilation of
Livebook from 7.2s to 6.9s (roughly 4%) and should
keep fewer binary references around.

This is done by introducing erl_prim_loader:read_file/1,
which stays closer to Erlang's file reading semantics
and does not rely on erl_prim_loader code paths.
erl_prim_loader:get_file/1 and erl_prim_loader:get_path/0
should be considered deprecated.

Future work should fully remove the concept of paths
from erl_prim_loader and move them into init.

github-actions · 2023-12-27T09:00:30Z

CT Test Results

No tests were run for this PR. This is either because the build failed, or the PR is based on a branch without GH actions tests configured.

Results for commit de9942e

To speed up review, make sure that you have read Contributing to Erlang/OTP and that all checks pass.

See the TESTING and DEVELOPMENT HowTo guides for details about how to run test locally.

Artifacts

Complete CT logs (Download Logs)
No HTML docs found
No Windows Installer found

// Erlang/OTP Github Action Bot

josevalim · 2023-12-27T17:20:27Z

erts/preloaded/src/erl_prim_loader.erl

+                erlang:display({?MODULE,file_error}),
+                erlang:display(Report)
+        end,
+    error;


The changes to this function is just a refactoring to remove a case catch.

erts/preloaded/src/erl_prim_loader.erl

jhogberg · 2023-12-28T10:27:11Z

Thanks for the PR, I'll add it to the nightly runs after new year's eve. :)

josevalim · 2023-12-28T15:47:10Z

Thank you and happy holidays! 🍾

This is a note to myself on future work:

After this is merged, move get_modules to fully work on the client (and spawn processes on the client)
Once Erlang/OTP 27 is out, remove get_path and get_file (assuming Remove erl_prim_loader fallback on module lookup #7697 is merged)

jhogberg · 2024-01-08T08:42:24Z

code_SUITE:set_path/1 in the kernel tests seems to fail across the board with this change :(

josevalim · 2024-01-09T09:22:45Z

@jhogberg apologies, I have pushed a fix. It was actually a silly failure. This line was failing:

otp/lib/kernel/test/code_SUITE.erl

Line 167 in 20e9c2f

{error, bad_directory} = (catch code:set_path([{a}])),

As you can see, it is pushing a bad path to the code server. However, instead of discarding invalid code paths, the code_server was passing an invalid argument to erl_prim_loader. This worked because erl_prim_loader wraps all function calls in a ?SAFE macro:

otp/erts/preloaded/src/erl_prim_loader.erl

Lines 89 to 95 in 20e9c2f

    
           -define(SAFE2(Expr, State),  
        
                   fun() -> 
        
                           case catch Expr of 
        
                               {'EXIT',XXXReason} -> {{error,XXXReason}, State}; 
        
                               XXXRes -> XXXRes 
        
                           end 
        
                   end()).

When I moved the actions to the client, I did not add the ?SAFE macro for client actions (and, honestly, I don't think we should), so now we got a failure.

I understand why we want to avoid these essential processes from crashing but I think it is a bad practice for the code_server to assume it can send garbage to erl_prim_loader and it will be fine. The proper fix is to validate data at the boundary, and that's exactly what I did. I force-pushed a change that moves path normalization to code.erl.

In a future PR, I will push more guards to erl_prim_loader, so we can catch other places that are sending garbage to erl_prim_loader.

PS: If you want me to add ?SAFE call around client code, I can do it, but I would prefer to add guards and lock its API down. :)

jhogberg · 2024-01-09T09:30:03Z

Thanks!

In a future PR, I will push more guards to erl_prim_loader, so we can catch other places that are sending garbage to erl_prim_loader.

Sounds good.

michalmuskala · 2024-01-09T09:44:38Z

lib/kernel/src/code_server.erl

-	    end;
-	false ->
-	    {{error, bad_directory}, Path}
+    Dir = filename:join([Dir0]), % Normalize


Isn't this already done in the client in normalize_paths?

Not really:

1> filename:join(["foo//bar"]). "foo/bar"

We could move it to normalize paths though and do more work on the client (it is on my to do list), but I am being very conservative with these changes. For this PR, I tried to change the code_server the minimum necessary to support the changes to erl_prim_loader.

I also want to revise all uses of filename:* in code_server, because I think there is some duplication too.

That definitely makes sense

michalmuskala · 2024-01-09T09:50:12Z

lib/kernel/src/code_server.erl

@@ -1241,23 +1209,23 @@ loader_down(#state{loading = Loading0} = St, {Mod, Bin, FName}) ->
            St
    end.

+mod_to_bin([{Dir, nocache}|Tail], ModFile, Acc) ->


Looking at this code, I think we should split the cache list from the path list.
In this function, every time we lookup a module we update the entire codepath - this might be quite a lot of work and extra allocations.
I think it would be quite beneficial for performance to split out the Cache as a map from the code path - this way once the cache is warmed up, this lookups will do no to little allocations.
This can be done as a follow-up PR, though, since this design of updating the codepath data-structure on every module lookup is present today already.

Interesting!

I think it would be quite beneficial for performance to split out the Cache as a map from the code path - this way once the cache is warmed up, this lookups will do no to little allocations.

Would you have the map keys be the code paths? Then lookup is a bit more expensive. But I guess we could also build unique integers to associate them?

Yes, you're right this could have some implications of lookup time.
One option could be to maintain two parallel lists, and updating the cache only if an actual update happened - it would have similar performance characteristics as today, but would not retain the updated list if no updates actually happened (which should be most of the time), but would still produce some garbage on every lookup.

With a map, one approach could be to "intern" the paths by converting them to atoms, though I'm not sure if there are any code path length implications on that, and atoms sometimes have a poor hash distribution, which might impact lookup performance. Another, as you suggested could be some manual interning and using the integer position in the code path list as the key to the map.

It might also be worth it to explore using pdict as the storage - traditionally it's the fastest one, though a bit ugly & questionable to use, but might make sense here.

Another, as you suggested could be some manual interning and using the integer position in the code path list as the key to the map.

I was thinking I could simply have an always incrementing integer, this way there is no need to rehash as code paths are added or removed. :) So the code path is a tuple {Path, IntKey} and we have a separate map.

That makes sense as well

michalmuskala · 2024-01-09T09:51:08Z

lib/kernel/src/code_server.erl

@@ -1280,6 +1246,12 @@ with_cache(cache, Dir, ModFile) ->
 with_cache(Cache, _Dir, ModFile) when is_map(Cache) ->
    {is_map_key(ModFile, Cache), Cache}.

+absname_when_relative(File) ->
+    case filename:pathtype(File) of
+        absolute -> File;


I think this makes the absolute clause in the absname/2 function dead

Yeah, I want to look at this later, this is just to avoid duplication. I am taking notes, if you have more suggestions, let me know. :)

jhogberg · 2024-01-15T11:39:01Z

There's a few more failed testcases that appear related to this PR: much of release_handler_SUITE in sasl, and otp_SUITE:test_runtime_dependencies_versions/1. Sorry about taking so long to get back :(

This reduces the amount of work on the code server and erl_prim_loader, by loading data on the client whenever possible. This is done by introducing erl_prim_loader:read_file/1, which stays closer to Erlang's file reading semantics. Future work should fully remove the concept of paths from erl_prim_loader and move them into init.

josevalim · 2024-01-21T13:27:19Z

I did fix one failure (same root cause, passing invalid values to erl_prim_loader).

However, @jhogberg, do you still have the full report of the failures? release_handler_SUITE fails on my machine even without this patch, so I am not sure there are more failures lurking behind. And test_runtime_dependencies_versions is skipped on machine with reason "this test case is designed to run in the Erlang/OTP teams test system for daily". If you have the full report, I can double check, but it may be good now.

bjorng · 2024-01-22T12:42:22Z

release_handler_SUITE fails on my machine even without this patch, so I am not sure there are more failures lurking behind.

Did you run it with an installed Erlang system? (make release or make install) If you run it from the git repository there will be failed test cases because the directory structure is different from an installed system.

josevalim · 2024-01-22T13:04:11Z

@bjorng ah, that may be it, I only did ./otp_build all -a && ./otp_build tests. But we should be good to go now, I have fixed the reports @jhogberg sent to me!

josevalim changed the title ~~Read .beam files for cache paths in the client~~ Read .beam files from cached paths in the client Dec 27, 2023

josevalim force-pushed the jv-read-beam-on-code-client branch from 43c9579 to 7e9ad5e Compare December 27, 2023 17:19

josevalim commented Dec 27, 2023

View reviewed changes

josevalim force-pushed the jv-read-beam-on-code-client branch from 7e9ad5e to bc9522d Compare December 27, 2023 19:58

jhogberg reviewed Dec 28, 2023

View reviewed changes

erts/preloaded/src/erl_prim_loader.erl Outdated Show resolved Hide resolved

jhogberg self-assigned this Dec 28, 2023

jhogberg added the team:VM Assigned to OTP team VM label Dec 28, 2023

josevalim force-pushed the jv-read-beam-on-code-client branch from bc9522d to ad0282b Compare December 28, 2023 15:37

josevalim force-pushed the jv-read-beam-on-code-client branch from ad0282b to a50b7e3 Compare January 9, 2024 09:13

michalmuskala reviewed Jan 9, 2024

View reviewed changes

josevalim force-pushed the jv-read-beam-on-code-client branch from a50b7e3 to de9942e Compare January 21, 2024 13:24

jhogberg merged commit de9942e into erlang:master Jan 25, 2024
11 of 13 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Read .beam files from cached paths in the client #7981

Read .beam files from cached paths in the client #7981

josevalim commented Dec 27, 2023 •

edited

Loading

github-actions bot commented Dec 27, 2023 •

edited

Loading

josevalim Dec 27, 2023

jhogberg commented Dec 28, 2023

josevalim commented Dec 28, 2023

jhogberg commented Jan 8, 2024

josevalim commented Jan 9, 2024 •

edited

Loading

jhogberg commented Jan 9, 2024

michalmuskala Jan 9, 2024

josevalim Jan 9, 2024

michalmuskala Jan 9, 2024

michalmuskala Jan 9, 2024

josevalim Jan 9, 2024

michalmuskala Jan 9, 2024 •

edited

Loading

josevalim Jan 9, 2024 •

edited

Loading

michalmuskala Jan 9, 2024

michalmuskala Jan 9, 2024

josevalim Jan 9, 2024

jhogberg commented Jan 15, 2024

josevalim commented Jan 21, 2024

bjorng commented Jan 22, 2024

josevalim commented Jan 22, 2024

Read .beam files from cached paths in the client #7981

Read .beam files from cached paths in the client #7981

Conversation

josevalim commented Dec 27, 2023 • edited Loading

github-actions bot commented Dec 27, 2023 • edited Loading

CT Test Results

Artifacts

Choose a reason for hiding this comment

jhogberg commented Dec 28, 2023

josevalim commented Dec 28, 2023

jhogberg commented Jan 8, 2024

josevalim commented Jan 9, 2024 • edited Loading

jhogberg commented Jan 9, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

michalmuskala Jan 9, 2024 • edited Loading

Choose a reason for hiding this comment

josevalim Jan 9, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jhogberg commented Jan 15, 2024

josevalim commented Jan 21, 2024

bjorng commented Jan 22, 2024

josevalim commented Jan 22, 2024

josevalim commented Dec 27, 2023 •

edited

Loading

github-actions bot commented Dec 27, 2023 •

edited

Loading

josevalim commented Jan 9, 2024 •

edited

Loading

michalmuskala Jan 9, 2024 •

edited

Loading

josevalim Jan 9, 2024 •

edited

Loading