-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Read .beam files from cached paths in the client #7981
Conversation
CT Test ResultsNo tests were run for this PR. This is either because the build failed, or the PR is based on a branch without GH actions tests configured. Results for commit de9942e To speed up review, make sure that you have read Contributing to Erlang/OTP and that all checks pass. See the TESTING and DEVELOPMENT HowTo guides for details about how to run test locally. Artifacts
// Erlang/OTP Github Action Bot |
43c9579
to
7e9ad5e
Compare
erlang:display({?MODULE,file_error}), | ||
erlang:display(Report) | ||
end, | ||
error; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes to this function is just a refactoring to remove a case catch
.
7e9ad5e
to
bc9522d
Compare
Thanks for the PR, I'll add it to the nightly runs after new year's eve. :) |
bc9522d
to
ad0282b
Compare
Thank you and happy holidays! 🍾 This is a note to myself on future work:
|
|
ad0282b
to
a50b7e3
Compare
@jhogberg apologies, I have pushed a fix. It was actually a silly failure. This line was failing: otp/lib/kernel/test/code_SUITE.erl Line 167 in 20e9c2f
As you can see, it is pushing a bad path to the code server. However, instead of discarding invalid code paths, the otp/erts/preloaded/src/erl_prim_loader.erl Lines 89 to 95 in 20e9c2f
When I moved the actions to the client, I did not add the I understand why we want to avoid these essential processes from crashing but I think it is a bad practice for the In a future PR, I will push more guards to PS: If you want me to add |
Thanks!
Sounds good. |
end; | ||
false -> | ||
{{error, bad_directory}, Path} | ||
Dir = filename:join([Dir0]), % Normalize |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this already done in the client in normalize_paths
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not really:
1> filename:join(["foo//bar"]).
"foo/bar"
We could move it to normalize paths though and do more work on the client (it is on my to do list), but I am being very conservative with these changes. For this PR, I tried to change the code_server
the minimum necessary to support the changes to erl_prim_loader
.
I also want to revise all uses of filename:*
in code_server
, because I think there is some duplication too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That definitely makes sense
@@ -1241,23 +1209,23 @@ loader_down(#state{loading = Loading0} = St, {Mod, Bin, FName}) -> | |||
St | |||
end. | |||
|
|||
mod_to_bin([{Dir, nocache}|Tail], ModFile, Acc) -> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at this code, I think we should split the cache list from the path list.
In this function, every time we lookup a module we update the entire codepath - this might be quite a lot of work and extra allocations.
I think it would be quite beneficial for performance to split out the Cache
as a map from the code path - this way once the cache is warmed up, this lookups will do no to little allocations.
This can be done as a follow-up PR, though, since this design of updating the codepath data-structure on every module lookup is present today already.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting!
I think it would be quite beneficial for performance to split out the Cache as a map from the code path - this way once the cache is warmed up, this lookups will do no to little allocations.
Would you have the map keys be the code paths? Then lookup is a bit more expensive. But I guess we could also build unique integers to associate them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you're right this could have some implications of lookup time.
One option could be to maintain two parallel lists, and updating the cache only if an actual update happened - it would have similar performance characteristics as today, but would not retain the updated list if no updates actually happened (which should be most of the time), but would still produce some garbage on every lookup.
With a map, one approach could be to "intern" the paths by converting them to atoms, though I'm not sure if there are any code path length implications on that, and atoms sometimes have a poor hash distribution, which might impact lookup performance. Another, as you suggested could be some manual interning and using the integer position in the code path list as the key to the map.
It might also be worth it to explore using pdict as the storage - traditionally it's the fastest one, though a bit ugly & questionable to use, but might make sense here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another, as you suggested could be some manual interning and using the integer position in the code path list as the key to the map.
I was thinking I could simply have an always incrementing integer, this way there is no need to rehash as code paths are added or removed. :) So the code path is a tuple {Path, IntKey} and we have a separate map.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes sense as well
@@ -1280,6 +1246,12 @@ with_cache(cache, Dir, ModFile) -> | |||
with_cache(Cache, _Dir, ModFile) when is_map(Cache) -> | |||
{is_map_key(ModFile, Cache), Cache}. | |||
|
|||
absname_when_relative(File) -> | |||
case filename:pathtype(File) of | |||
absolute -> File; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this makes the absolute
clause in the absname/2
function dead
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I want to look at this later, this is just to avoid duplication. I am taking notes, if you have more suggestions, let me know. :)
There's a few more failed testcases that appear related to this PR: much of |
This reduces the amount of work on the code server and erl_prim_loader, by loading data on the client whenever possible. This is done by introducing erl_prim_loader:read_file/1, which stays closer to Erlang's file reading semantics. Future work should fully remove the concept of paths from erl_prim_loader and move them into init.
a50b7e3
to
de9942e
Compare
I did fix one failure (same root cause, passing invalid values to However, @jhogberg, do you still have the full report of the failures? |
Did you run it with an installed Erlang system? ( |
This reduces the amount of work on the code server
and erl_prim_loader, by loading data on the client
whenever possible. This speeds up compilation of
Livebook from 7.2s to 6.9s (roughly 4%) and should
keep fewer binary references around.
This is done by introducing
erl_prim_loader:read_file/1
,which stays closer to Erlang's file reading semantics
and does not rely on
erl_prim_loader
code paths.erl_prim_loader:get_file/1
anderl_prim_loader:get_path/0
should be considered deprecated.
Future work should fully remove the concept of paths
from
erl_prim_loader
and move them intoinit
.