Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Default -code_path_choice to strict #6683

Merged
merged 1 commit into from Feb 2, 2023

Conversation

josevalim
Copy link
Contributor

@josevalim josevalim commented Jan 16, 2023

This PR may be a bit controversial but I would say it is worth it.

The init process has a feature where it tries to preemptively
fix code paths. To do so, it has to traverse all paths and perform
(most commonly) two additional file system lookups.

patch_path(Dirs, strict) ->
Dirs;
patch_path(Dirs, relaxed) ->
ArchiveExt = archive_extension(),
[patch_dir(Dir, ArchiveExt) || Dir <- Dirs].
patch_dir(Orig, ArchiveExt) ->
case funny_split(Orig, $/) of
["nibe", RevApp, RevArchive | RevTop] ->
App = reverse(RevApp),
case funny_splitwith(RevArchive, $.) of
{Ext, Base} when Ext =:= ArchiveExt, Base =:= App ->
%% Orig archive
Top = reverse([reverse(C) || C <- RevTop]),
Dir = join(Top ++ [App, "ebin"], "/"),
Archive = Orig;
_ ->
%% Orig directory
Top = reverse([reverse(C) || C <- [RevArchive | RevTop]]),
Archive = join(Top ++ [App ++ ArchiveExt, App, "ebin"], "/"),
Dir = Orig
end,
%% First try dir, second try archive and at last use orig if both fails
case erl_prim_loader:read_file_info(Dir) of
{ok, #file_info{type = directory}} ->
Dir;
_ ->
case erl_prim_loader:read_file_info(Archive) of
{ok, #file_info{type = directory}} ->
Archive;
_ ->
Orig
end
end;
_ ->
Orig
end.

While trying to optimize the boot process, I have noticed this
does show up when profiling. 1-2ms on my (fast) local machine
for erl and 3-4ms with Elixir (it uses -pa and -pz with increases
the cost for every loaded application).

When trying this on an actual application, such as Livebook which
has 50 dependencies, I got up to 10ms spent on patching paths
on a Linux-like machine typically used for deployment (emulated
with VirtualBox).

The times may be higher for embedded devices and those where
filesystem lookups are more expensive.

Luckily, this can be disabled by setting -code_path_choice strict,
which I argue should be the default. We could also try to optimize
this code but I would argue that, unless there are very good reasons,
it should not be the job of the init system to fix invalid paths.
The faster booter times can be seen as an additional positive side-effect
from removing this behaviour.

Therefore my suggestion is to change the default for this release and
completely remove the relaxed behaviour in future releases.

PS: This PR will most likely break the build. I am waiting for the desired
outcome to fix tests.

This option requires traversing the loadpaths
several times and doing multiple filesystem
lookups during boot.

This shaves 4ms from boot time on my local
machine but, in an application such as Livebook,
it shaves +10ms when booting on a typical Linux
box.
@bjorng
Copy link
Contributor

bjorng commented Jan 16, 2023

We are all for speeding up the boot time. Before we decide whether we can accept this pull request, the important questions are: What will break if we merge this pull request? Do tools that package BEAM files into archive files need to be updated? Do you users of those tools need to do something?

@josevalim
Copy link
Contributor Author

What will break if we merge this pull request? Do tools that package BEAM files into archive files need to be updated? Do you users of those tools need to do something?

While Mix uses archives, the archives are unpacked on installation, so this change does not affect Elixir. However, even before then, we would always point to the correct path. I will request feedback from the rebar3 team and also from the community on ErlangForums.

In any case, there is a chance this pull request will break existing users. If someone is building an archive and they are not passing the correct path inside the archive to the code path, then it will fail unless the appropriate path is given. In other words, the current mode allows you to pass:

-pa app.ez/app/ebin

and

-pa app.ez/ebin

And it will be able to load either app.ez/app/ebin or app.ez/ebin in both cases. This pull request will require the exact path to be given, so I would say it is more of a concern of how users are using archives rather than how they are being built.

Finally, it is important to note that tools that process archives can keep the existing behaviour by doing the traversal themselves once they detect an archive. So they can keep the behaviour if it is extremely important to do so, but outside of the init process.

@dgud
Copy link
Contributor

dgud commented Jan 16, 2023

@josevalim I don't think it is about passing the correct -pa, -pz paths.
It is about using ERL_LIBS environment variable or installing applications/packages to the erlang/lib/ directory.

Then you can argue if a application release should be just a zip of a built application, or a zip of everything below the application dir, i.e. should the zip contain the application name or not, in the zip file.
And I guess that the current behavior provides usage for both types.

So I do believe it is a question of how they are built (if anybody is using zip'ed applications).

@josevalim
Copy link
Contributor Author

josevalim commented Jan 16, 2023

Thanks @dgud for the additional context. The code responsible for loading ERL_LIBS is in the code_server module and the slow down I measured was in relation to init. Furthermore, the relaxed code path feature does not impact code_server as much because it is only invoked when the given path does not exist.

So one alternative suggestion is to disable the relaxed code path feature from -pa, -pz, and bootscripts, but still keep it enabled in the code server. That would effectively address the same concerns. (see below)

@josevalim
Copy link
Contributor Author

josevalim commented Jan 16, 2023

@dgud further updates from studying the code: the code responsible for loading ERL_LIBS and Erlang/OTP lib dir already attempts to load the correct ebin from inside archives, regardless if the code path choice is relaxed or strict.

The code starts here:

https://github.com/erlang/otp/blob/master/lib/kernel/src/code_server.erl#L79-L89

And then leads here:

https://github.com/erlang/otp/blob/master/lib/kernel/src/code_server.erl#L464-L483

So the behavior you describe will continue working regardless of the code_path_choice option. Therefore I believe the proposal is still solid, because it is exactly the responsibility of who is computing the code paths to compute the proper archive paths, instead of doing so after the fact.

@github-actions
Copy link
Contributor

CT Test Results

Tests are running... https://github.com/erlang/otp/actions/runs/4025152334

Results for commit 8ecdf2c

To speed up review, make sure that you have read Contributing to Erlang/OTP and that all checks pass.

See the TESTING and DEVELOPMENT HowTo guides for details about how to run test locally.

Artifacts

  • No CT logs found
  • No HTML docs found
  • No Windows Installer found

// Erlang/OTP Github Action Bot

@frazze-jobb frazze-jobb added the testing currently being tested, tag is used by OTP internal CI label Jan 30, 2023
@frazze-jobb
Copy link
Contributor

Added this to the nightly tests, will check internally if there is anyone protesting against this.

@frazze-jobb frazze-jobb merged commit 0f5d7d7 into erlang:master Feb 2, 2023
@KennethL KennethL added this to the OTP-26.0-rc1 milestone Feb 2, 2023
@josevalim
Copy link
Contributor Author

Thank you @frazze-jobb! Should I add a note to myself to send a PR to remove the option altogether on Erlang/OTP 27?

@frazze-jobb
Copy link
Contributor

Yes do that!

@bjorng
Copy link
Contributor

bjorng commented Feb 3, 2023

We have a page in the documentation for things that are to be removed:

https://www.erlang.org/doc/general_info/scheduled_for_removal.html

Here is an example how we added the text for things to be removed in OTP 26:

system/doc/general_info/scheduled_for_removal_26.inc

@josevalim
Copy link
Contributor Author

I have added the page and changed the docs in #6777.

jhogberg added a commit to jhogberg/otp that referenced this pull request Feb 9, 2023
…strict"

This reverts commit 0f5d7d7, reversing
changes made to a6c774c.
@jhogberg jhogberg mentioned this pull request Feb 9, 2023
josevalim added a commit to josevalim/otp that referenced this pull request Apr 13, 2023
List the changes discussed in erlang#6683 and erlang#6693
as upcoming incompatibilities.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement team:VM Assigned to OTP team VM testing currently being tested, tag is used by OTP internal CI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants