Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make the Erlang installation directory relocatable on Unix-like systems #2879

Conversation

JeromeDeBretagne
Copy link
Contributor

Hello,

Here is the second pull request to address ERL-1323 following #2863, this time making the erl and start scripts independent of their actual location. This allows the Erlang installation directory to be moved freely on Unix-like systems.

The idea is to find the absolute path of the script from within itself, as was mentioned in the initial ERL-1323 discussion, thus allowing to set ROOTDIR dynamically when needed. It supports the 2 copies of each script, the ones in erts-%VSN%/bin and the ones in bin.

At the moment, the usage and configuration of FINAL_ROOTDIR is kept as-is for full backwards compatibility with the existing installation scripts. It should be possible to remove FINAL_ROOTDIR totally in a second stage if this new dynamic approach proves to be robust enough on all supported systems, which I expect to be the case.

I've tested the updated erl script a lot and I've applied exactly the same change to the start script but I don't use start myself. So feel free to have a second look at the potential impacts / side effects for start users.

Let me know your thoughts and feedback about this proposition.

Thanks, Jérôme

@rickard-green rickard-green added the team:VM Assigned to OTP team VM label Nov 23, 2020
@kjellwinblad
Copy link
Contributor

I have already merged the first commit in this pull-request to the master branch. Can you remove the first commit and rebase the second commit on master?

while [ -h "${prog}" ]; do
newProg=`ls -ld "${prog}"`
newProg=`expr "${newProg}" : ".* -> \(.*\)$"`
if expr "x${newProg}" : 'x/' >/dev/null; then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this checks if $newProg is an absolute path but I don't fully understand how it works. Can you explain how expr "x${newProg}" : 'x/' >/dev/null; works?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Kjell,

Indeed, this checks if newProg starts with a leading / character via a regular expression match:

$ expr "x/test" : 'x/' > /dev/null
$ echo $?
0

$ expr "xtest" : 'x/' > /dev/null
$ echo $?
1

=> the expression returns 0 if there is a match, 1 otherwise.

P.S. The returned output is the number of characters that matched, but this is not relevent hence the >/dev/null part

$ expr "x/test" : 'x/'
2

In the above case, it shows that the first 2 characters of "x/test" have matched with 'x/'

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the explanation! The x character confused me.

Is the following:

expr "x${newProg}" : 'x/' > /dev/null"

not equivalent to:

expr "${newProg}" : '/' > /dev/null"

?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are not equivalent in all cases depending on the value of newProg. You can see a similar change here:

Be more POSIXly correct in the use of expr.
https://android.googlesource.com/platform/dalvik/+/530e0a50c365bd8451209a014fef4cf293e573c1

that uses exactly the same convention. I won't pretend to be a script guru to explain it in details though.

You can see an example suggesting to use expr X$a = X= reliably here as another reference:
https://pubs.opengroup.org/onlinepubs/009695399/utilities/expr.html

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, x is used to make sure "${newProg}" is not interpreted as an operation. I think it would be good to add some comments about the non-obvious parts to make it more readable by people that are not so familiar with shell scripting.

Btw, I have got a question about if one could use a command like realpath instead of doing the link traversal in the script. I will investigate realpath but please let me know if you have some insight into that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation about realpath.

I think the comments are good. They make it much easier for readers to understand what is happening.

@rickard-green suggested that we could make our own version of realpath (based on the C function realpath) that we can put under the same directory as the erl script. The realpath C function is part of a POSIX standard so it is hopefully available on all relevant platforms. The main benefits of this approach as I see it is that it is potentially more robust and that we would need less complex code in the erl and start scripts. @JeromeDeBretagne What do you think about this idea?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this scripting has been used in the Android Open Source Project for many years, I believe this is a good sign that it is quite rubust. I usually prefer incremental changes, and the current proposal is a limited modification of existing scripts that have proven to support many cases (native, cross, embedeed, etc.) with only an added dependency on ls.

I'm not a fan at all of the idea to add another executable in the Erlang installation directory, needed to simply launch erl, I don't see that as adding robustness. If the Erlang/OTP team is considering bigger changes, I would suggest modifications that vastly simplify the Erlang runtime build and installation, for instance :

  • decrease the number of installed executables
  • get rid of some scripts in favor of native executables
  • remove duplicates between bin/ and erts-$VSN$/bin
  • remove the ./Install step
  • better support for incrementable builds

That's only my (very personal) view though and it could be discussed within one of the EEF Working Groups I imagine.

@kjellwinblad @rickard-green I still plan to spend more time investigating how to add relevant test cases as suggested, so please let me know if the team still considers this PR a good direction. Thank you.

Copy link
Contributor

@kjellwinblad kjellwinblad Dec 4, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JeromeDeBretagne We have discussed this pull-request a lot internally in the OTP team today. I think you can take a break from working with this until we have decided what approach to take.

There is already a C program call dyn_erl.c that gets installed in bin directory of a release and seems to do similar things as your script. This program was created a long time ago (I did not know about it until today). So other good solutions might be to call that program if $ROOTDIR is incorrect or replace the script with that program. We will let you know when we have made a decision. Thank you for your work so far and sorry for making you do something that might not be accepted.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @kjellwinblad for your feedback and no worries at all, I've learnt a lot while working on this contribution.

I didn't know about dyn_erl until today either. It doesn't seem covered much in the official documentation and it's not found in the main bin directory (but it's in the erts-$VSN%/bin). That's what I meant by "simplify" in my previous comment :-)

Good catch, dyn_erl looks very similar to what I was trying to achieve indeed and there is already a "realpath look alike" method commented out. I'm really keen to see which direction the OTP team will take on this topic!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for jumping in, but @hawk do you remember anything about this?
I guess (without looking) that this is your work a long long time ago,
you had the same ideas and did it for windows, why was it never done for unix?
Any lessons you learned and remember that we don't? :-)

@kjellwinblad
Copy link
Contributor

We use lower_case_with_underscores for variable names in most shell scripts. Can you change the variable names so they use this naming convention?

@kjellwinblad
Copy link
Contributor

kjellwinblad commented Nov 27, 2020

It would be great if we could add an automated test for the symlink traversal code. That way we could see if something breaks in any of the machines in Erlang/OTP's testing infrastructure.

A good place to add such a test case would be in this test suite:

$ERL_TOP/erts/test/erlexec_SUITE.erl

You can check if the test is running on a UNIX system similarly to how it is done on line 287 in erlexec_SUITE.erl.

A testcase could check that one can execute different types of symlinks (one or more jumps, relative paths and absolute paths) that points to the erl script.

This document describes how to run test cases:

https://github.com/erlang/otp/blob/master/HOWTO/TESTING.md

Please let me know if you need help with anything related to testing.

@JeromeDeBretagne JeromeDeBretagne force-pushed the jdb/erts/make-Erlang-installation-dir-relocatable/ERL-1323 branch from 9cfa77e to f675f57 Compare November 27, 2020 23:21
@JeromeDeBretagne
Copy link
Contributor Author

Hello Kjell,

I have already merged the first commit in this pull-request to the master branch. Can you remove the first commit and rebase the second commit on master?

Here is a second revision with the (already merged) first commit removed and the second one rebased on master.

We use lower_case_with_underscores for variable names in most shell scripts. Can you change the variable names so they use this naming convention?

I've switched to the lower_case_with_underscores convention for script variable names.

I think it would be appropriate if you add a comment with a reference to the dx code.

I've also added a comment making a reference to the dx script as suggested.

It would be great if we could add an automated test for the symlink traversal code. That way we could see if something breaks in any of the machines in Erlang/OTP's testing infrastructure.

I'm not familiar at all with the test suite, I'll need much more time to investigate it and see how a test could fit into erlexec_SUITE.erl.

A testcase could check that one can execute different types of symlinks (one or more jumps, relative paths and absolute paths) that points to the erl script.

Please let me know if you need help with anything related to testing.

Do you have a suggestion about how to create the symlinks on the fly as part of one or more tests, especially the ones with an absolute path? Could you share a small sample code maybe?

Thanks, Jérôme

@kjellwinblad
Copy link
Contributor

kjellwinblad commented Nov 30, 2020

Hi JeromeDeBretagne,

Do you have a suggestion about how to create the symlinks on the fly as part of
one or more tests, especially the ones with an absolute path? Could you share a small sample code maybe?

You can get the path of a folder where you can put the new symlinks by executing this:

filename:join(filename:dirname(code:which(?MODULE)), "erlexec_SUITE_data")

There is an Erlang function for creating symlinks.

The filename:absname functions are convenient for getting an absolute path from a relative path.

We use the common test framework for our test suits.

The os:cmd/1 function can be used to execute external commands.

It can be useful to know that you can build and execute a single test case in a test suite by running a single command. For example, to execute the args_file test case in erlexec_SUITE (make sure that ERL_TOP is set to the root of Erlang/OTP source tree):

(cd $ERL_TOP/erts/ &&  make ARGS="-suite erlexec_SUITE -case args_file" test)

Please let me know if you get stuck on something and I will try to help you.

@JeromeDeBretagne
Copy link
Contributor Author

Hi @kjellwinblad ,

Thanks for the inputs, I'll do my best to look into this in the coming days.

I'm currently spending some time back on #2863 as Lukas has detected that it could be the source of test case regression. Your second look would be appreciated on that one too.

Update the erl and start scripts to find and set ROOTDIR dynamically when
needed, removing the dependency on a preset hardcoded absolute ROOTDIR path.
This makes the 2 scripts independent of their actual location and allows the
Erlang installation directory to be moved freely on Unix-like systems.

Keep the usage and configuration of FINAL_ROOTDIR as-is at the moment for
backwards compatibility.
@JeromeDeBretagne JeromeDeBretagne force-pushed the jdb/erts/make-Erlang-installation-dir-relocatable/ERL-1323 branch from f675f57 to 1e0a6a9 Compare December 3, 2020 22:30
@JeromeDeBretagne
Copy link
Contributor Author

Hi @kjellwinblad , Hi @dgud,

We will let you know when we have made a decision. Thank you for your work so far and sorry for making you do something that might not be accepted.

As OTP 24 is getting closer, do you have a recommandation about the approach the team would prefer on this topic (#4461)?

Could this first PR be accepatble for OTP 24, with an improvement to investigate for OTP 24.X or 25 later on?

If the OTP team would prefer a bigger/better change in one go, could you drop some hints about a more acceptable direction, so that the community can start investigating it and try to propose another PR?

Thank you, Jérôme

@hawk
Copy link
Contributor

hawk commented Mar 14, 2021

I did this 5 years ago for escript. Regardless of the final solution, I think it is good if the same solution can be used for all ways of starting the system.

1d8dbc7

@kjellwinblad
Copy link
Contributor

Hi @JeromeDeBretagne,

Sorry for the long time to respond. I will discuss with the team if it is possible to bring this PR into 24. Unfortunately, I think the most likely answer is that it will be too late (which is our fault) as we have already released rc2.

If the OTP team would prefer a bigger/better change in one go, could you drop some hints about a more acceptable direction so that the community can start investigating it and try to propose another PR?

Yes, we have a ticket in our backlog about deciding what direction we want to go. We will update this PR when we have made a decision.

@hawk thanks. Yes, it would be good to reuse the same solution in all ways one can start the system.

@JeromeDeBretagne
Copy link
Contributor Author

Hi @kjellwinblad,

Thanks for the update, waiting for the team feedback then!

@kjellwinblad
Copy link
Contributor

@JeromeDeBretagne wrote:

Thanks for the update, waiting for the team feedback then!

Now, we have finally decided on which direction we want to go with this. Thanks for your patience.

We have decided and discussed the following:

  • Having the installation directory relocatable is something we want to have ready for the Erlang/OTP 25 release. Preferable as soon as possible so we can test it well before the release.
  • We want to have a single way to get the absolute path of the installation directory. We, therefore, want to replace the erl script with a C program. The dyn_erl.c program seems to do what we want but we have to make sure that nothing breaks when replacing the erl script with this program.
  • We need to investigate if the start script needs to be kept as a script (it might be documented that users can modify the start script to customize the start). If start needs to be kept as a script, we can get the absolute path of the installation by calling a C program (preferable the same as the one that will be the new erl program). The program that we call from the start script must always be at the same location relative to the start script.
  • The install script also adds absolute paths to the RELEASES file. The RELEASES file is used when doing upgrades. We have to replace the absolute paths in the RELEASES file with relative paths and ensure that this does not break anything. There are some test cases in the SASL test suite related to release handling that one can use to test this, but it is not certain that these tests cover everything.
  • We should also try to make sure that the changes we do to make an Erlang installation relocatable do not break other tools. @ferd and @tsloughter do you think the change described in this message will affect rebar3 in any way?

@JeromeDeBretagne, if you want to continue to work on this, we would be happy to assist you in whatever way we can.

@tsloughter
Copy link
Contributor

I don't think this will effect rebar3. Currently releases generated with rebar3 will create a RELEASES file that uses relative paths, like {sasl,"4.0.1","./lib/sasl-4.0.1"},

This will be great, I was recently dealing with adding support for installing binary bundles of OTP to erlup (https://github.com/tsloughter/erlup/) and had forgotten about this issue until realizing I had to run INSTALL from the bundle because of paths needing to be setup in multiple files.

@JeromeDeBretagne
Copy link
Contributor Author

Thank you for sharing the goals and steps around this evolution, that sounds like a good plan. I can't tell yet if I'll have the bandwidth to look at it soon, I'm working on another evolution atm.

It seems it will require slightly different skills compared to my original PR (which was simply reusing existing scripts) with some changes to the build process needed this time. That's not the easiest part of the Erlang project to modify, at least for me.

To raise awareness with the community, I would suggest to create a dedicated issue with the help-wanted flag (or a few more granular ones?) replacing #4461 and describing the target changes. Then let's see how to get the discussion started there, with one or more contributions either from the OTP team or the community. What do you think?

In fact, if you already have a list of goals for the OTP 25 release that you can share publicly, it would be great if you could describe them in a similar way through the issue tracker, to see if some of them could be worked on by the community. Just my 2 cents.

@kjellwinblad kjellwinblad added the stalled waiting for input by the Erlang/OTP team label Aug 12, 2021
@kjellwinblad
Copy link
Contributor

@JeromeDeBretagne Thanks for your suggestion in #2879 (comment) . I think your suggestions are good and have discussed them with the Erlang/OTP team at Ericsson. To formalize goals and planned features in a way that is accessible to the community requires some work so there is a trade-off that needs to be made. That being said, we are always trying to improve our way of working so it is very possible that we will try to become more public about goals so that we can get more help from external contributors.

FYI, I'm starting to work on implementing what is outlined in #2879 (comment) now.

@ferd
Copy link
Contributor

ferd commented Oct 27, 2021

I don't think this will effect rebar3. Currently releases generated with rebar3 will create a RELEASES file that uses relative paths, like {sasl,"4.0.1","./lib/sasl-4.0.1"},

This will be great, I was recently dealing with adding support for installing binary bundles of OTP to erlup (https://github.com/tsloughter/erlup/) and had forgotten about this issue until realizing I had to run INSTALL from the bundle because of paths needing to be setup in multiple files.

The one edge case I recall hitting with releases was specifically with relups. IIRC the issue is that Rebar3 uses the relative paths (a good thing for relocatable stuff) while the VM uses absolute paths, which had two unforeseen consequences before this PR:

  1. The release boots fine, and upgrades fine with a relup, but will not rollback to the first version because of the path difference. Further installs and unpacks worked fine
  2. If the code in the release changes directories from the default one at startup (using c:cd/1 or file:set_cwd/1), then the operations around SASL and the release handlers started failing.

Being able to have the release know about the original absolute release path and use the RELEASES file's relative paths together would likely improve things on both issues.

kjellwinblad added a commit to kjellwinblad/otp that referenced this pull request Nov 17, 2021
Previously the shell scripts (e.g., erl and start) and the RELEASES
file that are created when generating and installing an Erlang
release depended on hard coded absolute path to the release' root
directory. This made it cumbersome to move a release to a different
directory and can be problematic for platforms such as Android
(erlang#2879) where the installation
directory is unknown at compile time. This is fixed by:

* Changing the scripts so that they can dynamically find the
  ROOTDIR. The dynamically found ROOTDIR is selected if it differs
  from the hard-coded ROOTDIR and seems to point to a valid Erlang
  release. The dyn_erl program is changed so that it can return its
  absolute canonicalized path when given the --realpath argument
  (dyn_erl gets its absolute canonicalized path from the realpath
  POSIX function).

* Changing the release_handler module that reads and writes to the
  RELEASES file so that it prepends code:root_dir() whenever it
  encounters relative paths. This is necessary since the current
  working directory can be changed so it is something different than
  code:root_dir().
kjellwinblad added a commit to kjellwinblad/otp that referenced this pull request Nov 19, 2021
Previously shell scripts (e.g., `erl` and `start`) and the RELEASES file
for an Erlang installation depended on hard coded absolute path to the
installation's root directory. This made it cumbersome to move a
release to a different directory and can be problematic for platforms
such as Android (erlang#2879) where the
installation directory is unknown at compile time. This is fixed by:

* Changing the shell scripts so that they can dynamically find the
  `ROOTDIR`. The dynamically found `ROOTDIR` is selected if it differs
  from the hard-coded `ROOTDIR` and seems to point to a valid Erlang
  release. The `dyn_erl` program has been changed so that it can
  return its absolute canonicalized path when given the `--realpath`
  argument (dyn_erl gets its absolute canonicalized path from the
  realpath POSIX function). The `dyn_erl`'s `--realpath`
  functionality is used by the scripts to get the root dir dynamically.

* Changing the release_handler module that reads and writes to the
  RELEASES file so that it prepends `code:root_dir()` whenever it
  encounters relative paths. This is necessary since the current
  working directory can be changed so it is something different than
  `code:root_dir()`.
kjellwinblad added a commit to kjellwinblad/otp that referenced this pull request Nov 19, 2021
for an Erlang installation depended on hard coded absolute path to the
installation's root directory. This made it cumbersome to move a
release to a different directory and can be problematic for platforms
such as Android (erlang#2879) where the
installation directory is unknown at compile time. This is fixed by:

* Changing the shell scripts so that they can dynamically find the
  `ROOTDIR`. The dynamically found `ROOTDIR` is selected if it differs
  from the hard-coded `ROOTDIR` and seems to point to a valid Erlang
  installation. The `dyn_erl` program has been changed so that it can
  return its absolute canonicalized path when given the `--realpath`
  argument (dyn_erl gets its absolute canonicalized path from the
  `realpath` POSIX function). The `dyn_erl`'s `--realpath`
  functionality is used by the scripts to get the root dir dynamically.

* Changing the `release_handler` module that reads and writes to the
  `RELEASES` file so that it prepends `code:root_dir()` whenever it
  encounters relative paths. This is necessary since the current
  working directory can be changed so it is something different than
  `code:root_dir()`.
kjellwinblad added a commit to kjellwinblad/otp that referenced this pull request Nov 19, 2021
Previously shell scripts (e.g., `erl` and `start`) and the RELEASES file
for an Erlang installation depended on hard coded absolute path to the
installation's root directory. This made it cumbersome to move a
release to a different directory and can be problematic for platforms
such as Android (erlang#2879) where the
installation directory is unknown at compile time. This is fixed by:

* Changing the shell scripts so that they can dynamically find the
  `ROOTDIR`. The dynamically found `ROOTDIR` is selected if it differs
  from the hard-coded `ROOTDIR` and seems to point to a valid Erlang
  installation. The `dyn_erl` program has been changed so that it can
  return its absolute canonicalized path when given the `--realpath`
  argument (dyn_erl gets its absolute canonicalized path from the
  `realpath` POSIX function). The `dyn_erl`'s `--realpath`
  functionality is used by the scripts to get the root dir dynamically.

* Changing the `release_handler` module that reads and writes to the
  `RELEASES` file so that it prepends `code:root_dir()` whenever it
  encounters relative paths. This is necessary since the current
  working directory can be changed so it is something different than
  `code:root_dir()`.
@kjellwinblad
Copy link
Contributor

I have created another PR (#5427) implementing the suggestion outlined in #2879 (comment). We can close this PR if that PR gets merged.

@ferd Thank you for your feedback. I have added a test case that (I think) tests the two unforeseen consequences you described, and it seems to be working fine now. Btw, your book was of great help to me when trying to figure out how release handling in Erlang works.

garazdawi pushed a commit to kjellwinblad/otp that referenced this pull request Dec 13, 2021
Previously shell scripts (e.g., `erl` and `start`) and the RELEASES file
for an Erlang installation depended on hard coded absolute path to the
installation's root directory. This made it cumbersome to move a
release to a different directory and can be problematic for platforms
such as Android (erlang#2879) where the
installation directory is unknown at compile time. This is fixed by:

* Changing the shell scripts so that they can dynamically find the
  `ROOTDIR`. The dynamically found `ROOTDIR` is selected if it differs
  from the hard-coded `ROOTDIR` and seems to point to a valid Erlang
  installation. The `dyn_erl` program has been changed so that it can
  return its absolute canonicalized path when given the `--realpath`
  argument (dyn_erl gets its absolute canonicalized path from the
  `realpath` POSIX function). The `dyn_erl`'s `--realpath`
  functionality is used by the scripts to get the root dir dynamically.

* Changing the `release_handler` module that reads and writes to the
  `RELEASES` file so that it prepends `code:root_dir()` whenever it
  encounters relative paths. This is necessary since the current
  working directory can be changed so it is something different than
  `code:root_dir()`.
garazdawi pushed a commit to garazdawi/otp that referenced this pull request Dec 13, 2021
Previously shell scripts (e.g., `erl` and `start`) and the RELEASES file
for an Erlang installation depended on hard coded absolute path to the
installation's root directory. This made it cumbersome to move a
release to a different directory and can be problematic for platforms
such as Android (erlang#2879) where the
installation directory is unknown at compile time. This is fixed by:

* Changing the shell scripts so that they can dynamically find the
  `ROOTDIR`. The dynamically found `ROOTDIR` is selected if it differs
  from the hard-coded `ROOTDIR` and seems to point to a valid Erlang
  installation. The `dyn_erl` program has been changed so that it can
  return its absolute canonicalized path when given the `--realpath`
  argument (dyn_erl gets its absolute canonicalized path from the
  `realpath` POSIX function). The `dyn_erl`'s `--realpath`
  functionality is used by the scripts to get the root dir dynamically.

* Changing the `release_handler` module that reads and writes to the
  `RELEASES` file so that it prepends `code:root_dir()` whenever it
  encounters relative paths. This is necessary since the current
  working directory can be changed so it is something different than
  `code:root_dir()`.
@garazdawi
Copy link
Contributor

#5427 is now merged, so I'm closing this PR. Thank you @JeromeDeBretagne for your help and for pushing us to finally fix this problem.

@garazdawi garazdawi closed this Dec 22, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stalled waiting for input by the Erlang/OTP team team:VM Assigned to OTP team VM
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants