New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

opam deleted home directory #3316

Open
nerpderp83 opened this Issue Apr 26, 2018 · 19 comments

Comments

Projects
None yet
6 participants
@nerpderp83

nerpderp83 commented Apr 26, 2018

This issue was previously logged @ #3231 , I am re-submitting here to raise visibility incase it hits anyone else.

If your ocaml compiler version changes, opam will do an upgrade and I think if you don't have the opam vars set in the current shell, it will do an rm -rf from the root of your system.

See #3231 (comment)

@nerpderp83

This comment has been minimized.

nerpderp83 commented Apr 26, 2018

Given how much is controlled by env vars, here is the offending logic "patched".

camlp5/camlp5@d561693#diff-b67911656ef5d18c4ae36cb6741b7965

$(RM) -rf "$(DESTDIR)$(LIBDIR)/$(CAMLP5N)"

If the above env vars are not set, that command decays into rm -rf /.

Maybe opam shouldn't proceed with any action unless the proper env vars are set, including in subshells?

@dra27

This comment has been minimized.

Contributor

dra27 commented Apr 26, 2018

Sorry you've been hit by this :(

A few things to clarify, though:

  • opam does indeed recompile a system switch if your system compiler is upgraded
  • this issue only occurs if you have a faulty version of camlp5 installed
  • I think that running opam update to update your package repository first mitigates this (is that right, @AltGr, or does opam use the old definition before uninstalling?)

opam's starting to use sand-boxing where available to prevent this - there's not a lot opam can do with environment variables here.

@nerpderp83

This comment has been minimized.

nerpderp83 commented Apr 26, 2018

I was in a new shell that afaik didn't have the opam env vars set, I had just done an upgrade of ocaml, opam detected the change and offered to upgrade opam for me.

I think if the env vars are not set and/or they are not set in a fresh subshell (for what ever reason), then opam should exit with an informative error. There are only a couple cases where opam should still function with out proper environment variables.

@dra27

This comment has been minimized.

Contributor

dra27 commented Apr 26, 2018

What are you regarding as "proper" environment variables? The fault here is that camlp5's configure script fails, leaving LIBDIR and DESTDIR unset, as you note above - it's not opam which sets those.

@AltGr

This comment has been minimized.

Member

AltGr commented Apr 27, 2018

NOTE TO ANY USERS IN DOUBT

The safe things to do

  • don't run any opam commands
  • either rm -rf ~/.opam (it's then safe to init again)
  • or wait for more detailed instructions once we triple-check them, for a less destructive approach
  • or update to the pre-release of opam 2.0.0

I am very sorry about this. Due to the issues you mentionned, we have been running to integrate sandboxing as quickly as possible in opam 2.0, and this is the main reason for the rc2 delay (we want to make sure it will work properly for everyone). Support is now in opam's master branch, but not yet in the release.

I think that running opam update to update your package repository first mitigates this (is that right, @AltGr, or does opam use the old definition before uninstalling?)

Unfortunately, to ensure consistency, opam 2 will use the definition it has been using for installation when removing, so opam update will not help here.

I was in a new shell that afaik didn't have the opam env vars set

it's not related to opam (would actually happen if compiling camlp5 from the command-line too!), these variables are expected to be set in the package's own Makefile.configure through its ./configure script.

Again, sorry, I was hoping we could ship the sandboxing update quickly enough to avoid such disasters.

@thierry-martinez

This comment has been minimized.

thierry-martinez commented Apr 27, 2018

The faulty rm -rf / will only occur if opam tries to build camlp5 version <= 7.04 with OCaml 4.06.1:
./configure will fail and then, opam will still try to execute make uninstall whereas the configuration files have been left blank.

Therefore:

  • yes, opam update will help indeed, because the faulty removing only happens for new builds (that has even not been installed yet!), and opam update will make opam use camlp5 version >= 7.05 that has been fixed;
  • it would be certainly a good idea to fix opam behaviour so that opam does not try to remove a package that has not been installed (because of failure during the building phase). It will be even better if opam 1.2 and 1.3 could be patched, but I am not sure how to make the patch myself.
@pmetzger

This comment has been minimized.

Member

pmetzger commented Apr 29, 2018

What are you regarding as "proper" environment variables? The fault here is that camlp5's configure script fails, leaving LIBDIR and DESTDIR unset, as you note above - it's not opam which sets those.

True enough, but cold comfort to someone who has just had their system blown away. Perhaps if OPAM notices that an older version of camlp5 is about to be used in the offending manner, even though this is a special case, it should refuse to proceed.

@pmetzger

This comment has been minimized.

Member

pmetzger commented Apr 29, 2018

@AltGr, you say:

The safe thing to do is to REMOVE any opam switch based on OCaml 4.06+ as soon as possible, using opam switch remove, AND run opam update.

Can one then restore 4.06, or does one have to leave it disabled to be safe? If the former, how can one check if one is already safe or not?

@dbuenzli

This comment has been minimized.

Contributor

dbuenzli commented Apr 29, 2018

Perhaps if OPAM notices that an older version of camlp5 is about to be used in the offending manner, even though this is a special case, it should refuse to proceed.

AFAIU. This problem is horrible because basically we can't do anything, if you opam update you'll never run into the issue as the package has been fixed. What you propose here is to update opam which is totally equivalent. Basically we can only pray that none will try to use opam with an outdated opam-repository that has the faulty version of the offending package. This will decoy with time but for now it seems people still run into the issue which is truly gruesome but I'm afraid there's no real fix here for that particular instance of the problem.

@pmetzger

This comment has been minimized.

Member

pmetzger commented Apr 29, 2018

This will decoy with time but for now it seems people still run into the issue which is truly gruesome but I'm afraid there's no real fix here for that particular instance of the problem.

Perhaps an announcement on the caml mailing list and on discuss.ocaml.org with precise diagnosis and mitigation instructions would help then? I only found out about this by accident.

@dra27

This comment has been minimized.

Contributor

dra27 commented Apr 29, 2018

@pmetzger wrote:

True enough, but cold comfort to someone who has just had their system blown away.

Indeed, my first comment noted my sympathy to the situation. Context matters, please.

I was investigating this in detail with @AltGr on Friday. If you downgrade your system compiler back to 4.06.0, do an opam update and then upgrade the system compiler back to 4.06.1 you would be fine, yes. The only other way at the moment is manually editing things in ~/.opam very carefully (you have to remove camlp5 and any packages which depend on it from ~/.opam/system/installed and then run opam update).

So, we reckon that the only reliable way to fix this is going to be to push an update to opam 1.2.2 - possibly expressly for macOS (so via homebrew and, if it's also affected, MacPorts). The problem in opam 1.2.2 is that the question to upgrade the system compiler is asked before anything else can possibly process. The first fix will be to defer the "upgrade your system compiler" question when you run opam update and the second fix will be to add a special case which either detects the broken camlp5 package being present and requires an opam update to have been run or just detects any camlp5 and emits a strong warning.

My opinion is that the risk of announcing this even more widely before that patch is available is that there will be users who will fail to follow the instructions properly and answer "yes" to the "upgrade your system compiler" question so, especially at the weekend, I think it would be better not to post this to Discuss - what do you think? (and @AltGr, @avsm?)

I don't use macOS, but if I did, I'd really want GNU coreutils overriding rm:

dra@bionic:~$ sudo rm -rf /
rm: it is dangerous to operate recursively on '/'
rm: use --no-preserve-root to override this failsafe

(added in GNU coreutils 5.1.0 in November 2003)

@pmetzger

This comment has been minimized.

Member

pmetzger commented Apr 29, 2018

Suggestion: would a shell script to let one check if one is vulnerable, possibly followed by saving the list of installed packages and forcing the removal of the bad switch, and then reinstallation, be feasible?

@pmetzger

This comment has been minimized.

Member

pmetzger commented Apr 29, 2018

Indeed, my first comment noted my sympathy to the situation. Context matters, please.

Oh, that came through. It may not have been obvious in what I then wrote, written language doesn't convey tone that well. Mostly I'm just encouraging some sort of action to be taken, though I don't know what exactly would work.

@dra27

This comment has been minimized.

Contributor

dra27 commented Apr 29, 2018

@pmetzger - the script is a good idea (which I'm now working on!). The problem is that if you're affected (so you have the old metadata version of camlp5.7.03 installed in a system switch and have upgraded to 4.06.1) then any opam command will offer to upgrade the system switch. So any script must do it by inspecting the files manually.

@pmetzger

This comment has been minimized.

Member

pmetzger commented Apr 29, 2018

So any script must do it by inspecting the files manually.

Unfortunate. Then again, this whole situation is unfortunate. 🙁🙁🙁

@AltGr

This comment has been minimized.

Member

AltGr commented Apr 30, 2018

it would be certainly a good idea to fix opam behaviour so that opam does not try to remove a package that has not been installed (because of failure during the building phase). It will be even better if opam 1.2 and 1.3 could be patched, but I am not sure how to make the patch myself.

that's what opam 2 does, but opam 1.2 didn't enforce separation of build and install phases (opam 1.1 not having the distinction), so that hadn't been added yet.

I'll edit my post above to remove the mention of running opam switch remove, according to @dra27's findings. :(
At least these findings conclude that opam 2 is completely safe (from that specific issue, should I precise... not in general until sandboxing is built in, which I am actively working on)

@nerpderp83

This comment has been minimized.

nerpderp83 commented May 13, 2018

Original issue submitter here. I appreciate the amount of discussion all of you have put around this. One outcome is that I no longer run homebrew AND opam (if that is entirely possible) as my main user account, but most people still probably will.

Could opam coordinate with the homebrew folks and update or rework the upgrade that homebrew does?

As I commented on in 3231#issuecomment-384712118 I did the following steps.

  1. brew upgrade ocaml
  2. opam init

So brew had a window of opportunity to slip a fix in that might have prevented this.

@dra27

This comment has been minimized.

Contributor

dra27 commented May 15, 2018

@nerpderp83 - @avsm has suggested this, too. Although it would fix the problem, I'm not keen on the idea of a package manager invoking opam update, but it could be worth the homebrew ocaml package running that scanner script - assuming there's a clear opportunity for homebrew to display the results of running it. That said, even a message would be an improvement.

All that said, homebrew seems to update about 90 seconds after a release - this problem will be largely eliminated by OCaml 4.07.0's release to homebrew. How long do previous versions hang around in homebrew?

@nerpderp83

This comment has been minimized.

nerpderp83 commented May 15, 2018

@dra27

By

slip a fix

I mean that as anything that could prevent this from happening. Not necessarily run the updater itself.

this problem will be largely eliminated by OCaml 4.07.0's release to homebrew.

But won't this still impact folks using earlier version of OCaml and opam? OCaml and opam are separate packages in homebrew.

I currently have on a newly configured OSX + Homebrew instance (hours after home dir deletion)

  • OCaml 4.06.1
  • opam 1.2.2_4

see brew info

Is this combination still vulnerable to the issue?

If someone (like me) upgrades their OCaml and w/o upgrading opam, and they run opam won't it offer to "rebuild" opam and rm -rf / their machine?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment