Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failure removing component. Ok, now what? --force does not help #1480

Open
gnzlbg opened this issue Aug 14, 2018 · 16 comments
Open

Failure removing component. Ok, now what? --force does not help #1480

gnzlbg opened this issue Aug 14, 2018 · 16 comments

Comments

@gnzlbg
Copy link
Contributor

gnzlbg commented Aug 14, 2018

error: failure removing component 'rustc-x86_64-apple-darwin', directory does not exist: '"share/man/man1/rustdoc.1"'

@thomashorrobin
Copy link

I'm also having this problem and it doesn't seem to be obvious what to do next or what's causing the issue

@thomashorrobin
Copy link

having done a bit more Googling I think this is the answer to this problem: #1167 (comment)

@kornelski
Copy link
Contributor

I've ran into this, too:

failure removing component 'rustc-x86_64-unknown-linux-gnu', directory does not exist: '"share/doc/rust/COPYRIGHT"'

I think it'd be sensible to treat absence of the directory to delete as a success.

@kornelski
Copy link
Contributor

Looking at the code:

  1. There's a race condition: https://github.com/rust-lang/rustup.rs/blob/a54051502ad49c7e10c029d869ddd98064a29e31/src/dist/component/transaction.rs#L267

  2. It'd be better to try to rename first, and handle not-found error. But the io::Error is not returned directly. https://github.com/rust-lang/rustup.rs/blob/a54051502ad49c7e10c029d869ddd98064a29e31/src/utils/utils.rs#L656 Is it normal to walk the chain in such cases? Or would it be better to modify the error type to include the io::ErrorKind?

  3. There's no way to do nothing. Would it be better to fake-delete a directory, or add NoOp change? Or change the function to return Option<ChangedItem>? https://github.com/rust-lang/rustup.rs/blob/a54051502ad49c7e10c029d869ddd98064a29e31/src/dist/component/transaction.rs#L185

@kinnison
Copy link
Contributor

kinnison commented Jun 3, 2019

@rbtcollins On the off chance you might have some of this in your head already -- What do you think? Could this resolve some more of our squiffy installs?

@rbtcollins
Copy link
Contributor

So, user solution here: I think the thing to do is to remove the toolchain and re-install: gets rid off all the detritus and gets a clean slate. Getting the selected component list and making that easy to re-select might be a convenience if there isn't a nice way already.

In terms of code, handling this specific point case. The code driving this doesn't know what to today with deleting a file that doesn't exist. In particular I don't think that we discriminate in the transaction system between delete-so-that-we-can-upgrade and delete-so-that-we-can-uninstall-a-component and deleteing-a-toolchain.

In the first case we want to leave the system working if interrupted; in the second the same but we never want that particular content back, and in the third there's whole trees we no longer want.

And the underlying problem here is a design problem: neither the transaction system nor the rustup update system have been built to be crash-expectant. Of course rust itself won't crash, but libc can, the OS itself can, and hardware can (e.g. power failures), not to mention of course virus scanners or people can willfully remove content. So for instance we uninstalling a toolchain could be about 5ms in duration: moving the entire tree to a place we don't want, forking a worker to delete it lazily and returning to the user immediately (and gcing that directory on future invocations too in case the worker is interrupted). Failing that we could parallelise the walking of subdirectories and eliminate syscall latencies and - well you see the point: the transaction system is beautiful but not well suited to all that we use it for.

Back to this specific case. (1) This isn't so much a race condition as a violated invariant: the contents of the disk don't match what rustup expects. 3) So how should we teach rustup to expect it. What will have the least side effects. I don't know offhand. remove_file appears to be only used in an uninstall case, and a removed file clearly cannot be rolled-back-from, but its also not a situation we created.

I think perhaps asking these questions may help: should we tell the user? Does the user need to know?

My hunch is that the user doesn't need to know when the overall operation is successful. We've converged on the desired state, and even though thats not the exact model we have today, I think its a very good model for installer software to have. If the overall operation doesn't work, given we have this rollback concept that is going to leave things not working, I think a message to the user would be valuable - but not one per file. Something like one per file to the log file (so verbose notifications - for diagnostics) and one single item at info level saying that there were missing files which mean that the system is incomplete after rollback. Or something like that.

And the remaining specific q's 2) with error_chain the io error is still preserved I believe, just needs to be unwrapped to get at it - whether its usual to walk to it or not I can't say - I'm still new here myself.

Hope this all helps,
Rob

@rbtcollins
Copy link
Contributor

Oh, I didn't answer (3) after all that - since all the uses of remove_file seem to be uninstall, and I can't see any reason that uninstall would abort when the file they are removing isn't there, I don't think there is any reason to change the signature, but yes by all means return a NothingChanged or some such.

@kornelski
Copy link
Contributor

kornelski commented Jun 4, 2019

Thanks for the answers.

With 1 I wasn't clear. I didn't mean the cause of this bug is a race condition. I mean that every use of path.exists() followed by another disk operation that depends on the (non)existence of the path is a race condition in general, because the path could be created or deleted after exists() but before the operation that follows it. For some software that may even be a security issue. Here that's unlikely, but still eliminating of exists() and checking errors from rename() will make it atomic and faster, since it avoids the extra disk access.

@rbtcollins
Copy link
Contributor

Oh sure; uhm yes - TOCTTOU bugs are rife :). I've been focusing on install performance, not uninstall (and thus not update) so far - the syscall traces from that were quite tragic :).

@carols10cents
Copy link
Member

carols10cents commented Dec 23, 2019

I think #1167 and #836 can be closed as duplicates of this issue (this one has the most detail)

I have admin rights so I could take care of this myself, but I'm not a regular maintainer here so I don't want to step on anyone's toes :) <3

@jawwadturabi
Copy link

any solution for this issue?

@kinnison
Copy link
Contributor

No fundamental solution no. In #1480 (comment) Robert said that the workaround is to remove and then reinstall the toolchain, because removal of a toolchain is a plain recursive delete rather than a component-aware transaction.

@jawwadturabi
Copy link

Removing toolchain method from #1480 (comment) works for me

@Luc108
Copy link

Luc108 commented Jun 21, 2021

Two easy, short term suggestions related to #2800 :

  • "file does not exist" instead of "directory does not exist" in the error message
  • When an update fails, we can give the user the suggestion to uninstall and reinstall. Giving the user details on how to do that, is very helpful.

@Luc108
Copy link

Luc108 commented Jun 21, 2021

And more details about #2800:

The file bin/cargo-clippy.exe was indeed missing. No idea how that's possible. Not done anything special or advanced. Only default install and maybe an update.

I uninstalled everything ("rustup self uninstall") and installed the same version again. This time the file was there.

@2giosangmitom
Copy link

I have same issue. I just deleted the .rustup directory, tried again and it worked :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants