Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fresh install fails on Windows: "error: could not rename component file from '... \rust-docs\share/doc/rust/html' ..." #1912

Closed
s-n-ushakov opened this issue Jun 25, 2019 · 145 comments

Comments

@s-n-ushakov
Copy link

Problem
A fresh install of stable build fails on Windows 8.1 with message "error: could not rename component file from '... \rust-docs\share/doc/rust/html' ...".
Expected behavior would be the install to be successful :)

Steps

  1. Check Windows version is 8.1 (not checked for other versions)
  2. Ensure %USERPROFILE%\.rustup and %USERPROFILE%\.cargo folders do not exist
  3. Download the latest stable rustup-init.exe file (1.35.0 of 2019-05-23) and run it under normal (non-admin) user without elevating user rights.
  4. Respond with "1" to the question regarding "Current installation options"
  5. Wait until the install fails with message "error: could not rename component file from '... \rust-docs\share/doc/rust/html' ..."
C:\Users\ushakov\Documents\-work\Downloads\www.rust-lang.org>rustup-init.exe

Welcome to Rust!

This will download and install the official compiler for the Rust programming
language, and its package manager, Cargo.

It will add the cargo, rustc, rustup and other commands to Cargo's bin
directory, located at:

  C:\Users\ushakov\.cargo\bin

This path will then be added to your PATH environment variable by modifying the
HKEY_CURRENT_USER/Environment/PATH registry key.

You can uninstall at any time with rustup self uninstall and these changes will
be reverted.

Current installation options:

   default host triple: x86_64-pc-windows-msvc
     default toolchain: stable
  modify PATH variable: yes

1) Proceed with installation (default)
2) Customize installation
3) Cancel installation
>1

info: syncing channel updates for 'stable-x86_64-pc-windows-msvc'
info: latest update on 2019-05-23, rust version 1.35.0 (3c235d560 2019-05-20)
info: downloading component 'rustc'
 60.0 MiB /  60.0 MiB (100 %)   8.0 MiB/s in  9s ETA:  0s
info: downloading component 'rust-std'
 53.1 MiB /  53.1 MiB (100 %)   8.3 MiB/s in  7s ETA:  0s
info: downloading component 'cargo'
info: downloading component 'rust-docs'
 10.3 MiB /  10.3 MiB (100 %)   9.3 MiB/s in  1s ETA:  0s
info: installing component 'rustc'
 60.0 MiB /  60.0 MiB (100 %)   7.7 MiB/s in  8s ETA:  0s
info: installing component 'rust-std'
 53.1 MiB /  53.1 MiB (100 %)   7.3 MiB/s in 15s ETA:  0s
info: installing component 'cargo'
  2.9 MiB /   2.9 MiB (100 %)   2.1 MiB/s in  6s ETA:  0s
info: installing component 'rust-docs'
 10.3 MiB /  10.3 MiB (100 %) 473.6 KiB/s in  1m 35s ETA:  0s
info: rolling back changes
error: could not rename component file from 'C:\Users\ushakov\.rustup\tmp\grl2pj2s61kwqrko_dir\rust-docs\share/doc/rust/html' to 'C:\Users\ushakov\.rustup\toolchains\stable-x86_64-pc-windows-msvc\share/doc/rust/html'
info: caused by: Отказано в доступе. (os error 5)

Press the Enter key to continue.

Possible Solution(s)
A possible temporary workaround may be to run the installer with admin privileges, as per https://stackoverflow.com/questions/52542965/rust-installation-fails-on-windows-subsystem-for-linux-could-not-rename-compone/55373522#55373522 , that proved to work for me.

Notes

Output of rustup --version:
rustup 1.18.3 (435397f48 2019-05-22)

Output of rustup show:

Default host: x86_64-pc-windows-msvc

stable-x86_64-pc-windows-msvc (default)
rustc 1.35.0 (3c235d560 2019-05-20)
@rbtcollins
Copy link
Contributor

Thank you for generating this report. It does look to me like a probable anti-virus problem.

You mentioned in 1723 that you have procmon traces - have a look in that at the time of the error, are there any other processes with handles open anywhere in that subdirectory? I'm happy to look through if you can upload the PML (as a zip/tar) to s3 or seed it as a torrent or some such...

@s-n-ushakov
Copy link
Author

@rbtcollins I have placed the zipped PML log (11M) here: http://www.usn.pp.ru/tmp/rustup-init--procmon.2019-06-24.zip . Unfortunately could not find anything suspicious so far... Please let me know whether the log is good when your time allows...

@rbtcollins
Copy link
Contributor

The log doesn't have enough info in it (or its not what I think it is, or Defender's implementation has changed radically from Windows 8 to 10).

So this is the event that fails:

5:03:57.3318549 PM	0.0137171	rustup-init.exe	12352	22804	SetRenameInformationFile	C:\Users\ushakov\.rustup\tmp\grl2pj2s61kwqrko_dir\rust-docs\share\doc\rust\html	ACCESS DENIED	ReplaceIfExists: True, FileName: C:\Users\ushakov\.rustup\toolchains\stable-x86_64-pc-windows-msvc\share\doc\rust\html

But there are no events in the trace from other processes than rustup-init.exe - I suspect you've got a filter on the process name.

Check that you aren't filtering for just rustup-init.exe in the filter dialog when you make the trace - something like "Path contains .rustup" would be a good filter - the default filters to exclude procmon.exe itself and so on are fine, but we do want to see activity from other processes within this directory structure.

After you've done the trace and saved the file, you can then edit the filter and add filter to exclude events from rustup-init.exe - I'd expect that then you'd see defender or whatever is causing the issue operating on those files.

I would find the event in the first place by adding a filter on result for 'ACCESS DENIED'
image and then using the microsecond timestamp to find the event in context when that filter is disabled (or clicking on the event, then removing the filter may keep you on it - the UI can be a bit flaky though so...).

@s-n-ushakov
Copy link
Author

@rbtcollins I have finally managed to guess the time values to filter out all the events withing 2 sec timeframe around the failure event :) :

image

All the excludes are the standard ones suggested by procmon, except the one of mine that restricts the time.

Please find the resulting zipped PML log (0.5M) here: http://www.usn.pp.ru/tmp/rustup-init--procmon.2019-06-24--two-seconds-around.zip

@rbtcollins
Copy link
Contributor

Ok so thats super interesting, - definitely other processses in the trace, but none operating on .rustup files.

I suspect we need to get into kernel tracing at this point - ETW / WPR https://docs.microsoft.com/en-us/previous-versions/windows/it-pro/windows-8.1-and-8/hh448205(v=win.10) - - basically looking for the cause of the access denied.

Before doing that, please try building and running master - there's some unpacking IO changes that might make this better - or worse - and would allow me to make the following statement:

We make that dir with the same code for both the binaries and docs, so either both should succeed or both should fail, which is why the third-party-process holding a handle open is the most likely scenario.

@s-n-ushakov
Copy link
Author

@rbtcollins Ok, I'll see if I succeed with building the master on the weekend. Meanwhile could you describe your ideas on further troubleshooting in some more detail? What the next steps may look like, and how to proceed?

@rbtcollins
Copy link
Contributor

@s-n-ushakov basically we're at the edge of my ready-to-use knowledge, so I'm going to be vague. But something like: enable kernel event tracing on all file operations, run the failing scenario, disable event tracing, look through the trace which hopefully will have much more detail than procmon, because it wires into the kernel itself rather than purely userspace.

@EugeneChung
Copy link

I tried to install rust by chocolatey and got the similar error.

...\rust-docs\share\doc\rust\html' is denied.

After I turned off my vaccine program, it was successful.

@mpoullet
Copy link

mpoullet commented Jul 7, 2019

I have the same issue here with the Avira anti-virus. Turning off the "Real-Time Protection" before running rustup-init.exe makes it work.

@ctaggart
Copy link

This failed about 5 times for me and then I found this thread. It worked after I deselected the box in the McAfee Access Protection Properties to "Prevent McAfee services from being stopped" and then stopped the `McAfee McShield" service. It is also known as the "McAfee On-Access Scanner service".

This was the error:

info: installing component 'rust-docs'
 11.3 MiB /  11.3 MiB (100 %) 355.2 KiB/s in  1m 28s ETA:  0s
info: rolling back changes
error: could not rename component file from 'C:\Users\taggac\.rustup\tmp\l2_bwq4ke00qaw9i_dir\rust-docs\share/doc/rust/html' to 'C:\Users\taggac\.rustup\toolchains\beta-x86_64-pc-windows-gnu\share/doc/rust/html'
info: caused by: Access is denied. (os error 5)

image

@s-n-ushakov
Copy link
Author

@rbtcollins I have to confess that I failed to do the ETW / WPR exercise so far, as my HDD with Windows died. Sorry... Now I am busy with gradual migration to Ubuntu, so it is not very much likely that I will be able to add more to this investigation, at least in the near future. But it is good to know that numerous colleagues here are reporting their observations of problem correlation with antivirus software...

@ChenotConsulting
Copy link

Hi.
I confirm the same as the above. After turning off Avira antivirus it worked.

@kwaegel
Copy link

kwaegel commented Aug 15, 2019

Same. Turning off the McAfee On-Access Scanner fixed the Access is denied. (os error 5) message.

@kinnison
Copy link
Contributor

Badly behaved scan-on-write virus scanners are becoming a great pain to me. I wonder if there's any way we can detect and mitigate this.

@pitaj
Copy link

pitaj commented Aug 21, 2019

If I could disable my anti-virus temporarily I'm sure it would fix this issue. However, at work I am unable to disable the anti-virus program. This makes it impossible for me to use rustup update.

Is it possible to make it so rustup won't give up after the very first attempt at moving? Or give us an option to, instead of moving the files, try copying to destination and deleting from source instead?

@pitaj
Copy link

pitaj commented Aug 21, 2019

Update: this was fixed for me by building from latest master (d62e504)

I think this issue is actually fixed by the same fix for #1870

@kinnison
Copy link
Contributor

Handy to know, thanks @pitaj -- I hope to get a release done soonish that will therefore solve this a bit.

@refaelsh
Copy link

I have the same problem. Corporate laptop with McAfee that can not be turn off. I need this release in order to start learning rust. Please hurry :-)

@refaelsh
Copy link

@pitaj Can you please share the binary you've build from the latest master? I cant install rust because of this issue and I cant build the latest master because I don't have rust.

@refaelsh
Copy link

Found a way: downloaded the latest master directly from the CI/CD server: here. I can now start learning rust :-)

@esbgo97
Copy link

esbgo97 commented Sep 2, 2019

Hello, I fix it by deactivating the antivirus while it is installed.

@refaelsh
Copy link

refaelsh commented Sep 2, 2019

@esbgo97, most corporate laptops do not allow turning off the Antivirus.

@RReverser
Copy link

I keep having the same lately on corporate Windows 10. There is no 3rd-party antivirus though, just built-in Windows Defender.

@RReverser
Copy link

Found a way: downloaded the latest master directly from the CI/CD server: here. I can now start learning rust :-)

Yay, I can confirm that nuking .rustup folder and re-installing using the latest build from CI worked.

@vnermolaev
Copy link

vnermolaev commented Sep 9, 2019

Yay, I can confirm that nuking .rustup folder and re-installing using the latest build from CI worked.

Unfortunately it does not work for me, I am still getting the following

info: installing component 'rust-docs'
 11.3 MiB /  11.3 MiB (100 %) 252.8 KiB/s in  2m 21s ETA:  0s
info: retrying renaming 'C:\Users\---\.rustup\tmp\fry3x4mosy837ftc_dir\rust-docs\share/doc/rust/html' to 'C:\Users\---\.rustup\toolchains\stable-x86_64-pc-windows-msvc\share/doc/rust/html'

I have a corporate with McAfee enabled, and this is preposterous that docs are blocked from being moved (hey, if it was a virus, it would already have infected my computer).

@kinnison
Copy link
Contributor

kinnison commented Sep 9, 2019

It is a big pain, I agree. McAfee is among the worst offenders :(

@pitaj
Copy link

pitaj commented Sep 9, 2019

If it says "retry renaming" instead of failing then the fix is working.

@norru
Copy link

norru commented Sep 10, 2019

To me it says "retry renaming" for a while and then it fails on rustup 1.19.0 (2af131cf9 2019-09-08), Windows 10.

info: rolling back changes
error: could not rename component file from 'C:\Users\gborruni\.rustup\tmp\akm490eg7dpm6pbd_dir\rust-docs\share/doc/rust/html' to 'C:\Users\gborruni\.rustup\toolchains\stable-x86_64-pc-windows-msvc\share/doc/rust/html'
info: caused by: permission denied

Corporate McAfee enabled, cannot disable.

Issue is not restricted to fresh install. I did a rustup update before nuking .rustup and got the same error.

@rbtcollins
Copy link
Contributor

@ChrisGreenaway success! cool; permission denied on the job object, interesting - I'd like to dig into that deeper at some point, at least I have something to go on. If you could poke around and see what permissions you have / need on the default job object in your win10 user session, that would be useful. You might find the effective-limits.rs test suite to be useful too.

@ChrisDenton Yes, most of the time, you'd need to be unlucky to race on a file, but its not safe-safe, though we could of course retry on that too. And yes, that behaviour is exactly the problem we're working around. I'm not sure of the distinction you're drawing here between yield and sleep: we're giving up the CPU and trying each time. But McAfee is multi-threaded; there is often no gap between files:
fileA opened
fileA read
fileB opened
fileA closed
fileB read

etc

@rbtcollins
Copy link
Contributor

@ChrisGreenaway, oh, with 26 retries you should be able to install the entire toolchain at once, no need for profile minimal.

@ChrisDenton
Copy link
Member

ChrisDenton commented May 28, 2020

@rbtcollins I mean there should literally be some time after a file handle is a closed and before another is open. Even if they have a tight loop it takes time for the OS to check arguments do security checks, etc even in multithreaded code you can get lucky. My theory is that it may be possible to slot between an open and a close. A very rough example would be:

let mut result = Ok(())
// Number of iterations is probably way too large
for _ in 0..100_000 {
    result = fs::rename(src, dest);
    if let Err(e) = &result {
        if e.kind() == io::ErrorKind::PermissionDenied {
            // yield this thread's timeslice back to the OS
            // so it can run other threads on the same CPU core.
            std::thread::yield_now();
            continue;
        }
    }
    break;
}
result.chain_err(|| ErrorKind::RenamingFile {
    name,
    src: PathBuf::from(src),
    dest: PathBuf::from(dest),
})

@ChrisGreenaway
Copy link

@rbtcollins - I got that permission denied on the job object error even when running as a machine adminstrator.

You asked me to "see what permissions [I] have / need on the default job object in [my] win10 user session". How would I do that?

@rbtcollins
Copy link
Contributor

@ChrisDenton oh, yield_now, which is more-or-less sleep(0), except that there are some guarantees about not causing processor migrations on Windows. No, that is not of any help at all, because McAfee is holding these locks because it is running concurrently with Rust.

The least number of hardware threads in a laptop you can feasibly buy today is 4 I think; possibly 2. The traces we have show clearly that multiple mcafee threads are holding handles open across each others open and closes with no windows between them, as I sketched in my last comment. Allowing more another 20ms of mcafee time (which is what yield_now will be more or less as it is backed by SwitchToThread, and desktop Windows scheduler granularity), is basically what we do now, except that by not backing off, you're just probing more often in a wait-loop.

But by all means test it ! what we need is empirical data here :). Like @ben-spiller's extraction test, doing experiments is valuable and useful - but please do post the patch or code so we can make sure we all understand what was tested ;)

@rbtcollins
Copy link
Contributor

@ChrisGreenaway process explorer perhaps? I'm not 100% sure, job objects are still a new API for me :).

Basically we're calling https://docs.microsoft.com/en-us/windows/win32/api/jobapi2/nf-jobapi2-queryinformationjobobject like so:
QueryInformationJobObject(
NULL,
JobObjectExtendedLimitInformation,
...
)

which should get the default job object, if any. The API says we have to have JOB_OBJECT_QUERY on the handle, but we're using the implied handle. https://docs.microsoft.com/en-us/windows/win32/procthread/job-object-security-and-access-rights documents those, but it's not clear how we wouldn't have access to read extendedlimitinformation on the associated job object,... unless perhaps there's some corporate setup restruction that is ultra limited - but perhaps we can open the job manually with appropriate permissions - anyhow, basically there is an investigation we need to do, ...

@ChrisDenton
Copy link
Member

Ok, I went away and did some tests. I don't have a McAfee so I made my own. You can find my code here.

It seems to work but there could of course be other factors at play that I'm missing (or my AV simulator is broken in some way). I've made a rustup fork (rename_dir branch) for real world testing if anybody is willing. The only changes are to utils.rs. The idea is to catch a moment where all files have been closed but none are yet open. This may be a rare occurrence but, with the vagaries of both threads and file I/O, the stars should hopefully align eventually. It could go without the yield_now at all but I'd worry about the chance of starving a process that needs to make forward progress to release a handle. Though admittedly this is very unlikely with a multi-core system and a scheduler that can move threads between cores.

To be clear, I'm not saying this will definitely work or is the best idea. But I am interested in file handling in Windows and would at least like to rule out this possibility.

@ChrisGreenaway
Copy link

@ChrisDenton - I ran component add rust-docs using your fork. It succeeded. However, it took 1 minute and 20 seconds, so I suspect that it just succeeds after McAfee has finished.

@ChrisGreenaway
Copy link

@rbtcollins - is there a test you could add to effective-limits to check the job object? Then I could clone the project and run cargo test. I just tried that and all the tests pass at the moment.

@ChrisDenton
Copy link
Member

@ChrisGreenaway Fair, enough. Thanks for testing.

@kinnison
Copy link
Contributor

We've merged an update to the retry count in readiness for 1.22 so hopefully that'll mitigate things as per the discussion above, but I won't close this as it's just papering over the cracks still, until we can resolve things properly.

@rbtcollins
Copy link
Contributor

rbtcollins commented May 31, 2020

@ChrisDenton I think your simulator is faulty, sorry. You're dropping the opened files immediately in your simulated av action loops. So they aren't open for x number of ns, they are open for the time it takes to do the syscall. This means it only supports the idea of being able to zip in if and only if the activity was limited to open+close with no reads and no delays in time between them, neither of which are true.

For higher fidelity I think you need

loop{
_file = open()
sleep(65ns)
drop(_file)
sleep(random_inter_file_thread_gap)
}

@rckhpe
Copy link

rckhpe commented Jun 8, 2020

I'm getting bit by this bug as well. Corporate laptop, strict policy with McAfee antivirus.

@vnnv01
Copy link

vnnv01 commented Jun 25, 2020

I am also getting bit by this bug. Since it is corporate laptop I cannot remove Mcafee. Is there a way we can make installation wait longer so that Mcafee completes its check

@kinnison
Copy link
Contributor

We hope to release a new rustup soon which will help a little; but there's a bug we need to bottom first which is causing problems mentioned above around QueryInformationJobObject

@ChrisGreenaway
Copy link

The latest version of rustup (out yesterday) has this change in.

@rckoepke
Copy link

rckoepke commented Jul 7, 2020

Just tried it again now. The installation seems to work great. Despite that it temporarily hangs up / long loops on
retrying renaming 'C:\Users\username\.rustup\tmp\utq1gp9blz83pn2s_dire\rust-docs\share/doc/rust/html' to 'C:\Users\username\.rustup\toolchains\stable-x86_64-pc-windows-msvc\share/doc/rust/html'

for several dozen retries...it does succeed in installation now.

Thank you so much, you guys are fantastic!

@kinnison
Copy link
Contributor

kinnison commented Jul 7, 2020

Wonderful, thank you for checking for us. I'll close this now then.

@kinnison kinnison closed this as completed Jul 7, 2020
@ChrisGreenaway
Copy link

I thought the change that had been merged was an interim fix. @kinnison - said it was just papering over the cracks. Presumably, on different hardware, the timeout may need to be different - although we could just wait to see if anyone encounters the issue.

If the current fix is as considered acceptable, then I wonder whether the repeated logging of retrying renaming that @rckoepke mentioned should be suppressed.

I had to run rustup update a couple of times to make everything work - I think because rustup only updated itself after trying to update the version of Rust. The first run failed to install the latest Rust but installed the latest rustup. The second run then succeeded on updating Rust. This isn't particularly surprising, I suppose, but worth mentioning in case others encounter this issue. Although I would have thought it would make more sense to update rustup before anything else.

@kinnison
Copy link
Contributor

kinnison commented Jul 7, 2020

you're quite right, I was overzealous there.

@kinnison kinnison reopened this Jul 7, 2020
@rbtcollins
Copy link
Contributor

Addressing #2417 may provide a permanent solution to this as well.

workingjubilee pushed a commit to workingjubilee/rustup that referenced this issue Aug 10, 2020
Per rust-lang#1912 McAfee users are still seeing contention on directory renames.
Give McAfee more settle time.
@workingjubilee
Copy link
Member

@rustbot label: +O-windows

@rustbot rustbot added the O-windows Windows related label Apr 29, 2021
@rbtcollins
Copy link
Contributor

I'm closing this as the symptom is well resolved. I'm going to file a new bug about possible future improvements.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests