Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing/Broken Msiexec Circumventing CPU Stall Checks #579

Open
nixuno opened this issue Jan 23, 2024 · 1 comment
Open

Missing/Broken Msiexec Circumventing CPU Stall Checks #579

nixuno opened this issue Jan 23, 2024 · 1 comment

Comments

@nixuno
Copy link
Contributor

nixuno commented Jan 23, 2024

While attempting to use BCU to uninstall a "broken" Zoom msiexec install (the uninstall media was missing), I noted that msiexec sometimes will be using 10%+ CPU when it is doing nothing but sitting at this screen:

image

Currently the UninstallThread() method in BulkUninstallEntry.cs is waiting for 30 consecutive true returns from TestUninstallerForStalls() saying that the process is using less than 1% CPU and 10KB of IO, however in my test case, the process will infinitely run, as msiexec is running at 10%+ CPU but with 0 I/O. I have two solutions to this that I have tested successfully locally:

The easy fix

Simply make the CPU and IO checks an or instead of and:

Current:


Proposed:

if (c0 <= 1 || c1 <= 10240)

While I do not have solid data to back this up, my assumption would be that if an uninstaller does not perform any IO work or any CPU work in 30 seconds then it is likely stalled. I am open to suggestions/corrections here as this is simply an assumption.

The possibly more robust fix

Add a new property and constructor parameter to RunUninstallerOptions to allow for an optional "timeout":

Current:

internal sealed class RunUninstallerOptions
{
public RunUninstallerOptions(bool autoKillStuckQuiet, bool retryFailedQuiet, bool preferQuiet, bool simulate, BulkUninstallTask owner)
{
AutoKillStuckQuiet = autoKillStuckQuiet;
RetryFailedQuiet = retryFailedQuiet;
PreferQuiet = preferQuiet;
Simulate = simulate;
Owner = owner;
}
public bool AutoKillStuckQuiet { get; }
public bool PreferQuiet { get; }
public bool RetryFailedQuiet { get; }
public bool Simulate { get; }
public BulkUninstallTask Owner { get; }
}

Proposed:

internal sealed class RunUninstallerOptions 
 { 
     public RunUninstallerOptions(
bool autoKillStuckQuiet, bool retryFailedQuiet, bool preferQuiet, bool simulate, BulkUninstallTask owner,
/****New timeout property****/int timeout = -1) 
     { 
         AutoKillStuckQuiet = autoKillStuckQuiet; 
         RetryFailedQuiet = retryFailedQuiet; 
         PreferQuiet = preferQuiet; 
         Simulate = simulate; 
         Owner = owner;
         // ***New timeout property***
         Timeout = timeout;
     } 
  
     public bool AutoKillStuckQuiet { get; } 
  
     public bool PreferQuiet { get; } 
  
     public bool RetryFailedQuiet { get; } 
  
     public bool Simulate { get; } 
  
     public BulkUninstallTask Owner { get; } 

    // ***New timeout property***
    public int Timeout { get; }

 } 

Use this timeout functionality where required to kill the uninstall process even if it is not being marked as "stalled". For instance:

Current:

var idleCounter = 0;
while (true)

// Kill the uninstaller (and children) if they were idle/stalled for too long
if (idleCounter > 30)
{
KillProcesses(watchedProcesses);
throw new IOException(Localisation.UninstallError_UninstallerTimedOut);

Proposed:

var idleCounter = 0;
Stopwatch watch = new Stopwatch();
if(options.Timeout > 0) { watch.Start(); }
while (true)
 // Kill the uninstaller (and children) if they were idle/stalled for too long 
 if (idleCounter > 30 || watch.ElapsedMilliseconds >= options.Timeout) 
 { 
     watch.Stop();
     KillProcesses(watchedProcesses); 
     throw new IOException(Localisation.UninstallError_UninstallerTimedOut);
// ... after the while loop
watch.Stop();

This would allow for all functionality to remain the same unless something explicitly implements the timeout.

I would be using this in my current fork to implement junk cleanup in BCU-console. If you'd like a PR for this, I'll be happy to put one together.

@Klocman
Copy link
Owner

Klocman commented Feb 23, 2024

I would advise against a timeout since some systems may have very slow drives and larger applications could take tens of minutes to remove.

A better option I think would be something similar to what you've mentioned - to have a separate, longer timeout if there is no I/O activity.

An even better way may be to keep track of CPU and I/O usage, and if they stay on roughly the same values for a very long time to treat it as a stall. It would also deal with the uninstaller continuously checking for some file, generating I/O that does nothing. This might require a condition that the I/O is below some threshold however to avoid false positives with large software packages.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants