New bootstrapper fails to run when performing an UpdateReplace #5238

Closed
squirmy opened this Issue Mar 11, 2016 · 6 comments

Projects

None yet

3 participants

@squirmy
squirmy commented Mar 11, 2016

WIX: 3.10.1.2213 (no customizations to burn)
Visual Studio 2015
Framework: .NET 4
OS: We have a mixed environment of Windows 8.1 and WIndows XP. The XP machines are about 12 years old and are much slower than our 8.1 machines. I have only noticed this happening on the XP machines. Unfortunately these machines are around to stay for a little while yet.

When performing an update by planning an UpdateReplace, I get an intermittent issue where the new bootstrapper fails to run and I get the following lines logged. The software is being installed on close to 4000 machines, so an intermittent issue creates a large problem for us.

[0C28:0C2C][2016-03-10T09:28:10]e000: Error 0x80070002: Failed to open handle to engine process path: C:\Documents and Settings\Administrator\Local Settings\Application Data\Package Cache\{2967b71d-245d-4f8d-aefb-6056f62a18f7}\RWWA.Terminal.exe
[0C28:0C2C][2016-03-10T09:28:10]e000: Error 0x80070002: Failed to initialize engine section.
[0C28:0C2C][2016-03-10T09:28:10]e000: Error 0x80070002: Failed to initialize engine state.

I have put the contents of the old bundle log file in this gist: https://gist.github.com/squirmy/aded8fa6361702321136

There is no log for the new bundle as it cannot be run.

@squirmy
squirmy commented Mar 14, 2016

Some more info, as I just watched last week's online meeting. Well, i skipped most of it and started listening in on the pull request for #5234.

It was mentioned that the bundle is moved to temp and marked for deletion on restart if it cannot be deleted. We had a file in temp, and when it was given a .exe extension and run, it was the bundle. So I assume this happened.

Also it was mentioned that the new bundle only has 5 seconds to start running otherwise there is a race condition. These machines are dog slow at times, it's possible that this could be where things are going wrong for us.

For now, we're thinking about trying to work around this in the ApplyComplete callback, where we would wait until the new bundle process has spawned before exiting. If it doesn't detect a process within say 30 seconds, then we would plan the UpdateReplace again, retrying a couple of times before bailing out so we don't get an infinite loop.

@rseanhall
Member

Yes, it appears that you are running into the race condition. The ApplyComplete callback is too late, you want the ExecutePackageComplete callback.

We'll probably have to pass file handles to the child processes, which means we probably can't fix this problem retroactively.

@squirmy
squirmy commented Mar 14, 2016

Ok thanks for the tip.

Out of interest though, why is it too late? I have had success in the past planning an UpdateReplace from ApplyComplete. This is the point in which the bootstrapper will exit, and i want to avoid that if the new process hasn't started yet. Just seemed like the logical place for it.

@rseanhall
Member

From your log:

[0AE4:0AE8][2016-03-10T09:27:59]i301: Applying execute package: {2967b71d-245d-4f8d-aefb-6056f62a18f7}, action: Install, path: C:\Documents and Settings\Administrator\Local Settings\Application Data\Package Cache\{2967b71d-245d-4f8d-aefb-6056f62a18f7}\your.exe, arguments: '"C:\Documents and Settings\Administrator\Local Settings\Application Data\Package Cache\{2967b71d-245d-4f8d-aefb-6056f62a18f7}\your.exe"  /passive LAUNCHAPP=1 "/cancel=<snip> /noupdate" -burn.related.update'
[0AE4:0AE8][2016-03-10T09:28:08]i000: ExecuteProgress is 100
[0AE4:0AE8][2016-03-10T09:28:08]i000: Set Progress to 100
[0AE4:0AE8][2016-03-10T09:28:08]i319: Applied execute package: {2967b71d-245d-4f8d-aefb-6056f62a18f7}, result: 0x0, restart: None
[0AE4:0AE8][2016-03-10T09:28:08]i351: Removing cached package: {2967b71d-245d-4f8d-aefb-6056f62a18f7}, from path: C:\Documents and Settings\Administrator\Local Settings\Application Data\Package Cache\{2967b71d-245d-4f8d-aefb-6056f62a18f7}\
[0AE4:0AE8][2016-03-10T09:28:09]i000: Automatically closing the window for non-interactive install
[0AE4:0AE8][2016-03-10T09:28:09]i000: Set State to Applied
[0AE4:0AE8][2016-03-10T09:28:09]i399: Apply complete, result: 0x0, restart: None, ba requested restart:  No

The deletion happens in the Removing cached package line, which is between the ExecutePackageComplete and ApplyComplete callback.

@squirmy
squirmy commented Mar 14, 2016

Oh, you're suggesting that I can delay the deletion of the file by waiting for the new process in the ExecutePackageComplete callback?

If so, that's likely to be a better solution.

@squirmy
squirmy commented Mar 16, 2016

Worked a treat. Here's the code which waits for the new process to start. It doesn't bother to elevate for UAC because well, we only have the problem on XP where UAC doesn't exist. So if we get a permission denied exception we just assume that it has started.

private bool EnsureUpdateBundleProcessStarted()
{
    try
    {
        var hasStarted = new ManualResetEventSlim();
        var currentProcess = Process.GetCurrentProcess();
        var ownPid = currentProcess.Id.ToString();
        var name = currentProcess.MainModule.ModuleName;

        using (var watcher = new ManagementEventWatcher(new WqlEventQuery($"SELECT * FROM Win32_ProcessStartTrace WHERE ProcessName='{name}' AND ProcessId != '{ownPid}'")))
        {
            watcher.EventArrived += (o, e) => hasStarted.Set();
            watcher.Start();

            using (var searcher = new ManagementObjectSearcher(@"\\.\root\CIMV2", $"SELECT * FROM Win32_Process WHERE Name='{name}' AND ProcessId != '{ownPid}'"))
            using (var processes = searcher.Get())
            {
                if (processes.Count > 0)
                    hasStarted.Set();
            }

            Model.Engine.Log(LogLevel.Standard, "Ensuring the update bundle process has started");

            var started = hasStarted.Wait(30000);

            Model.Engine.Log(LogLevel.Standard,
                             started
                                 ? "The update bundle process has started"
                                 : "Timed out waiting for the update bundle process to start");

            return started;
        }
    }
    catch (Exception e)
    {
        // The above requires admin priviledges.
        // We have this in our environment so for now we'll just do the easy thing and assume all is ok if it fails for any reason.
        Model.Engine.Log(LogLevel.Error,
                         $"An error occured while attempting to ensure the update bundle process started, assuming the process has started. Exception: {e.Message}");
        return true;
    }
}
@barnson barnson added bug burn labels Mar 22, 2016
@barnson barnson added this to the v3.11 milestone Mar 22, 2016
@rseanhall rseanhall was assigned by barnson Mar 22, 2016
@rseanhall rseanhall closed this Apr 21, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment