Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cleaning up MainDom and other resources preventing clean shutdown/startup #6757

Merged
merged 23 commits into from Dec 5, 2019

Conversation

Shazwazza
Copy link
Contributor

@Shazwazza Shazwazza commented Oct 18, 2019

Based on some discussion here #6546

The gist is that we can be more robust in handling MainDom and shutdown logic with background task runners. We are currently preventing the app from shutting down in a fast manner which this will help resolve. We are not properly releasing resources.

Changes

  • Renames AsyncLock to SystemLock
  • The EventWaitHandle was never closed or disposed
    • This probably meant leaking handles or other reasources. In fact, on Umbraco Cloud we have cases where there are millions of handles leaked that we are still investigating. These handle types in the mem dump are Event and WaitCompletionPacket and when investigating on the server we get more details that the event is Event Type Manual Reset & Event is Set... there is a possibility that these handles are these EventWaitHandles (since we have yet to find the cause)
    • Not closing these handles could potentially lead to odd things with synchronization
  • Ensure MainDom is registered with HostingEnvironment when constructored, not just when acquired
    • MainDom was only registered with HostingEnvironment when it acquired MainDom which means it wasn't able to dispose of it's resources (like the EventWaitHandle) on shutdown if it wasn't acquired.
  • MainDom is now IDisposable
    • MainDom will now be disposed on hosting environment shutdown and when the asp.net application shutdown event is executed since I've heard cases of one or the other not executing in some shutdown scenarios.
  • MainDom & BackgroundTaskRunner not using IRegisteredObject correctly
    • Both of these were unregistering themselves from the HostingEnvironment as soon as Stop(false) was called, which is incorrect. It means that neither of these things would ever be listening for Stop(true) even though there is code to deal with the immediate == true value which would never be hit. In some cases, depending on how an app is shutdown, this may have led to some inconsistencies when winding down the app.
    • Here's a note on if IRegisteredObject.Stop is called: Umbraco Fails to Boot Repeatedly on Azure during auto Machine switchover #6546 (comment) ... it's never guaranteed to call the first with immediate==false but it will always call with immediate==true
  • BackgroundTaskRunner throwing exceptions when terminating because it is trying to set a result on a Task that is already completed
    • I have seen this debugging multiple times, it didn't happen all of the time, but is also related to the above. In some cases SetResult(0) was being called on an already completed task causing an exception to occur. Since this was happening during shutdown, this could have had other consequences causing the app to not unwind correctly.
  • BackgroundTaskRunner may not terminate if the call the Shutdown fails for any reason
    • If a call to Shutdown failed for any reason, the background runner would not terminate properly, this could have had other consequences causing the app to not unwind correctly.
  • MainDom.Signal logic is entirely executed within a lock instead of only partially
  • MainDom.Register eagerly exits and logs a warning if MainDom wasn't actually acquired
  • IRegisteredObject.Stop calls are made by a single aspnet thread which means each one is called sequentially, not in parallel so it is easy to slow down the shutdown sequence so BG task runners are updated now to shutdown on a threadpool thread if required so they can all shut down in parallel. This should speed up shutdown a lot. Further to this we are no longer very lenient with BG task runners shutting down, we will try to shut down as fast as possible instead of waiting for the hosting env.
  • PublishedSnapshotService is updated to track both files (media + content) instead of treating the boolean flags as one.

…unner was not using HostingEnvironment.Stop correctly, ensures no exception is thrown when setting task result on a background task, ensure multiple terminations don't occur, ensure terminations are done if shutdown fails.
@John-Blair
Copy link
Contributor

I agree with your changes above that I understand. Especially
Ensure MainDom is registered with HostingEnvironment when constructored
MainDom not using IRegisteredObject correctly - I did wonder about that.
W.r.t. BackgroundTaskRunner - the changes mentioned above do look sensible. I took a quick look at that and am not surprised there are issues with it - there's a lot of complexity in there that is poorly documented.

@jmayntzhusen
Copy link
Contributor

Hey @John-Blair

If you type in git remote -v you will get a list of your remote repos, my guess is that origin is pointing to your fork.

You may then have "upstream" set to the real version here, you can then do git fetch upstream to get the upstream changes and then get rebase upstream/v8/bugfix/6546-MainDom-Cleanup to rebase your own branch with the upstream one.

If no upstream is set you can run git remote add upstream https://github.com/umbraco/Umbraco-CMS.git

Hope this helps 🙂

@John-Blair
Copy link
Contributor

John-Blair commented Oct 18, 2019

Thanks for the help.
I finally figured it out - deleted my forked repo, re-forked from umbraco-cms, set the default branch on github.com on the fork to "v8/bugfix/6546-MainDom-Cleanup" and then cloned that in Visual Studio.

I made the original mistake of rebasing to the "v8/bugfix/6546-MainDom-Cleanup" branch on my original fork - thinking that was switching to the branch...that's git noob status for you!

@Shazwazza Shazwazza mentioned this pull request Oct 22, 2019
1 task
…ny latched tasks are canceled even with Stop(immediate == false) is executed.
…block the shutdown of other IRegisteredObjects
…terminate to ensure that all cancelation tokens are canceled and cleared
# Conflicts:
#	src/Umbraco.Core/MainDom.cs
#	src/Umbraco.Web/PublishedCache/NuCache/ContentStore.cs
#	src/Umbraco.Web/PublishedCache/NuCache/PublishedSnapshotService.cs
@Shazwazza Shazwazza changed the base branch from v8/8.1 to v8/dev December 4, 2019 05:10
@Shazwazza Shazwazza marked this pull request as ready for review December 4, 2019 05:29
…xception where necessary, adds logging to tests, fixes a boolean op
@bergmania bergmania merged commit f7e5e81 into v8/dev Dec 5, 2019
@bergmania bergmania deleted the v8/bugfix/6546-MainDom-Cleanup branch December 5, 2019 13:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants