Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The process cannot access the file NuCache.Content.db because it is being used by another process #5035

Closed
robertjf opened this issue Mar 21, 2019 · 103 comments

Comments

@robertjf
Copy link
Contributor

commented Mar 21, 2019

PR: #5924

Something is happening in Azure WebApps where the NuCache.Content.db file is locked causing the site to hang. I've attached the log file for reference.

The exception is as follows:

Umbraco.Core.Exceptions.BootFailedException: Boot failed: Umbraco cannot run. See Umbraco's log file for more details.

-> Umbraco.Core.Exceptions.BootFailedException: Boot failed.

-> System.IO.IOException: The process cannot access the file 'D:\home\site\wwwroot\App_Data\TEMP\NuCache\NuCache.Content.db' because it is being used by another process.
   at System.IO.__Error.WinIOError(Int32 errorCode, String maybeFullPath)
   at System.IO.FileStream.Init(String path, FileMode mode, FileAccess access, Int32 rights, Boolean useRights, FileShare share, Int32 bufferSize, FileOptions options, SECURITY_ATTRIBUTES secAttrs, String msgPath, Boolean bFromProxy, Boolean useLongPath, Boolean checkHost)
   at System.IO.FileStream..ctor(String path, FileMode mode, FileAccess access, FileShare share, Int32 bufferSize, FileOptions options)
   at CSharpTest.Net.IO.TransactedCompoundFile..ctor(Options options)
   at CSharpTest.Net.Storage.BTreeFileStoreV2..ctor(Options options)
   at CSharpTest.Net.Collections.BPlusTree`2.OptionsV2.CreateStorage()
   at CSharpTest.Net.Collections.BPlusTree`2.NodeCacheBase..ctor(BPlusTreeOptions`2 options)
   at CSharpTest.Net.Collections.BPlusTree`2.NodeCacheNormal..ctor(BPlusTreeOptions`2 options)
   at CSharpTest.Net.Collections.BPlusTree`2..ctor(BPlusTreeOptions`2 ioptions)
   at Umbraco.Web.PublishedCache.NuCache.DataSource.BTree.GetTree(String filepath, Boolean exists)
   at Umbraco.Web.PublishedCache.NuCache.PublishedSnapshotService..ctor(Options options, IMainDom mainDom, IRuntimeState runtime, ServiceContext serviceContext, IPublishedContentTypeFactory publishedContentTypeFactory, IdkMap idkMap, IPublishedSnapshotAccessor publishedSnapshotAccessor, IVariationContextAccessor variationContextAccessor, IUmbracoContextAccessor umbracoContextAccessor, ILogger logger, IScopeProvider scopeProvider, IDocumentRepository documentRepository, IMediaRepository mediaRepository, IMemberRepository memberRepository, IDefaultCultureAccessor defaultCultureAccessor, IDataSource dataSource, IGlobalSettings globalSettings, ISiteDomainHelper siteDomainHelper, IEntityXmlSerializer entitySerializer, IPublishedModelFactory publishedModelFactory, UrlSegmentProviderCollection urlSegmentProviders)

...

This is on Umbraco 8.0.1 which was upgraded from Umbraco 8.0.0. (attachment will need to be changed from .txt to.json)

UmbracoTraceLog.RD2818786B7D96.20190319.txt

@zpqrtbnk

This comment has been minimized.

Copy link
Contributor

commented Mar 21, 2019

Is this on Umbraco Cloud, or on pure Azure? Where is D:\home\site\wwwroot\App_Data\TEMP - is it on a shared drive, or on a local drive? Is it possible that another instance of the site, running on another server, is trying to access these files? These files can only be accessed by 1 instance at a time - we have mechanism in place to ensure this on one server but not accross servers.

@robertjf

This comment has been minimized.

Copy link
Contributor Author

commented Mar 21, 2019

This is in standard Azure on a basic plan with single instance only, there’s no special configuration at all, pretty vanilla Umbraco 8 with hacked up starter kit

@Shazwazza

This comment has been minimized.

Copy link
Member

commented Mar 26, 2019

The same config principles apply to v8 as v7 for running on Azure. We defo need to get the docs updated (PRs are welcome!) The current azure docs are here https://our.umbraco.com/Documentation/Getting-Started/Setup/Server-Setup/azure-web-apps

for v8 you will need these in appSettings:

  • <add key="Umbraco.Core.LocalTempStorage" value="EnvironmentTemp" />
  • <add key="Umbraco.Examine.LuceneDirectoryFactory" value="Examine.LuceneEngine.Directories.SyncTempEnvDirectoryFactory, Examine" />
    • Alternatively depending on your requirements there is also Examine.LuceneEngine.Directories.TempEnvDirectoryFactory
@robertjf

This comment has been minimized.

Copy link
Contributor Author

commented Mar 26, 2019

interestingly enough, there is no examine config files in this website - I hadn't noticed that before... is that normal? Out of the box shouldn't this just work? Do I need to add the files?

@Shazwazza

This comment has been minimized.

Copy link
Member

commented Mar 26, 2019

There is no examine config files in v8. OOTB, just like v7, you need to adjust some config to make umbraco work with azure. So the above 2 config values are needed. These are equivalent to the v7 appSetting umbracoLocalTempStorage and the examine config directoryFactory switch.

@Shazwazza

This comment has been minimized.

Copy link
Member

commented Mar 26, 2019

The 2 config values are just in your web.config, these are just appSetting key values in v8

@robertjf

This comment has been minimized.

Copy link
Contributor Author

commented Mar 26, 2019

ah, right - good to know :)

@robertjf

This comment has been minimized.

Copy link
Contributor Author

commented Mar 26, 2019

so is it right to assume these are going to be automatically added in future, or are they only in certain situations?

@Shazwazza

This comment has been minimized.

Copy link
Member

commented Mar 26, 2019

No you cannot assume that, Just like deploying v7 to Azure you need to setup specific config for that, it's no different in v8.

If however, you create an Umbraco website from the Azure portal, then yes these should be pre-configured for you but I don't think we have a v8 build on the Azure portal yet.

@JoseMarcenaro

This comment has been minimized.

Copy link

commented Apr 4, 2019

Thanks for explaining the required config settings.

But in v8, even with the above settings, the exact error happens when you enable Slot Swap in the Azure App Service: before the swap both slots are working ok, but after the Swap the new "Production" slot throws the error every time. A full App Service restart is needed to make things working again.

If you work with a single slot everything is Ok,

@Shazwazza

This comment has been minimized.

Copy link
Member

commented Apr 4, 2019

That's interesting, will assume the same problem exists in v7 too since it's the same paradigm.

I think the only way around such behavior is to have an option to not have a persisted cache file, or name the cache file based on the AppDomainAppId + MachineName (should be unique among processes).

@zpqrtbnk what do you think here?

@JoseMarcenaro

This comment has been minimized.

Copy link

commented Apr 4, 2019

Note: We have a site on Umbraco 7.13.2 running on Azure with slot swapping (and the "LocalTempStorage" ="EnvironmentTemp" app setting) and we never run into this problem with the XML cache file.
Maybe because it's not open all the time?

@zpqrtbnk

This comment has been minimized.

Copy link
Contributor

commented Apr 4, 2019

NuCache stores its file in Path.Combine(_globalSettings.LocalTempPath, "NuCache") and therefore respects the <add key="Umbraco.Core.LocalTempStorage" value="EnvironmentTemp" /> setting. The temp path combines the application id and the site name, but not the machine name, as the local temp is supposed to be local, ie per-server.

I don't know much about "slot swaps" in Azure. But, if they are running on the same machine, same application id, same site name, same temp dir then... there might be a collision? That would require some troubleshooting to get it right. But then, as you mention, it's not only NuCache but other things too.

Now indeed, the difference is that v8 locks the NuCache files for as long as it's running where v7 was "using" the Xml file from time to time = the two sites may cohabit (but that was a bad idea).

So... to make it short, I'd like to hear more about Slot Swaps and, on all slots, get the value of HostingEnvironment.SiteName, HostingEnvironment.ApplicationID, and %temp%.

@JoseMarcenaro

This comment has been minimized.

Copy link

commented Apr 4, 2019

@zpqrtbnk thanks for looking into this. I will post today the requested information about the Slot Swaps and those values.

@JoseMarcenaro

This comment has been minimized.

Copy link

commented Apr 4, 2019

This is what I got on each slot, before and after the swap:

<!-- STAGE slot, before swap

    HostingEnvironment.SiteName = 'terniumcomdev__e951'
    HostingEnvironment.ApplicationID = '/LM/W3SVC/1464357220/ROOT'
    %temp% = 'D:\local\Temp'
    -->
<!-- PRODUCTION slot, before swap
    HostingEnvironment.SiteName = 'terniumcomdev__dd91'
    HostingEnvironment.ApplicationID = '/LM/W3SVC/1742369212/ROOT'
    %temp% = 'D:\local\Temp'
    -->

<!-- STAGE slot, after swap
    HostingEnvironment.SiteName = 'terniumcomdev__dd91'
    HostingEnvironment.ApplicationID = '/LM/W3SVC/1742369212/ROOT'
    %temp% = 'D:\local\Temp'
    -->

<!-- PRODUCTION slot, after swap 
    HostingEnvironment.SiteName = 'terniumcomdev__e951'
    HostingEnvironment.ApplicationID = '/LM/W3SVC/1464357220/ROOT'
    %temp% = 'D:\local\Temp'
    -->

And this time the error was not thrown.
In a previous execution (with different slots) I got the following:

System.IO.IOException: The process cannot access the file 'D:\local\Temp\UmbracoData\854d45b396372eb551323cc722f33136\NuCache\NuCache.Content.db' because it is being used by another process.

I am positively sure that this setting was in effect when the error was thrown:

<add key="Umbraco.Core.LocalTempStorage" value="EnvironmentTemp" />

but this one was not (may this be the issue?)

<add key="Umbraco.Examine.LuceneDirectoryFactory" value="Examine.LuceneEngine.Directories.SyncTempEnvDirectoryFactory, Examine" />

In my latest test (no error) both settings were in effect.

@zpqrtbnk

This comment has been minimized.

Copy link
Contributor

commented Apr 5, 2019

The Lucene setting has no impact on NuCache. The fact that you are seeing an error such as The process cannot access the file 'D:\local\Temp\UmbracoData\854...136\NuCache\NuCache.Content.db' indicates that you are indeed using the proper local temp storage setting.

NuCache uses the built-in MainDom mechanism to ensure that only one app domain at a time can own the cache (and the associated files). MainDom uses a machine-wide named lock; the name is built by combining the application id and the application physical path.

(so obviously I should also have asked you the HostingEnvironment.ApplicationPhysicalPath values...)

A site cannot, in theory, even try to access the NuCache file until it has aquired the MainDom lock. Therefore, for the error to happen... the site must own the machine-wide lock on (app.id, app.path) and yet someone else must lock the files at (app.id, site.name), meaning

  • either there is something weird with app.path, or
  • "machine-wide" is the key here, the two slots don't run on the same physical server yet share the same diskspace

Any chance you can get the ApplicationPhysicalPath values?

And... I don't know enough about "slots" to figure out the physical server thing. Ideas?

@JoseMarcenaro

This comment has been minimized.

Copy link

commented Apr 5, 2019

Just tested it. The ApplicationPhysicalPath value is always D:\home\site\wwwroot\ in both slots, before and after the swap.

Regarding the physical server... it is not fully documented, but many posts assume it is the same because all slots share the same App Service Plan = the same set of resources (i.e. 3.75 Gb RAM).

@zpqrtbnk

This comment has been minimized.

Copy link
Contributor

commented Apr 7, 2019

So... assuming that all sites run on the same physical server (for now), that leaves us with app.path.

D:\home\site\wwwroot\ is a kinda virtual path that Azure uses, so basically all apps are always under that path, and that can be annoying in some cases... but in our case, it should indeed make things even more safer.

Did not realize you posted a log file with the original issue - now looking at that file.

@JoseMarcenaro

This comment has been minimized.

Copy link

commented Apr 7, 2019

@zpqrtbnk - the log file is not mine, so it does not apply to the slot issue. Thanks.

@JoseMarcenaro

This comment has been minimized.

Copy link

commented Apr 7, 2019

In any case, for some mysterious reason the "process cannot access the file NuCache.Content.db" error has not appeared lately - in all slots swaps during the last three days.
I will post more information - including a log - if it does happen again. Thanks for your help!

@zpqrtbnk

This comment has been minimized.

Copy link
Contributor

commented Apr 7, 2019

Thanks for the update.

Even though it's not your log... the significant lines are:

2019-03-19 03:16:49,695 [P9488/D3/T1] INFO   Umbraco.Core.Runtime.CoreRuntime - Booting Umbraco 8.0.0 on RD2818786B7D96. [Timing e8bc0a7]
2019-03-19 03:16:50,759 [P9488/D3/T1] INFO   Umbraco.Core.MainDom - Acquired.
2019-03-19 03:17:39,322 [P9488/D3/T17] INFO   Umbraco.Core.MainDom - Released (environment)
2019-03-19 03:25:35,501 [P9488/D4/T1] INFO   Umbraco.Core.Runtime.CoreRuntime - Booting Umbraco 8.0.0 on RD2818786B7D96. [Timing 761ae5c]
2019-03-19 03:25:36,074 [P9488/D4/T1] INFO   Umbraco.Core.MainDom - Acquired.
2019-03-19 03:27:45,813 [P9488/D4/T16] INFO   Umbraco.Core.MainDom - Released (environment)
2019-03-19 03:27:49,388 [P9488/D5/T1] INFO   Umbraco.Core.Runtime.CoreRuntime - Booting Umbraco 8.0.0 on RD2818786B7D96. [Timing 9ce9254]
2019-03-19 03:27:50,970 [P9488/D5/T1] INFO   Umbraco.Core.MainDom - Acquired.
2019-03-19 03:28:24,123 [P9488/D5/T1] INFO   Umbraco.Core.Runtime.CoreRuntime - Booted. (34755ms) [Timing 9ce9254]
2019-03-19 03:28:59,434 [P9472/D2/T1] INFO   Umbraco.Core.Runtime.CoreRuntime - Booting Umbraco 8.0.0 on RD2818786B7D96. [Timing 56405ae]
2019-03-19 03:29:06,148 [P9472/D2/T1] INFO   Umbraco.Core.MainDom - Acquired.
2019-03-19 03:29:31,320 [P9472/D2/T1] ERROR  Umbraco.Core.Runtime.CoreRuntime - Boot failed. (32637ms) [Timing 56405ae]
Umbraco.Core.Exceptions.BootFailedException: Boot failed. ---> System.IO.IOException: ...
2019-03-19 03:29:46,191 [P9488/D5/T13] INFO   Umbraco.Core.MainDom - Released (environment)

Where we see that a new process (9472) starts while the previous one (9488) is running, and acquires the MainDom lock before the previous process releases it, thus hitting an exception when trying to read the cache. That should not be possible...

If anything happens again, thanks for reporting. Meanwhile, I've made a few changes so that in 8.0.2 we log all the important infos (app.path, app.id, etc) when the site boots.

@JoseMarcenaro

This comment has been minimized.

Copy link

commented Apr 9, 2019

Hi @zpqrtbnk
Some more info (the exception is being thrown again in my Azure environment).
A log is attached, the same behavior you described happens in the last lines of the log.

Note that in this case:

  • This is not some kind of "race condition": twelve minutes passed between the first lock acquire and the following attempt.
  • There is no lock release nor application shutdown between the first acquisition (at 18:45) and the following attempt (at 18:59)

UmbracoTraceLog.RD2818784FF824.20190409.zip

I'm running v8.0.1 installed via the nuget.org package.

Hope it helps.

@JoseMarcenaro

This comment has been minimized.

Copy link

commented Apr 9, 2019

@zpqrtbnk - I could further isolate the problem.

I performed the following steps:

  • at 19:33 GMT stopped the "stage" slot and deleted the log file
  • then I started the app (on the stage slot) and got into Umbraco - that is the start of the attached log file, at 19:35
  • then I "went to play outside" and did nothing with the app for twenty minutes (I am the only user, for now). I checked the log, and nothing had happen during that time.
  • at 19:55 I performed a slot swap (line 16 of the log)
    ... and voilà! the app successfully acquired a new lock, with a different PID, without shutting down the previous instance.

So it looks like this is related to the Azure Slot swap - although it is the same machine, for some reason the process starts over without having shut down the previous one.

UmbracoTraceLog.RD2818784FF824.20190409.zip

@JoseMarcenaro

This comment has been minimized.

Copy link

commented Apr 9, 2019

One last comment about the slot swap process:
the "swapped" Web app remains the same, in the same machine .. but the URLs that hit it change (i.e. from contoso-stage.azurewebsites.net to contoso.azurewebsites.net). I don't know if immediately after the switch the Azure infrastructure sends some kind of wakeup call.

@zpqrtbnk

This comment has been minimized.

Copy link
Contributor

commented Apr 10, 2019

Hey - thanks for the details - just FYI I am away at the Barcelona meetup, with little time for this issue - will resume work on it at the end of the week (so don't feel bad if I don't reply).

@zpqrtbnk

This comment has been minimized.

Copy link
Contributor

commented Apr 12, 2019

Have looked at your logfile and... here is the thing: the new process (P564) is starting for a different application (since you swapped). So P8108 was for the other application. It does not have to release its lock for P564 to acquire it... OTOH the lock has to be released... but that should happen on the other slot. And so... I would love to see the other log. Is this possible?

Note that it still does not explain what is going on with the cache. I am currently running some experiments with Azure AppServices to try to get a better understanding of it all.

Or maybe not - further experimenting.

@zpqrtbnk

This comment has been minimized.

Copy link
Contributor

commented Apr 12, 2019

I have created an Azure App Service hosting a simple MVC web app, with two slots. Had to have one per-slot appSetting in order to force slot-swaps to trigger a restart. Then, have our MainDom system and some temp file lock to the app... and I can swap slots, and see the MainDom lock being properly released, and swaps happening without issues.

I have to accept that something is wrong, considering your log, but for now I am running out of idea.

During the swap, on my test, the old process terminates and then a new process starts. In your log, the new process starts before the old process terminates, but that should not be a problem: this is precisely why we have the MainDom lock.

Is this happening every time you swap, or only from time to time? Is this on a production / live site? To which extend would it be possible for you to run some custom DLLs which would include more tracing / logging?

@JoseMarcenaro

This comment has been minimized.

Copy link

commented Apr 14, 2019

@zpqrtbnk ,
The error is no longer happening.
By the time I reported this and send you the log, it was happening on every swap.

I did not change anything in Umbraco since then - the same package 8.0.1 is used.

I'm almost sure the difference is that, while trying to workaround the error, I completely disabled Application Insights on Azure - it was enabled by default when I first created the App Service, and I did not opt out of it as I usually do.

To verify if this was the cause I enabled again Application Insights in both slots and something strange happened:

  • On both slots, immediately after installing Application Insights, I went into the site and the same NuCache locking error was thrown - no swap involved, just installing Application Insights which restarts the app. I'm attaching this log, just in case. UmbracoTraceLog.RD2818784FF824.20190413.zip. Error is at the end of the log.
  • I temporarily fixed the error by stopping and starting the App Service, the site started normally.
  • Further swaps between slots worked fine with no errors

So looking in retrospect, now my thoughts are:

  • the error is not related to the swap itself, but to the restart after Application Insights is installed or removed.
  • the reason why this happened to me on every swap is that when creating a stage slot for an app service that has Application Insights installed, some App Settings are duplicated but the feature is not correctly installed in the new slot - so each swap means activating / deactivating Application Insights

Current situation:

  • I can run the site normally (no errors) with or without Application Insights. So it's not a priority for me that you spend more time on the issue.
  • On the other hand, if you want to get to the bone of what happens, I can gladly help by installing any DLL that collects additional information. This is not a production environment yet. And I found a way to reproduce the issue: Enable Application Insights on an App Service that doesn't have it.

Thanks for your help!

@zpqrtbnk

This comment has been minimized.

Copy link
Contributor

commented Apr 15, 2019

Thanks for keeping up with the detective work ;-) I have tried to experiment with enabling/disabling AppliCation Insights on my test app, but still cannot reproduce the out-of-order lock management that you see.

Happy that it works for you, but it annoys me.

Will lower the priority of this issue, but still, will try to send you a DLL to test, that would log way more details about what is going on. Stay tuned!

@zpqrtbnk

This comment has been minimized.

Copy link
Contributor

commented Apr 15, 2019

This: Debug.zip contains a patched DLL that just logs infos when acquiring the main lock (process id, but also user name, app id and maindom hash) - if you have a moment to try it...

@Shazwazza

This comment has been minimized.

Copy link
Member

commented Jul 16, 2019

The 'ultimate fix' ( 😂 ) is that a race condition is fixed so that an appdomain that is started and immediately shutdown because another appdomain comes online moments later does not try to access the nucache persisted files.

Will we be able to have a persisted cache file once the fix is in

yes

If we can't, does this prevent the use of caching as detailed here:

This doesn't prevent you from doing anything. As mentioned before - the persisted cache files are purely there for startup performance times. All cache is in memory regardless of whether it loads from the DB on startup of the perstisted cache files. The usage of Umbraco is identical.

You mentioned also the issue is present in v7 but not detected in the same manner. Are there fixes that will make there way into v7 or anything we need to do for v7 deployments?

This is not in the same context as this issue. Like i said, v7 and v8 are very different. The XML cache in v7 has it's own issues which is one reason why it doesn't exist in v8. These issues have always been there - for example, there is a reason why the right click context menu in the very root 'content' node in the tree has a "refresh all" which doesn't exist in v8. It's because the v7 xml cache in some cases can be "corrupted" due to various different reasons ranging from the file not being exclusively locked and allowing multiple appdomains to read/write to it at the same time. These issues cannot be fixed in v7 - a big reason why v8 exists is so that we could fix these issues.

@Shazwazza

This comment has been minimized.

Copy link
Member

commented Jul 16, 2019

The next tests that i would love for @JoseMarcenaro and @brreisner to run is to try a nightly of the 8.1.1 Umbraco.Core/Umbraco.Web DLLs. I will upload these here shortly. We would like to release an 8.1.1 asap (probably in a couple weeks) but I cannot close this issue until i know it works for you guys.

DLLs coming soon....

@JoseMarcenaro

This comment has been minimized.

Copy link

commented Jul 16, 2019

@Shazwazza sure, just post them here and we will test them ASAP.

@brreisner

This comment has been minimized.

Copy link

commented Jul 16, 2019

Will do, Thanks!

@Shazwazza

This comment has been minimized.

Copy link
Member

commented Jul 17, 2019

Please see attached the 8.1.1 binaries zip file.

NOTE These are NOT the final release binaries of 8.1.1 and they don't have a pre-release alias on the build. This build was made from rev 3603090

To test:

  • Take a backup of Umbraco.Core.dll and Umbraco.Web.dll and then replace these 2 DLLs with the ones in the zip file
  • Remove the work around(s)
  • Change your version in your web.config to 8.1.1 instead of 8.1.0 - or you can let the upgrader run
  • Check if everything works

To rollback:

  • replace these 2 DLLs with your backup copies
  • change the version in your web.config back to 8.1.0
  • re-instate work around

UmbracoCms.AllBinaries.8.1.1.zip

Please let me know how you go, we won't be shipping this release without knowing it works for you.

@JoseMarcenaro

This comment has been minimized.

Copy link

commented Jul 17, 2019

@Shazwazza - quick question
We should do this on top of an existing 8.1.0 installation, right?

The problem I have is that my production site is still 8.0.2, and I cannot upgrade it to 8.1.0 until I get issue #5886 solved (Variant Children of non-variant node are not listed) because it breaks the site in multiple places.

@Shazwazza

This comment has been minimized.

Copy link
Member

commented Jul 19, 2019

@JoseMarcenaro yeah, let me make a build of 8.0.x with this fix for you to test and i'll also look into the children issue.

@Shazwazza

This comment has been minimized.

Copy link
Member

commented Jul 19, 2019

@JoseMarcenaro here's an 8.0.2 build with the fix in these 2x DLLs. FYI these are Debug versions of these DLLs (not Release). I've just built this in VS based on the 8.0.2 tag and cherry picked the commit rev 3603090 ... i think that should work for you, please let me know

8.0.2_CoreAndWeb.zip

@JoseMarcenaro

This comment has been minimized.

Copy link

commented Jul 19, 2019

@Shazwazza thanks for building this DLL set for me.

Good news: On an existing 8.0.2 Azure web app, built without the reflection code, I dropped these DLLs in both slots (stage - production) and swapped back and forth without issues (no NuCache blocking)

Then I completed the setup of a 8.1.0 separate environment, and dropped the other DLL zip you provided for 8.1.1 (in both slots) and swapped back and forth - no incidents! Yay!
Of course, on the first run it upgraded the DB to 8.1.1 as expected.

It looks like the fix in the initialization code solves the lock issues. Great work!
Thanks,
Jose

@Shazwazza

This comment has been minimized.

Copy link
Member

commented Jul 22, 2019

@JoseMarcenaro many thanks for testing! I'll go ahead and close this issue. @brreisner will still be great if you can let me know the outcome of your testing too. Cheers!

@warrenbuckley

This comment has been minimized.

Copy link
Member

commented Jul 23, 2019

Closing this issue as this got merged in from this PR
#5924

@ealse

This comment has been minimized.

Copy link

commented Aug 5, 2019

I have updated Umbraco from version 8.1.0 -> 8.1.1

When I was using version 8.1.0 with reflection to disable persisted cache files inside my Compose method (As mentioned above) no errors were thrown.

After updating to version 8.1.1 and removing the workaround the following error came back when swapping my web app:
Umbraco.Core.Exceptions.BootFailedException: Boot failed.
-> System.IO.IOException: The process cannot access the file 'D:\home\site\wwwroot\App_Data\TEMP\NuCache\NuCache.Content.db' because it is being used by another process

UmbracoTraceLog.txt

@jsheard02

This comment has been minimized.

Copy link

commented Aug 19, 2019

@ealse @Shazwazza

We have tried this on 8.1.0, 8.1.1 and 8.1.2, and we experience the same NuCache.Content.Db error as you point out. For some additional context, we add the umbracoLocalTempStorage into the Application Settings of the Web App on azure, and not directly into the web app itself. I will try and add this directly into the web.config file now, and let you know if the issue persists.

@Shazwazza

This comment has been minimized.

Copy link
Member

commented Aug 20, 2019

@jsheard02 umbracoLocalTempStorage isn't the correct setting, that is for v7, see #5035 (comment)

@anthonydotnet

This comment has been minimized.

Copy link

commented Aug 20, 2019

This doesnt seem to work in v8.1.2. We still need to restart the app after a slot swap :(

@JoseMarcenaro

This comment has been minimized.

Copy link

commented Aug 20, 2019

We have three similar Azure environments with v8.1.1
In one of them we get the error every time - the other two are good. And I can't figure out what the difference is. Any suggestions for further testing?

@nul800sebastiaan

This comment has been minimized.

Copy link
Member

commented Aug 21, 2019

This sounds like a configuration problem, at least for @JoseMarcenaro - it might be good to triple-check all differences between those 3 servers.

Other than that, if it can be reproduced with a clean install then we can have a look at additional fixes.

@kjac

This comment has been minimized.

Copy link
Contributor

commented Aug 21, 2019

@nul800sebastiaan This also happened to me with 8.1.2 when using deployment slots on Azure. Here are my findings.

Symptoms

Whenever a deployment slot swap was executed, the target (production) slot would crash with the The process cannot access the file error. The source slot would still boot fine when requested after the swap.

This happened with manual swaps from the Azure portal as well as with automated ones from DevOps.

Workarounds

Setting Umbraco.Core.LocalTempStorage = EnvironmentTemp did not help. Only difference was a new file path in the The process cannot access the file error (as one would expect given the new LocalTempStorage setting value).

However! Applying @Shazwazza's fix did the trick. The code looks like this in 8.1.2:

using Umbraco.Core.Composing;
using Umbraco.Web.PublishedCache.NuCache;

namespace Your.Namespace.Here
{
    public class IgnoreLocalDbComposer : IUserComposer
    {
        public void Compose(Composition composition)
        {
            composition.Register(factory => new PublishedSnapshotServiceOptions
            {
                IgnoreLocalDb = true
            });
        }
    }
}

Since we can do warm-ups of deployment slots before swapping, any startup performance impact is of no concern.

@Shazwazza

This comment has been minimized.

Copy link
Member

commented Sep 9, 2019

My 'fix' will be fine if that's what you choose to use but i recommend reading this whole thread to understand. Setting IgnoreLocalDb = true just means there is no persisted cache file on disk which can affect startup time, but with slot swapping you probably don't need to worry about that either.

What I would like to understand though is what the actual new problem is. It's unfortunate when people say things like "the problem still exists" or "I have the same problem" which isn't entirely helpful since the first problem was definitely fixed and was caused by a race condition. As it seems that there is a new/similar issue then please open up a new issue with steps to replicate and full details. Thanks!

rossvernal added a commit to rossvernal/UmbracoDocs that referenced this issue Oct 15, 2019
Updated, as per advice here umbraco/Umbraco-CMS#5035 (comment)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
You can’t perform that action at this time.