Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Default Transaction blocks the whole application on each request #7688

Closed
ViRuSTriNiTy opened this issue May 9, 2017 · 9 comments
Closed

Comments

@ViRuSTriNiTy
Copy link
Contributor

ViRuSTriNiTy commented May 9, 2017

Hi there,

let me put a bit of context first: after upgrading from1.9.2 to 1.10.1 the application couldn't handle requests executed at the same time (we got timeouts as soon as 2 users startet a request at the same time). As far as i can remember i googled around and after a while i found a post by @pszmyd recommending setting the snapshot isolation level to a certain value (see here). After i did this the application worked again but after a while i recognized that when a long operation takes place (e.g. in a controller action) the whole application is blocked until the long operation finishes (even when using async actions).

Then i remembered a paragraph of the Orchard docs stating that a DB transaction is created for each request. Therefore i injected ITransactionManager in my controller and called ITransactionManager.Cancel() in the action first and suddenly Orchard is back on track and can handle multiple requests at the same time:

public class MyController : ...
{
  // public
    public MyController(Orchard.Data.ITransactionManager aTransactionManager)
    {
      mTransactionManager = aTransactionManager;
    }
    
    public async System.Threading.Tasks.Task<System.Web.Mvc.ActionResult> PrepareAsync()
    {
      // cancel the blocking transaction as this request might take a longer time,
      // otherwise the whole application will be blocked by the transaction until the operation ends
      //
      // remarks:
      //  we don't need the transaction here as
      //    b) we are purely reading data from DB
      //    a) we don't read the DB data via HQL or other NHibernate based stuff (instead using LINQ to SQL), so the transaction is of no use here
      mTransactionManager.Cancel();
      
      // operation with a long duration is executed here
    }
      
  // private
    private Orchard.Data.ITransactionManager mTransactionManager;
}

Therefore i assume its a DB transaction / configuration issue. My assumption: as long as a transaction is open no other can read from the DB.

I don't feel comfortable calling ITransactionManager.Cancel() all over the place. Also this approach won't work when basic Orchard functionality is concerned, e.g. on saving a content item -> handlers are called -> a handler might take a longer time to process the content item -> here i cannot call ITransactionManager.Cancel().

What could possibly cause the DB blocking and how can i fix this?

So lonG

@sebastienros
Copy link
Member

  1. Maybe you should consider the ReadUncommitted isolation level if you are not already using it. It's a setting in hostComponent.config but it should be set by default.
  2. Can you give some more detailed repro? Or can I assume that any action doing any db access (read OR write) will block the requests? Ideally give me the code for an action that repros the issue and we can investigate.

I would say that if you are doing an update on a table and the action takes a long time, then the database will lock the table during the whole transaction, right.

@ViRuSTriNiTy
Copy link
Contributor Author

Ok, i tracked it down. The module AlexZh.WindowsAuthentication includes a file CustomAuthenticationService.cs with the following code lines:

// Signing In Windows User
if (user != null)
{
  SetAuthenticatedUserForRequest(user);
  SignIn(user, false); // *1
  foreach (var userEventHandler in userEventHandlers) 
  {
    userEventHandler.LoggedIn(user); // *2
  }
}

The module implements windows authentication. After a user has been signed in (*1) the module calls the user event handlers (*2) which results in a call to LoggedIn of Orchard.Users.Events.LoginUserEventHandler. Here the last login timestamp is updated which seems to trigger somehow the content manager or something. When i exclude LoginUserEventHandler from the project, everthing works fine.

The following exception is displayed when the LoginUserEventHandler is included in the project:

Line 368:            contentItemCriteria.SetCacheable(true);
Line 369:
Line 370:            return contentItemVersionCriteria.List<ContentItemVersionRecord>();
Line 371:        }
Line 372:

Source File: src\Orchard\ContentManagement\DefaultContentManager.cs    Line: 370 

Stack Trace: 


[Win32Exception (0x80004005): The wait operation timed out]

[SqlException (0x80131904): Execution Timeout Expired.  The timeout period elapsed prior to completion of the operation or the server is not responding.
Die Anweisung wurde beendet.]
   System.Data.SqlClient.SqlConnection.OnError(SqlException exception, Boolean breakConnection, Action`1 wrapCloseInAction) +2442598
   System.Data.SqlClient.SqlInternalConnection.OnError(SqlException exception, Boolean breakConnection, Action`1 wrapCloseInAction) +5766516
   System.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj, Boolean callerHasConnectionLock, Boolean asyncClose) +285
   System.Data.SqlClient.TdsParser.TryRun(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj, Boolean& dataReady) +4162
   System.Data.SqlClient.SqlCommand.FinishExecuteReader(SqlDataReader ds, RunBehavior runBehavior, String resetOptionsString, Boolean isInternal, Boolean forDescribeParameterEncryption) +255
   System.Data.SqlClient.SqlCommand.RunExecuteReaderTds(CommandBehavior cmdBehavior, RunBehavior runBehavior, Boolean returnStream, Boolean async, Int32 timeout, Task& task, Boolean asyncWrite, Boolean inRetry, SqlDataReader ds, Boolean describeParameterEncryptionRequest) +2031
   System.Data.SqlClient.SqlCommand.RunExecuteReader(CommandBehavior cmdBehavior, RunBehavior runBehavior, Boolean returnStream, String method, TaskCompletionSource`1 completion, Int32 timeout, Task& task, Boolean& usedCache, Boolean asyncWrite, Boolean inRetry) +911
   System.Data.SqlClient.SqlCommand.InternalExecuteNonQuery(TaskCompletionSource`1 completion, String methodName, Boolean sendToPipe, Int32 timeout, Boolean& usedCache, Boolean asyncWrite, Boolean inRetry) +359
   System.Data.SqlClient.SqlCommand.ExecuteNonQuery() +286
   System.Data.SqlClient.SqlCommand.ExecuteBatchRPCCommand() +42
   System.Data.SqlClient.SqlCommandSet.ExecuteNonQuery() +289
   NHibernate.AdoNet.SqlClientSqlCommandSet.ExecuteNonQuery() +39
   NHibernate.AdoNet.SqlClientBatchingBatcher.DoExecuteBatch(IDbCommand ps) +222

[GenericADOException: could not execute batch command.[SQL: SQL not available]]
   NHibernate.AdoNet.SqlClientBatchingBatcher.DoExecuteBatch(IDbCommand ps) +273
   NHibernate.AdoNet.AbstractBatcher.ExecuteBatchWithTiming(IDbCommand ps) +122
   NHibernate.AdoNet.AbstractBatcher.ExecuteBatch() +62
   NHibernate.AdoNet.AbstractBatcher.OnPreparedCommand() +4
   NHibernate.AdoNet.AbstractBatcher.PrepareBatchCommand(CommandType type, SqlString sql, SqlType[] parameterTypes) +118
   NHibernate.Persister.Entity.AbstractEntityPersister.Update(Object id, Object[] fields, Object[] oldFields, Object rowId, Boolean[] includeProperty, Int32 j, Object oldVersion, Object obj, SqlCommandInfo sql, ISessionImplementor session) +427
   NHibernate.Persister.Entity.AbstractEntityPersister.UpdateOrInsert(Object id, Object[] fields, Object[] oldFields, Object rowId, Boolean[] includeProperty, Int32 j, Object oldVersion, Object obj, SqlCommandInfo sql, ISessionImplementor session) +194
   NHibernate.Persister.Entity.AbstractEntityPersister.Update(Object id, Object[] fields, Int32[] dirtyFields, Boolean hasDirtyCollection, Object[] oldFields, Object oldVersion, Object obj, Object rowId, ISessionImplementor session) +654
   NHibernate.Action.EntityUpdateAction.Execute() +326
   NHibernate.Engine.ActionQueue.Execute(IExecutable executable) +25
   NHibernate.Engine.ActionQueue.ExecuteActions(IList list) +59
   NHibernate.Engine.ActionQueue.ExecuteActions() +27
   NHibernate.Event.Default.AbstractFlushingEventListener.PerformExecutions(IEventSource session) +182
   NHibernate.Event.Default.DefaultAutoFlushEventListener.OnAutoFlush(AutoFlushEvent event) +140
   NHibernate.Impl.SessionImpl.AutoFlushIfRequired(ISet`1 querySpaces) +389
   NHibernate.Impl.SessionImpl.List(CriteriaImpl criteria, IList results) +535
   NHibernate.Impl.CriteriaImpl.List(IList results) +40
   NHibernate.Impl.CriteriaImpl.List() +58
   Orchard.ContentManagement.DefaultContentManager.GetManyImplementation(QueryHints hints, Action`2 predicate) in src\Orchard\ContentManagement\DefaultContentManager.cs:370
   Orchard.ContentManagement.DefaultContentManager.Get(Int32 id, VersionOptions options, QueryHints hints) in src\Orchard\ContentManagement\DefaultContentManager.cs:168
   Orchard.ContentManagement.DefaultContentManager.Get(Int32 id, VersionOptions options) in src\Orchard\ContentManagement\DefaultContentManager.cs:131
   Orchard.ContentManagement.ContentGetExtensions.Get(IContentManager manager, Int32 id, VersionOptions options) in src\Orchard\ContentManagement\ContentExtensions.cs:159
   Orchard.Core.Settings.Services.SiteService.GetSiteSettings() in src\Orchard.Web\Core\Settings\Services\SiteService.cs:32
   Orchard.Settings.CurrentSiteWorkContext.Get(String name) in src\Orchard\Settings\CurrentSiteWorkContext.cs:13
   Orchard.Environment.<>c__DisplayClass9_0`1.<FindResolverForState>b__0(Lazy`1 wcsp) in src\Orchard\Environment\WorkContextImplementation.cs:40
   System.Linq.WhereSelectArrayIterator`2.MoveNext() +80
   System.Linq.Enumerable.FirstOrDefault(IEnumerable`1 source, Func`2 predicate) +161
   Orchard.Environment.WorkContextImplementation.FindResolverForState(String name) in src\Orchard\Environment\WorkContextImplementation.cs:40
   System.Collections.Concurrent.ConcurrentDictionary`2.GetOrAdd(TKey key, Func`2 valueFactory) +72
   Orchard.Environment.WorkContextImplementation.GetState(String name) in src\Orchard\Environment\WorkContextImplementation.cs:35
   Orchard.WorkContext.get_CurrentSite() in src\Orchard\WorkContext.cs:65
   Orchard.Roles.Services.RolesBasedAuthorizationService.TryCheckAccess(Permission permission, IUser user, IContent content) +403
   Orchard.Security.Authorizer.Authorize(Permission permission, IContent content, LocalizedString message) in src\Orchard\Security\Authorizer.cs:72

What is going wrong here? Any idea?

@sebastienros
Copy link
Member

sebastienros commented May 11, 2017

Maybe try to put a breakpoint in the transaction manager to see if two threads are trying to create a new one.
Or enable the logs on SQL to see what is sent.

@ViRuSTriNiTy
Copy link
Contributor Author

Yes, i will enable the logs, good idea, thx. But i dont understand your statement: "Maybe try to put a request in the transaction manager to see if two threads are trying to create a new one.". Can you elaborate?

@sebastienros
Copy link
Member

Updated my comment (breakpoint).

@ViRuSTriNiTy
Copy link
Contributor Author

ViRuSTriNiTy commented May 12, 2017

Setting a breakpoint in TransactionManager.EnsureSession() didn't reveal anything unusual, one transaction per request is created.

So i enabled NHibernate Logging and the following queries are logged right before the exception occurs:

2017-05-12 09:19:17,239 [8] NHibernate.SQL - (null) - DEBUG
SELECT ... FROM Orchard_Orchard_Framework_ContentItemVersionRecord this_
inner join Orchard_Orchard_Framework_ContentItemRecord contentite1_ on this_.ContentItemRecord_id=contentite1_.Id
inner join Orchard_Orchard_Users_UserPartRecord userpartre2_ on contentite1_.Id=userpartre2_.Id
left outer join Orchard_Orchard_Framework_ContentTypeRecord contenttyp6_ on contentite1_.ContentType_id=contenttyp6_.Id
WHERE userpartre2_.NormalizedUserName = @p0 and this_.Published = @p1;@p0 = 'dafr'
[Type: String (4000)], @p1 = True [Type: Boolean (0)]

2017-05-12 09:19:17,283 [8] NHibernate.SQL - (null) - DEBUG
select cast(count(*) as INT) as col_0_0_ from ...EmployeePartRecord employeepa0_ where employeepa0_.Id=@p0;@p0 = 6 [Type: Int32 (0)]

2017-05-12 09:19:17,321 [8] NHibernate.SQL - (null) - DEBUG
select cast(count(*) as INT) as col_0_0_ from ...StorePartRecord storepartr0_ where storepartr0_.Id=@p0;@p0 = 6 [Type: Int32 (0)]

2017-05-12 09:19:17,340 [8] NHibernate.SQL - (null) - DEBUG
select cast(count(*) as INT) as col_0_0_ from ...AccountingPartRecord accounting0_ where accounting0_.Id=@p0;@p0 = 6 [Type: Int32 (0)]

2017-05-12 09:19:17,358 [8] NHibernate.SQL - (null) - DEBUG
select cast(count(*) as INT) as col_0_0_ from ...AddressPartRecord addresspar0_ where addresspar0_.Id=@p0;@p0 = 6 [Type: Int32 (0)]

2017-05-12 09:19:17,376 [8] NHibernate.SQL - (null) - DEBUG
select cast(count(*) as INT) as col_0_0_ from ...UserAttributePartRecord userattrib0_ where userattrib0_.Id=@p0;@p0 = 6 [Type: Int32 (0)]

2017-05-12 09:19:17,403 [8] NHibernate.SQL - (null) - DEBUG Batch commands:
command 0:
UPDATE Orchard_Orchard_Framework_ContentItemRecord
SET Data = @p0, ContentType_id = @p1
WHERE Id = @p2;

@p0 = '<Data><UserPart UserName="dafr" ... LastLoginUtc="2017-05-12T07:19:17.4024893Z" ... />...</Data>' [Type: String (1073741823)],
@p1 = 2 [Type: Int32 (0)],
@p2 = 6 [Type: Int32 (0)]

2017-05-12 09:19:47,484 [8] NHibernate.Event.Default.AbstractFlushingEventListener - (null) - ERROR
Could not synchronize database state with session

As we can see, first the current user is queried, then the parts. Afterwards LoginUserEventHandler sets the last login timestamp and last query seems to be the result of this,.
But why would the UPDATE query hang?

@ViRuSTriNiTy
Copy link
Contributor Author

Found it! The issue is as follows:

Session A

  1. User dafr starts a request to execute a long operation
  2. .NET Authorization stuff starts (i see something with IAuthorizationFilter in call stack), -
  • IAuthenticationService.GetAuthenticatedUser() in AlexZh.WindowsAuthentication module is callled
    • user signin is executed via FormsAuthenticationService -> transaction A begins
    • user event handlers are called
    • LoginUserEventHandler updates last login timestamp -> generates a UPDATE query *1
  1. controller action that executes the long operation is called (async implementation in an Orchard module)
  2. operation starts and runs for a longer time

-> transaction A is still open with the UPDATE Orchard_Orchard_Framework_ContentItemRecord statement (*1) which targets the row associated with User dafr

Session B (e.g. in another browser):

  1. User dafr starts a request (e.g. to the default page of Orchard)
  2. Authorization stuff starts, same behavior as Session A, step 2, transaction B starts, LoginUserEventHandler updates last login timestamp -> also generates a UPDATE query *2

-> UPDATE Orchard_Orchard_Framework_ContentItemRecord statement (*2) is sent to SQL server (traced it via SQL Profiler) but the statement never gets executed successfully and returns after around 30 seconds with RPC:Completed error 2 (abort).

After reading Data Modifications under Read Committed Snapshot Isolation (RCSI turned on as suggested by @pszmyd here in my initial post) it's clear why this happens.

The UPDATE query in transaction A locks the row until the transaction is commited (or canceled as my initial approach did) whereas transaction B waits for the lock to be released but it is not released in time because the operation of Session A takes to long.

So this issue affects a specific user only.

Now, what would be the right approach to tackle this?
I would say commiting the transaction before starting a long operation to ensure everything is flushed to the DB, right?

@sebastienros
Copy link
Member

I assume the only thing to do is ensure the tables are not locked when a long operation is executed. So start by committing the current one, then do the long operation, and finally open a new one just for the necessary time.

@ViRuSTriNiTy
Copy link
Contributor Author

Will do, thx.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants