Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deadlocks occur when sending to a channel with multiple asynchronous destinations #2961

Closed
rbeckman-nextgen opened this issue May 11, 2020 · 2 comments
Milestone

Comments

@rbeckman-nextgen
Copy link
Collaborator

@rbeckman-nextgen rbeckman-nextgen commented May 11, 2020

Forum thread: http://www.mirthcorp.com/community/forums/showthread.php?t=9537
\
\
{quote}We have numerous inbound channels that are MLLP listeners sending to multiple channel-reader type destinations. Everything is queued and none of it is synchronized.

Frequently, this configuration produces a deadlock while updating the message statistics for the channel and destinations:
\
\
[2013-11-03 18:47:36,366] ERROR (com.mirth.connect.connectors.vm.VmDispatcher:510): Error processing queued message 215-3 (SENT) for channel 8c89cb69-b26d-450e-a9fc-d09207d9b6eb (Destination 3). This error is expected if the message was manually removed from the queue. com.mirth.connect.donkey.server.data.DonkeyDaoException: java.sql.SQLException: Transaction (Process ID 63) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction. at com.mirth.connect.donkey.server.data.jdbc.JdbcDao.addChannelStatistics(JdbcDao.java:369) at com.mirth.connect.donkey.server.data.jdbc.JdbcDao.commit(JdbcDao.java:1581) at com.mirth.connect.donkey.server.data.buffered.BufferedDao.executeTasks(BufferedDao.java:131) at com.mirth.connect.donkey.server.data.buffered.BufferedDao.commit(BufferedDao.java:74) at com.mirth.connect.donkey.server.data.buffered.BufferedDao.commit(BufferedDao.java:61) at com.mirth.connect.donkey.server.channel.DestinationConnector.run(DestinationConnector.java:498) at java.lang.Thread.run(Unknown Source)Caused by: java.sql.SQLException: Transaction (Process ID 63) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction. at net.sourceforge.jtds.jdbc.SQLDiagnostic.addDiagnostic(SQLDiagnostic.java:372) at net.sourceforge.jtds.jdbc.TdsCore.tdsErrorToken(TdsCore.java:2894) at net.sourceforge.jtds.jdbc.TdsCore.nextToken(TdsCore.java:2334) at net.sourceforge.jtds.jdbc.TdsCore.getMoreResults(TdsCore.java:643) at net.sourceforge.jtds.jdbc.JtdsStatement.processResults(JtdsStatement.java:614) at net.sourceforge.jtds.jdbc.JtdsStatement.executeSQL(JtdsStatement.java:573) at net.sourceforge.jtds.jdbc.JtdsPreparedStatement.executeUpdate(JtdsPreparedStatement.java:707) at com.mirth.connect.donkey.server.data.jdbc.JdbcDao.addChannelStatistics(JdbcDao.java:346) ... 6 more

Turning the synchronization (wait for previous channel...) on makes this issue go away, but makes our processing sequential -- which we would like to avoid.

I've attached the offending configuration, and I can get it to reliably trigger using a set of about 500 messages pumped at it.

I've started to look at the way the code for the channels queues up the events for processing based on the Deadlock Graph included in the ZIP file -- but that is as far as I have gotten thus far.

Just wanted to see if anyone else had seen this and if anybody was out ahead of me on the analysis.

Oh -- the messages do get delivered, it is just the stat updates that are failing.

+Environment:+

  • Windows Server 2008 R2
  • Mirth Connect Server 3.0.0.6931
  • Built on September 30, 2013
  • Server ID: 2f1516e3-8ed6-443e-914f-c670e8fde5cf
  • Java version: 1.7.0_45 (64-bit)
  • SQL Server 2008 R2 & SQL Server 2012

Cordially,

Jonathan Lent
{quote}

Imported Issue. Original Details:
Jira Issue Key: MIRTH-3042
Reporter: narupley
Created: 2013-11-04T08:20:03.000-0800

@rbeckman-nextgen rbeckman-nextgen added this to the 3.0.1 milestone May 11, 2020
@rbeckman-nextgen
Copy link
Collaborator Author

@rbeckman-nextgen rbeckman-nextgen commented May 11, 2020

With certain channel configurations, SQL Server will encounter a deadlock scenario unless we always update the channel stats row and update it before the connector stats. We determined that this is because SQL Server creates a page lock when the statistics update statement references the existing row values in order to increment them (RECEIVED = RECEIVED + ?). Other databases such as Postgres use only row locks in this situation so they were not deadlocking. The deadlock scenario was only confirmed to happen with a channel with multiple asynchronous destinations and destination queues enabled.

The JdbcDao.addChannelStatistics() method has been rewritten to ensure that the channel stats row is updated first before the connector rows and is always updated even if the numbers didn't change, thus preventing the SQL Server deadlock scenario.

Imported Comment. Original Details:
Author: brentm
Created: 2013-11-08T11:21:22.000-0800

@rbeckman-nextgen
Copy link
Collaborator Author

@rbeckman-nextgen rbeckman-nextgen commented May 11, 2020

Verified that the deadlock appears to be gone.

Just to clarify a little, it may or may not be a page lock that is happening when the update where received = received + 1 (etc) is executed. It appears as if it just locks all rows that were inserted before the row being updated.

For instance, if a table had 3 rows, a deadlock would occur in the following situation:

  1. Thread 1 updates row 2 and keeps the transaction open
  2. Thread 2 updates row 1 and keeps the transaction open (this will block)
  3. Thread 1 updates row 3 (deadlock)

Since the channel statistics row is always the first row inserted into the table, what we have essentially done is modified the steps above so Thread 1 will always updated row 1 first. Thread 2 will still block when it attempts to updated row 1, but the deadlock will not occur when Thread 1 tries to update rows 2 and 3. Why is this only a problem for SQL Server? That's a question for Microsoft.

Imported Comment. Original Details:
Author: wayneh
Created: 2013-11-12T10:53:09.000-0800

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
1 participant
You can’t perform that action at this time.