Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SFTP File Reader gets stuck in Polling state. Must be halted and restarted. #4142

Closed
rbeckman-nextgen opened this issue May 11, 2020 · 2 comments
Closed
Milestone

Comments

@rbeckman-nextgen
Copy link
Collaborator

@rbeckman-nextgen rbeckman-nextgen commented May 11, 2020

See forum post https://www.mirthcorp.com/community/forums/showthread.php?t=11371 for history of issue.

All of the seemingly unrelated errors have to do with the server terminating the connection. My thread dump showed that the Jsch session thread was not running when the channel was stuck polling.

After 8 occurances of these errors is when the channel gets stuck. org.apache.commons.pool2.impl.GenericObjectPoolConfig.DEFAULT_MAX_TOTAL is also 8.

The issue appears to be in the releaseConnection method of com.mirth.connect.connectors.file.FileConnector. If the FileSystemConnection is not connected, it neither returns it to the pool nor invalidates it.

I was able to verify this with the following code:
`
var donkey = com.mirth.connect.donkey.server.Donkey.getInstance();
var stuckChannel = '9113f3c3-2024-424e-8916-592ec983a822';
var channel = donkey.getDeployedChannels().get(stuckChannel);
var sourceConnector = channel.getSourceConnector();
var f = sourceConnector.getClass().getDeclaredField("fileConnector");
f.setAccessible(true);
var fileConnector = f.get(sourceConnector);
f = fileConnector.getClass().getDeclaredField("pools");
f.setAccessible(true);
var pools = f.get(fileConnector);
var pool = pools.values().iterator().next();

logger.info(JSON.stringify({
pool:{
getNumIdle: pool.getNumIdle(),
getNumActive: pool.getNumActive(),
getNumWaiters: pool.getNumWaiters()
}
}));
Which logged the following when my channel was stuck:INFO (deploy:?): {"pool":{"getNumIdle":0,"getNumActive":8,"getNumWaiters":1
I was able to reproduce this in a test environment by setting up a sftp server with artifical network delay. Once the file reader started reading a file I would kill the sftp server to sever the connection and then restart the sftp server. After 8 iterations of this the channel was stuck in a polling state.

Imported Issue. Original Details:
Jira Issue Key: MIRTH-4266
Reporter: agermano
Created: 2018-05-02T20:24:27.000-0700

@rbeckman-nextgen rbeckman-nextgen added this to the 3.6.0 milestone May 11, 2020
@rbeckman-nextgen
Copy link
Collaborator Author

@rbeckman-nextgen rbeckman-nextgen commented May 11, 2020

Thanks Tony! I would like to add, I also saw this issue in 3.5.0

Imported Comment. Original Details:
Author: kirbykn2
Created: 2018-05-03T05:56:45.000-0700

@rbeckman-nextgen
Copy link
Collaborator Author

@rbeckman-nextgen rbeckman-nextgen commented May 11, 2020

Surefire way to reproduce:

Setup an SFTP container:
docker pull atmoz/sftp

Create a File Writer to write to it:
!screenshot-1.png!

Set a breakpoint at FileDispatcher:166, right after the line
fileSystemConnection = fileConnector.getConnection(fileSystemOptions);
\
\
Now, do this 9 times:

  • Start up a container: docker run -p 2222:22 -d atmoz/sftp user:pass:::upload
  • Send a message, verify dispatcher is suspended at the breakpoint
  • Kill the container: docker kill $(docker ps -q)
  • Resume the dispatcher thread

On the 9th time, the breakpoint won't be reached, and instead your File Writer will be hung until you halt it. Verify with a threaddump:

"File Writer Process Thread on file writer (d9a94b15-59a0-4f9e-b474-c8d53c868705), Destination 1 (1)" #62 prio=5 os_prio=31 tid=0x00007feaa5743000 nid=0x8d07 waiting on condition [0x000070000ba42000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x00000007bdfbe728> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039) at org.apache.commons.pool2.impl.LinkedBlockingDeque.takeFirst(LinkedBlockingDeque.java:583) at org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:442) at org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:363) at com.mirth.connect.connectors.file.FileConnector.getConnection(FileConnector.java:174) at com.mirth.connect.connectors.file.FileDispatcher.send(FileDispatcher.java:165) at com.mirth.connect.donkey.server.channel.DestinationConnector.handleSend(DestinationConnector.java:822) at com.mirth.connect.donkey.server.channel.DestinationConnector.process(DestinationConnector.java:476) at com.mirth.connect.donkey.server.channel.DestinationChain.doCall(DestinationChain.java:121) at com.mirth.connect.donkey.server.channel.DestinationChain.call(DestinationChain.java:63) at com.mirth.connect.donkey.server.channel.Channel.process(Channel.java:1729) at com.mirth.connect.donkey.server.channel.Channel.dispatchRawMessage(Channel.java:1191) at com.mirth.connect.donkey.server.channel.SourceConnector.dispatchRawMessage(SourceConnector.java:192) at com.mirth.connect.server.controllers.DonkeyEngineController.dispatchRawMessage(DonkeyEngineController.java:1071) at com.mirth.connect.server.api.servlets.MessageServlet.processMessage(MessageServlet.java:94)

Imported Comment. Original Details:
Author: narupley
Created: 2018-05-09T08:53:59.000-0700

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
1 participant
You can’t perform that action at this time.