Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

First sketch for mechanism to cancel DcmSCU negotiateAssociation before connectionTimeout elapses. #62

Closed
wants to merge 2 commits into from

Conversation

jogerh
Copy link

@jogerh jogerh commented Jun 26, 2022

If a server is unavailable, the DcmSCU::negotiateAssociation function call will block for the duration of the connectionTimeout. In many cases this timeout may be long, 30-60 seconds to ensure that we can robustly establish associations against slow servers. This is a long time for a user to wait for a device to shut down.

In some use cases, for example for portable imaging devices, the ability to quickly shut down is important to reduce the risk of the user pulling the power cable before the system is completely shut down. Such devices typically have a high risk that servers are temporarily unavailable, for example if they are serving multiple purposes and have to be moved from lab to lab.

The design idea is that the client software can create a socketpair that is used as a communication channel for signalling cancellation. While an association is being established, the client software can send a message on the socket pair from a different thread, which will wake the select in dulfsm.cc requestAssociationTCP function. The error handling will then follow the same path as if the connectionTimeout would be reached.

Another option would be to introduce a 'dcmTcpPollingInterval' and a 'cancelConnection' pointer parameter to the DUL_ASSOCIATESERVICEPARAMETERS structure. We could then set the dcmTcpPollingInterval to for example 5 seconds. In dulfsm.cc requestAssociationTCP, we could then call 'select()' in a loop with the dcmTcpPollingInterval as timeout until we reach the desired connectionTimeout. This would allow us to jump out of the select loop if cancelConnection was set to true in the meantime.

I implemented this on top of #61, which is the first commit.

100029962 added 2 commits June 26, 2022 18:06
This change makes it possible to configure the DcmSCU TCP connection timeout per instance, instead of using the global dcmConnectionTimeout parameter.

In a multi-service executable, that for example supports QR, Worklist, and Storage, storage may be executed in parallel with Worklist and QR. This is a typical pattern on some DICOM modalities, where the user can do QR or Worklist queries while images are being sent in the background.

It is then beneficial to be able to configure different TCP connect timeouts for each individual service. A typical example could be a multi-site institution where the storage server is far away and requires a long TCP connection timeout, whereas the Worklist server could be on site and require a short TCP connection timeout.

This becomes particularly important for mobile devices, where the device may have access to a storage server, temporarily not have access to the worklist server. We would then like to have a short worklist timeout to quickly fall back to an offline worklist.

Fix issue with ambiguity of the value -1 in TCP connection timeout.

Instead of using -1 to indicate fallback to the timeout value specified by dcmConnectionTimeout, we can just remove the fallback and instead call dcmConnectionTimeout.get() directly where we want to specify the timeout value.

This way we avoid confusion related to what the value -1 means. Now -1 means infinite timeout.

Add TCP connection timeout argument to ASC_createAssociationParameters in dcmpstat and dcmqrdb projects.

I had previously only compiled with a subset of the DCMTK modules in my solution.

Add a couple of tests that verifies the new behavior of DcmSCU::setConnectionTimeout.

These tests helps documenting the behavior and prevents regressions in the handling of connection timeouts.

Fix the type of DcmSCU::m_tcpConnectTimeout which should be a Sint32.

A value of -1 is used to indicate inifinite timeout, so we should use a signed type.
…re connectionTimeout elapses.

This helps quickly terminating attempts to establish an association for example during applicaiton shutdown.

The problem is that if a server is unavailable, the negotiateAssociation function call will block for the duration of the connectionTimeout. In many cases this timeout may be long, 30-60 seconds to ensure that we can robustly establish associations against slow servers. This is a long time for a user to wait for a device to shut down.

In some use cases, for example for portable imaging devices, the ability to quickly shut down is important to reduce the risk of the user pulling the power cable before the system is completely shut down. Such devices typically have a high risk that servers are temporarily unavailable, for example if they are serving multiple purposes and have to be moved from lab to lab.

The design idea is that the client software can create a socketpair that is used as a communication channel for signalling cancellation. While an association is being established, the client software can send a message on the socket pair from a different thread, which will wake the select in dulfsm.cc requestAssociationTCP function. The error handling will then follow the same path as if the connectionTimeout would be reached.
@michaelonken
Copy link
Member

Hey Joger,

I looked into this in detail now and after "completing" and playing around with the socket solution, I think it is not viable for DCMTK.

The reason is (and I found out quite late due to my missing experience with sockets) that if you use a socket of type AF_INET, then you must actually open a TCP port in order to listen for the incoming "cancel" request that should interrupt the select call. Of course, one could bind that socket only to localhost, however, such open ports have been used in the past by malicious software (e.g. downloaded javascript code in a browser).

Of course, AF_UNIX instead of AF_INET would be a good choice, but it is not supported well on Windows systems -- actually Microsoft added AF_UNIX to Winsock lately in Windows 10 (and 11), but it is still missing the socketpair() call which would be the weapon of choice to create two automatically connected sockets. One can use AF_UNIX based on temporary files also on Windows, but this leads to more complexity and in the end, trouble, than it is worth implementing.

Am I missing another obvious socket-based solution?

So the only possibility I see to make this work reliably on various OSes is the fallback solution you proposed (using "intermediate" timeout and cancel flag in DUL_ASSOCIATESERVICEPARAMETERS.

@jogerh
Copy link
Author

jogerh commented Apr 17, 2023

Replaced by #79

@jogerh jogerh closed this Apr 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants