Flushing a socket in a Live workflow #2760

sylvain-apivideo · 2023-07-06T13:56:48Z

Hello,
We've observed strange behavior at the end of an SRT connection where the data has been received by the receiving socket, but not delivered to the application.
We traced it back to the TSBPD mode that just stop doing anything as soon as the socket has been marked as closing.
I would have expected the srto_linger option to be able to handle this, but this is not the case.

Is it voluntary to not care about flushing data when TSBPD mode is enabled ?
How is an application supposed to handle this without blindly keeping the socket alive during an arbitrary amount of time ?

Additionally, the documentation states that srto_linger option is disabled when using the "live" transtype mode, because "In this type of workflow there is no point for wait for all the data to be delivered after a connection is closed.". I am not sure to fully understand the point. I find it could be useful to be able read all the data received by the receiving socket (assuming those data are not late..) regardless whether the sending peer has closed. No user would want the credits to be cut-off.

The text was updated successfully, but these errors were encountered:

ethouris · 2023-07-06T15:42:01Z

In the live mode SRT serves only as a throughput. You receive the data as long as they are "in the flow".

In order to force stopped reading on the receiver side, while having all the data received up to this moment to be delivered to the application, you'd require some feature like "shutdown" in particular direction, which should stop the socket from delivering any more data (incoming data packets are ignored), but this only would lock a particular sequence number (and still any numbers that precede it might undergo retransmission, if lost, for example) after which no more packets are accepted in the receiver buffer. In the meantime, the application could read whatever is remaining in the receiver buffer.

This is moreover hard to implement because there's also the "congestion-blow problem": when the sender sends the data, and these are not read and not ACK-ed, at some point the maximum sequence for the current buffer state is exceeded and there's no other way than break the connection (which will likely happen even before the app has a chance to read these remaining data). This is how this problem is being solved now, and it will prevent any ability to implement this "shutdown" thing.

In general the live mode isn't a data transmission. You transmit pictures and sound with the rhythm as they happen live - not "data". And in this case you can only have two main states: you can transmit and receive them at the moment, and so you do, or you can't, and the connection must be broken, with one additional intermediate state that if you exceeded the current transmission cap at the moment, you can drop some data (at least by default settings). When the connection is broken, there's no more transmission, not just the socket is closed.

The file mode is completely different - you transmit the data as fast as you can, you can also slow them down as much as necessary to get a sensible throughput and low loss rate, and when the sending side closes the socket, you read the remaining data and the connection gets broken at the end. And the application is free to wait with extracting the data from the socket as much as it wants and the "linger" settings allow.

So, these are those problems that have to be first overcome in order to implement this. The only way to do this is through some kind of "shutdown" or "pause" applied to the socket, which makes the incoming data ignored, or even fake-ACKed, you read what's left, and then you can close the socket, or even unblock it and read again. This is a lot of work with implementing such a thing.

sylvain-apivideo · 2023-07-07T08:15:54Z

Thank you very much for your reply ethouris.

The only way to do this is through some kind of "shutdown" or "pause" applied to the socket

I thought a "shutdown" control command was already sent when the remote peer is closing properly.
Wouldn't the receiving peer be able to perform it's flushing upon its reception, before the connection is effectively closed ?
I am telling this without any deep knowledge about the protocol so this is only a question.

In any case, every components in a Live workflow includes some kind of buffering, including SRT. Pictures and sound are being transmitted at one moment, and received a moment later, and if the protocol requires some kind of communication with the emitting peer in order to deliver those to the application, then it should also be its job to handle a shutdown phase. Otherwise, as much as the "receive latency" of pictures and sound could be dropped by the SRT protocol upon closing. Right ?

I would find useful to consider this issue as this makes the Live version of the protocol non-deterministic, which a pain for testing. Consider a file being streamed over SRT in Live mode.

I would also enhance the documentation about linger not being compatible with TSBPD mode.

Regards

ethouris · 2023-07-07T08:30:00Z

It's not "that" shutdown, I meant the "shutdown" like the one done on the TCP socket (see man 2 shutdown on POSIX systems).

The buffer you are talking about is there indeed, it's the receiver buffer, and yes, in the live mode some part of this buffer is also used to keep packets for which the play time didn't come yet, and they stay there until it does, about which the TSBPD thread decides. But normally in the live mode you expect to keep a very low latency and if your transmission is going to terminate it usually doesn't matter if it terminates now or in the next half a second. A completely different case is when the transmission should end because there's nothing more to send - in such a case you should simply keep the connection open even without sending any data, and keep it this way for enough time to read the remaining pictures. And how to handle the end-of-transmission and display it correctly to the user, it's the application's problem.

This means that:

If the connection was broken (due to link broken on the network-UDP level), it's simply broken transmission, no one cares how many data were supposed to be sent, but were effectively lost. It's an error case that we never expect to happen.
If the connection was closed on the sender side, because the sender has no more data to send, but it was done WITHOUT this kinda "quiet period", then the application on the sender side is incorrectly written and should be fixed. The least time it should wait before closing the socket is the latency + 2 * average RTT.
If the sender application has closed the socket because of some urgent necessity (and still as an errroneous situation), then it's more-less the same situation as with the broken link.

So, I can understand that the application may want to read all the remaining data when the "transmission is ended", so that all sent pictures and sound up to the very end is retrieved and played. But if the transmission is unexpectedly terminated, this isn't the case.

sylvain-apivideo · 2023-07-07T09:25:31Z

I was only concerned about the case where the connection is closed intentionally.

Sadly, in this situation, I am the receiving peer, and I believe most of the tools that implement the SRT output won't perform this "quiet period", simply because they expect the protocol to handle this. They send the data, then call srt_close as soon as there is no nothing left to send, as they would for TS over UDP. I've experienced it with both ffmpeg and tsduck.

Thank you for taking the time to answer the question.

maxsharabayko · 2023-08-22T05:44:45Z

Feature Request

The sender closes a connection once streaming ends (indicating the "no more data" reason for closing a connection, see FR [FR] Reason for closing a connection #2638). Communication medium: SHUTDOWN control packet.
⚠️ Closing a connection also means the end of packet loss recovery (retransmissions). When the sending application would like all the packets to be delivered (or dropped as too late), it should wait for the sender's buffer to become empty (SRTO_SNDDATA) before closing a socket.
The receiver gets the shutdown event, but still has, say, SRTO_LATENCY of the bitrate in the RCV buffer. There is no need to close the socket straight away. The state of the connection should change though, to indicate the peer is offline and no packets, i.e. keepalive or else should be sent. But reading must still be possible until the application decides to close the socket. See also srt_send() error SRT_ECONNLOST or SRT_EINVSOCK depending on peer disconnection time #2098.

sylvain-apivideo added the Type: Question label Jul 6, 2023

maxsharabayko added Type: Enhancement [core] and removed Type: Question labels Aug 22, 2023

maxsharabayko added this to the Major milestone Aug 22, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Flushing a socket in a Live workflow #2760

Flushing a socket in a Live workflow #2760

sylvain-apivideo commented Jul 6, 2023

ethouris commented Jul 6, 2023

sylvain-apivideo commented Jul 7, 2023

ethouris commented Jul 7, 2023

sylvain-apivideo commented Jul 7, 2023

maxsharabayko commented Aug 22, 2023

Flushing a socket in a Live workflow #2760

Flushing a socket in a Live workflow #2760

Comments

sylvain-apivideo commented Jul 6, 2023

ethouris commented Jul 6, 2023

sylvain-apivideo commented Jul 7, 2023

ethouris commented Jul 7, 2023

sylvain-apivideo commented Jul 7, 2023

maxsharabayko commented Aug 22, 2023

Feature Request