Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

--timeout option doesn't work as expected #164

Open
pavan54meena opened this issue Mar 12, 2021 · 1 comment
Open

--timeout option doesn't work as expected #164

pavan54meena opened this issue Mar 12, 2021 · 1 comment

Comments

@pavan54meena
Copy link

if i give timeout=60, and the link between client and server is broken in between transfer then rsync gets stuck for a random time(i.e. 15 minutes or 20 minutes). this happens only when i execute the rsync command at client side and the files are being transferred from server to client. sometimes it also gets stuck indefinitely.

@jenkins-armedia
Copy link

jenkins-armedia commented Mar 10, 2024

I've encountered this issue as well, in an environment where network connectivity was somewhat restricted (and a bit spotty due to some temporary hardware issues), and I had to use --rsh to establish a connection.

As it turns out, if the connection died in such a way that the "sender" side of the connection (the --rsh process) wouldn't efficiently detect the dead link, the sender rsync instance would continue to ping the far side, and never time out even though the far side wasn't responding.

Reading the code, it appears that this is the intended behavior:

// Taken from io.c
static void check_timeout(BOOL allow_keepalive, int keepalive_flags)
{
    time_t t, chk;

    /* On the receiving side, the generator is now the one that decides
     * when a timeout has occurred.  When it is sifting through a lot of
     * files looking for work, it will be sending keep-alive messages to
     * the sender, and even though the receiver won't be sending/receiving
     * anything (not even keep-alive messages), the successful writes to
     * the sender will keep things going.  If the receiver is actively
     * receiving data, it will ensure that the generator knows that it is
     * not idle by sending the generator keep-alive messages (since the
     * generator might be blocked trying to send checksums, it needs to
     * know that the receiver is active).  Thus, as long as one or the
     * other is successfully doing work, the generator will not timeout. */
    if (!io_timeout)
        return;

IMHO, a correct implementation would require that for every "ping" there should be a corresponding "pong" from the far side... If a side receives a ping, then it should respond with the "pong", and clear out the timeout counters. Thus, if no data is flowing (b/c it doesn't need to flow), both sides are at least happy with the silence b/c they know that the link is still alive.

This could be ignored when data is flowing normally, for obvious reasons. That said, any side receiving data should also periodically respond with "ping" or "ack" or whatnot to indicate that the transmitted data did reach its destination. This would be sufficient for both sides to safely assume that the link remains valid.

The problem with the current approach is that the link may die in some hard-to-detect way, and one of the sides will not timeout opportunely. I was forced to work some python magic to write a small wrapper to handle this case. However, this seems like something the --timeout flag should take care of.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants