Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

send-error message not always sent / wrong order #77

Closed
lgrahl opened this issue Jul 9, 2018 · 3 comments
Closed

send-error message not always sent / wrong order #77

lgrahl opened this issue Jul 9, 2018 · 3 comments
Assignees
Labels

Comments

@lgrahl
Copy link
Member

lgrahl commented Jul 9, 2018

When a relay message is being enqueued (as a task from id 0x01) and the subsequent ping (to 0x02) times out, the resulting PingTimeoutError (on a task of 0x02) for some reason prevents the send-error routine from being called.

The relay send task on 0x02 is cancelled and the relay task waiting for the send task to complete on 0x01 doesn't get that information.

@lgrahl lgrahl added the bug label Jul 9, 2018
@lgrahl
Copy link
Member Author

lgrahl commented Jul 9, 2018

Related: threema-ch/threema-web#505

@lgrahl lgrahl self-assigned this Jul 9, 2018
@lgrahl
Copy link
Member Author

lgrahl commented Jul 9, 2018

Okay, so I pinned it down: The problem is that we cannot determine which message has been lost, or in fact whether a message has been lost at all in all circumstances. As an example:

  1. The initiator sends a relay message to the server.
  2. The connection from the server to the responder has already been lost but the server doesn't know yet.
  3. The server relays the message to the responder, i.e. it calls send on the socket. And that send call succeeds because, again, the server doesn't know yet that the connection has been lost.
  4. After a while, the server gets a TCP timeout (or a ping timeout) for the responder. But it has no idea whether messages have been lost or not.
  5. Since the original send call succeeded, it only sends a disconnected message to the initiator.

To summarise: We do not know where the stream has been cut off. Thus we cannot (and this is very important to grasp) reliably tell that no data has been lost unless the connection to the other peer has been closed gracefully. Because of this, we may need to re-evaluate the usefulness of send-error. A simple flag in the disconnected message telling the peer how the connection has been closed would suffice. send-error is quite complex in the server and prone to error.

/cc @dbrgn

@lgrahl
Copy link
Member Author

lgrahl commented Jul 9, 2018

This issue is still open since I still need to fix that a send-error can be sent after a disconnected message.

lgrahl added a commit that referenced this issue Jul 9, 2018
Do not cancel the task loop but wait for it and only kill it after 10 minutes of inactivity (useful for finding faulty tasks)
Explicitly log that a relayed message has been sent (well, at least on a socket level)
Fix always send `send-error` messages first before sending a `disconnected` message, see #77
Add a test for the case where a message is being relayed but the connection has been lost
@lgrahl lgrahl changed the title send-error message not always sent send-error message not always sent / wrong order Jul 10, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant