Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle ordering bin packets #35

Merged
merged 12 commits into from Jul 20, 2023

Conversation

biryukovmaxim
Copy link
Contributor

current mechanism of retrying/resending packets in socketioxide can brake flow in case of retry launches when another binary payload sends.
moreover, if anybody tries to send packets and one of them includes binary attachment,it can break flow also.

Suggested solution:

  1. create new private struct PacketSender containing channel and queue of failed messages.
  2. Socket owns Packet sender guarded by mutex
  3. when user tryes to send packet it locks mutex and uses packet sender

internal packet sender logic:

  1. if main message succesfully sent, but sending of attachments fails, packet sender saves a queue of failed messages internally. It can be sent in next send call before sending the new packet, or send_binaries function call (privately from Socket)
  2. if after new send call sending of failed queue attachments fails again it will return SendFailedBinPayloads containing provided packet to give an opportunity for user to send it again
  3. if sending of main message fails it will return it will return RetryablePacket(it's a queue of deserialized main packet and binary payloads). this struct has only one method retry, accepting a socket.

public API changes:
socket.retry_failed() - tries to resend failed queue messages
AckSender returns error containing socket and SendError to make it possible to resend ack

@Totodore
Copy link
Owner

I don't now if it is worth it, If I understood this well, you are going to implement another channel/queue of message in the Socket struct in order to keep all failed packets. But in most of the case if packets are not sent it is because the sending queue is to small. So it seems a bit weird to add another queue in order to backup those not sent previously. It seems redundant.

We are moving away from the original problem which was "how can we give back owned data to the user in case of error".

@biryukovmaxim
Copy link
Contributor Author

biryukovmaxim commented Jun 26, 2023

I don't now if it is worth it, If I understood this well, you are going to implement another channel/queue of message in the Socket struct in order to keep all failed packets. But in most of the case if packets are not sent it is because the sending queue is to small. So it seems a bit weird to add another queue in order to backup those not sent previously. It seems redundant.

We are moving away from the original problem which was "how can we give back owned data to the user in case of error".

currently user is able to break flow, if main packet successfully sent, what should do user if they don't retry sending binaries.
the packet sender saves only attachments if main message sent successfully but the attachments are not sent

@Totodore
Copy link
Owner

Okay, nevermind, I thought it was for all failed packets.

@biryukovmaxim
Copy link
Contributor Author

biryukovmaxim commented Jun 26, 2023

Okay, nevermind, I thought it was for all failed packets.

if it's impossible to send failed attachments again during new send call - it will return error and new not serialized message.
if old failed attachments succesfully sent, but new main message fails, it will return RetryableMessage(containing attachments and main message). It's possible to ignore this error and send new message, or resend this like retryable.retry(socket)

the main idea of this PR is preventing flow breaks

@biryukovmaxim
Copy link
Contributor Author

biryukovmaxim commented Jun 26, 2023

Moreover currently if user tries to emit two events containing attachments in parallel it will break flow also, even without any retry logic

@Totodore Totodore mentioned this pull request Jun 26, 2023
@biryukovmaxim
Copy link
Contributor Author

@Totodore Any suggestions about the name of GoodNameError? other suggestions?

@Totodore
Copy link
Owner

Totodore commented Jun 29, 2023

I would say TransportError or PipeError or ChanError. Something like that

@biryukovmaxim
Copy link
Contributor Author

done!

Copy link
Owner

@Totodore Totodore left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a good solution, however it is quite complex so it requires some doc to understand the process. And I'm afraid it will be hard to add some features over this (like for instance the Remote adapter feature and syncronisation between multiple instances).

Could you add an example in the example crate to show how to manage errors like that ?

socketioxide/src/handler.rs Show resolved Hide resolved
socketioxide/src/handler.rs Outdated Show resolved Hide resolved
socketioxide/src/socket.rs Show resolved Hide resolved
@@ -60,6 +170,7 @@ impl<A: Adapter> Socket<A> {
sid,
extensions: Extensions::new(),
config,
sender: Mutex::new(PacketSender::new(tx, VecDeque::new())),
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The introduction of a Mutex make impossible to send things in a multithreading way. I know it is necessary though. But could you benchmark this with a high throughput for one socket and the ping/pong delay and memory comsumption to be sure it is not an issue ?

socketioxide/src/socket.rs Show resolved Hide resolved
socketioxide/src/socket.rs Outdated Show resolved Hide resolved
socketioxide/src/socket.rs Outdated Show resolved Hide resolved
socketioxide/src/socket.rs Outdated Show resolved Hide resolved
socketioxide/src/socket.rs Outdated Show resolved Hide resolved
@@ -61,7 +61,8 @@
//! ```

pub mod adapter;
pub mod retryer;
pub mod errors;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only export needed error like below and not the whole mod

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didnt get you. I used this mod in example to handle errors. Did you mean it's needed to split into public and private error modules?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, just like line 68 where it is already done (only AckError and Error are exported from root level)

@biryukovmaxim biryukovmaxim marked this pull request as draft June 30, 2023 11:28
@biryukovmaxim biryukovmaxim force-pushed the handle_ordering_bin_packets branch 6 times, most recently from 23e47bf to fb409f2 Compare July 1, 2023 15:36
@biryukovmaxim biryukovmaxim marked this pull request as ready for review July 1, 2023 15:51
Copy link
Owner

@Totodore Totodore left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I said before, I know that it is necessary to have a correct retry mecanism. I think it is really important for the user to abstract things. As we can see in the example it is really painful to do that each time we wan't to emit something.

For instance we could add automatic retry in case of failure configurable from the SocketIoConfig. Or just try to reemit failed packets in the background at the next .emit() call.
Another way that we didn't considered (and it's a bit too late maybe) is to have a .on_error which is called when there is failed packets...

@@ -0,0 +1,82 @@
//! This a end to end test server used with this [test suite](https://github.com/socketio/socket.io-protocol)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add a little description about the example

Comment on lines 22 to 24
.ping_interval(Duration::from_millis(300))
.ping_timeout(Duration::from_millis(200))
.max_payload(1e6 as u64)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These weird configs are used only for the test suites and shouldn't be in the examples (that's why I moved e2e testing binaries from the examples)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure that it should be in examples at all, it's not mandatory to handle this error

Comment on lines 35 to 66

socket.on("message", |socket, data: Value, bin, _| async move {
info!("Received event: {:?} {:?}", data, bin);
if let Err(BroadcastError::SendError(mut errors)) =
socket.bin(bin).emit("message-back", data)
{
let err = errors.pop().unwrap();
if let SendError::TransportError(TransportError::SendFailedBinPayloads(_)) = err
{
while let Err(TransportError::SendFailedBinPayloads(_)) =
socket.retry_failed()
{
sleep(Duration::from_millis(10)).await;
info!("retry");
}
}
}
});

socket.on("message-with-ack", |_, data: Value, bin, ack| async move {
info!("Received event: {:?} {:?}", data, bin);
if let Err(AckSenderError::SendError {
send_error: SendError::TransportError(TransportError::SendFailedBinPayloads(_)),
socket,
}) = ack.bin(bin).send(data)
{
while let Err(TransportError::SendFailedBinPayloads(_)) = socket.retry_failed()
{
sleep(Duration::from_millis(10)).await;
info!("retry");
}
}
});
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The control flow is hard and the retry mecanism complex... It requires some comments so the reader easily understand what is happening

@biryukovmaxim
Copy link
Contributor Author

biryukovmaxim commented Jul 2, 2023

As I said before, I know that it is necessary to have a correct retry mecanism. I think it is really important for the user to abstract things. As we can see in the example it is really painful to do that each time we wan't to emit something.

Actually any call of emit triggers send_buffered_binaries to send previous failed binary packages first. If user ignore all errors, it will not be real problem, a packet won't be delivered in case of error containing main packet or retryable packet. In the example I showed that's possible to call retry manually, it can be useful if user not going to send new packages anymore, but carry about the delivery of the packet they tried to send before

@Totodore
Copy link
Owner

Totodore commented Jul 2, 2023

So I have a suggestion... Currently I'm not very satisfied about the end user experience who wan't to manage the SendError when emitting a packet. I think you also agree that it is a bit complicated to handle this each time you wan't to be sure that a message will be sent... :

if let Err(BroadcastError::SendError(mut errors)) =
      socket.bin(bin).emit("message-back", data)
  {
      let err = errors.pop().unwrap();
      if let SendError::TransportError(TransportError::SendFailedBinPayloads(_)) = err
      {
          while let Err(TransportError::SendFailedBinPayloads(_)) =
              socket.retry_failed()
          {
              sleep(Duration::from_millis(10)).await;
              info!("retry");
          }
      }
  }

Moreover, it is a saw as a bad practice to discard errors so it will be weird for the user to almost always discard SendErrors... Also the send error with retry sending case will happen when the channel is full (so it will be really exceptionnal and it is will be generally the result of a misconfiguration).

What I propose is :

  • We keep your solution of failed packets buffering and binary packet ordering.
  • We return nothing when emitting (appart from a eventual AdapterError)
  • We add an on_error callback exactly like the on_disconnect in PR feat(socketioxide): disconnect handler #41 and in the TS server implementation, this would allow to forward errors all non handled internal errors to this callback + notifying the user if there is a sending error problem (error mentioning packets that are going to be probably resent)
  • Or we print warn! or error! logs when there is an issue

I know that it is a bit too late and that you did a lot about returning SendError to the user so I'm sorry I didn't realize this earlier.

What do you think @biryukovmaxim

@biryukovmaxim
Copy link
Contributor Author

I like the idea of using special handler more than handling every emit call, however the handler should provide Error enum as input, so user has to match this error. It will expect almost the same code from user as in the example but it will include more variants of error.

I think we can keep internal logic the same as in the pr. So public functions like emit in case of error it will call on_error handler, and return unit ()
Is it what you mean?

@biryukovmaxim
Copy link
Contributor Author

I think it's possible to use warn and return unit right now, make errors private again. In further implement the error handler logic handling errors in proper way

@Totodore
Copy link
Owner

Totodore commented Jul 2, 2023

I think it's possible to use warn and return unit right now, make errors private again. In further implement the error handler logic handling errors in proper way

Let's do this ! :)

@Totodore Totodore added bug Something isn't working enhancement New feature or request labels Jul 3, 2023
Copy link
Owner

@Totodore Totodore left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So now we can merge this and we will add on_error handler in another PR.
Do you agree @biryukovmaxim ?

@Totodore Totodore dismissed their stale review July 20, 2023 15:37

changes are OK

@Totodore Totodore merged commit bb27e92 into Totodore:main Jul 20, 2023
5 of 6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants