Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use memory-aligned data structures in qmux protocol #22

Closed
awakecoding opened this issue Jun 11, 2021 · 6 comments
Closed

Use memory-aligned data structures in qmux protocol #22

awakecoding opened this issue Jun 11, 2021 · 6 comments

Comments

@awakecoding
Copy link

I am currently looking into implementing a TCP-in-TCP connection tunneling protocol very similar to qmux or OpenSSH in Devolutions Gateway + jetsocat, a project written in Rust. The cleaned up protocol specification used in qmux is exactly what I'm looking for, but I've noticed that the message headers haven't been modified to respect memory alignment rules commonly enforced in certain types of processors like ARM. To read a uint32 (4-bytes) then it should be at a memory address that is a multiple of 4 bytes, etc. This is the primary reason why a lot of protocols try hard to align their data structures, and why compilers automatically pad data structures that don't respect those rules. You can still read data structures that are not memory aligned, but it comes with a performance cost.

I know this would be a breaking change, but would you be interested in making the change anyway? I will likely implement the memory aligned variant on my side and keep everything else as-is. Here is what the current header structure looks like:

  byte      QMUX_MSG_CHANNEL_OPEN
  uint32    sender channel
  uint32    initial window size
  uint32    maximum packet size

And what a memory-aligned variant would look like:

  byte      QMUX_MSG_CHANNEL_OPEN
  byte     reserved[3]
  uint32    sender channel
  uint32    initial window size
  uint32    maximum packet size

This would free 1 byte and a uint16 for other use like flags, etc, if you have such needs. What do you think? Everything else that follows would be nicely memory-aligned with those 3 bytes.

@progrium
Copy link
Owner

Oh, fascinating. Definitely open to it. For reference/docs here is more info I found: https://developer.ibm.com/technologies/systems/articles/pa-dalign/

So this would be for all messages I'm assuming?

@awakecoding
Copy link
Author

Yes, this would be for all messages. If am I to fix the message alignment, I might as well define a common header structure to make it much easier to know the expected message size. I will make significant changes to the document structure, and then you can take a look and see if you want to adopt the changes. It would be great if we end up both using a common protocol, otherwise it's no big deal, I'll just use a different name. I also want to ensure that the protocol can evolve with some backward compatibility.

@awakecoding
Copy link
Author

@progrium okay, I've begun my work and I am clearly going to redesign the whole thing my way at this point, so you may not want to the final result when I'm done. We still have a common goal, so there are bits and pieces we can share. I've reviewed QMUX and the original SSH protocol, and there's one thing that eludes me so far: how would you support SOCKS5 on top of QMUX?

OpenSSH has this nice SOCKS5 tunneling feature that is an improvement over the regular TCP tunnel, and I intend to support SOCKS5 as well. However, I couldn't find the part of QMUX or the original SSH protocol that lets you specify the destination of the channel to open (IP address or hostname + port). Unlike a tunnel with a fixed destination, SOCKS5 requires that you specify where to connect. Is this something you wanted for QMUX as well?

@awakecoding
Copy link
Author

Here's my draft revised protocol draft, let me know if you're interested, otherwise you can just close this issue and I'll move on with my own variant: https://github.com/awakecoding/qmux/blob/protocol-update/SPEC.md

@progrium
Copy link
Owner

I like some of the improvements but you're right we might be diverging pretty quickly. Is there a benchmark or test that could be done to show the performance difference of memory-aligning vs not (assuming a target platform/architecture)?

@awakecoding
Copy link
Author

I don't have benchmarks, but to be honest, the performance will likely be very similar. It is just a good idea to pre-pad the data structures so the compilers won't, making it much easier to just transfer data structures as-is with minimal transformation between the wire format and the memory format (aside from little endian vs big endian).

I don't want to steer you in a different direction, I have already modified the protocol beyond recognition with the following:

  • structure memory alignment
  • common header structure
  • maximum fragment size (65535)
  • type-length-value (TLV) encoding
  • move variable-length field to body
  • assign 'variable' names to each field
  • channel destination URL encoding
  • reintroduce reason code and string

This new draft is what I'll use for my V1 of "JMUX" in jetsocat. It'll mostly be interesting to follow what happens on my side, but it's no longer relevant for QMUX unless you wish to significantly change it and take the same direction I'm taking.

I'll close this issue, thanks for your help

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants