Skip to content

fix: ack frame#352

Closed
hax0r31337 wants to merge 1 commit intoValveSoftware:masterfrom
hax0r31337:master
Closed

fix: ack frame#352
hax0r31337 wants to merge 1 commit intoValveSoftware:masterfrom
hax0r31337:master

Conversation

@hax0r31337
Copy link
Copy Markdown

There was a typo for checking against the size of latest_received_pkt_num.
As the packet mask is 0b10010000, nFrameType & 0x40 will always be zero.
I changed the mask to 0x10 according to the format documentation.

@hax0r31337
Copy link
Copy Markdown
Author

also got another question, why bother use varint on select lane frame if the max acceptable lane number is 255

@zpostfacto
Copy link
Copy Markdown
Contributor

Hey, this looks like really great stuff!

I'm swamped right now, but thanks so much for digging into the wire encoding. It looks like you've found a good bug, and are asking good questions. I'll try to get back to you soon.

@hussein-aitlahcen
Copy link
Copy Markdown
Contributor

@zpostfacto gentle ping on this :)

zpostfacto added a commit that referenced this pull request Apr 15, 2026
This addresses issue #352.  The problem was more subtle than the
original bug report.  The decoder indeed had a bug, but everything
worked because the encoder *also* had a bug!  The encoder is
always sending a lead byte of 0x98 (10011000), which incorrectly
set the w bit to 1, and according to the (previous) spec, the packet
number should have been encoded using 32-bits.  The decoder had a
bug checking the wrong bit, 0x40, when w is actually in bit 0x08.
But this was dead code because we only get into that branch
`if ( ( nFrameType & 0xf0 ) == 0x90 )`, and in that case bit
0x40 cannot be set.

The change is subtle because we cannot just fix encoders/decoders
to match the spec without risking brekaing interop with existing
code deployments.

My solution:
- Change the spec, inverting the meaning of the w flag, to match
  the current encoding behaviour.
- Fix decoders to be compliant with this new spec.
- Bump the protocol version so that, in the future, encoders
  could chose to use 32-bit packet numbers safely, knowing
  whether or not the peer was capable of decoding it properly.

I did not change the encoder logic at this time.  The need to
encode 32-bits of the packet number is a theoretical possibility
that might become necessary if the bandwidth x delay product
gets very high.  (The number of packets in flight is high and
the bottom 16-bits risks becoming ambiguous to identify a packet.)
@zpostfacto
Copy link
Copy Markdown
Contributor

Thanks for finding this bug. I was not able to take your PR exactly as is, because the fix ended up being more subtle in order to maintain interop with broken peers that had the bug. Sorry for the extreme delay in responding. Thanks again!

@zpostfacto zpostfacto closed this Apr 15, 2026
zpostfacto added a commit that referenced this pull request Apr 16, 2026
This addresses issue #352.  The problem was more subtle than the
original bug report.  The decoder indeed had a bug, but everything
worked because the encoder *also* had a bug!  The encoder is
always sending a lead byte of 0x98 (10011000), which incorrectly
set the w bit to 1, and according to the (previous) spec, the packet
number should have been encoded using 32-bits.  The decoder had a
bug checking the wrong bit, 0x40, when w is actually in bit 0x08.
But this was dead code because we only get into that branch
`if ( ( nFrameType & 0xf0 ) == 0x90 )`, and in that case bit
0x40 cannot be set.

The change is subtle because we cannot just fix encoders/decoders
to match the spec without risking brekaing interop with existing
code deployments.

My solution:
- Change the spec, inverting the meaning of the w flag, to match
  the current encoding behaviour.
- Fix decoders to be compliant with this new spec.
- Bump the protocol version so that, in the future, encoders
  could chose to use 32-bit packet numbers safely, knowing
  whether or not the peer was capable of decoding it properly.

I did not change the encoder logic at this time.  The need to
encode 32-bits of the packet number is a theoretical possibility
that might become necessary if the bandwidth x delay product
gets very high.  (The number of packets in flight is high and
the bottom 16-bits risks becoming ambiguous to identify a packet.)

(cherry picked from commit d707afb)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants