Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Packet received out-of-order" with UseCompression=true #146

Closed
bgrainger opened this issue Dec 7, 2016 · 6 comments
Closed

"Packet received out-of-order" with UseCompression=true #146

bgrainger opened this issue Dec 7, 2016 · 6 comments
Assignees
Labels
Milestone

Comments

@bgrainger
Copy link
Member

@bgrainger bgrainger commented Dec 7, 2016

In #31, @mguinness reported a specific scenario where compressed data is not deserialized correctly.

@bgrainger bgrainger added the bug label Dec 7, 2016
@bgrainger bgrainger added this to the 1.0 milestone Dec 7, 2016
@bgrainger bgrainger self-assigned this Dec 7, 2016
@bgrainger
Copy link
Member Author

@bgrainger bgrainger commented Dec 7, 2016

The final packet being sent from the server is 05 00 00 1B FE 00 00 22 00:

  • 05 00 00: packet length = 5
  • 1B: packet sequence number = 27
  • FE 00 00 22 00: EOF, no warnings, flags = AutoCommit | NoIndexUsed

This packet is getting split in two pieces: 05 is being sent as the last byte in a compressed payload (2K raw data compressed to 676 bytes, sequence number 1), then 8 bytes being sent uncompressed (sequence number 2).

It appears that in this case, the sequence number of the uncompressed packet is not reset to the sequence number of the containing compressed packet, possibly because it started being written in the compressed packet (sequence number 1).

Given that I've found so many edge cases with packet sequence numbering (when compression is on), that it was solely intended as a diagnostic for the deserialization code (to catch egregious coding errors), and that that code seems pretty stable now (and we have a good suite of integration tests), I'm inclined to simply disable the checks when compression is on, rather than try to match MySQL Server's logic for numbering packets.

@mguinness
Copy link
Contributor

@mguinness mguinness commented Dec 7, 2016

Thanks for reviewing and fixing. While I understand your logic regarding packet sequence numbering, what is the implication of disabling the checks? Could that mean the possibility of corrupted data, or are there also checksums in place? Just curious.

@bgrainger
Copy link
Member Author

@bgrainger bgrainger commented Dec 7, 2016

The data being received from the server is still going through packet sequence number checks at the compressed packet level (i.e., this driver will expect compressed packet 1, then 2, then 3, etc.). That should be a primary defense against bugs in this library dropping data entirely.

The check that was removed was checking the sequence numbers of packets inside the compressed blob. In this case, they would go 1, 2, 3, ... 27. In the next compressed packet, they were expected to be reset to 2, 3, 4, ... n. However, the first packet read out of compressed packet 2 actually had sequence number 27, presumably because it was split across the compressed packets

Resetting the uncompressed sequence number with every compressed packet (as MySQL Server does) actually prevents the client from checking that it's received all expected packets in the right order. Let's say there were 50 packets split into two compressed packets. The first group would be numbered 1-27 and the second group would be numbered 2-24 (not 28-50 as you might expect). There's actually no way to know whether you got packet 28 or not!

Checking the sequence numbers ensures that we're not skipping over data within the compressed blob. However, since the data is read linearly (there is no seeking), it seems extremely unlikely to me that disabling packet sequence number checks within compressed data could lead to silent data corruption.

@bgrainger
Copy link
Member Author

@bgrainger bgrainger commented Dec 7, 2016

And FWIW, I have not seen compression provide a net benefit (in any testing I've done). It burns MySQL Server CPU (which could be used to run queries instead) to compress the data. If it's being sent over a high-speed network link to a web server, the I/O savings are usually extremely minimal.

There may be some benefit if your DB server is separated from your client by a high-latency, low-throughput link, but that's an atypical use case (in my experience).

@bgrainger
Copy link
Member Author

@bgrainger bgrainger commented Dec 7, 2016

Fixed in 0.7.4.

@mguinness
Copy link
Contributor

@mguinness mguinness commented Dec 7, 2016

Thanks for the thorough explanation and your observations regarding compression effectiveness.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

No branches or pull requests

2 participants