New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replaced parse_descriptor() function, fixed some overruns #1460
base: master
Are you sure you want to change the base?
Conversation
To my personal taste - I like this (explicit) way better. I've reviewed it, and the change looks good. |
The change is not just one of taste, but correctness too. The old way was assuming specific struct alignment, and that struct fields did not have padding in between.
Thanks for reviewing.
I did write pretty detailed commit messages... is it useful to repeat in the PR description? |
Oh, I didn't see the commit message(s). NOTE: libusb does not use squash-merge functionality github provides. |
Really?!?! Why? That really reduces the usefulness of |
A very simple and verified over time philosophy: a single PR should contain a single independent change and solve a single "problem". If a PR is more than one commit - those are either not independent or should be split into separate PRs. |
This PR does solve a single "problem". The problem of If the official libusb policy is indeed 1 commit per PR, then it really should be documented in the |
No, there is no rule saying a PR gets squashed. We will certainly want to squash fixup commits and other WIP artifacts, but if each commit is an independent change they go in one by one. Sometimes it is for instance good to separate preparatory commits from those commits that really make a change, it makes it easier to review up front, and to understand and verify afterwards. |
And if and when we squash commits, no commit messages gets lost. When you use "squash" in git rebase it will add all commit messages, then you edit it before committing. |
Alright, looks like my observatory assumption is not correct. Some habbit of mine to generalize things I see consistently. But in any case I find it way more convenient to see the summary of the PR upfront, rather than having to look for individual commit messages. Maybe it's just my habbit. |
After a quick look at these, I see no need to squash them. OK, they all improve parse_descriptor, but in different ways. Each of them would have made sense without the others. A bunch of easy-to-review commits is better than huge commits where you have to look back and forth to see what pieces play together. And the larger the commit the more sure there will be something wrong that slips in and is missed in a review. I think most regressions I have found are from mega-commits where too many things were changed at a time. |
To my experience - that's a good reason to have separate PRs, not just separate commits. |
If you click on the "..." after each commit summary on this web page (e.g. under "seanm added 6 commits"), there is a drop-down with the full commit message,. Most of the time it is the opposite problem, people write a lot in the PR description (which doesn't get into the repo) while their commit messages are poor or empty. |
Sometimes they depend on each other to avoid conflicts, and it is just a saner way to work both for coder and reviewer. They fit together logically so they are best treated in the same PR. There is a risk that discussion about one commit delays the others, but it is rarely a big problem. The committer can in such cases also choose to cherry-pick the easy ones, and leave the questionable commits to be fixed up (and rebased if needed). |
This commit, apparently introduces an issue:
At least on MSVC x64 compilers. |
7ae012b
to
83942ce
Compare
@@ -127,7 +126,7 @@ static int parse_endpoint(struct libusb_context *ctx, | |||
|
|||
/* Copy any unknown descriptors into a storage area for drivers */ | |||
/* to later parse */ | |||
len = (int)(buffer - begin); | |||
intptr_t len = (intptr_t)(buffer - begin); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
now, when I looke a this - what's the point of using intptr_t
and not using int
as it was originally?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Admittedly, not much.
The old cast is theoretically truncating (64 bit pointers, 32 bit ints), but of course they should never be so far apart.
This way is just slightly more correct. len
is now the correct type, and at least the check on the next line (if (len <= 0)
is therefore more correct too in the face of overflow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When we casted to int
right away, in case of overflow there was a chance it would be cut off by the <0
check.
If we're to make it property right, then in addition to <0
check we should also check for (at least) INT_MAX
.
Unless it is already checked elsewhere for some othe constant (e.g. something usb-specific), in which case casting it to int
instead of intptr_t
would make more elegant implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
parse_configuration() or its sole caller raw_desc_to_config() is called with an int size. raw_desc_to_config() is only called from libusb_get_active_config_descriptor() where the size is taken from wTotalLength which therefore cannot be larger than 65535 ever.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What was the conclusion here, is the change pointless?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My opinion is still that the change is a small improvement.
The last commit here is marked WIP, I guess the others are ready to go? |
Let's figure out the WIP one before merging... Let me see if I remember what I was thinking those weeks ago.... |
Ah yes... The old code is this: // Second pass: Iterate through desc list, fill IAD structures
consumed = 0;
i = 0;
while (consumed < size) {
header.bLength = buffer[0];
header.bDescriptorType = buffer[1];
if (header.bDescriptorType == LIBUSB_DT_INTERFACE_ASSOCIATION) {
iad[i].bLength = buffer[0];
iad[i].bDescriptorType = buffer[1];
iad[i].bFirstInterface = buffer[2];
iad[i].bInterfaceCount = buffer[3];
iad[i].bFunctionClass = buffer[4];
iad[i].bFunctionSubClass = buffer[5];
iad[i].bFunctionProtocol = buffer[6];
iad[i].iFunction = buffer[7];
i++;
}
buffer += header.bLength;
consumed += header.bLength;
} We know the length of
|
beb1831
to
8bae9eb
Compare
8bae9eb
to
d324626
Compare
d324626
to
67ac1a7
Compare
BTW, I will merge #1428 first, and this one will need a small rework afterwards. |
d219558
to
9e52150
Compare
I've rebased and pushed. |
"Fixed potential buffer overread" still talks about WIP. Here as well as in your other PR it would be good to see in the commit summary what part of the code you are changing. We have established a prefix convention that everybody is using. When reviewing or perusing the git log it has great value e.g. to be able to identify changes in core code vs in examples. |
Yeah, would appreciate thoughts on that. I currently call
Any docs on that? I see nothing in |
Just look at the git log, i.e. |
9e52150
to
de91ffe
Compare
Sure, but if this is required of commit messages, shouldn't it be documented in HACKING, which already discusses commit messages: "Commit messages should be formatted to 72 chars width and have a Put detailed information in the commit message itself, which will end |
libusb/libusb.h
Outdated
@@ -335,6 +335,7 @@ enum libusb_descriptor_type { | |||
#define LIBUSB_DT_SS_ENDPOINT_COMPANION_SIZE 6 | |||
#define LIBUSB_DT_BOS_SIZE 5 | |||
#define LIBUSB_DT_DEVICE_CAPABILITY_SIZE 3 | |||
#define LIBUSB_DT_INTERFACE_ASSOCIATION_SIZE 8 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Too many tabs, "8" should be aligned with the rest (with tab stop width 8).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ha, to me they all look unaligned because the Xcode project is configured for 4 space tabs.
We really should proceed with #1443 so that human effort reviewing and fixing trivialities like this can be automated.
Anyway, fixed it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. I'd recommend using "git diff" before or "git show" after committing, it exposes such whitespace errors easily and clearly and also lets you review other changes and commit messages from another angle. QED GUIs fail miserably on this.
de91ffe
to
15f3123
Compare
Another tip: To spellcheck your commit messages you can use for instance: |
libusb/descriptor.c
Outdated
{ | ||
return (uint32_t)((uint32_t)(p[3] << 24) | | ||
(uint32_t)(p[2] << 16) | | ||
(uint32_t)(p[1] << 8) | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't the cast/promotion be done before the shifting?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Otherwise I think we would only need the final cast.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cast before shifting is not needed because of implicit promotion: https://www.gnu.org/software/c-intro-and-ref/manual/html_node/Shift-Operations.html (as long as we don't care about 16-bit platforms!).
I wouldn't think casting before bitwise or'ing is needed either.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed. I have restored the casting to be as it was in the macro.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why can't we just write
return (p[3] << 24) | (p[2] << 16) | (p[1] << 8) | p[0];
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would implicitly promote to signed int (IIUC), and it seems safer to keep everything unsigned because shifting a 1 into the sign bit is technically UB. Besides, it was already like that, and presumably correct as it was.
This function had a few problems: - it takes two buffers as parameters but knows nothing about their length, making it easy to overrun them. - callers make unwarranted assumptions about the alignment of structures that are passed to it (it assumes there's no padding) - it has tricky pointer arithmetic and masking With this new formulation, it's easier to see what's being read/written, especially the destination. It's now very clear that the destination is not being overrun because we are simply assigning to struct fields. Also converted byte swapping macros to inline functions for more type safety.
This was checking that `size` is at least `LIBUSB_DT_CONFIG_SIZE` (9) bytes long, but then increments the pointer with `buf += header.bLength`. That could end up pointing past of the end of the buffer. There is a subsequest check that would prevent dereferencing it, but it's still UB to even create such a pointer. Added a check with a similar pattern as elsewhere in this file.
All the right hand side is `dev_cap`, changed one outlier to match. Also clarified the relationships between some magic numbers. No change in behaviour here.
The first iteration of this loop was safe because the beginning of the function checked that `size` is at least LIBUSB_DT_CONFIG_SIZE (9) bytes long. But for subsequent iterations, it could advance the pointer too far (which is undefined behaviour) depending on the content of the buffer itself.
15f3123
to
d12e64a
Compare
Spellings fixed. |
@@ -1043,7 +1045,14 @@ int API_EXPORTED libusb_get_ssplus_usb_device_capability_descriptor( | |||
|
|||
// We can only parse the non-variable size part of the SuperSpeedPlus descriptor. The attributes | |||
// have to be read "manually". |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment doesn't make much sense now that we are not "parsing" any longer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please give me the comment text you'd like, and I'll update it.
So, is it that you want me to add the prefix |
No description provided.