Replaced parse_descriptor() function, fixed some overruns #1460

seanm · 2024-02-05T01:34:21Z

No description provided.

Youw · 2024-02-06T00:31:12Z

To my personal taste - I like this (explicit) way better.
Although, I don't have an oppinion if the change is inline with the whole design of this part of the code.

I've reviewed it, and the change looks good.
I also found where the improvements are, but would be great to describe those explicitly in the PR description section.

seanm · 2024-02-06T01:32:10Z

To my personal taste - I like this (explicit) way better.

The change is not just one of taste, but correctness too. The old way was assuming specific struct alignment, and that struct fields did not have padding in between.

I've reviewed it, and the change looks good.

Thanks for reviewing.

I also found where the improvements are, but would be great to describe those explicitly in the PR description section.

I did write pretty detailed commit messages... is it useful to repeat in the PR description?

Youw · 2024-02-06T09:57:52Z

Oh, I didn't see the commit message(s).
Yes, that would suffice, but I believe each PR is merged into master as a single commit, so those commit messages would have been lost, unless someone explicitly combines them into a single message for a single (final) commit.

NOTE: libusb does not use squash-merge functionality github provides.

seanm · 2024-02-06T14:14:11Z

I believe each PR is merged into master as a single commit

Really?!?! Why?

That really reduces the usefulness of git bisect for example.

Youw · 2024-02-06T21:07:53Z

Really?!?! Why?

A very simple and verified over time philosophy: a single PR should contain a single independent change and solve a single "problem".
This worked very well for many years, not only for libusb.

If a PR is more than one commit - those are either not independent or should be split into separate PRs.

seanm · 2024-02-06T21:56:30Z

A very simple and verified over time philosophy: a single PR should contain a single independent change and solve a single "problem".

This PR does solve a single "problem". The problem of parse_descriptor() being broken. This single problem is divided into smaller subproblems.

If the official libusb policy is indeed 1 commit per PR, then it really should be documented in the HACKING file (or somewhere) because it is very unusual.

tormodvolden · 2024-02-06T23:31:44Z

No, there is no rule saying a PR gets squashed. We will certainly want to squash fixup commits and other WIP artifacts, but if each commit is an independent change they go in one by one. Sometimes it is for instance good to separate preparatory commits from those commits that really make a change, it makes it easier to review up front, and to understand and verify afterwards.

tormodvolden · 2024-02-06T23:34:40Z

And if and when we squash commits, no commit messages gets lost. When you use "squash" in git rebase it will add all commit messages, then you edit it before committing.

Youw · 2024-02-06T23:44:34Z

Alright, looks like my observatory assumption is not correct. Some habbit of mine to generalize things I see consistently.
Maybe I haven't been around long-enough to see PRs with more that one commit merged with rebase-merge strategy.

But in any case I find it way more convenient to see the summary of the PR upfront, rather than having to look for individual commit messages. Maybe it's just my habbit.

tormodvolden · 2024-02-06T23:46:50Z

After a quick look at these, I see no need to squash them. OK, they all improve parse_descriptor, but in different ways. Each of them would have made sense without the others.

A bunch of easy-to-review commits is better than huge commits where you have to look back and forth to see what pieces play together. And the larger the commit the more sure there will be something wrong that slips in and is missed in a review. I think most regressions I have found are from mega-commits where too many things were changed at a time.

Youw · 2024-02-06T23:52:22Z

A bunch of easy-to-review commits is better than huge commits

Each of them would have made sense without the others

To my experience - that's a good reason to have separate PRs, not just separate commits.
I'm often seen cases when a change request or discussions to one of the commits would delay the merge of the others, not to mention potential inconvenience with interractive rebases/etc., to make changes only to some part of a multi-commit PR.

tormodvolden · 2024-02-06T23:55:20Z

But in any case I find it way more convenient to see the summary of the PR upfront, rather than having to look for individual commit messages. Maybe it's just my habbit.

If you click on the "..." after each commit summary on this web page (e.g. under "seanm added 6 commits"), there is a drop-down with the full commit message,.

Most of the time it is the opposite problem, people write a lot in the PR description (which doesn't get into the repo) while their commit messages are poor or empty.

tormodvolden · 2024-02-07T00:01:17Z

Each of them would have made sense without the others

To my experience - that's a good reason to have separate PRs, not just separate commits.

Sometimes they depend on each other to avoid conflicts, and it is just a saner way to work both for coder and reviewer. They fit together logically so they are best treated in the same PR. There is a risk that discussion about one commit delays the others, but it is rarely a big problem. The committer can in such cases also choose to cherry-pick the easy ones, and leave the questionable commits to be fixed up (and rebased if needed).

Youw · 2024-02-07T00:02:07Z

This commit, apparently introduces an issue:

C:\projects\libusb\libusb\descriptor.c(276,24): error C2220: the following warning is treated as an error [C:\projects\libusb\msvc\libusb_static.vcxproj]
         			ifp->extra_length = len;
         			                    ^
         
     2>C:\projects\libusb\libusb\descriptor.c(276,24): warning C4244: '=': conversion from 'intptr_t' to 'int', possible loss of data [C:\projects\libusb\msvc\libusb_static.vcxproj]
         			ifp->extra_length = len;
         			                    ^
         
     2>C:\projects\libusb\libusb\descriptor.c(427,[28](https://ci.appveyor.com/project/dickens/libusb/builds/49109043/job/or3trscp22wbflua#L28)): warning C4244: '+=': conversion from 'intptr_t' to 'int', possible loss of data [C:\projects\libusb\msvc\libusb_static.vcxproj]
         			config->extra_length += len;
         			                        ^

At least on MSVC x64 compilers.

Youw · 2024-02-07T17:45:34Z

libusb/descriptor.c

@@ -127,7 +126,7 @@ static int parse_endpoint(struct libusb_context *ctx,

 	/* Copy any unknown descriptors into a storage area for drivers */
 	/*  to later parse */
-	len = (int)(buffer - begin);
+	intptr_t len = (intptr_t)(buffer - begin);


now, when I looke a this - what's the point of using intptr_t and not using int as it was originally?

Admittedly, not much.

The old cast is theoretically truncating (64 bit pointers, 32 bit ints), but of course they should never be so far apart.

This way is just slightly more correct. len is now the correct type, and at least the check on the next line (if (len <= 0) is therefore more correct too in the face of overflow.

When we casted to int right away, in case of overflow there was a chance it would be cut off by the <0 check.
If we're to make it property right, then in addition to <0 check we should also check for (at least) INT_MAX.

Unless it is already checked elsewhere for some othe constant (e.g. something usb-specific), in which case casting it to int instead of intptr_t would make more elegant implementation.

parse_configuration() or its sole caller raw_desc_to_config() is called with an int size. raw_desc_to_config() is only called from libusb_get_active_config_descriptor() where the size is taken from wTotalLength which therefore cannot be larger than 65535 ever.

What was the conclusion here, is the change pointless?

My opinion is still that the change is a small improvement.

tormodvolden · 2024-02-14T22:44:14Z

The last commit here is marked WIP, I guess the others are ready to go?

seanm · 2024-02-14T23:05:42Z

The last commit here is marked WIP, I guess the others are ready to go?

Let's figure out the WIP one before merging... Let me see if I remember what I was thinking those weeks ago....

seanm · 2024-02-14T23:10:16Z

Ah yes... The old code is this:

		// Second pass: Iterate through desc list, fill IAD structures
		consumed = 0;
		i = 0;
		while (consumed < size) {
			header.bLength = buffer[0];
			header.bDescriptorType = buffer[1];
			if (header.bDescriptorType == LIBUSB_DT_INTERFACE_ASSOCIATION) {
				iad[i].bLength = buffer[0];
				iad[i].bDescriptorType = buffer[1];
				iad[i].bFirstInterface = buffer[2];
				iad[i].bInterfaceCount = buffer[3];
				iad[i].bFunctionClass = buffer[4];
				iad[i].bFunctionSubClass = buffer[5];
				iad[i].bFunctionProtocol = buffer[6];
				iad[i].iFunction = buffer[7];
				i++;
			}
			
			buffer += header.bLength;
			consumed += header.bLength;
		}

We know the length of buffer is size because it's passed as a parameter to this function.

buffer is incremented here by header.bLength (aka buffer[0]). But if the contents of buffer[0] are unexpected/garbage, then it seems to me buffer could get incremented too far...

libusb/descriptor.c

libusb/libusb.h

tormodvolden · 2024-05-07T18:48:58Z

BTW, I will merge #1428 first, and this one will need a small rework afterwards.

seanm · 2024-05-08T16:29:22Z

BTW, I will merge #1428 first, and this one will need a small rework afterwards.

I've rebased and pushed.

tormodvolden · 2024-05-08T21:55:36Z

"Fixed potential buffer overread" still talks about WIP.

Here as well as in your other PR it would be good to see in the commit summary what part of the code you are changing. We have established a prefix convention that everybody is using. When reviewing or perusing the git log it has great value e.g. to be able to identify changes in core code vs in examples.

seanm · 2024-05-08T22:42:11Z

"Fixed potential buffer overread" still talks about WIP.

Yeah, would appreciate thoughts on that. I currently call usbi_warn and return, is that good?

Here as well as in your other PR it would be good to see in the commit summary what part of the code you are changing. We have established a prefix convention that everybody is using. When reviewing or perusing the git log it has great value e.g. to be able to identify changes in core code vs in examples.

Any docs on that? I see nothing in HACKING.

tormodvolden · 2024-05-09T10:17:21Z

Just look at the git log, i.e. git log --oneline

libusb/descriptor.c

seanm · 2024-05-09T22:53:22Z

Just look at the git log, i.e. git log --oneline

Sure, but if this is required of commit messages, shouldn't it be documented in HACKING, which already discusses commit messages:

"Commit messages should be formatted to 72 chars width and have a
free-standing summary line. See for instance "Commit Guidelines" on
https://git-scm.com/book/en/v2/Distributed-Git-Contributing-to-a-Project
or https://cbea.ms/git-commit/ about how to make well-formed commit
messages.

Put detailed information in the commit message itself, which will end
up in the git history."

tormodvolden · 2024-05-10T07:30:21Z

libusb/libusb.h

@@ -335,6 +335,7 @@ enum libusb_descriptor_type {
 #define LIBUSB_DT_SS_ENDPOINT_COMPANION_SIZE	6
 #define LIBUSB_DT_BOS_SIZE			5
 #define LIBUSB_DT_DEVICE_CAPABILITY_SIZE	3
+#define LIBUSB_DT_INTERFACE_ASSOCIATION_SIZE			8


Too many tabs, "8" should be aligned with the rest (with tab stop width 8).

Ha, to me they all look unaligned because the Xcode project is configured for 4 space tabs.

We really should proceed with #1443 so that human effort reviewing and fixing trivialities like this can be automated.

Anyway, fixed it.

Thanks. I'd recommend using "git diff" before or "git show" after committing, it exposes such whitespace errors easily and clearly and also lets you review other changes and commit messages from another angle. QED GUIs fail miserably on this.

tormodvolden · 2024-05-11T10:07:54Z

Another tip: To spellcheck your commit messages you can use for instance:
git log origin/master.. | grep -v ^commit | aspell -a --dont-suggest | grep '^#'

tormodvolden · 2024-05-12T10:35:54Z

libusb/descriptor.c

+{
+	return (uint32_t)((uint32_t)(p[3] << 24) |
+					  (uint32_t)(p[2] << 16) |
+					  (uint32_t)(p[1] << 8) |


Shouldn't the cast/promotion be done before the shifting?

Otherwise I think we would only need the final cast.

Cast before shifting is not needed because of implicit promotion: https://www.gnu.org/software/c-intro-and-ref/manual/html_node/Shift-Operations.html (as long as we don't care about 16-bit platforms!).
I wouldn't think casting before bitwise or'ing is needed either.

Indeed. I have restored the casting to be as it was in the macro.

Why can't we just write
return (p[3] << 24) | (p[2] << 16) | (p[1] << 8) | p[0];
?

That would implicitly promote to signed int (IIUC), and it seems safer to keep everything unsigned because shifting a 1 into the sign bit is technically UB. Besides, it was already like that, and presumably correct as it was.

This function had a few problems: - it takes two buffers as parameters but knows nothing about their length, making it easy to overrun them. - callers make unwarranted assumptions about the alignment of structures that are passed to it (it assumes there's no padding) - it has tricky pointer arithmetic and masking With this new formulation, it's easier to see what's being read/written, especially the destination. It's now very clear that the destination is not being overrun because we are simply assigning to struct fields. Also converted byte swapping macros to inline functions for more type safety.

This was checking that `size` is at least `LIBUSB_DT_CONFIG_SIZE` (9) bytes long, but then increments the pointer with `buf += header.bLength`. That could end up pointing past of the end of the buffer. There is a subsequest check that would prevent dereferencing it, but it's still UB to even create such a pointer. Added a check with a similar pattern as elsewhere in this file.

All the right hand side is `dev_cap`, changed one outlier to match. Also clarified the relationships between some magic numbers. No change in behaviour here.

The first iteration of this loop was safe because the beginning of the function checked that `size` is at least LIBUSB_DT_CONFIG_SIZE (9) bytes long. But for subsequent iterations, it could advance the pointer too far (which is undefined behaviour) depending on the content of the buffer itself.

seanm · 2024-05-13T17:26:43Z

Another tip: To spellcheck your commit messages you can use for instance: git log origin/master.. | grep -v ^commit | aspell -a --dont-suggest | grep '^#'

Spellings fixed.

tormodvolden · 2024-05-13T18:03:44Z

libusb/descriptor.c

@@ -1043,7 +1045,14 @@ int API_EXPORTED libusb_get_ssplus_usb_device_capability_descriptor(

 	// We can only parse the non-variable size part of the SuperSpeedPlus descriptor. The attributes
 	// have to be read "manually".


This comment doesn't make much sense now that we are not "parsing" any longer.

Please give me the comment text you'd like, and I'll update it.

seanm · 2024-05-14T13:41:09Z

Just look at the git log, i.e. git log --oneline

So, is it that you want me to add the prefix core: to all these commits?

seanm mentioned this pull request Feb 5, 2024

Add support for SuperSpeed+ Capability Descriptors #1428

Closed

mcuee added the core Related to common codes label Feb 5, 2024

Youw approved these changes Feb 6, 2024

View reviewed changes

seanm force-pushed the remove-parse_descriptor branch from 7ae012b to 83942ce Compare February 7, 2024 02:43

Youw reviewed Feb 7, 2024

View reviewed changes

hjelmn approved these changes Feb 12, 2024

View reviewed changes

tormodvolden reviewed Apr 4, 2024

View reviewed changes

libusb/descriptor.c Outdated Show resolved Hide resolved

seanm force-pushed the remove-parse_descriptor branch 3 times, most recently from beb1831 to 8bae9eb Compare April 7, 2024 00:59

seanm force-pushed the remove-parse_descriptor branch from 8bae9eb to d324626 Compare April 20, 2024 16:15

seanm force-pushed the remove-parse_descriptor branch from d324626 to 67ac1a7 Compare May 5, 2024 21:24

tormodvolden reviewed May 5, 2024

View reviewed changes

libusb/libusb.h Outdated Show resolved Hide resolved

seanm force-pushed the remove-parse_descriptor branch 2 times, most recently from d219558 to 9e52150 Compare May 8, 2024 16:28

tormodvolden reviewed May 9, 2024

View reviewed changes

libusb/descriptor.c Outdated Show resolved Hide resolved

seanm force-pushed the remove-parse_descriptor branch from 9e52150 to de91ffe Compare May 9, 2024 22:45

tormodvolden reviewed May 10, 2024

View reviewed changes

seanm force-pushed the remove-parse_descriptor branch from de91ffe to 15f3123 Compare May 10, 2024 15:53

tormodvolden reviewed May 12, 2024

View reviewed changes

seanm added 6 commits May 13, 2024 13:20

Defer potentially truncating cast to last minute

f08de9c

Restored implicitly casted-away const

52fdc64

Small clarifications with no behaviour change

498e406

All the right hand side is `dev_cap`, changed one outlier to match. Also clarified the relationships between some magic numbers. No change in behaviour here.

seanm force-pushed the remove-parse_descriptor branch from 15f3123 to d12e64a Compare May 13, 2024 17:26

tormodvolden reviewed May 13, 2024

View reviewed changes

		@@ -1043,7 +1045,14 @@ int API_EXPORTED libusb_get_ssplus_usb_device_capability_descriptor(

		// We can only parse the non-variable size part of the SuperSpeedPlus descriptor. The attributes
		// have to be read "manually".

Replaced parse_descriptor() function, fixed some overruns #1460

Are you sure you want to change the base?

Replaced parse_descriptor() function, fixed some overruns #1460

Conversation

seanm commented Feb 5, 2024

Youw commented Feb 6, 2024

seanm commented Feb 6, 2024

Youw commented Feb 6, 2024

seanm commented Feb 6, 2024

Youw commented Feb 6, 2024

seanm commented Feb 6, 2024

tormodvolden commented Feb 6, 2024

tormodvolden commented Feb 6, 2024

Youw commented Feb 6, 2024

tormodvolden commented Feb 6, 2024

Youw commented Feb 6, 2024

tormodvolden commented Feb 6, 2024

tormodvolden commented Feb 7, 2024

Youw commented Feb 7, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tormodvolden commented Feb 14, 2024

seanm commented Feb 14, 2024

seanm commented Feb 14, 2024

tormodvolden commented May 7, 2024

seanm commented May 8, 2024

tormodvolden commented May 8, 2024

seanm commented May 8, 2024

tormodvolden commented May 9, 2024

seanm commented May 9, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tormodvolden commented May 11, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

seanm commented May 13, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

seanm commented May 14, 2024