Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid incrementing a pointer past the end #1489

Merged
merged 1 commit into from Feb 18, 2021

Conversation

DemiMarie
Copy link
Contributor

The ‘end’ parameter to ‘strtaglen’ might point past the end of an
allocation. Therefore, if ‘start’ becomes equal to ‘end’, return an
error without calling ‘memchr’ on that pointer.

@pmatilai
Copy link
Member

pmatilai commented Jan 14, 2021

Err, no. There might be an off-by-one in the code, but you don't need all this and an abort(), to deal with that.

Start with the basics. What is the scenario where 'end' is beyond the allocation? Where is that pointer accessed? Do you have a reproducer for this?

@DemiMarie
Copy link
Contributor Author

The only case where end can be beyond the allocation is for the last entry in the header. This can happen for v3 headers, or v4 headers that aren’t contiguous. I know that compilers are allowed to assume that the arguments to memcpy can be dereferenced, and the same may also be true of memchr.

lib/header.c Outdated Show resolved Hide resolved
@DemiMarie DemiMarie force-pushed the safe-memchr branch 2 times, most recently from 83be6b4 to 70c3c87 Compare February 16, 2021 19:01
@DemiMarie
Copy link
Contributor Author

base64-encoded test case below. This doesn’t trip any of the sanitizers, but I did check that it called memchr with a pointer one-past-the-end:

AAAACAAAAIgAAAA+AAAABwAAAAAAAAAQAJgAAAAAAAcAAAAAAAAAEACYAAAAAAAAAAAAAADCAAAA
AAAAAAAACAQAAAAEAAAAAAAACAAAAAQEAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAMIAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAiAAAAD4AAAAHgAAAAAAAABAAAAAAAAAAAAAAAAAAwgAAAAAA
AP8AAAAAAAAAAAAAAADCAAAAAAAIBAAAAAAA/wAAhwAAAAAAALTwKwEgYQAAAAAAAAAAAAAAAAAA
AAAAAAAAAAACAAAAAAAAAAAAAAAAAAAALwgAAAAAAAAAAAAAAAAAAAAAAAA=

While this change (which reduces the amount of code) is probably a good idea anyway, we might also want to wrap memcmp, memchr, and other such functions with static inline wrappers that avoid this problem.

@pmatilai
Copy link
Member

Actually in the latest form, the code is obvious enough that it doesn't need a comment at all. Just move the whole explanation to the commit message and we're good here.

That's a generic preference BTW: comments should be used sparingly, and in particular multiline comments are like neon signs which should be reserved for very special danger zones. If the neon signs are everywhere, they do nothing but blind you from the code itself. In the git era, commit messages are a far better place to document the details.

The ‘end’ parameter to ‘strtaglen’ might point past the end of an
allocation.  Therefore, if ‘start’ becomes equal to ‘end’, exit the loop
without calling ‘memchr’ on it.
@DemiMarie
Copy link
Contributor Author

That's a generic preference BTW: comments should be used sparingly, and in particular multiline comments are like neon signs which should be reserved for very special danger zones. If the neon signs are everywhere, they do nothing but blind you from the code itself. In the git era, commit messages are a far better place to document the details.

Good to know, thanks! I take it that doxygen doc comments on public APIs are an exception?

@pmatilai
Copy link
Member

Public API docs are documentation, not comments, despite masquerading as such.

Multiline comments are by no means forbidden, just that they should be used sparingly, in particular in middle of code. In middle of code, multiline comments are very distracting so they should only be used there when it's in fact the purpose to distract, typically to deliver a warning of some sort.
https://www.kernel.org/doc/html/v5.11/process/coding-style.html#commenting is excellent advice (most of the kernel coding style is, despite some details such as indentation differing a bit)

The general preference is do the explaining in the code itself, using descriptive names and that in the best cases make the code read almost like a story. Clearly not all of rpm is that way 🤣 and it's not always achievable now matter what, but hey, it's a goal.

@pmatilai pmatilai merged commit 165330b into rpm-software-management:master Feb 18, 2021
@pmatilai
Copy link
Member

Oh and thanks for spotting + fixing this.

I rather like this as it ends up actually simplifying the code. When a fix removes code, that's usually a good sign 🙂

@pmatilai
Copy link
Member

Except that ... this introduced a new issue:

header.c: In function ‘dataLength’:
header.c:427:30: warning: ‘s’ may be used uninitialized in this function [-Wmaybe-uninitialized]
  427 |     return (c > 0) ? -1 : (s - str + 1);
      |                            ~~^~~~~

@pmatilai
Copy link
Member

Created #1545 to fix that.

It's a bit strange as CI is supposed to catch newly introduced warnings, but perhaps gcc on Fedora 32 doesn't see that (my laptop is on 33)

@DemiMarie DemiMarie deleted the safe-memchr branch February 18, 2021 20:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants