Skip to content

Conversation

jaredlwong
Copy link
Contributor

Fixed potential pointer overflow leading to an infinite loop in
db/storage/record.cpp

In order to iterate over the data in the record there was a previous loop that
compared pointers:

const char * addr = _data;
const char * end = _data + _netLength();
for ( ; addr <= end ; addr += 2048 ) {
    __record_touch_dummy += addr[0];

this will not work as expected because the addr pointer may overflow.
Consider the size of _data, where it is defined in mongo::ps::Slice as:

Entry _data[SliceSize]

Here _netLength is defined as:

int _netLength() const { return _lengthWithHeaders - HeaderSize; }

in mongo/db/pdfile.h. Where HeaderSize = 16 and _lengthWithHeaders is set in a
number of places through the function:

int& lengthWithHeaders() { _accessing(); return _lengthWithHeaders; }

in mongo/db/pdfile.h.

Assuming we can't guarantee anything about the exact value of _netLength (for
example, that it's always exactly a multiple of 2048) then there will be an
issue with the addr pointer overflowing the bounds of _data.

In C, it is undefined behavior to increment a pointer (in this case addr) more
than 1 past the end of the array (in this case _data) to which it points.
If the compiler so chooses, it may wrap around the pointer if it increases past
_data + SliceSize + 1. This would cause this loop to never exit because addr
would always be less than end.

@kangas
Copy link
Contributor

kangas commented Dec 9, 2013

Jared, can you be more precise about why you believe this is potentially undefined behavior? What specifically may overflow? In what scenarios?

Also, you're calling _netLength() on every pass through the loop. Awful from a performance perspective.

Finally, please read CONTRIBUTING.rst. We'll need a SERVER ticket and for you to sign the contributor's agreement.

@jaredlwong
Copy link
Contributor Author

I'm absolutely sorry. I should have checked for the contributing file before doing this. Hopefully the updated message above is more precise. If it is not, please let me know.

Here is my account: https://jira.mongodb.org/browse/SERVER-12006?jql=Participants%20%3D%20jaredlwong

Not sure how to allow you to see that I signed the contributing form. Maybe you're able to see it on jira?

@jaredlwong
Copy link
Contributor Author

I also updated the commit such that _netLength is not called on every iteration of the for loop.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean 2048 here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I'll fix that. I did mean 2048.

@kangas
Copy link
Contributor

kangas commented Dec 9, 2013

Jared, I confirmed that you signed the agreement. Thanks for the ticket! I've assigned it to Andy for review.

…e loop in

db/storage/record.cpp

In order to iterate over the data in the record there was a previous loop that
compared pointers. This will not work as expected because the addr pointer may
overflow.

In C, it is undefined behavior to increment a pointer (in this case addr) more
than 1 past the end of the array (in this case _data) to which it points.

If the compiler so chooses, it may wrap around the pointer if it increases past
_data + SliceSize + 1. This would cause this loop to never exit because addr
would always be less than end.
@acmorrow
Copy link
Contributor

Hi Jared, Matt -

Matt, I believe the undefined behavior referred to here is from C++11 5.7.5 [expr.add], which states

If both the pointer operand and the result point to elements of the same array object,
or one past the last element of the array object, the evaluation shall not produce an
overflow; otherwise, the behavior is undefined.

The situation is complicated by the 'break' statement in the loop. Right now, the increment of 'addr' never actually occurs, since we always break out of the loop after one iteration. However, if we were to remove the 'break' statement, then this could elicit undefined behavior when incrementing 'addr' past the end of the array after the loop body. The fact that this invalid pointer will not be dereferenced is irrelevant, just synthesizing it is UB.

So I think Jared's proposed change makes sense, in that with his change in place, if the 'break' statement is ever removed, then the code will not suddenly elicit UB.

@kangas
Copy link
Contributor

kangas commented Dec 13, 2013

@acmorrow This discussion reminded me of a blog post by Chris Westin, where he describes how he picked up his habit of "incrementing the pointer" when writing for loops

https://www.bookofbrilliantthings.com/blog/revisiting-some-old-for-loop-lore

It used to generate much faster code; now only slightly faster.

@amschwerin
Copy link
Contributor

Did you read his addendum? I think when he ran his test in GCC 4, the
compiler did all the work at compile time due to better cross-function
analysis.

On Fri, Dec 13, 2013 at 3:57 PM, Matt Kangas notifications@github.comwrote:

@acmorrow https://github.com/acmorrow This discussion reminded me of a
blog post by Chris Westin, where he describes how he picked up his habit of
"incrementing the pointer" when writing for loops

https://www.bookofbrilliantthings.com/blog/revisiting-some-old-for-loop-lore

It used to generate much faster code; now only slightly faster.


Reply to this email directly or view it on GitHubhttps://github.com//pull/570#issuecomment-30542962
.

@kangas
Copy link
Contributor

kangas commented Dec 13, 2013

@andy10gen Yes. Also, he's only single-stepping a pointer, rather than advancing by 2048.

@kangas
Copy link
Contributor

kangas commented Dec 13, 2013

Merged in 9ebb366

Thank you for contributing to MongoDB, Jared!

@kangas kangas closed this Dec 13, 2013
@benety benety changed the title Fixed potential pointer overflow leading to an infinite loop in db/storage/record.cpp SERVER-12006 Fixed potential pointer overflow leading to an infinite loop in db/storage/record.cpp May 31, 2014
jiongle1 pushed a commit to scantist-ossops-m2/mongo that referenced this pull request Mar 30, 2024
In summary: we don't copy hot-backup files; when we load the hot backup
metadata file during WiredTiger startup, if there are entries in the
metadata file without a valid checkpoint, we assume they're bulk-loaded
files and re-create them with no contents.

Implement Alex's idea: don't walk the metadata file in the hot-backup
code looking for 'file:' objects, instead use __wt_meta_btree_apply()
plus a callback instead.

Clarify __wt_conn_btree_apply() and __wt_meta_btree_apply(): we talk
about EBUSY errors, but all we can handle are bulk-loaded files.
Rename __wt_conn_btree_apply_single() to __wt_conn_btree_apply_bulk(),
and change it to only operate on bulk-loaded files.

Increase the size of the metadata backup line we can handle from 512B
to 5KB; this should be a dynamic value, but I'm not going to do it right
now.

Reference: mongodb#570, mongodb#653.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants