Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: fix stack overflow in HandleNode() (CVE-2017-5950) #489

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

anarcat
Copy link

@anarcat anarcat commented Apr 26, 2017

simply set a hardcoded recursion limit to 2000 (inspired by Python's)
to avoid infinitely recursing into arbitrary data structures

assert() the depth. unsure if this is the right approach, but given
that HandleNode() is "void", I am not sure how else to return an
error. the problem with this approach of course is that it will still
crash the caller, unless they have proper exception handling in place.

Closes: #459

simply set a hardcoded recursion limit to 2000 (inspired by Python's)
to avoid infinitely recursing into arbitrary data structures

assert() the depth. unsure if this is the right approach, but given
that HandleNode() is "void", I am not sure how else to return an
error. the problem with this approach of course is that it will still
crash the caller, unless they have proper exception handling in place.

Closes: jbeder#459
@inetknght
Copy link

assert() is not the right thing for this: assert translates to no-operation in release builds (specifically, most build systems will set NDEBUG definition because that's generally the standard).

Generally, that means that this doesn't solve issues for production uses. In debug builds, the library will terminate the application: importantly, assert() does not throw so "unless they have proper exception handling in place" isn't applicable. Most (many?) end-user developers aren't going to want to spend time or effort debugging problem; some might assume that the recursion depth doesn't represent production data.

An explicit throw would be better. I only recently started using yaml-cpp so I'm not sure what custom exception would be useful; but I would start off with something inheriting from std::runtime_error

@anarcat
Copy link
Author

anarcat commented Apr 26, 2017 via email

assert() may be compiled out in production and is clunkier to catch.

some ParserException are already thrown elsewhere in the code and it
seems to make sense to reuse the primitive, although it may still
crash improperly configured library consumers, those who do not handle
exceptions explicitly.

we use the BAD_FILE error message because at this point we do not
exactly know which specific data structure led to the recursion.
the original implementation couldn't parse a document with more than
depth_limit entries. now we explicitly increase *and* decrease the
depth on specific handlers like maps, sequences and so on - any
handler that may in turn callback into HandleNode().

this is a little clunky - I would have prefered to increment and
decrement the counter in only one place, but there are many different
return points and this is not Golang so I can't think of a better way
to to this.
@anarcat
Copy link
Author

anarcat commented Apr 26, 2017

there - that should be a little better.

@keith-bennett-gbg
Copy link

That's what I figured as well... Had to start somewhere. :) Note that
assert() is used in a bunch of other places in the code already, I
just copied a nearby example.

Scary.

But I'm still hesitant in using throw or assert. Throw will crash
callers unless they have specific exception handling in place (so it
breaks backwards compatibility). Assert will either crash without
handling or not solve the issue at all. So neither seems like a good
fit.

It would (could?) be crashing anyway because of stack overflow. Throw allows an end-user developer to at least try to intercept the crash and handle that gracefully.

How about we just return here instead? Would you, as a library
consumer, be satisfied by that approach? Or do you prefer something to
be raised?

As a library consumer, I would not want to be unable to differentiate between success and yaml-cpp did not parse the entire file.

@@ -46,6 +46,9 @@ void SingleDocParser::HandleDocument(EventHandler& eventHandler) {
}

void SingleDocParser::HandleNode(EventHandler& eventHandler) {
if (depth > depth_limit) {
throw ParserException(m_scanner.mark(), ErrorMsg::BAD_FILE);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that throwing will eliminate the possibility of re-using this object because it does not reset depth.

I suggest using a depth guard type of class and instantiate that at the entrypoint of the function. A quick example:

// suggested type, adapt as necessary
class deep_recursion : public std::runtime_error {
  int at_depth_;
public:
  using std::runtime_error::runtime_error;
  deep_recursion(int at_depth) : std::runtime_error{std::string{"recursion reached depth "} + std::to_string(at_depth)}, at_depth_(at_depth) {}
  int at_depth() const { return at_depth_; }
  // might need virtual dtor if change inheritance tree
}

// also a suggested type, adapt as necessary
template <int max_depth = 2000>
class depth_guard {
  int & depth_;
public:
  using deep_recursion_exception = deep_recursion; // or whatever type
  depth_guard(int & depth) : depth_(depth) {
    ++depth_;
    if ( max_depth <= depth_ ) {
        throw deep_recursion_exception{depth}; // or whatever constructor
    }
  }
  ~depth_guard() {
    --depth_;
  }
  int current_depth() const { return depth_; }
}

// example use:
void SingleDocParser::HandleNode(EventHandler& eventHandler) {
  using DepthGuard = depth_guard<2000>;
  DepthGuard guard{depth};
  ...
  // code here is guaranteed to not be too 'deep'

That also means that you don't need to explicitly increment and decrement depth everywhere you're calling the function. That will keep maintenance cost low and reduce human error opportunities. It can also be reused elsewhere ;)

Honestly though I think a better solution is to not use recursion: stack depth is not consistent across platforms. I've personally encountered stack size problems when porting stuff from Linux to OS X.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that throwing will eliminate the possibility of re-using this object because it does not reset depth.

Is that a serious issue? At this point - the file should be considered garbage anyways and we just need to abort gracefully.

I suggest using a depth guard type of class and instantiate that at the entrypoint of the function. A quick example: [...]

That's pretty neat - didn't think of using destructors to take care of this... But then that's a huge change - at that point, I would rather switch to a non-recursive design, as you said, but that's also a large change.

I'm looking at making a rather non-invasive change here, that can be backported to earlier versions as well. I'm also worried about stack size problems here - 2000 may just be too high as well - having an arbitrary number there is just looking for trouble. But I'm not sure how else to fix this, short of rewriting the whole parsing logic to be non-recursive.

Copy link

@keith-bennett-gbg keith-bennett-gbg Apr 26, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that a serious issue? At this point - the file should be considered garbage anyways and we just need to abort gracefully.

Maybe for most use cases. Arguably for most realistic use cases. But I thought it worth mentioning; if you have deeply nested types, then YAML is probably not the right data structure to use.

Clarification edit: what I meant to say is that maybe it's not a serious issue for most use cases and arguably it's not a serious issue for most realistic use cases.

That's pretty neat - didn't think of using destructors to take care of this... But then that's a huge change - at that point, I would rather switch to a non-recursive design, as you said, but that's also a large change.

RAII (and, lifetime guarantees) is wonderful :)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I thought it worth mentioning; if you have deeply nested types, then YAML is probably not the right data structure to use.

That's besides the point. This is not about the regular, well designed use-case. This is about user-supplied data, specifically crafted to crash the target... So I think it's acceptable to throw in this case.

RAII (and, lifetime guarantees) is wonderful :)

Yeah, I was wondering how to do this in C++ again. I got used to the defer statement in golang. :)

Honestly, I find the types you suggested to be a little on the heavy side. Isn't there something in the standard library or boost that could cover for this without having to reinvent the wheel?

I like the original implementation I provided because it's just a single integer to keep track of, without that data structure around..

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly, I find the types you suggested to be a little on the heavy side. Isn't there something in the standard library or boost that could cover for this without having to reinvent the wheel?

That's a great question. I'm not familiar with a type with this exact purpose from either the standard library or the boost libraries. Is it really reinventing the wheel then? If you feel strongly about it, the boost mailing list should be able to point you in the direction of people with more experience :) Having been to cppcon, I can tell you that a lot of different people are really responsive to useful stuff.

Assuming it's not already a thing, I would guess that it's for the same reason that there's no standard process library (actually... a week ago, boost::process was released): the standard focuses on portability and stacks aren't actually a fully portable concept.

I like the original implementation I provided because it's just a single integer to keep track of, without that data structure around..

The data structure has no virtual table and stores a simple reference to plain old data. In debug builds, the internal reference may end up being a pointer allocation (on the stack). In any sort of reasonable optimization, its code should get inlined. The optimizer should also identify that the reference is redundant to this->depth and may or may not elect to remove the pointer (since depth is not marked volatile it becomes whether or not the optimizer chooses to use a single indirection (at the cost of a pointer) or double indirection (and save the bytes); on x86 it'll typically choose the former because the second indirection is constant (offset of depth) and can be directly instrumented in assembly). and the class basically gets completely removed from the stack. It then becomes just a source code concept to tell the compiler how to do things with zero cost to runtime. This is why I don't think it's heavy.

I think you're confusing verbose with heavy, which is arguable: when you consider the limited scope of this one specific function, sure, it's rather verbose.

Wow that's a lot of words. Let me just tl;dr that for you:
Optimization level 3 assembly with depth_guard class
Optimization level 3 assembly with equivalent manual control

IN particular, the depth_guard assembly is both shorter and generally simpler. Importantly, the function prolog (setting up the stack frame, et al) is identical which means that stack_guard definition gets elided and optimized away.

Although I'm not quite sure what's up with the repeated sub eax, 1 in .L8.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I guess I just got schooled in compiler arithmetics. ;) What I meant by "heavy" was more "verbose", indeed. As someone who hasn't done C++ in years, I had trouble parsing your class, but then it's probably a symptom of my age more than an issue with your design.

I don't really care. I need something that works to backport to Debian LTS. ;) I was just passing by really, so I'll let the maintainers here decide.

I encourage you to open a separate pull request to fix this one (or just push here if you can) if you believe it is incorrect (which it probably is anyways). :) I'm not sure it's right to just be copy-pasting your code in my PR without fully understanding the implications.

besides, the real issue is not how we decrement that counters, but what the limit should be. we don't know how portable that "2000" is - i just picked that out of the blue, based on Python's default recursion limit (!) on my platform (!!). It's a wild guess. We need to figure out a better solution here, either by tweaking that number dynamically (?) or by ditching recursion here...

Or some other idea no one thought of yet. :)

Thanks for the review!

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe @keith-bennett-gbg 's solution is the safest and simplest; would you please use that? I'd keep the ParserException, and feel free to make a new error message that says something like "Recursion limit of NN reached".

As for that limit, I think some power of 2 around there is fine; let's try 2K. The example file is 4360, so we'll be well under that. If someone has a report that this is too high, we can tailor it per-platform.

@anarcat
Copy link
Author

anarcat commented Apr 29, 2017

I unfortunately won't have time to followup on this for the next month, but I'm happy to delegate this to whoever wishes to push this further!

thanks for your time.

@anarcat
Copy link
Author

anarcat commented Jun 27, 2017

What's the word here? Are yaml-cpp maintainers looking into a fix for this or should I reroll this according to the pattern suggested?

Copy link
Owner

@jbeder jbeder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this PR. Sorry for the delay in reviewing.

@@ -46,6 +46,9 @@ void SingleDocParser::HandleDocument(EventHandler& eventHandler) {
}

void SingleDocParser::HandleNode(EventHandler& eventHandler) {
if (depth > depth_limit) {
throw ParserException(m_scanner.mark(), ErrorMsg::BAD_FILE);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe @keith-bennett-gbg 's solution is the safest and simplest; would you please use that? I'd keep the ParserException, and feel free to make a new error message that says something like "Recursion limit of NN reached".

As for that limit, I think some power of 2 around there is fine; let's try 2K. The example file is 4360, so we'll be well under that. If someone has a report that this is too high, we can tailor it per-platform.

@Justinzobel
Copy link

What's the current status of this merge/fix? As it's a pubic CVE so would love for it to be addressed sooner rather than later.

@anarcat
Copy link
Author

anarcat commented Mar 25, 2018

@Justinzobel the current state is that @keith-bennett-gbg has a better solution to the problem and he (or someone else) needs to submit that as a new pull request.

… error-prone manual increment/check/decrement
@keith-bennett-gbg
Copy link

Hi all,

I wasn't aware that reviewing the code would mean that I'd be responsible for implementing the changes. Nevertheless, they're trivial enough.

I branched off of @anarcat's work and created a pull request to his fork with the changes I had considered. If @anarcat accepts the changes, they should show up in this pull request too.

Thanks,
Keith

use RAII type class to guard against stack depth recursion…
@anarcat
Copy link
Author

anarcat commented Mar 29, 2018

i've never worked that way - i was expecting you would make a different PR, since really, it's a completely different patch. :) but whatever: i'm fine either way... i've merged your PR, but I have only done a cursory review and no tests...

@AlanGriffiths AlanGriffiths mentioned this pull request Jan 20, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants