Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixing improper input syntax and failing bounds check #4955

Merged
merged 1 commit into from
Sep 23, 2014
Merged

Fixing improper input syntax and failing bounds check #4955

merged 1 commit into from
Sep 23, 2014

Conversation

ENikS
Copy link
Contributor

@ENikS ENikS commented Sep 22, 2014

This code does not make much sense from modern compiler point of view.

I need to investigate further but it looks like under certain circumstances it might allow reading beyond vch.end() and could be similar to heart-bleed bug of OpenSSL http://en.wikipedia.org/wiki/Heartbleed

It might take me a bit to understand all of the cases this method is involved with, meanwhile here is the fix.

@BitcoinPullTester
Copy link

Automatic sanity-testing: PASSED, see http://jenkins.bluematt.me/pull-tester/p4955_87314c1c5e69035fefa638de8e237e4ce09788e9/ for binaries and test log.
This test script verifies pulls every time they are updated. It, however, dies sometimes and fails to test properly. If you are waiting on a test, please check timestamps to verify that the test.log is moving at http://jenkins.bluematt.me/pull-tester/current/
Contact BlueMatt on freenode if something looks broken.

@laanwj
Copy link
Member

laanwj commented Sep 22, 2014

There's quite some undefined behaviour with regard to vectors in the bitcoin core code (indexing into an empty vector, or dereferencing begin() from an empty vector). I sincerely doubt that this creates an exploitable bug, but it's not correct C++. For this reason I added begin_ptr and end_ptr to serialize.h, the intent is to replace naive &vchIn.begin()[0]-like expressions with those.

However - in this case they doesn't seem to be needed. Which is interesting. I wasn't aware that a std::vector<unsigned char>::iterator could be passed to the std::vector<char> constructor.

@laanwj
Copy link
Member

laanwj commented Sep 22, 2014

I wasn't aware that a std::vector::iterator could be passed to the std::vector constructor

I've looked it up and this is indeed allowed. You can pass any kind of iterator that produces objects that can be cast to the type of the vector. It basically does a loop with a static_cast (not reinterpret_cast).

@ENikS
Copy link
Contributor Author

ENikS commented Sep 22, 2014

Paraphrasing Murphy’s Law: if something could be exploited it will be exploited.
Judging from the fact that this code was there from the day one, passes all the build checks, unit tests and is significantly different from the rest of API, makes me think it may be left there as a deliberate attempt at leaving a backdoor into the system.

914: CDataStream(const vector_type& vchIn, int nTypeIn, int nVersionIn) : vch(vchIn.begin(), vchIn.end())
919: CDataStream(const std::vector<char>& vchIn, int nTypeIn, int nVersionIn) : vch(vchIn.begin(), vchIn.end())
924: CDataStream(const std::vector<unsigned char>& vchIn, int nTypeIn, int nVersionIn) : vch((char*)&vchIn.begin()[0], (char*)&vchIn.end()[0])

I am not familiar with assembly emitted by GCC, but it certainly looks like something that could be used for malicious purposes. It defensively needs to be fixed and perhaps even investigated if any security was compromised because of this and other cases of vector boundary logic violations.

@theuni
Copy link
Member

theuni commented Sep 22, 2014

@ENikS In this case, one-past-the-end is.... the end-marker. While dereferencing that is nasty and should indeed be fixed, there's no reason to sound the alarms. It's "significantly different" from the rest of the api because its arguments are different. You'll notice that the (int,int) ctor is even more significantly different...

Either way, if you can come up with a unit test that induces failure here, it would be added for sure.

@ENikS
Copy link
Contributor Author

ENikS commented Sep 23, 2014

@theuni Debug build of Microsoft implementation of STL has these checks and boundaries built in. If you run your tests linked with it, all these errors will be apparent. This is what I do in https://github.com/ENikS/bitcoin-dev-msvc

For more information about this feature of MS STL watch this short video: http://channel9.msdn.com/Series/C9-Lectures-Stephan-T-Lavavej-Advanced-STL/C9-Lectures-Stephan-T-Lavavej-Advanced-STL-3-of-n

As for raising the alarm I disagree with you. I see this as huge potential for abuse and would like more attention to these vulnerabilities. I see this as much higher priority compared to any new features. The whole project could be in jeopardy if it could be proven that bitcoin core allows unauthorized access to private data. This is not just a technical issue in my view.

I offered help to set it up as part of automated build and test and I repeating the offer. I have to work at my job so I could not chase you on IRC or any other arbitrary place. If you are serious and would like to discuss this offer I could give you email or phone number so we could communicate.

@sipa
Copy link
Member

sipa commented Sep 23, 2014

It's a good thing that Microsoft's STL checks this, as it is outside of the spec, so this code needs to be fixed for that reason - no argument about that.

However, I'm pretty sure that any reasonable implementation will - in term of actual semantics - have the exact same behaviour using x.begin() + x.size() vs &(x[x.size()]); both are implemented as a pointer one past the end. It's just that for iterators this is defined behavior and for the access operator it is not.

@ENikS
Copy link
Contributor Author

ENikS commented Sep 23, 2014

It's just that for iterators this is defined behavior and for the access operator it is not.

@sipa You are right about access operators and I would add one more offender to the list: x.end()[0] - does not have any meaning in any implementation. As laanwj mentioned earlier it is not a proper C++.

@sipa
Copy link
Member

sipa commented Sep 23, 2014

It does. It's a reference to the byte one past the end of the vector (or a NULL reference in case of no allocated vector storage). Accessing the reference is clearly invalid, but taking a pointer to it or using it as reference is not.

You are absolutely correct that this is not valid C++, and should be fixed. But in any sane implementation (i.e. if it doesn't want unnecessary performance impact) I expect it to be perfectly safe.

@laanwj
Copy link
Member

laanwj commented Sep 23, 2014

There is no way that this code allowed actual out-of-buffer access, it is well known that MSVC in debug mode does a pedantic bounds check at vch.size()-1 - and thus forbids &vch[vch.size()] and such. This would be valid with an array, as the resulting pointer is only used as end guard and not actually dereferenced. Treating a vector as array is incorrect C++ but not dangerous, as a vector is implemented as array (not guaranteed, but in practice it is for the code that is out there).

Also we all have to work at our job. This is starting to get old, I'm not sure why you keep repeating it as if it makes you exceptional.

@theuni
Copy link
Member

theuni commented Sep 23, 2014

I've reproduced these errors with libstdc++ using debug mode. It looks like a few are legitimate bugs while most are harmless in practice.

I'll PR a change to enable them for travis builds so that they can be discussed and fixed in one place.

@laanwj
Copy link
Member

laanwj commented Sep 23, 2014

As discussed with @theuni I'm a bit divided about the undefined casting behavior from unsigned char to signed char here (for values >127). We shouldn't be using signed chars for a binary buffer in the first place.

But anyhow, this is an improvement to what there is now, and I don't think this brings any risks in practice, so ACK.

@sipa
Copy link
Member

sipa commented Sep 23, 2014

ut ACK

@theuni
Copy link
Member

theuni commented Sep 23, 2014

Agreed. I created a few quick test progs trying to create brokenness from this undefined behavior, but everything seemed to work as intended. ACK.

+1 on using unsigned chars instead, but I doubt that will change any time soon.

@sipa sipa merged commit 87314c1 into bitcoin:master Sep 23, 2014
sipa added a commit that referenced this pull request Sep 23, 2014
87314c1 Fixing improper input syntax and failing bounds check (ENikS)
@ENikS ENikS deleted the invalid_vector_iterator branch September 23, 2014 18:50
@bitcoin bitcoin locked as resolved and limited conversation to collaborators Sep 8, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants