Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

COMMON: Avoid Quicktime parsing crash on early EOF #5368

Merged
merged 2 commits into from Nov 18, 2023

Conversation

mduggan
Copy link
Contributor

@mduggan mduggan commented Oct 16, 2023

In parseStream, atom.size is initailized to uint32 max. If the loop in readDefault then never executes because the stream has hit EOF, that size is passed as-is to seek. This can cause a crash eg in SeekableSubReadStream because of an int overflow.

If nothing has been read just skip seeking at the end of readDefault. parseStream will correctly return false in this case as no MOOV will be found.

@mduggan
Copy link
Contributor Author

mduggan commented Oct 16, 2023

Doing this as a PR as I'm not 100% sure of my fix - @elasota I think you've touched this code recently(ish), not sure if you have any thoughts?

@elasota
Copy link
Contributor

elasota commented Oct 16, 2023

Couldn't you just check the err/eos functions on _fd to figure out if one of the reads failed? Should probably check if the seek failed too.

@mduggan
Copy link
Contributor Author

mduggan commented Oct 16, 2023

Couldn't you just check the err/eos functions on _fd to figure out if one of the reads failed? Should probably check if the seek failed too.

I think the loop on line 184 can fall through the first time even if there is no particular error, that's why I was trying to check something about the output - but then I wasn't fully sure if total_size was the right thing to check.. would it be better to instead check atom.size != 0xffffffff?

@elasota
Copy link
Contributor

elasota commented Oct 23, 2023

Would it make more sense to initialize the atom size to stream->size() in QuickTimeParser::parseStream then? I'm not sure why that is initialized to 0xffffffff in the first place...

@sev-
Copy link
Member

sev- commented Nov 5, 2023

@mduggan any updates?

@mduggan
Copy link
Contributor Author

mduggan commented Nov 6, 2023

Thanks for the ping

I think @elasota's suggestion right about changing the initialization of atom.size, but I'm also a little afraid to make that change without some more testing as this whole quicktime parsing code is a little complex and I don't want to break anyone else's stuff. I think I will write some unit tests first and then I can reduce the chance of breaking something, but it might take me a little while.

The original reason for the change went away when I fixed my corrupted file by the way - so this is not a critical bug.

Copy link
Contributor

coderabbitai bot commented Nov 14, 2023

Walkthrough

Walkthrough

The changes involve enhancing the QuickTimeParser class to handle large streams more safely by preventing potential overflow issues and improving error handling. Additionally, a comprehensive test suite for the QuickTime parser has been introduced, which includes various test cases to ensure the parser's robustness in different scenarios, such as end-of-stream conditions and parsing specific QuickTime structures. The test infrastructure has also been updated to include new test files.

Changes

File Path Change Summary
common/.../quicktime.cpp Modified to calculate maximum atom size based on remaining stream size to prevent overflow; added error checks before reading from file descriptor and after parsing functions.
test/common/.../quicktime.h Introduced a test suite for QuickTime parser with test cases for parsing data, end-of-stream conditions, and validating QuickTime structures.
test/module.mk Updated TESTS variable to include new test files from the formats subdirectory.

Tips

Chat with CodeRabbit Bot (@coderabbitai)

  • If you reply to a review comment from CodeRabbit, the bot will automatically respond.
  • To engage with CodeRabbit bot directly around the specific lines of code in the PR, mention @coderabbitai in your review comment
  • Note: Review comments are made on code diffs or files, not on the PR overview.
  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Commands (invoked as PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger a review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai help to get help.

Note: For conversation with the bot, please use the review comments on code diffs or files.

CodeRabbit Configration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • The JSON schema for the configuration file is available here.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/coderabbit-overrides.v2.json

Comment on lines 101 to 107
_foundMOOV = false;
_disposeFileHandle = disposeFileHandle;

Atom atom = { 0, 0, 0xffffffff };
int64 strmleft = stream->size() - stream->pos();
// Atom size is uint32, but use int32 max as it can wrap in later expressions
uint32 maxsize = (uint32)MIN(strmleft, (int64)(1 << 31));
Atom atom = { 0, 0, maxsize };

if (readDefault(atom) < 0 || !_foundMOOV) {
close();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change to calculate maxsize based on the remaining stream size is a good preventive measure against integer overflow and invalid seek operations. However, it's important to ensure that the MIN macro is well-defined and behaves as expected, especially in edge cases where strmleft might be negative due to an error in stream->pos() or stream->size(). Additionally, the code assumes that stream->size() returns the total size of the stream, which may not be the case for all stream types. It would be prudent to verify that stream->size() and stream->pos() are returning expected values and that strmleft is non-negative before proceeding with the calculation.

int64 strmleft = stream->size() - stream->pos();
if (strmleft < 0) {
    warning("Stream position is beyond the stream size.");
    close();
    return false;
}

@mduggan
Copy link
Contributor Author

mduggan commented Nov 14, 2023

Ok, I switched to using stream size, and added some simple tests which previously crashed and now pass.

@bluegr
Copy link
Member

bluegr commented Nov 14, 2023

@sluicebox has worked on the QT code, so perhaps he could also share some feedback on this?

@sluicebox
Copy link
Member

sluicebox commented Nov 14, 2023

I would rather see this fixed in readDefault; that function doesn't seem to be fulfilling its obligations if it's crashing instead of returning error when given a truncated stream. This change is an non-obvious workaround for readDefault not doing that.

If it's valid for the readDefault loop to do no iterations under some circumstances, then can we add an initial _fd->err() / _fd->eos() check and return -1? Is it a problem that err isn't checked at all?

If we don't know these answers, I think it's better to do nothing for now than complicate the caller code. readDefault is in need of basic documentation if nothing else. Name is unclear, how does the struct parameter affect things, what state does it alter, what are the return values... when I see that the two key Atom objects are named atom and a I reach for my No Confidence vote.

Although this issue was uncovered by a corrupt file that's been replaced, it is still worth improving the error handling. Parsers that crash (possibly with an assert?) instead of returning an error is a wide problem in the codebase; see #2486

In parseStream, atom.size is initailized to uint32 max.  If the loop in
readDefault then never executes because the stream has hit EOF, that size is
passed as-is to seek.  This can cause a crash eg in SeekableSubReadStream
because of an int overflow.

Change readDefault to return an error in a few places if we have hit eof before
a valid atom has been read.  It's ok not to do the seek in this case.
Comment on lines 181 to 189

a.offset = atom.offset;

if (_fd->eos() || _fd->err() || (_fd->pos() == _fd->size()))
return -1;

while(((total_size + 8) < atom.size) && !_fd->eos() && _fd->pos() < _fd->size() && !err) {
a.size = atom.size;
a.type = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code checks for end-of-stream (EOS) and errors before entering the loop, which is good for preventing reads past the end of the file. However, the check for (_fd->pos() == _fd->size()) is redundant after checking _fd->eos(), as they both indicate the end of the stream.

- if (_fd->eos() || _fd->err() || (_fd->pos() == _fd->size()))
+ if (_fd->eos() || _fd->err())

Commitable suggestion

[!IMPORTANT]
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change
a.offset = atom.offset;
if (_fd->eos() || _fd->err() || (_fd->pos() == _fd->size()))
return -1;
while(((total_size + 8) < atom.size) && !_fd->eos() && _fd->pos() < _fd->size() && !err) {
a.size = atom.size;
a.type = 0;
a.offset = atom.offset;
if (_fd->eos() || _fd->err())
return -1;
while(((total_size + 8) < atom.size) && !_fd->eos() && _fd->pos() < _fd->size() && !err) {
a.size = atom.size;
a.type = 0;

Comment on lines 187 to 189
while(((total_size + 8) < atom.size) && !_fd->eos() && _fd->pos() < _fd->size() && !err) {
a.size = atom.size;
a.type = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The assignment a.size = atom.size; inside the loop is incorrect. It should be set to the size read from the stream, not the size of the parent atom. This could lead to incorrect parsing behavior.

- a.size = atom.size;
+ // a.size should be set to the size read from the stream, not the parent atom's size.

Commitable suggestion

[!IMPORTANT]
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change
while(((total_size + 8) < atom.size) && !_fd->eos() && _fd->pos() < _fd->size() && !err) {
a.size = atom.size;
a.type = 0;
while(((total_size + 8) < atom.size) && !_fd->eos() && _fd->pos() < _fd->size() && !err) {
// a.size should be set to the size read from the stream, not the parent atom's size.
a.type = 0;

Comment on lines 235 to 243
uint32 start_pos = _fd->pos();
err = (this->*_parseTable[i].func)(a);

if (!err && (_fd->eos() || _fd->err()))
err = -1;

uint32 left = a.size - _fd->pos() + start_pos;

if (left > 0) // skip garbage at atom end
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error check after calling the parsing function is correct, but the calculation of left is potentially incorrect. It assumes that _fd->pos() will always be greater than start_pos, which might not be the case if the stream seeking behavior is incorrect. It would be safer to check if _fd->pos() is not less than start_pos before subtracting to avoid underflow.

- uint32 left = a.size - _fd->pos() + start_pos;
+ if (_fd->pos() < start_pos) return -1; // Add this check to prevent underflow
+ uint32 left = a.size - (_fd->pos() - start_pos);

Commitable suggestion

[!IMPORTANT]
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change
uint32 start_pos = _fd->pos();
err = (this->*_parseTable[i].func)(a);
if (!err && (_fd->eos() || _fd->err()))
err = -1;
uint32 left = a.size - _fd->pos() + start_pos;
if (left > 0) // skip garbage at atom end
uint32 start_pos = _fd->pos();
err = (this->*_parseTable[i].func)(a);
if (!err && (_fd->eos() || _fd->err()))
err = -1;
if (_fd->pos() < start_pos) return -1; // Add this check to prevent underflow
uint32 left = a.size - (_fd->pos() - start_pos);
if (left > 0) // skip garbage at atom end

@mduggan
Copy link
Contributor Author

mduggan commented Nov 15, 2023

Thanks for the suggestion @sluicebox. After some thought I've realized there should always be at least 8 bytes to read in readDefault, otherwise there is no atom there to read. I added checks for the eos and err condition in a few more places and reverted the change in readStream. I think I prefer it now.

@sev-
Copy link
Member

sev- commented Nov 18, 2023

Excellent work and research. Thank you! Merging.

@sev- sev- merged commit ce76f5b into scummvm:master Nov 18, 2023
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants