Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix for PEAR bug #21243 #10

Merged
merged 7 commits into from Nov 13, 2018
Merged

Fix for PEAR bug #21243 #10

merged 7 commits into from Nov 13, 2018

Conversation

hfig
Copy link
Contributor

@hfig hfig commented Jul 16, 2018

This is a fix for PEAR bug #21243 along the lines of what the bug filer tacituseu suggested.

Background

OLE Compound documents have two different data streams, big block stream (bbat) and small block stream (sbat). Support for reading small data streams in PEAR::OLE is broken. It reads big blocks worth of data from the small block index. Sometimes this works, if the small blocks are contiguous within one big block's worth of data.

In fact, the small block stream should be read from the big block object pointed to by the Compound document's root element start block pointer.

Fix

Open the big block object containing the small block stream when the file is opened. Use that to read the small block stream in OLE_ChainedBlockStream::stream_open().

Effectively this causes the entire big block object to be read to read any small block object. This object's data is cached in the data member of the stream object, so performance will be acceptable. However it may be problematic for extremely large Compound documents with extremely large small block streams. Though compare with the PhpSpreadsheet project which parses the entire Compound document into memory.

hfig added 2 commits July 16, 2018 16:15
end() operates on a reference requires a variable be passed not a function result.
Small blocks are not contiguous but a read from within big blocks. Fix in the nature of that outlined by the reporter.
@sanmai
Copy link
Member

sanmai commented Oct 23, 2018

Do you think you can get a test for this?

@sanmai
Copy link
Member

sanmai commented Oct 23, 2018

Test script in a nutshell:

$obj = new OLE();
$obj->read('ole.xls');
foreach ($obj->_list as $idx => $pps) if ($obj->isFile($idx)) {
  $dlen = $olz->getDataLength($idx);
  $data = $olz->getData($idx, 0, $dlen);
  var_dump($data);
}

Only problem now is getting that ole.xls.

@hfig
Copy link
Contributor Author

hfig commented Oct 26, 2018

Only problem now is getting that ole.xls.

The MS-XLS file format doesn't use the small block stream :-)

The MS-OXMSG ("Outlook" .msg files) format does use the small block stream. Here's one:

Test message.zip

Example using the file artefact:

$file = 'Test message.msg';
$knownSubject = '540065007300740020006d00650073007300610067006500'; // UTF16-LE string "Test Message"

$ole = new OLE();
$ole->read($file);

$el = $ole->_list[8];
assert($el->Name == '__substg1.0_0037001F'); // MAPI attribute PidTagSubject (ie message subject)
$dlen = $ole->getDataLength($el->No);
$data = $ole->getData($el->No, 0, $dlen);
assert($data == hex2bin($knownSubject));

sanmai added a commit that referenced this pull request Oct 29, 2018
@sanmai
Copy link
Member

sanmai commented Oct 29, 2018

I've added that test, thank you very much! Yet the build is failing with some vague Undefined offset: 4294967294.

Are you able to reproduce this error?

Then I tell PHPUnit to ignore that notice like so:

-<phpunit bootstrap="vendor/autoload.php" verbose="true" colors="true">
+<phpunit bootstrap="vendor/autoload.php" verbose="true" colors="true"
+         convertNoticesToExceptions="false"
+>

The test won't finish.

PHP Notice:  Undefined offset: 4294967294 in OLE/OLE/ChainedBlockStream.php on line 125
PHP Stack trace:
PHP   1. {main}() OLE/test.php:0
PHP   2. OLE->read() OLE/test.php:9
PHP   3. OLE->getStream() OLE/OLE.php:198
PHP   4. fopen() OLE/OLE.php:254
PHP   5. OLE_ChainedBlockStream->stream_open() OLE/OLE.php:254
PHP Notice:  Undefined index:  in OLE/OLE/ChainedBlockStream.php on line 125
PHP Stack trace:
PHP   1. {main}() OLE/test.php:0
PHP   2. OLE->read() OLE/test.php:9
PHP   3. OLE->getStream() OLE/OLE.php:198
PHP   4. fopen() OLE/OLE.php:254
PHP   5. OLE_ChainedBlockStream->stream_open() OLE/OLE.php:254

sanmai added a commit to sanmai/OLE that referenced this pull request Oct 30, 2018
sanmai added a commit that referenced this pull request Oct 30, 2018
Tag the failing with a notice test from #10 as incomplete for time being
@khorsky
Copy link

khorsky commented Nov 7, 2018

The problem with the error "Undefined offset: 4294967294" is not related to this PR.

I believe that the problem lies here. The problem is AFAIK in the difference of the PHP_INT_SIZE between different machines.

e.g. On my Windows 7 (where you do not get this error), the PHP_INT_SIZE is 4. On server with Centos 7, it is 8.

I guess that there is some overflow error or so...

When I tried if my assumption is correct, this really ugly hack make the test green:

if (PHP_INT_SIZE === 8) {
   switch ($tmp) {
      case 4294967295:
         $tmp = -1;
         break;

      case 4294967294:
         $tmp = -2;
         break;

      case 4294967293:
         $tmp = -3;
         break;
   }
}

BTW? there are no issues, so sorry for not creating a new one for this topic.

@sanmai
Copy link
Member

sanmai commented Nov 7, 2018

Oh, thanks for chiming in! That’s great, because then we can fix that! I’ll get to this in a day or two.

@sanmai
Copy link
Member

sanmai commented Nov 13, 2018

4294967295 as uint32 matches -1 as int16

@sanmai
Copy link
Member

sanmai commented Nov 13, 2018

OK, things seem to be all right now, except for a small issue that I'll fix in another PR.

@hfig, @khorsky thank you!

@sanmai sanmai merged commit dd51ded into pear:master Nov 13, 2018
@sanmai
Copy link
Member

sanmai commented Nov 19, 2018

This fix went into v1.0.0RC5, again thank you very much.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants