Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory problems running finfo::buffer with PHP_CLI on AWS, large files #522

Open
ascaura opened this issue Aug 11, 2017 · 6 comments
Open

Comments

@ascaura
Copy link

ascaura commented Aug 11, 2017

Moving large files from Amazon AWS to S3 using a CakePHP shell, the burzum/cakephp-file-storage plugin and knplabs/Gaufrette we ran into memory problems. The problems appear to be specific to AWS, working with the command-line PHP interpreter and using finfo::buffer on large files.

We get the following messages:

2017-08-10 12:11:57 Warning: Warning (2): finfo::buffer(): Failed identify data 12:cannot allocate 2057250392 bytes (Cannot allocate memory)video/mp4 in [/data/repos/Platform/vendor/knplabs/gaufrette/src/Gaufrette/Adapter/AwsS3.php, line 359]
Trace:
Cake\Error\BaseErrorHandler::handleError() - CORE/src/Error/BaseErrorHandler.php, line 153
finfo::buffer() - [internal], line ??
Gaufrette\Adapter\AwsS3::guessContentType() - ROOT/vendor/knplabs/gaufrette/src/Gaufrette/Adapter/AwsS3.php, line 359
Gaufrette\Adapter\AwsS3::write() - ROOT/vendor/knplabs/gaufrette/src/Gaufrette/Adapter/AwsS3.php, line 166
Gaufrette\Filesystem::write() - ROOT/vendor/knplabs/gaufrette/src/Gaufrette/Filesystem.php, line 103
Burzum\FileStorage\Storage\Listener\AbstractListener::_storeFile() - ROOT/vendor/burzum/cakephp-file-storage/src/Storage/Listener/AbstractListener.php, line 313
Burzum\FileStorage\Storage\Listener\BaseListener::afterSave() - ROOT/vendor/burzum/cakephp-file-storage/src/Storage/Listener/BaseListener.php, line 101
Cake\Event\EventManager::_callListener() - CORE/src/Event/EventManager.php, line 414
Cake\Event\EventManager::dispatch() - CORE/src/Event/EventManager.php, line 391
Cake\ORM\Table::dispatchEvent() - CORE/src/Event/EventDispatcherTrait.php, line 78
Burzum\FileStorage\Model\Table\FileStorageTable::dispatchEvent() - ROOT/vendor/burzum/cakephp-file-storage/src/Model/Table/FileStorageTable.php, line 243
Burzum\FileStorage\Model\Table\FileStorageTable::afterSave() - ROOT/vendor/burzum/cakephp-file-storage/src/Model/Table/FileStorageTable.php, line 159
Cake\Event\EventManager::_callListener() - CORE/src/Event/EventManager.php, line 414
Cake\Event\EventManager::dispatch() - CORE/src/Event/EventManager.php, line 391
Cake\ORM\Table::dispatchEvent() - CORE/src/Event/EventDispatcherTrait.php, line 78
Burzum\FileStorage\Model\Table\FileStorageTable::dispatchEvent() - ROOT/vendor/burzum/cakephp-file-storage/src/Model/Table/FileStorageTable.php, line 243
Cake\ORM\Table::_onSaveSuccess() - CORE/src/ORM/Table.php, line 1850
Cake\ORM\Table::_processSave() - CORE/src/ORM/Table.php, line 1816
Cake\ORM\Table::Cake\ORM\{closure}() - CORE/src/ORM/Table.php, line 1723
Cake\ORM\Table::Cake\ORM\{closure}() - CORE/src/ORM/Table.php, line 1446
Cake\Database\Connection::transactional() - CORE/src/Database/Connection.php, line 680
Cake\ORM\Table::_executeTransaction() - CORE/src/ORM/Table.php, line 1447
Cake\ORM\Table::save() - CORE/src/ORM/Table.php, line 1724
Organizations\Model\Entity\Media::storeFile() - ROOT/plugins/Organizations/src/Model/Entity/Media.php, line 174
App\Shell\MigrationFileStorageShell::main() - APP/Shell/MigrationFileStorageShell.php, line 126
Cake\Console\Shell::runCommand() - CORE/src/Console/Shell.php, line 472
Cake\Console\ShellDispatcher::_dispatch() - CORE/src/Console/ShellDispatcher.php, line 230
Cake\Console\ShellDispatcher::dispatch() - CORE/src/Console/ShellDispatcher.php, line 182
Cake\Console\ShellDispatcher::run() - CORE/src/Console/ShellDispatcher.php, line 128
[main] - ROOT/bin/cake.php, line 20

2017-08-10 12:11:57 Error: [RuntimeException] Could not write the "filename.m4v" key content. in /data/repos/Platform/vendor/knplabs/gaufrette/src/Gaufrette/Filesystem.php on line 106
Stack Trace:
#0 /data/repos/Platform/vendor/burzum/cakephp-file-storage/src/Storage/Listener/AbstractListener.php(313): Gaufrette\Filesystem->write('filename...', '\x00\x00\x00 ftypM4V \x00\x00\x00...', true)
#1 /data/repos/Platform/vendor/burzum/cakephp-file-storage/src/Storage/Listener/BaseListener.php(101): Burzum\FileStorage\Storage\Listener\AbstractListener->_storeFile(Object(Cake\Event\Event))
#2 /data/repos/Platform/vendor/cakephp/cakephp/src/Event/EventManager.php(414): Burzum\FileStorage\Storage\Listener\BaseListener->afterSave(Object(Cake\Event\Event), Object(App\Model\Entity\FileStorage), true, Object(Gaufrette\Filesystem), Object(App\Model\Table\FileStorageTable))
#3 /data/repos/Platform/vendor/cakephp/cakephp/src/Event/EventManager.php(391): Cake\Event\EventManager->_callListener(Array, Object(Cake\Event\Event))
#4 /data/repos/Platform/vendor/cakephp/cakephp/src/Event/EventDispatcherTrait.php(78): Cake\Event\EventManager->dispatch(Object(Cake\Event\Event))
#5 /data/repos/Platform/vendor/burzum/cakephp-file-storage/src/Model/Table/FileStorageTable.php(243): Cake\ORM\Table->dispatchEvent('FileStorage.aft...', Array, NULL)
#6 /data/repos/Platform/vendor/burzum/cakephp-file-storage/src/Model/Table/FileStorageTable.php(159): Burzum\FileStorage\Model\Table\FileStorageTable->dispatchEvent('FileStorage.aft...', Array)
#7 /data/repos/Platform/vendor/cakephp/cakephp/src/Event/EventManager.php(414): Burzum\FileStorage\Model\Table\FileStorageTable->afterSave(Object(Cake\Event\Event), Object(App\Model\Entity\FileStorage), Object(ArrayObject), Object(App\Model\Table\FileStorageTable))
#8 /data/repos/Platform/vendor/cakephp/cakephp/src/Event/EventManager.php(391): Cake\Event\EventManager->_callListener(Array, Object(Cake\Event\Event))
#9 /data/repos/Platform/vendor/cakephp/cakephp/src/Event/EventDispatcherTrait.php(78): Cake\Event\EventManager->dispatch(Object(Cake\Event\Event))
#10 /data/repos/Platform/vendor/burzum/cakephp-file-storage/src/Model/Table/FileStorageTable.php(243): Cake\ORM\Table->dispatchEvent('Model.afterSave', Array, NULL)
#11 /data/repos/Platform/vendor/cakephp/cakephp/src/ORM/Table.php(1850): Burzum\FileStorage\Model\Table\FileStorageTable->dispatchEvent('Model.afterSave', Array)
#12 /data/repos/Platform/vendor/cakephp/cakephp/src/ORM/Table.php(1816): Cake\ORM\Table->_onSaveSuccess(Object(App\Model\Entity\FileStorage), Object(ArrayObject))
#13 /data/repos/Platform/vendor/cakephp/cakephp/src/ORM/Table.php(1723): Cake\ORM\Table->_processSave(Object(App\Model\Entity\FileStorage), Object(ArrayObject))
#14 /data/repos/Platform/vendor/cakephp/cakephp/src/ORM/Table.php(1446): Cake\ORM\Table->Cake\ORM\{closure}()
#15 /data/repos/Platform/vendor/cakephp/cakephp/src/Database/Connection.php(680): Cake\ORM\Table->Cake\ORM\{closure}(Object(Cake\Database\Connection))
#16 /data/repos/Platform/vendor/cakephp/cakephp/src/ORM/Table.php(1447): Cake\Database\Connection->transactional(Object(Closure))
#17 /data/repos/Platform/vendor/cakephp/cakephp/src/ORM/Table.php(1724): Cake\ORM\Table->_executeTransaction(Object(Closure), true)
#18 /data/repos/Platform/plugins/Organizations/src/Model/Entity/Media.php(174): Cake\ORM\Table->save(Object(App\Model\Entity\FileStorage))
#19 /data/repos/Platform/src/Shell/MigrationFileStorageShell.php(126): Organizations\Model\Entity\Media->storeFile(Array, Array)
#20 /data/repos/Platform/vendor/cakephp/cakephp/src/Console/Shell.php(472): App\Shell\MigrationFileStorageShell->main()
#21 /data/repos/Platform/vendor/cakephp/cakephp/src/Console/ShellDispatcher.php(230): Cake\Console\Shell->runCommand(Array, true, Array)
#22 /data/repos/Platform/vendor/cakephp/cakephp/src/Console/ShellDispatcher.php(182): Cake\Console\ShellDispatcher->_dispatch(Array)
#23 /data/repos/Platform/vendor/cakephp/cakephp/src/Console/ShellDispatcher.php(128): Cake\Console\ShellDispatcher->dispatch(Array)
#24 /data/repos/Platform/bin/cake.php(20): Cake\Console\ShellDispatcher::run(Array)
#25 {main}

We were able to reproduce the first warning, which we think is at the core of this issue, with the following PHP script:

<?php

// filename.m4v is a valid video file of around 311MB
$content = file_get_contents('filename.m4v', true);

$fileInfo = new \finfo(FILEINFO_MIME_TYPE);

var_dump($fileInfo->buffer($content));

?>

Serving this script through Apache/PHP-FPM doesn't cause any problems. Neither running this with PHP-CLI on other systems. But running it with PHP-CLI on AWS yields the same warning. The filename.m4v file is 311M. We (temporarily) configured PHP-CLI memory_limit to 3072M. On another server (non AWS) with a memory_limit of 1024M, we do not see this issue. We suspect it has something to do with the way the AWS filesystem or memory management is set up. Note that according to the above warning, PHP tried to allocate 2GB over the allowed 3GB to parse a 311MB file.

We managed to resolve this issue by cutting the input string short to 1024 characters. It appears that finfo:buffer continues to work for most/all files even when they're truncated this way?

<?php

// filename.m4v is a valid video file of around 311MB
$content = file_get_contents('filename.m4v', true); 

$fileInfo = new \finfo(FILEINFO_MIME_TYPE);

$content = substr($content, 0, 1024);

var_dump($fileInfo->buffer($content));

?>
@ascaura
Copy link
Author

ascaura commented Sep 25, 2017

It turns out that, when truncating at 1024, may cause pptx files to be detected as zip. We found at least one case. Experimentally truncating at 10000 caused it to be correctly detected as "application/vnd.openxmlformats-officedocument.presentationml.presentation" again.

@anonim1133
Copy link

anonim1133 commented Mar 13, 2018

Have you dealt with problem of large files in Gaufrette?

@boraneksen
Copy link

boraneksen commented Dec 10, 2019

Turns out this is indeed a specific AWS issue, increasing the memory limit in PHP will increase the file size u can handle. However increasing the memory limit is something i usualy avoid so we found a workaround to this. If you open a stream instead of the content and then use $fileInfo->file(stream_get_meta_data($content)['uri']) instead does work and yields the same results. All the Gaufrette adapters that we used can handle the stream and will proxy it to the $fileInfo->file(stream_get_meta_data($content)['uri']) instead of $fileInfo->buffer($content). Fixing this issue entirely for our use case. I still am not a fan of code breaking because of server quarks so i will follow this further with AWS support.

The AWS support agent acknowledged the problem and redirected the issue to people that are more capable of resolving this. I will post an update as soon as i get a response.

@mkveksas
Copy link

mkveksas commented Apr 6, 2020

@boraneksen have you received any further info from AWS support regarding this issue?

@Stadly
Copy link

Stadly commented Jun 11, 2020

I also ran into this issue today. Have you heard anything from AWS, @boraneksen?

@Stadly
Copy link

Stadly commented Jun 12, 2020

On further investigation, the issue does not seem to be restricted to CLI or AWS: thephpleague/flysystem#1172

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants