Memory allocation error when scanning large music folder #212

Closed
pienk opened this Issue Apr 12, 2014 · 20 comments

Projects

None yet

8 participants

@pienk
pienk commented Apr 12, 2014

When scanning my music collection of 7416 songs the music app stops at 1336 songs imported with the following error in the apache log:

Allowed memory size of 536870912 bytes exhausted (tried to allocate 3691560553 bytes) in /var/www/owncloud/apps/music/3rdparty/getID3/getid3/getid3.php on line 1641

This is an allocation of 3.44 GB and I have an allowed memory size of 512 MB.
There is no music shown in the music app.
After the error there are 1336 rows in my oc_music_tracks table.
If I go to my files app and back to music the app tries to import some (22) more songs but the same error occurs.
My music folder is mounted via the external storage app as local storage.

If I symlink the music folder in my owncloud directory the music does appear in the music app but it stops at around the same point (1350 songs). I am able to play the songs that are imported.

@MorrisJobke
Member

@pienk Hi. Did you use the latest version? Because I included an incremental scanner (50 steps at once) to avoid such issues.

Thanks for your feedback and sorry for the anwsering delay.

@stefanriemens

I can confirm this bug report, it happens for me as well. It was scanning happily until song 5253/6766. Even increasing the memory limit to 1536M did not solve it, still a memory exhaustion error.
My music folder is also on a samba external mount (using external storage, no linux mounting). My server is running the latest owncloud version 6 on an up-to-date Scientific Linux 6, on php 5.4.27.

cat apps/music/appinfo/version
0.1.9.1-rc

@naringas

This is happening to me too on 0.1.9.1-rc

There are many messages like this one in the log:

Allowed memory size of 134217728 bytes exhausted (tried to allocate 1167823392 bytes) at /path/to/owncloud/apps/music/3rdparty/getID3/getid3/getid3.php#1641

I have many FLAC files in my library, it got stuck on 138/1520 and they're on an local external share/mount

@MorrisJobke
Member

I changed the size of the scan interval from 50 to 20 files on each scan cycle and hope this is enough to fix this issue. (it is included into v0.2 if you download it from the app store or download it from the github release packe - i patched the tar.gz there)

@MorrisJobke MorrisJobke added this to the 0.3 milestone Apr 30, 2014
@MorrisJobke MorrisJobke added the Bug label Apr 30, 2014
@naringas

Still not working for me... on 0.2
I added this $this->api->log(...); into here (which is the same place which that patch modified)

        foreach ($music as $file) {
            if(!$batch && $count >= 20) {
                // break scan - 20 files are already scanned
                break;
            }
            if(!$batch && in_array($file['fileid'], $fileIds)) {
                // skip this file as it's already scanned
                continue;
            }
            $this->api->log('resscan: count '.$count ,'debug');
            $this->update($file['path'], $userId);
            $count++;
        }

and i got this in owncloud.log

{"app":"music","message":"Rescan triggered","level":1,"time":"2014-04-30T18:22:16+00:00"}
{"app":"music","message":"resscan: count 0","level":0,"time":"2014-04-30T18:22:16+00:00"}
{"app":"music","message":"update - file.flac","level":0,"time":"2014-04-30T18:22:16+00:00"}
{"app":"music","message":"update - mimetype audio\/flac","level":0,"time":"2014-04-30T18:22:16+00:00"}
{"app":"PHP","message":"Allowed memory size of 134217728 bytes exhausted (tried to allocate 1167823392 bytes) at \/path_to_owncloud\/apps\/music\/3rdparty\/getID3\/getid3\/getid3.php#1641","level":3,"time":"2014-04-30T18:22:16+00:00"}

So maybe it's a problem with the file I'm scanning? I do not know, but it always fails in the same file.
This file is 37 MBytes (38615810)

@MorrisJobke
Member

@naringas I will look for a bigger flac file and test against. Your allowed memory is 128 MB, right?

@MorrisJobke MorrisJobke reopened this May 1, 2014
@naringas
naringas commented May 2, 2014

@MorrisJobke indeed

memory_limit = 128M
@MorrisJobke
Member

I have a file here, which causes a memory limit exception:

{"app":"music","message":"update - \/jkl\u00f6+\/14 - With You.flac","level":0,"time":"2014-05-02T14:19:37+00:00"}
{"app":"music","message":"update - mimetype audio\/flac","level":0,"time":"2014-05-02T14:19:37+00:00"}
{"app":"PHP","message":"Allowed memory size of 536870912 bytes exhausted (tried to allocate 72 bytes) at \/srv\/http\/music\/3rdparty\/getID3\/getid3\/module.audio.ogg.php#585","level":3,"time":"2014-05-02T14:19:58+00:00"}

It fails while reading the file through the oc://path/to/file/ in https://github.com/owncloud/music/blob/9a2ca55f3144733190f6212a0fb247d3045d4a4d/utility/extractorgetid3.php#L48 which is called from here: https://github.com/owncloud/music/blob/9a2ca55f3144733190f6212a0fb247d3045d4a4d/utility/scanner.php#L100

I use the latest master of getID3 in music app and tested against a plain getID3 (there is a demo browser and I located the flac file into the tree and it opens up correctly parsed)

Steps to reproduce

  • use ownCloud master/stable6
  • install music app (git clone https://github.com/owncloud/music to core/apps and enable music app)
  • upload the flac file (just ask @morrisjobke for that file - 13 MB flac) - this will trigger the scan for metadata and result in huge memory consumption untill it hits the limit.

The weird thing of this is, that it works with so many other files (flac, mp3 ogg) which are even bigger than this one.


Comment by @Glandos:

I can add that the FLAC (which I ripped) was extracted from the same CD, using the same software. Other tracks are parsed correctly, but not this particular one.

@MorrisJobke
Member

I debugged a bit and found that there is a nearly endless for loop because the count is mis-calculated.

https://github.com/owncloud/music/blob/master/3rdparty/getID3/getid3/module.audio.ogg.php#L506 The variable $CommentsCount has the value 1666318336 which results in a memory outage in my case at a loop count of around 600k.

I changed the loop condition to

for ($i = 0; $i < $CommentsCount && $i < 10000; $i++) {

and I get following error message from getID3: Corrupt FLAC file: uncompressed_audio_bytes == zero.

This is maybe of interest for @JamesHeinrich ;)

@MorrisJobke MorrisJobke added a commit that referenced this issue May 14, 2014
@MorrisJobke MorrisJobke fix for memory leak in getID3
ref #212
2951292
@JamesHeinrich JamesHeinrich added a commit to JamesHeinrich/getID3 that referenced this issue May 14, 2014
@JamesHeinrich JamesHeinrich bugfix: possible memory leak in OggFLAC a9443d5
@JamesHeinrich

I have included a similar fix in JamesHeinrich/getID3@a9443d5
If @MorrisJobke would like to send me the broken FLAC file in question for testing I would appreciate it, please email it or a link to info@getid3.org

@MorrisJobke
Member

@JamesHeinrich Thanks. I included your fix into 1ec2bce

@JamesHeinrich

@MorrisJobke thanks for the sample file. But I see nothing wrong with it as far as getID3 is concerned. The loop-breaking code is nowhere near to being invoked: $CommentsCount for that file is 13.

@MorrisJobke
Member

@JamesHeinrich Could this be caused maybe by a misalignment of the retrieved bytes through the stream? Are stream wrappers supported by your library (registered through http://www.php.net/manual/en/function.stream-wrapper-register.php )

Following are registered and used within ownCloud:

$  ack stream_wrapper_register
lib/base.php
502:        stream_wrapper_register('fakedir', 'OC\Files\Stream\Dir');
503:        stream_wrapper_register('static', 'OC\Files\Stream\StaticStream');
504:        stream_wrapper_register('close', 'OC\Files\Stream\Close');
505:        stream_wrapper_register('quota', 'OC\Files\Stream\Quota');
506:        stream_wrapper_register('oc', 'OC\Files\Stream\OC');

apps/files_external/3rdparty/aws-sdk-php/Aws/S3/StreamWrapper.php
135:        stream_wrapper_register('s3', __CLASS__, STREAM_IS_URL);

apps/files_external/3rdparty/smb4php/smb.php
515:stream_wrapper_register('smb', 'smb_stream_wrapper')

apps/files_external/3rdparty/phpseclib/phpseclib/Net/SFTP/Stream.php
773:if (function_exists('stream_wrapper_register')) {
774:    stream_wrapper_register('sftp', 'Net_SFTP_Stream');

apps/files_encryption/appinfo/app.php
32:     stream_wrapper_register('crypt', 'OCA\Encryption\Stream');

@icewind1991 As you know more about the file accessing. The problem is following: There is a file in your ownCloud and as it got accessed through the oc:// stream wrapper it results in wrong data, but this is only the case for some specific files. Could this be caused by code from core? Do you know where this interference could come from?

@icewind1991
Member

@MorrisJobke can you send me the file again

@RandolfCarter

Forgive me if that should be a completely unfitting comment, but in my understanding, a sequential scanning of files should not exhaust memory, unless there are some memory leaks during the process.

So I would see this incremental scanning rather as a workaround than as a fix (although as one which seems required anyway for being able to output the progress of the scan). But I would still be worried about the memory leaks...

@MorrisJobke
Member

@RandolfCarter You're right. this needs to be debugged.

@butonic
Member
butonic commented Sep 21, 2014

@JamesHeinrich @MorrisJobke getid3 uses ftell to get the current position of the file pointer. Despite all sanity, that is considered a bug: https://bugs.php.net/bug.php?id=30157

If you need to determine the current position, you simply need to seek with a zero offset from the current position.

As a result the oc:// stream wrappers stream_ftell() never is executed, causing the offsets to be completely wrong.

Hm ... strange, somehow ftell returns 8775 bytes after reading the fLaC syncword at the beginning of the stream. Could there be a problem with buffering? I still don't get the debugger to stop in stream_ftell... damn ... this is a tough one.

Ping me if you need the flac file.

@butonic
Member
butonic commented Sep 21, 2014

@icewind1991 With plain getID3 (no stream wrappers) the same file parses just fine.

@butonic butonic referenced this issue in owncloud/core Sep 21, 2014
Merged

Fix oc stream seek #11204

@MorrisJobke
Member

Awesome. It seems that @butonic found a fix for this: owncloud/core#11204 ... I will test this tomorrow.

@MorrisJobke
Member

A fix will be available in ownCloud 7.0.3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment