TsExtractor bug #278

crossle · 2015-02-04T08:27:23Z

Device android 4.4, Test , about 3~4 second, the logcat print many below line, frequently gc, and result in the UI get stuck,(you can add progressbar on the top of surfaceview)

 D/dalvikvm( 5562): GC_FOR_ALLOC freed 4962K, 19% free 81591K/100428K, paused 74ms, total 74ms
D/dalvikvm( 5562): GC_FOR_ALLOC freed 4971K, 19% free 81592K/100428K, paused 73ms, total 73ms
D/dalvikvm( 5562): GC_FOR_ALLOC freed 4979K, 19% free 81593K/100428K, paused 72ms, total 72ms
D/dalvikvm( 5562): GC_FOR_ALLOC freed 4988K, 19% free 81594K/100428K, paused 58ms, total 58ms
D/dalvikvm( 5562): GC_FOR_ALLOC freed 4997K, 19% free 81596K/100428K, paused 56ms, total 56ms
D/dalvikvm( 5562): GC_FOR_ALLOC freed 5006K, 19% free 81597K/100428K, paused 58ms, total 58ms
D/dalvikvm( 5562): GC_FOR_ALLOC freed 5015K, 19% free 81598K/100428K, paused 61ms, total 61ms
D/dalvikvm( 5562): GC_FOR_ALLOC freed 5023K, 19% free 81599K/100428K, paused 67ms, total 67ms
D/dalvikvm( 5562): GC_FOR_ALLOC freed 5032K, 19% free 81601K/100428K, paused 62ms, total 62ms
D/dalvikvm( 5562): GC_FOR_ALLOC freed 5041K, 19% free 81602K/100428K, paused 61ms, total 61ms
D/dalvikvm( 5562): GC_FOR_ALLOC freed 5051K, 19% free 81604K/100428K, paused 62ms, total 62ms
D/dalvikvm( 5562): GC_FOR_ALLOC freed 5059K, 19% free 81605K/100428K, paused 59ms, total 59ms
D/dalvikvm( 5562): GC_FOR_ALLOC freed 5068K, 19% free 81606K/100428K, paused 61ms, total 61ms
D/dalvikvm( 5562): GC_FOR_ALLOC freed 5076K, 19% free 81607K/100428K, paused 63ms, total 63ms
D/dalvikvm( 5562): GC_FOR_ALLOC freed 5085K, 19% free 81608K/100428K, paused 59ms, total 59ms
D/dalvikvm( 5562): GC_FOR_ALLOC freed 5094K, 19% free 81609K/100428K, paused 57ms, total 57ms
D/dalvikvm( 5562): GC_FOR_ALLOC freed 5103K, 19% free 81611K/100428K, paused 58ms, total 58ms
D/dalvikvm( 5562): GC_FOR_ALLOC freed 5111K, 19% free 81612K/100428K, paused 62ms, total 62ms
D/dalvikvm( 5562): GC_FOR_ALLOC freed 5120K, 19% free 81613K/100428K, paused 59ms, total 59ms
D/dalvikvm( 5562): GC_FOR_ALLOC freed 5129K, 19% free 81615K/100428K, paused 59ms, total 59ms
D/dalvikvm( 5562): GC_FOR_ALLOC freed 5139K, 19% free 81616K/100428K, paused 71ms, total 71ms
D/dalvikvm( 5562): GC_FOR_ALLOC freed 5147K, 19% free 81617K/100428K, paused 61ms, total 61ms
D/dalvikvm( 5562): GC_FOR_ALLOC freed 5155K, 19% free 81619K/100428K, paused 58ms, total 58ms
D/dalvikvm( 5562): GC_FOR_ALLOC freed 5164K, 19% free 81620K/100428K, paused 58ms, total 58ms
D/dalvikvm( 5562): GC_FOR_ALLOC freed 5173K, 19% free 81621K/100428K, paused 57ms, total 57ms
D/dalvikvm( 5562): GC_FOR_ALLOC freed 5182K, 19% free 81622K/100428K, paused 57ms, total 57ms
D/dalvikvm( 5562): GC_FOR_ALLOC freed 5191K, 19% free 81624K/100428K, paused 55ms, total 55ms
D/dalvikvm( 5562): GC_FOR_ALLOC freed 5200K, 19% free 81625K/100428K, paused 57ms, total 57ms
D/dalvikvm( 5562): GC_FOR_ALLOC freed 5208K, 19% free 81626K/100428K, paused 59ms, total 59ms
D/dalvikvm( 5562): GC_FOR_ALLOC freed 5217K, 19% free 81627K/100428K, paused 58ms, total 58ms
D/dalvikvm( 5562): GC_FOR_ALLOC freed 5226K, 19% free 81629K/100428K, paused 58ms, total 58ms
D/dalvikvm( 5562): GC_FOR_ALLOC freed 5235K, 19% free 81630K/100428K, paused 56ms, total 56ms
D/dalvikvm( 5562): GC_FOR_ALLOC freed 5244K, 19% free 81631K/100428K, paused 56ms, total 56ms
D/dalvikvm( 5562): GC_FOR_ALLOC freed 5252K, 19% free 81632K/100428K, paused 62ms, total 62ms
D/dalvikvm( 5562): GC_FOR_ALLOC freed 5261K, 19% free 81634K/100428K, paused 72ms, total 72ms

if the link no valid for u, you can download from dropbox, and use local http server test.

The text was updated successfully, but these errors were encountered:

mine260309 · 2015-02-04T08:47:59Z

It is probably caused by TsExtractor that reads the HTTP data very fast and holds the samples in memory.
You can control the behavior by modify the SampleSource who uses TxExtractor and keep the available samples in a threshold, that limits the memory usage of TsExtractor.

crossle · 2015-02-04T10:04:14Z

@mine260309 read HTTP data very fast? why other ts stream normal?

mine260309 · 2015-02-04T11:43:27Z

Depend on the ts stream (and codec info), TsExtract may discard samples or hold the samples to be rendered.
In my experience, if a chunk of HLS is large (e.g. 100MiB), TsExtract will read as many samples as HTTP downloads, and thus consume too much memory. But in my case there's no such GC logs.

Your case could be different, but you may analyze the heap usage to find which object consume too much memory.

nahojkap · 2015-02-04T19:48:28Z

I would agree with @mine260309 that it is most likely caused by TsExtractor reading too many samples "ahead" in the TS segment. Occurs especially at high bitrates.

There is also an issue where there are too many Samples being pooled (in the SamplePool) - this is marked as "We need to fix this" in the sources.

crossle · 2015-02-05T01:47:54Z

@JohanLindquist agree. I trace the GC, found alway allocate Sample.expand, result in GC.

Guro · 2015-02-06T16:54:21Z

any easy fix for this?
i have same problem

nahojkap · 2015-02-09T16:38:20Z

Also, the TsExtractor$Sample will use a lot of byte arrays, which, if pooled correctly, can reduce GC.

This is especially noticeable when the bitrate (and thus) samples are larger - the samples are all expanded at bitrate switching time.

Not sure what the ideal situation here is as pooling too many bytes will have a big impact on memory footprint. But, considering they are being used at quite a fast rate, maybe that is ok?

ojw28 · 2015-02-09T16:47:00Z

Sample objects are already pooled. The ideal solution would probably be to keep data in TsPacket sized buffers right up until the samples are actually read from the extractor. Rather than copying sample data out of TsPacket buffers into intermediate Sample objects, we'd instead only parse out the information we need, such as where the sample data boundaries are within these buffers. We'd then copy directly from TsPacket buffers into SampleHolders on demand.

This is a pretty big change, and may be pretty complicated to pull off. But it would solve the issue.

Guro · 2015-02-09T17:40:37Z

@ojw28
Currently because of this bug Exo Player for HLS is almost unusable, especially when bitrate is high.

do you have any approx time when this bug will be fixed?
or we should try it doing ourselves?

ojw28 · 2015-02-09T17:42:18Z

FWIW it's only "almost unusable" for lower powered devices running pre-L versions of Android, combined with high bitrate streams. L does much better due to Art being the default VM. I plan to start looking at it this week. How soon the fix will arrive will depend on how difficult it is to fix.

Guro · 2015-02-09T18:10:36Z

yep, we have 4.4 devices and for testing purpose now have switched from Dalvik to Art and result is much better.
but as you know there are lot of devices with Dalvik, please give us updates when you will look at the issue and will know approximate time of fixing.

Thanks.

crossle · 2015-02-10T00:27:11Z

Yep, I use Nexus 5 Android 5.0 test, it work well. Waiting for Dalvik work well.

ojw28 · 2015-02-10T22:15:33Z

I started looking at this today. I've made good progress, and hope to have something ready by the end of the week.

Guro · 2015-02-10T23:05:56Z

Happy to hear that :) thanks

ojw28 · 2015-02-10T23:27:14Z

I'm not quite sure how good performance will end up being. It should be noted that HLS, specifically its use of MPEG-TS as a container format, is horrendously inefficient, both for bandwidth and in terms of computational cost for the client. If you have a 1MB keyframe in your stream, it ends up being fragmented into over 5000 pieces in MPEG-TS, which the client then has to merge back together again before feeding the frame to the decoder.

You should really consider switching to DASH or SmoothStreaming using the FMP4 container format if you have any control at all over what format you're using, where a 1MB keyframe will be in exactly one piece, and can simply be copied directly into the decoder.

Guro · 2015-02-11T08:30:02Z

@ojw28 thanks for advice, we will check if its possible to switch to DASH, the problem is that iOS supports only HLS.

will wait for you fix and check performance.

This is the start of a sequence of changes to fix the ref'd github issue. Currently TsExtractor involves multiple memory copy steps: DataSource->Ts_BitArray->Pes_BitArray->Sample->SampleHolder This is inefficient, but more importantly, the copy into Sample is problematic, because Samples are of dynamically varying size. The way we end up expanding Sample objects to be large enough to hold the data being written means that we end up gradually expanding all Sample objects in the pool (which wastes memory), and that we generate a lot of GC churn, particularly when switching to a higher quality which can trigger all Sample objects to expand. The fix will be to reduce the copy steps to: DataSource->TsPacket->SampleHolder We will track Pes and Sample data with lists of pointers into TsPackets, rather than actually copying the data. We will recycle these pointers. The following steps are approximately how the refactor will progress: 1. Start reducing use of BitArray. It's going to be way too complicated to track bit-granularity offsets into multiple packets, and allow reading across packet boundaries. In practice reads from Ts packets are all byte aligned except for small sections, so we'll move over to using ParsableByteArray instead, so we only need to track byte offsets. 2. Move TsExtractor to use ParsableByteArray except for small sections where we really need bit-granularity offsets. 3. Do the actual optimization. Issue: #278

- TsExtractor is now based on ParsableByteArray rather than BitArray. This makes is much clearer that, for the most part, data is byte aligned. It will allow us to optimize TsExtractor without worrying about arbitrary bit offsets. - BitArray is renamed ParsableBitArray for consistency, and is now exclusively for bit-stream level reading. - There are some temporary methods in ParsableByteArray that should be cleared up once the optimizations are in place. Issue: #278

1. AdtsReader would previously copy all data through an intermediate adtsBuffer. This change eliminates the additional copy step, and instead copies directly into Sample objects. 2. PesReader would previously accumulate a whole packet by copying multiple TS packets into an intermediate buffer. This change eliminates this copy step. After the change, TS packet buffers are propagated directly to PesPayloadReaders, which are required to handle partial payload data correctly. The copy steps in the extractor are simplified from: DataSource->Ts_BitArray->Pes_BitArray->Sample->SampleHolder To: DataSource->Ts_BitArray->Sample->SampleHolder Issue: #278

- Remove TsExtractor's knowledge of Sample. - Push handling of Sample objects into SampleQueue as much as possible. This is a precursor to replacing Sample objects with a different type of backing memory. Ideally, the individual readers shouldn't know how the sample data is stored. This is true after this CL, with the except of the TODO in H264Reader. - Avoid double-scanning every H264 sample for NAL units, by moving the scan for SEI units from SeiReader into H264Reader. Issue: #278

Use of Sample objects was inefficient for several reasons: - Lots of objects (1 per sample, obviously). - When switching up bitrates, there was a tendency for all Sample instances to need to expand, which effectively led to our whole media buffer being GC'd as each Sample discarded its byte[] to obtain a larger one. - When a keyframe was encountered, the Sample would typically need to expand to accommodate it. Over time, this would lead to a gradual increase in the population of Samples that were sized to accommodate keyframes. These Sample instances were then typically underutilized whenever recycled to hold a non-keyframe, leading to inefficient memory usage. This CL introduces RollingBuffer, which tightly packs pending sample data into a byte[]s obtained from an underlying BufferPool. Which fixes all of the above. There is still an issue where the total memory allocation may grow when switching up bitrate, but we can easily fix that from this point, if we choose to restrict the buffer based on allocation size rather than time. Issue: #278

ojw28 · 2015-02-13T19:34:43Z

Hi. Could you test this again using the latest dev branch? Thanks!

ojw28 · 2015-02-13T19:35:57Z

Note: If you were previously using the dev-hls branch, you should now be using dev (dev-hls has been merged to dev).

Guro · 2015-02-13T20:01:05Z

Tested for a little bit and as it seems issue is fixed!
Big Thanks!

I think this is the limit of how far we should be pushing complexity v.s. efficiency. It's a little complicated to understand, but probably worth it since the H264 bitstream is the majority of the data. Issue: #278

This prevents excessive memory consumption when switching to very high bitrate streams. Issue: #278

Guro mentioned this issue Feb 6, 2015

[HLS] Dropping frames constantly / freezing / low speed on seek #276

Closed

ojw28 added the bug label Feb 9, 2015

ojw28 self-assigned this Feb 10, 2015

ojw28 added a commit that referenced this issue Feb 13, 2015

Restrict HLS buffering by total buffer size as well as duration.

5f64a1f

This prevents excessive memory consumption when switching to very high bitrate streams. Issue: #278

ojw28 closed this as completed Feb 13, 2015

google locked and limited conversation to collaborators Jun 28, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TsExtractor bug #278

TsExtractor bug #278

crossle commented Feb 4, 2015

mine260309 commented Feb 4, 2015

crossle commented Feb 4, 2015

mine260309 commented Feb 4, 2015

nahojkap commented Feb 4, 2015

crossle commented Feb 5, 2015

Guro commented Feb 6, 2015

nahojkap commented Feb 9, 2015

ojw28 commented Feb 9, 2015

Guro commented Feb 9, 2015

ojw28 commented Feb 9, 2015

Guro commented Feb 9, 2015

crossle commented Feb 10, 2015

ojw28 commented Feb 10, 2015

Guro commented Feb 10, 2015

ojw28 commented Feb 10, 2015

Guro commented Feb 11, 2015

ojw28 commented Feb 13, 2015

ojw28 commented Feb 13, 2015

Guro commented Feb 13, 2015

TsExtractor bug #278

TsExtractor bug #278

Comments

crossle commented Feb 4, 2015

mine260309 commented Feb 4, 2015

crossle commented Feb 4, 2015

mine260309 commented Feb 4, 2015

nahojkap commented Feb 4, 2015

crossle commented Feb 5, 2015

Guro commented Feb 6, 2015

nahojkap commented Feb 9, 2015

ojw28 commented Feb 9, 2015

Guro commented Feb 9, 2015

ojw28 commented Feb 9, 2015

Guro commented Feb 9, 2015

crossle commented Feb 10, 2015

ojw28 commented Feb 10, 2015

Guro commented Feb 10, 2015

ojw28 commented Feb 10, 2015

Guro commented Feb 11, 2015

ojw28 commented Feb 13, 2015

ojw28 commented Feb 13, 2015

Guro commented Feb 13, 2015