Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Growing memory usage with DASH live streams #6610

Open
MichaelSweden opened this issue May 14, 2024 · 12 comments
Open

Growing memory usage with DASH live streams #6610

MichaelSweden opened this issue May 14, 2024 · 12 comments
Labels
type: bug Something isn't working correctly
Milestone

Comments

@MichaelSweden
Copy link
Contributor

MichaelSweden commented May 14, 2024

Have you read the FAQ and checked for duplicate open issues?
Yes

If the problem is related to FairPlay, have you read the tutorial?
No

What version of Shaka Player are you using?
v4.8.3

Can you reproduce the issue with our latest release version?
Yes

Can you reproduce the issue with the latest code from main?
Yes

Are you using the demo app or your own custom app?
Own

If custom app, can you reproduce the issue using our demo app?
Not done, but tested with a minimal html page and default shaka config.

What browser and OS are you using?
Cobalt browser, Linux

For embedded devices (smart TVs, etc.), what model and firmware version are you using?
N/A

What are the manifest and license server URIs?
Sorry, have not found any free public streams showing this problem.

What configuration are you using? What is the output of player.getConfiguration()?
Not relevant, since problem also occurs with default config.

What did you do?
Upgrade shaka from v4.5.0 to v4.8.3 and playing a dash live stream.

What did you expect to happen?
Be able to play for long time.

What actually happened?
After a few hours memory usage have grown very much, several hundred megabytes. Later in our platform oom-killer kicks in and terminate the browser process.

Are you planning send a PR to fix it?
Yes

@MichaelSweden MichaelSweden added the type: bug Something isn't working correctly label May 14, 2024
@MichaelSweden
Copy link
Contributor Author

Trouble shooting
Playing in Firefox and using web dev tool show shortly after start, strings 40 MiB (count 28902). After 70 minutes, 185 MiB (count 39141). More detail:
157.261.256 74% objects Array, g_Player parser_ streamMap_ id_PT1713303398S,audio1300_0 segmentIndex references
Below this many 262.720, with objectElements[xxxx].
The shaka.media.SegmentIndex class have an array of shaka.media.SegmentReference named "references".

Since the older version doesn't show this behavior I started to test different releases to track down when the problem started to occur. Seems to be between v4.6.0 and v4.6.1, more precisely:
Issue: #5898 SegmentTemplate@media not updated after change in DASH manifest
PR: #5899 fix(DASH): SegmentTemplate@media not updated after change in manifest
Commit: 30de177 fix(DASH): SegmentTemplate@media not updated after change in manifest (#5899)
Row: this.templateInfo_.mediaTemplate = info.mediaTemplate;

If this row is removed in v4.8.3, the huge growing memory usage disappear. This row is not needed for the stream that cause this problem.

@MichaelSweden
Copy link
Contributor Author

MichaelSweden commented May 14, 2024

Thoughts and attempts to fix
My guess is that the assigment of the string will create a reference between objects that cause the garbage collector not to free up memory that is associated with these objects. However shouldn't it be only the string that is hold in memory, not other parts. This part of the code is quite complex and handle different scenarios. A dash manifest (mpd) support many functionalities and many parameters. This problem don't show up with all streams. I haven't found any free public test stream on internet that gives this problem. So maybe some special in this manifest that cause some part to behave faulty.

Well I try to get rid of this growing memory, by changing this row. It turned out to be problematic. Here is a summary of testing (by playing in Cobalt browser):
These doesn't help:

#1
this.templateInfo_.mediaTemplate = String(info.mediaTemplate);
#2
this.templateInfo_.mediaTemplate = String(info.mediaTemplate).toString();
#3
{
  // eslint-disable-next-line no-new-wrappers
  const newString = new String(info.mediaTemplate);
  this.templateInfo_.mediaTemplate = newString.toString();
}
#4
this.templateInfo_.mediaTemplate = String(info.mediaTemplate).trim();
Documentation of trim says: "If neither the beginning or end of str has any whitespace, a new string is still returned (essentially a copy of str)."
#5:
this.templateInfo_.mediaTemplate = String(info.mediaTemplate).replace('\0', '\0');

These helps:

#6 ok
this.templateInfo_.mediaTemplate = String(info.mediaTemplate).replace('_'. '_');
Could be done like this:
  static createNewString(str) {
    if (!str || str.length < 1) {
      return '';
    }
    const replaceStr = str.substring(0, 1);
    return str.replace(replaceStr, replaceStr);
  }
#7 ok
{
  /** @const {!ArrayBuffer} */
  const buf = shaka.util.StringUtils.toUTF8(String(info.mediaTemplate));
  const newString = shaka.util.StringUtils.fromUTF8(buf);
  this.templateInfo_.mediaTemplate = newString;
}
#8 ok
{ // Create a new string, which do not have any reference to the source
  const newString = '_' + info.mediaTemplate;
  this.templateInfo_.mediaTemplate = newString.substring(1);
}

I prefer the last, put in a new function.

I would appreciate getting links to live streams that trigger calls to the appendTemplateInfo function.

@shaka-bot shaka-bot added this to the v4.9 milestone May 14, 2024
@MichaelSweden
Copy link
Contributor Author

More tests
In issue #6239 Memory leak on multi-period streams
I found this link: https://d24rwxnt7vw9qb.cloudfront.net/v1/dash/e6d234965645b411ad572802b6c9d5a10799c9c1/All_Reference_Streams/4577dca5f8a44756875ab5cc913cd1f1/index.mpd
This also show growing memory problem.
I have now tested this stream with the latest main (12 May, last commit 5f8628a).
pure main: 1h 207MB (v4.8.3-main-20-g5f8628a4a)
main with fix: 1h 206MB (v4.8.3-main-20-g5f8628a4a-dirty)
Problem with this stream will not be solved by my fix.

Same test condition, but our stream:
main with fix: 1h 61MB (v4.8.3-main-20-g5f8628a4a-dirty)
pure main: 1.25h 182MB (v4.8.3-main-20-g5f8628a4a)

Test in Chromium (124.0.6367.155) with default shaka config (MB from heap snapshot with developer tools), stream from cloudfront:
pure main: 1h 80MB
main with fix: 1.25h 119MB

Test in Chromium with our stream:
pure main: 1h 7.8MB Oops, no huge growing memory!

Please don't compare MB (MegaByte) values between Cobalt and Chromium, they show different memory.

The fix:

--- a/lib/dash/segment_template.js
+++ b/lib/dash/segment_template.js
@@ -714,7 +714,11 @@ shaka.dash.TimelineSegmentIndex = class extends shaka.media.SegmentIndex {
     } else {
       const currentTimeline = this.templateInfo_.timeline;
 
-      this.templateInfo_.mediaTemplate = info.mediaTemplate;
+      {
+        // Create a new string, which has no reference to the source
+        const newString = '_' + info.mediaTemplate;
+        this.templateInfo_.mediaTemplate = newString.substring(1);
+      }
 
       // Append timeline
       const lastCurrentEntry = currentTimeline[currentTimeline.length - 1];

The pull request will also include an if-statement (to avoid unnecessary assignment/copy of string) and a function.

@MichaelSweden
Copy link
Contributor Author

Conclusion
I have a fix. However since the fault doesn't appear in Chromium, can it be a javascript engine problem?

So right now I'm little confused. Shall I put up a pull request, or not. Any ideas are welcome. I believe I need to think about this for a while, but guess the fix would be good. Maybe I will trouble shoot again, but with a different approach. Maybe it would be possible to solve the issue at another place in the code.

@joeyparrish
Copy link
Member

Chrome and Firefox have different JS engines (V8 and Spidermonkey, respectively), so it's very possible that GC issues could differ by browser.

However, the workaround above (build new string and take a substring) looks very fishy. A string value should not hold any references to a parent object that contains it. I have a hard time imagining a scenario where it's not an independent memory allocation.

Obviously, solving this issue would be good, but I'm hesitant to put in the fix you posted above, because of all of the following:

  1. We don't have any automation to avoid regressing on this point
  2. We don't understand the mechanism causing this in the JS engine, so...
  3. We don't actually know why this "fixes" it, so...
  4. We can't have any confidence in the fix working in the future (say, after a browser update)
  5. We don't know whether we have other instances of this problem
  6. We don't know how to avoid introducing other instances of this problem in the future

I would like to see this solved, but I feel strongly that we need to understand and document the root cause or the mechanism behind the workaround.

@MichaelSweden
Copy link
Contributor Author

I agree with you.

Cobalt using V8, but old 8.8 version (Chromium have 12.4).

@MichaelSweden
Copy link
Contributor Author

I can add, as info, that for our problematic stream the template string is always the same.

I have tested more browsers, here is a summary:

Cobalt 23               : V8  8.8.278.8  : Problem
Cobalt main (Ubuntu)    : V8  8.8.278.17 : Problem, 1.5h+180MB (fix 1h+0MB)
Chromium 124.0.6367.155 : V8 12.4.254.15 : ok
Chromium 112.0.5615.49  : V8 11.2.214.9  : ok
Chromium 90.0.4430.72   : V8  9.0.257.17 : ok
Chromium 49.0.2623.108  : V8  4.9.385.33 : ok
Firefox  125.0.3                         : Problem
Firefox  88.0                            : Problem
Edge 42.17134.1098.0                     : ok, 1h 29->33MB 168->755MB(Process memory usage)
Chromium 124.0.6367.207 (V8 12.4.254.15) : With and without fix for 3 hours VmRSS: +0MB vs +200MB
ok = No growing JS heap size in dev tool

The above "ok" is a check if the JS heap size reported by development tools in the browser doesn't show a big increase. However while testing Cobalt I don't have the possibility to check JS heap size. Instead I have looked at the memory used by the cobalt process. To do similar with Chromium (Ubuntu) I started two instances of Chromium. One running with the fix and one without the fix. Monitor the VmRSS value like this:
while true; do echo "$(cat /proc//status | grep VmRSS) : $(cat /proc//status | grep VmRSS)"; sleep 60s; done
Got this result after about 3 hours:
20240519chromium
The red is without the fix. Oops, it increase. But earlier test show that JS heap doesn't increase. Directly after this I started to play without the fix and open dev tool, to see JS heap size. After 2 hours no significant increase.

So now I'm starting to think that chromium also suffers from this problem.

// To be continued (at low speed)

@avelad
Copy link
Collaborator

avelad commented May 28, 2024

@MichaelSweden do you have any update? Thanks!

@cristian-atehortua
Copy link
Contributor

cristian-atehortua commented May 28, 2024

This is curious because mediaTemplate is create using shaka.util.StringUtils.htmlUnescape which intentionally avoids to call replace:

   static htmlUnescape(input) {
    // Used to map HTML entities to characters.
    const htmlUnescapes = {
      '&amp;': '&',
      '&lt;': '<',
      '&gt;': '>',
      '&quot;': '"',
      '&#39;': '\'',
      '&apos;': '\'',
      '&nbsp;': '\u{a0}',
      '&lrm;': '\u{200e}',
      '&rlm;': '\u{200f}',
    };

    // Used to match HTML entities and HTML characters.
    const reEscapedHtml = /&(?:amp|lt|gt|quot|apos|#(0+)?39|nbsp|lrm|rlm);/g;
    const reHasEscapedHtml = RegExp(reEscapedHtml.source);
    // This check is an optimization, since replace always makes a copy
    if (input && reHasEscapedHtml.test(input)) {
      return input.replace(reEscapedHtml, (entity) => {
        // The only thing that might not match the dictionary above is the
        // single quote, which can be matched by many strings in the regex, but
        // only has a single entry in the dictionary.
        return htmlUnescapes[entity] || '\'';
      });
    }
    return input || '';
  }

So, this should not be an issue if the media attribute has an escaped HTML sequence (?)

@MichaelSweden
Copy link
Contributor Author

I have seen that part. Our problematic stream don't have any that will be unescaped. Did this test:

--- a/lib/util/string_utils.js
+++ b/lib/util/string_utils.js
@@ -300,9 +300,9 @@ shaka.util.StringUtils = class {
 
     // Used to match HTML entities and HTML characters.
     const reEscapedHtml = /&(?:amp|lt|gt|quot|apos|#(0+)?39|nbsp|lrm|rlm);/g;
-    const reHasEscapedHtml = RegExp(reEscapedHtml.source);
+    // const reHasEscapedHtml = RegExp(reEscapedHtml.source);
     // This check is an optimization, since replace always makes a copy
-    if (input && reHasEscapedHtml.test(input)) {
+    if (input/* && reHasEscapedHtml.test(input)*/) {
       return input.replace(reEscapedHtml, (entity) => {
         // The only thing that might not match the dictionary above is the
         // single quote, which can be matched by many strings in the regex, but

Memory still grow. I guess that "since replace always makes a copy" might not be true (as for some other javascript functions that is specified like that).

I have traced the string in question and have these notes:

- This function:
shaka.dash.SegmentTemplate.appendTemplateInfo(info, periodStart, periodEnd, shouldFit, initSegmentReference)
    this.templateInfo_.mediaTemplate = info.mediaTemplate;
- is called from:
createStreamInfo(context, requestSegment, streamMap, isUpdate, segmentLimit, periodDurationMap, aesKey, lastSegmentNumber)
    @param {shaka.dash.DashParser.Context} context
    @type {shaka.dash.SegmentTemplate.SegmentTemplateInfo}
    const info = SegmentTemplate.parseSegmentTemplateInfo_(context);
    tsi.appendTemplateInfo(info, periodStart, periodEnd, shouldFit, initSegmentReference);
- which calling:
parseSegmentTemplateInfo_(context)
    const media = MpdUtils.inheritAttribute(context, SegmentTemplate.fromInheritance_, 'media');
    mediaTemplate: media && StringUtils.htmlUnescape(media)

Let do some more testing. We will take help by this new function:

--- a/lib/util/string_utils.js
+++ b/lib/util/string_utils.js
@@ -312,6 +312,22 @@ shaka.util.StringUtils = class {
     }
     return input || '';
   }
+
+  /**
+   * Creates a copy of a string.
+   * The new returned string do not have any reference to the source string.
+   *
+   * @param {?string} str
+   * @return {string}
+   * @export
+   */
+  static createNewString(str) {
+    if (!str || str.length < 1) {
+      return '';
+    }
+    const newString = '_' + str;
+    return newString.substring(1);
+  }
 };
 
 

Let us test with this:

--- a/lib/dash/segment_template.js
+++ b/lib/dash/segment_template.js
@@ -201,8 +201,8 @@ shaka.dash.SegmentTemplate = class {
     const segmentInfo =
         MpdUtils.parseSegmentInfo(context, SegmentTemplate.fromInheritance_);
 
-    const media = MpdUtils.inheritAttribute(
-        context, SegmentTemplate.fromInheritance_, 'media');
+    const media = StringUtils.createNewString(MpdUtils.inheritAttribute(
+        context, SegmentTemplate.fromInheritance_, 'media'));
     const index = MpdUtils.inheritAttribute(
         context, SegmentTemplate.fromInheritance_, 'index');
 

No huge memory grow. I will do more testing. I have had very much to do last week and also this week. But I will be back with more information.

@MichaelSweden
Copy link
Contributor Author

I also tested this:

 --- a/lib/dash/dash_parser.js
+++ b/lib/dash/dash_parser.js
@@ -1891,6 +1891,12 @@ shaka.dash.DashParser = class {
 
     const segmentBase = TXml.findChild(elem, 'SegmentBase');
     const segmentTemplate = TXml.findChild(elem, 'SegmentTemplate');
+    if (segmentTemplate && segmentTemplate.attributes) {
+      if (segmentTemplate.attributes['media']) {
+        segmentTemplate.attributes['media'] = shaka.util.StringUtils
+            .createNewString(segmentTemplate.attributes['media']);
+      }
+    }
 
     // The availabilityTimeOffset is the sum of all @availabilityTimeOffset
     // values that apply to the adaptation set, via BaseURL, SegmentBase,

and memory usage will not grow huge.

I have some ideas about what going on, but need to do more tests, investigate and think it through.

@avelad avelad modified the milestones: v4.9, v4.10 May 30, 2024
@MichaelSweden
Copy link
Contributor Author

This will not fix growing memory:

--- a/lib/dash/segment_template.js
+++ b/lib/dash/segment_template.js
@@ -54,6 +54,10 @@ shaka.dash.SegmentTemplate = class {
 
     /** @type {shaka.dash.SegmentTemplate.SegmentTemplateInfo} */
     const info = SegmentTemplate.parseSegmentTemplateInfo_(context);
+    if (info.mediaTemplate) {
+      info.mediaTemplate =
+          shaka.util.ObjectUtils.cloneObject(info.mediaTemplate);
+    }
 
     SegmentTemplate.checkSegmentTemplateInfo_(context, info);
 

But this...

--- a/lib/dash/segment_template.js
+++ b/lib/dash/segment_template.js
@@ -54,6 +54,10 @@ shaka.dash.SegmentTemplate = class {
 
     /** @type {shaka.dash.SegmentTemplate.SegmentTemplateInfo} */
     const info = SegmentTemplate.parseSegmentTemplateInfo_(context);
+    if (info.mediaTemplate) {
+      info.mediaTemplate =
+          shaka.util.StringUtils.createNewString(info.mediaTemplate);
+    }
 
     SegmentTemplate.checkSegmentTemplateInfo_(context, info);
 

will not grow memory (i.e. cloneObject vs createNewString).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: bug Something isn't working correctly
Projects
None yet
Development

No branches or pull requests

5 participants