-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace the XML SAX parser with a DOM parser #242
Comments
@dsilhavy any opinion? FYI, in dash.js the |
The sax parser is almost a 3rd of the minified-gzipped size of this project (checked via bundlephobia.com). In another project, I've used https://github.com/xmldom/xmldom for doing the parsing in node and then in the bundle for browsers, it rewrites the import to be return While I do think that worrying about a few kb here and there isn't a big deal when you're going to be streaming large video files, it's still helpful to reduce filesize when and where we can, particularly for code that ships to the client side. This seems like a good candidate to help reduce the bundled size of the project. |
From the documentation for xmldom:
I suspect that changing sax.js for xmldom would likely increase the parsing duration and the minified size, though I haven't checked. Maybe it needs a "spike" experiment. Another XML parser that could be worth considering is fast-xml-parser, though again, I haven't tried it. It appears to support all the required features, but again, I haven't looked at the size. If there is a library that uses the native parser when available, and can be compiled into both a "native" and a "not native" variant where the native one has a lower size and hopefully better performance, then I'm all in favour. |
And in the rollup config, for the "globals" build, we have
Unless we wanted to do dynamic code loading, there's probably no way to handle this without multiple builds. |
I am fine with multiple builds. |
Another XML parser to consider that claims to be faster than fast-xml-parser and with very smal bundle size: tXml. This tXml parser is included in next dash.js v5 for MPD parsing. |
After a few attempts at refactoring NOTE: This is built on top of #258 |
I chatted with @dsilvahy and others about this in the context of dash.js last week. I think it would really make sense to profile imscJS in usage to identify any performance issues - I'm not sure if that has been done? My reasoning is that I think there's a high chance that it's the callbacks that are taking time to complete, rather than than the sax parsing itself. As an example, we recently discovered an issue in TVs where switching from one string handling method to another, both native JS, made a huge performance difference - I can't exactly remember, but I think one option was to replace the beginning of a long string with an empty string, the other option was to split the long string and use the second part. It was surprising how much the performance was impacted; I wonder if similar small changes could make a big difference in overall parsing performance in imscJS. |
I agree that the performance of the parsing should be looked into, but maybe a new issue should be created to track that. For us, the main benifit for removing the parser is the reduction in the number of dependencies in players that use |
Thanks for clarifying your motivation @littlespex . |
Currently, imscJS uses sax.js to parse IMSC documents. This allows the same code to be used in both node and in the browser. This is also doubles the size of the packaged browser JS module.
An alternative approach is to move to a DOM parser, so that the native parser can be used in browsers, while reverting to a external DOM parser in node.
@bbert would this help dash.js?
@gkatsev any further thoughts on this topic?
@nigelmegitt any strong feelings?
The text was updated successfully, but these errors were encountered: