Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support non fragmented mp4 #216

Closed
netaelk opened this issue Jul 4, 2018 · 20 comments
Closed

Support non fragmented mp4 #216

netaelk opened this issue Jul 4, 2018 · 20 comments

Comments

@netaelk
Copy link

netaelk commented Jul 4, 2018

i'm able to play frag_bunny.mp4 example, but when i use my own mp4 videos i get the error:

Failed to execute 'endOfStream' on 'MediaSource': The MediaSource's readyState is not 'open'.

using this SO answer i realized that the problem is my videos are not fragmented as needed.

is it possible to add support for non-fragmented mp4? or is it a blocker need for the way MSE works?
my videos are working just fine with simple

Thanks!

@netaelk netaelk changed the title support non fragmented mp4 Support non fragmented mp4 Jul 4, 2018
@jpiesing
Copy link

Perhaps someone who understands the details of this more than I do can comment on whether it's practical for some JavaScript to reformat a non-fragmented MP4 file into fragmented MP4 format as part of the process of reading the file in and passing it to source buffers?

@jyavenard
Copy link
Member

I'd say MSE should never support plain mp4. If you want to use those, then you have no need for MSE, may as well use the plain src attribute instead.

If you do want to use MSE, have a look at the mp4box.js project
https://github.com/gpac/mp4box.js/

It does exactly what you need: convert plain mp4 on the fly and load them into MSE enabled player

@jpiesing
Copy link

I'd say MSE should never support plain mp4. If you want to use those, then you have no need for MSE, may as well use the plain src attribute instead.

Sometimes there are requirements to guarantee that all of a piece of media can be played from beginning to end without any risk of network activity - e.g. adverts. MSE appears to meet that requirement. I'm happy to be corrected but I can't see how this can't be done with the src attribute - calling MediaElement.load() does not guarantee that all of the media is loaded (even if it's just 30s in duration).

@jyavenard
Copy link
Member

That's not a reason for using MSE...

Download the video into blob, then video.src = URL.createObjectURL(blob)

done.

@jpiesing
Copy link

That's not a reason for using MSE...

Download the video into blob, then video.src = URL.createObjectURL(blob)

done.

Does that work in the real world?

@jyavenard
Copy link
Member

of course..
plenty of people using that, including pretty popular sites. though mostly for playing audio.

Typical use I've seen is downloading an encrypted file, decrypt it and play it.. Before the day of EME

@netaelk
Copy link
Author

netaelk commented Jul 18, 2018

thanks @jyavenard !

@netaelk netaelk closed this as completed Jul 18, 2018
@lastmjs
Copy link

lastmjs commented Jul 27, 2019

I'd say MSE should never support plain mp4. If you want to use those, then you have no need for MSE, may as well use the plain src attribute instead.

Download the video into blob, then video.src = URL.createObjectURL(blob)

@jyavenard Those suggestions do not work in practice. What if you have a large file? I literally started doing what you described above, but for my use case it did not work. Large files break objectURL's, and I need offline playback.

Creating an objectURL doesn't work well in the real world, for me at least. Large files, audio about an hour or over, crash my mobile browser on Chrome. It also takes a bit to load initially, and makes background with screen off track to track playing difficult. MSE seem like a perfect solution, as you only need to keep in memory a small portion of audio and can seemlessly move from buffer to buffer, even across tracks. This was working well for my application (a full podcast PWA with episode downloads, https://podcrypt.app), but now I'm running into the issue with mp4 files. Very disappointing because MSE seemed like an excellent solution, but this mp4 issue is really messing it up

@lastmjs
Copy link

lastmjs commented Jul 27, 2019

Also, I want to avoid having to ever hold the entire mp4 audio file in memory (I am talking only about mp4 audio by the way, I'm not dealing with video). I'm building a mostly client-side podcast progressive web app, where users stream or download audio from RSS feeds. I do not have control of the feeds or the audio files that they point to. If MSE does not support unfragmented mp4 audio files, then I'll somewhere need to convert the entire mp4 audio file, either on a server or in memory in the user's device. Both of these scenarios are not ideal. I do not see why there should be a difference between mp3 audio, mp4 audio, or any other type of audio format for use with MSE. The HTML audio element can somehow support playing non-fragmented mp4 audio files on a chunk-by-chunk basis, I'm not sur why MSE can't

@oncode
Copy link

oncode commented Apr 10, 2020

HI @lastmjs Were you ever able to solve the problem in some other way? I would like to split up an mp4 and send it via socket to users and begin playing with the first chunk and append the other chunks afterwards. But I think it's only possible with fragmented MP4 files.

@ntrrgc
Copy link

ntrrgc commented Apr 10, 2020

Non-fragmented MP4 has a design incompatible with MediaSource Extensions.

MediaSource extensions formats have two kinds of segments (chunks of data):

  • Initialization segment (defining what tracks would be used, SPS/PPS, optional movie duration...)
  • Media segment (containing audio, video or text samples)

The main MSE spec doesn't specify the format of these segments, but they need to satisfy some basic important properties:

  • Media segments can be fed in any order, and some may never be fed (think seeks... or loading a video at 5:00, it would not make sense to have to download and feed all previous samples).
  • Initialization segments can be fed several times (think quality changes, where a new SPS/PPS needs to be fed).
  • A valid MSE stream is formed by an initialization segment, followed by any number of media segments or other initialization segments (with some restrictions, e.g. you can't change the number of tracks in the middle of the stream).

The MSE stream formats are spec'ed separately, with a registry of all MSE stream formats available in https://www.w3.org/TR/mse-byte-stream-format-registry/

These sound familiar: WebM, ISO BMFF (~MP4), MPEG-TS and MPEG Audio (AAC). But they are not any WebM or ISO BMFF file!

Each MSE stream format spec defines what an initalization segment and a media segment is for that particular media format, often introducing restrictions in the features of the original format used to honor the properties mentioned before.

In the ISO BMFF/MP4 case, an initialization segment is a moov box, and a media segment is a moof box followed by a mdat box (in other words, a fragment). There are many important restrictions, such as sample tables using default-base-is-moof so that no that there are no absolute file pointers, since these would break the feed in any order property, making the sample tables useless and the file unplayable except in the most trivial (and useless) cases.

So, why does the the MSE ISO BMFF stream format spec forbid non-fragmented files?

Non-fragmented MP4 files use a different sample table that only supports addressing with absolute file pointers. This instantly breaks the feed in any order property for any possible definition of media segments you could devise (and devising a definition of media segments for non-fragmented MP4 files would be tricky as well).

The design of the non-fragmented MP4 is essentially incompatible with MSE, since it always uses absolute stream offsets, but MSE streams cannot rely on absolute stream offsets.

That's not to say you couldn't in theory transmux a non-fragmented MP4 file into a fragmented one and use it into MSE, since your JS transmuxer -- unlike SourceBuffer.appendBuffer() -- would be aware of the absolute stream offsets in the original file and would be able to resolve them. MP4 is not a trivial format to work with, but it would be possible in some capacity. Doing it offline in your server with tools like MP4Box is much easier.

@falk-stefan
Copy link

falk-stefan commented Oct 19, 2020

I'm banging my head trying to send audio chunk-wise to a web client but everytime I make one step and I think it works now something new breaks. I am now at the point where I am converting all audio files to fMP4 in order to "stream" it to clients. However, I am not sure how to do that using SourceBuffer.appendBuffer().

Does anybody know how I have to chunk an fMP4 in order to make it playable from any point in time?

See also "How to append fMP4 chunks to SourceBuffer?" (stackoverflow).

@YordinsonPolar
Copy link

someone can help me with MP4Box i trying to use the Source Media Api with this tool but i dont understand the doc

@buptsb
Copy link

buptsb commented Jun 19, 2021

Non-fragmented MP4 has a design incompatible with MediaSource Extensions.

MediaSource extensions formats have two kinds of segments (chunks of data):

  • Initialization segment (defining what tracks would be used, SPS/PPS, optional movie duration...)
  • Media segment (containing audio, video or text samples)

The main MSE spec doesn't specify the format of these segments, but they need to satisfy some basic important properties:

  • Media segments can be fed in any order, and some may never be fed (think seeks... or loading a video at 5:00, it would not make sense to have to download and feed all previous samples).
  • Initialization segments can be fed several times (think quality changes, where a new SPS/PPS needs to be fed).
  • A valid MSE stream is formed by an initialization segment, followed by any number of media segments or other initialization segments (with some restrictions, e.g. you can't change the number of tracks in the middle of the stream).

The MSE stream formats are spec'ed separately, with a registry of all MSE stream formats available in https://www.w3.org/TR/mse-byte-stream-format-registry/

These sound familiar: WebM, ISO BMFF (~MP4), MPEG-TS and MPEG Audio (AAC). But they are not any WebM or ISO BMFF file!

Each MSE stream format spec defines what an initalization segment and a media segment is for that particular media format, often introducing restrictions in the features of the original format used to honor the properties mentioned before.

In the ISO BMFF/MP4 case, an initialization segment is a moov box, and a media segment is a moof box followed by a mdat box (in other words, a fragment). There are many important restrictions, such as sample tables using default-base-is-moof so that no that there are no absolute file pointers, since these would break the feed in any order property, making the sample tables useless and the file unplayable except in the most trivial (and useless) cases.

So, why does the the MSE ISO BMFF stream format spec forbid non-fragmented files?

Non-fragmented MP4 files use a different sample table that only supports addressing with absolute file pointers. This instantly breaks the feed in any order property for any possible definition of media segments you could devise (and devising a definition of media segments for non-fragmented MP4 files would be tricky as well).

The design of the non-fragmented MP4 is essentially incompatible with MSE, since it always uses absolute stream offsets, but MSE streams cannot rely on absolute stream offsets.

That's not to say you couldn't in theory transmux a non-fragmented MP4 file into a fragmented one and use it into MSE, since your JS transmuxer -- unlike SourceBuffer.appendBuffer() -- would be aware of the absolute stream offsets in the original file and would be able to resolve them. MP4 is not a trivial format to work with, but it would be possible in some capacity. Doing it offline in your server with tools like MP4Box is much easier.

thanks @ntrrgc! It resolves all my questions about MSE

@buptsb
Copy link

buptsb commented Jun 19, 2021

There is one use case that is playing large (several GigaBytes) videos online, and the video contents may come from Bittorrent Network (e.g. WebTorrent). Most videos on that network are not made fragmented by default, so if putting them in the src of <video tag, we will need a backend server to host these videos instead of fetching and playing video contents purely in the browser.

@RavikumarTulugu
Copy link

@zckevin i am working exactly the same scenario which you quoted :-) any tricks you were able to come up for "unfragmented" mp4 ?? Thanks

@buptsb
Copy link

buptsb commented Jul 31, 2022

@zckevin i am working exactly the same scenario which you quoted :-) any tricks you were able to come up for "unfragmented" mp4 ?? Thanks

I use a ffmpeg-wasm to transform non-frag mp4 -> frag mp4 and feed them to MSE on live.

@dzek69
Copy link

dzek69 commented Dec 7, 2022

@zckevin how exactly do you transform non-fragmented mp4 to fragmented ones?

I tried variety of movflags, but i always end up with non-fragmented file.

@samueleastdev
Copy link

To fragment files you can use bento4 or shaka packager.

https://www.bento4.com/documentation/mp4fragment/

The video in this post explains ISO-BMFF boxes for MSE media streaming.
https://www.acronym.ninja/acronym/631765b899fac4cc51289131/moof/movie-fragment-box

@buptsb
Copy link

buptsb commented Dec 8, 2022

    const cmd = [
      "ffmpeg",
      // confirm on file overwitten
      "-y",
      // disable interaction on standard input
      "-nostdin",
      "-loglevel info",
      // Infinity times input stream shall be looped
      // "-stream_loop -1",
      "-i input.file",
      // video codec copy, audio codec to AAC LC
      "-c:v copy -c:a aac",
      // configure AAC channel info, without it MSE may throw errro
      "-channel_layout stereo",
      // generate Fmp4
      "-movflags frag_keyframe+empty_moov+default_base_moof",
      // max moov+moof size: 1MB
      "-frag_size 1000000",
      // min moov+moof duration: 0.5 seconds
      "-min_frag_duration 500000",
      // max fragment duration 1000ms
      // "-frag_duration 1000",
      // output container format MP4/MOV
      "-f mov",
      "output.mp4",
    ].join(" ");

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests