Creation of Seekable Files #119

SingingTree · 2017-02-27T22:45:21Z

At the moment implementations of the MediaStream Recording API don't write seekable webm files. The WebM format doesn't make this particularly easy, as to write the cues in a useful fashion will require mutating the start of the file, either to write the cues there or manipulate the seek head. However, with the MediaRecorder API, it's the case the if requestData has been run, this data is no longer available for writing.

Would it work to expand the spec to encompass some way to handle this?

For example:

expose functionality to finalise recorded blobs: allowing implementations to modify the finalised blob(s) as needed for a given container
expose the ability to signal that data will not be requested until the end of recording, allowing the recorder to buffer data and finalise it, without concern for further potential future data

yellowdoge · 2017-03-02T00:48:52Z

@SingingTree thanks for moving this discussion to an issue. Like I said before, we have given a bit of thought to this internally with the libwebm folks and we thought about a different approach because:

there are tools for doing this now (albeit in C/C++) for libwebm,
MediaRecorder is a live recorder, modifying it to also, sometimes, be non-live, would dilute the API,

but most important, cues-reconstruction would be muxer-specific: I don't see a straightforward way to generalize this to any muxer and hence to add it to the Spec. @jan-ivar, @Pehrsons wdyt?

The alternative solution in the libwebm case is a simple polyfill (in the spirit of polyfill-first) calling CopyAndMoveCuesBeforeClusters(); this polyfill will create a new encoded Blob with the Cues and duration reconstructed. The JS could be done using e.g. Emscriptem. I wanted to get this done but haven't really found time TBH.

SingingTree · 2017-03-02T01:49:12Z

Regarding the muxer specific nature: is your concern that a finalise() style function, or indicating that a data will not be read back until completion, is not enough to allow for all muxers to handle this case?

In the case of the polyfill, would this have no official relation this spec, but could be included by pages using media recorder to rewrite the results of their recording to contain cues? Would there be a need for the file to already have cues written, in the sense that it's a strict move operation, or would it handle writing cues in files that didn't have any?

Pehrsons · 2017-03-02T09:54:11Z

I think [1] is a good idea that would fix a number of issues. Including this one, making issue #4 support changes to tracks, supporting resolution changes in containers that don't natively support it, etc.

[1] #67 (comment)

yellowdoge · 2017-03-03T02:25:13Z

@SingingTree

Regarding the muxer specific nature: is your concern that a finalise() style function, or indicating that a data will not be read back until completion, is not enough to allow for all muxers to handle this case?

Aside from the concerns already mentioned, adding a finalise()-like function would face some operational issues spec-wise. Two cases:

the user doesn't mind the UA holding on to the data for as long as needed, and indicates that by calling start() with no timeslice; at first sight, this situation would allow the implementation to rewrite the cues/length appropriately since it holds on to all the data, right? The problem here is that requestData() can be called at any time, flushing any internal memory, and dumping us into case 2.
to add a finalise() method we would need to specify what data is passed into it, e.g. should this method be passed as parameter the whole bag of Blobs received in ondatavailable? Or just some Blobs marked in some particular way...? Different container formats might need to rewrite different chunks of the output, so If the answer is 'the whole bag` then please read on...

In the case of the polyfill, would this have no official relation this spec, but could be included by pages using media recorder to rewrite the results of their recording to contain cues? Would there be a need for the file to already have cues written, in the sense that it's a strict move operation, or would it handle writing cues in files that didn't have any?

Yeah, in this case the polyfill would be a node.js package that would be informatively linked from this very spec, and would consist of a single function call that gets the whole set of recorded Blobs and passes it through the mentioned function (CopyAndMoveCuesBeforeClusters), that tries to "clean up" the webm/mkv, so that it has correct Duration, Cues and a bunch of other things. IIRC, it can create the Cues from scratch. IIUC, it's very much the equivalent of mkclean for webm files.

A similar informative-thingy would be to use WebAudio to mix several audio tracks before passing them to Media Recorder: it's not strictly part of this Spec, but it's good to have an informative example detailing this... (either in the Spec, in MDN or in both).

yellowdoge · 2017-03-08T01:42:03Z

To give more context, I had cloned and compiled https://github.com/webmproject/libwebm: among the generated executables there is this utility mkvmuxer_sample, that is an example of what I suggested above. I have used with a MediaRecorder-Chrome produced webm (not seekable and with ∞ duration) to generate another webm file that is seekable and has the duration correctly recalculated.

Running another utility from the same folder, webm_info, I see the changes:
before:

Segment:
  SegmentInfo:
    TimecodeScale : 1000000 
    Duration(secs): -1e-09
    MuxingApp     : Chrome
    WritingApp    : Chrome

after:

Segment:
  SegmentInfo:
    TimecodeScale : 1000000 
    Duration(secs): 12.7203
    MuxingApp     : libwebm-0.2.1.0
    WritingApp    : mkvmuxer_sample

jnoring · 2017-03-08T20:53:03Z

You can also do the same with ffmpeg -i input.webm -c copy output.webm to get a remuxed file.

Honestly, if getting a seekable file only requires remuxing the file, my vote would be a javascript library. My opinion is MediaRecorder should be more about doing the thing that is prohibitively hard to do in javascript (encoding audio and video) and less about muxing into containers.

yellowdoge · 2017-03-08T23:22:58Z

@jnoring I agree, and didn't know that ffmpeg also reconstructs the missing parts of the file, good to know. I forked libwebm here and added an emscripten-compiled repair_webm.cc (see also emcompile.sh) as demo and perhaps this can be the seed of a polyfill to cover the remuxing. It's still TBD, mostly because I need to somehow teach the C routines to read from a Blob instead of from file, and I'm no emscripten pro.

SingingTree · 2017-03-09T01:43:49Z

Sounds like a reasonable solution. I'm mindful of keeping the barrier to entry low, so sounds good to me that we can both advertise the JS once it's ready and keep the interface simple.

yellowdoge · 2017-04-18T17:28:40Z

@legokichi was kind enough to provide a solution based on ts-ebml, see legokichi/ts-ebml#2 (comment) .

SingingTree · 2017-05-04T23:57:11Z

What is the appropriate place to discuss usage of libs such as the one above, as well as stewardship of that code (if this is the means by which MediaRecorder can have seekable files, who is involved in making sure it remains so and saying how). This issue? Another one?

fthiery · 2017-05-11T14:13:29Z

FYI remuxing from ffmpeg is not viable (anymore?) with chromium as it will currently produce 1000 fps files, see https://trac.ffmpeg.org/ticket/6386 for details.

HTML: Add test for <video> dispatching resize event and displaying variable video track width and height #17821 web-platform-tests/wpt#17821 All files produced using the code at https://github.com/guest271314/MediaFragmentRecorder/blob/webrtc-replacetrack/MediaFragmentRecorder.html The codecs used are - "video/webm;codecs=vp8,opus" - "video/webm;codecs=vp9,opus" - "video/webm;codecs=h264" - "video/x-matroska;codecs=h264" (see https://bugs.chromium.org/p/chromium/issues/detail?id=999580; https://plnkr.co/edit/WUVbjz?p=info) Width and height of the encoded frames in the files in order WidthxHeight 768x576 480x240 640x360 400x300 1280x720

buynao · 2022-01-17T09:58:11Z

A few years have passed and there is still no official solution.
The good solution I've tried so far is this webm-duration-fix for those who need it.
It supports fixing recording files larger than 2GB and has a low memory footprint when fixing.
based on ts-ebml，Support browser and node。
https://github.com/buynao/webm-duration-fix

import fixWebmDuration from 'webm-duration-fix';

const mimeType = 'video/webm\;codecs=vp9';
const blobSlice: BlobPart[] = [];

mediaRecorder = new MediaRecorder(stream, {
  mimeType
});

mediaRecorder.ondataavailable = (event: BlobEvent) => {
  blobSlice.push(event.data);
}

mediaRecorder.onstop = async () => {  
    // fix blob, support fix webm file larger than 2GB
    const fixBlob = await fixWebmDuration(new Blob([...blobSlice], { type: mimeType }));
    // to write locally, it is recommended to use fs.createWriteStream to reduce memory usage
    const fileWriteStream = fs.createWriteStream(inputPath);
    const blobReadstream = fixBlob.stream();
    const blobReader = blobReadstream.getReader();
  
    while (true) {
      let { done, value } = await blobReader.read();
      if (done) {
        console.log('write done.');
        fileWriteStream.close();
        break;
      }
      fileWriteStream.write(value);
      value = null;
    }
    blobSlice = [];
};

dontcallmedom · 2022-01-17T10:08:33Z

re-opening the issue since it's not obvious leaving this to user-code fulfills best the requirements

ZHENGYUN01 · 2022-06-13T13:03:54Z

A few years have passed and there is still no official solution. The good solution I've tried so far is this webm-duration-fix for those who need it. It supports fixing recording files larger than 2GB and has a low memory footprint when fixing. based on ts-ebml，Support browser and node。 https://github.com/buynao/webm-duration-fix

import fixWebmDuration from 'webm-duration-fix';

const mimeType = 'video/webm\;codecs=vp9';
const blobSlice: BlobPart[] = [];

mediaRecorder = new MediaRecorder(stream, {
  mimeType
});

mediaRecorder.ondataavailable = (event: BlobEvent) => {
  blobSlice.push(event.data);
}

mediaRecorder.onstop = async () => {  
    // fix blob, support fix webm file larger than 2GB
    const fixBlob = await fixWebmDuration(new Blob([...blobSlice], { type: mimeType }));
    // to write locally, it is recommended to use fs.createWriteStream to reduce memory usage
    const fileWriteStream = fs.createWriteStream(inputPath);
    const blobReadstream = fixBlob.stream();
    const blobReader = blobReadstream.getReader();
  
    while (true) {
      let { done, value } = await blobReader.read();
      if (done) {
        console.log('write done.');
        fileWriteStream.close();
        break;
      }
      fileWriteStream.write(value);
      value = null;
    }
    blobSlice = [];
};

this one can not run in project with es2022

yellowdoge mentioned this issue Apr 13, 2017

Support reconstructing the length and cues in a webm legokichi/ts-ebml#2

Closed

yellowdoge closed this as completed Apr 18, 2017

Pehrsons mentioned this issue Mar 13, 2019

MediaRecorder needs to define effect of adding / removing tracks in its input MediaStream #4

Closed

LukasKalbertodt mentioned this issue Mar 26, 2020

Recording does not allow seeking and does not store the duration elan-ev/opencast-studio#517

Open

dontcallmedom reopened this Jan 17, 2022

jan-ivar added the enhancement label Jan 23, 2024

jan-ivar added this to the Future Version milestone Jan 30, 2024

pirate mentioned this issue Apr 11, 2024

Provide a way to specify custom encoder (to generate other types of formats) #60

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Creation of Seekable Files #119

Creation of Seekable Files #119

SingingTree commented Feb 27, 2017

yellowdoge commented Mar 2, 2017

SingingTree commented Mar 2, 2017

Pehrsons commented Mar 2, 2017

yellowdoge commented Mar 3, 2017

yellowdoge commented Mar 8, 2017

jnoring commented Mar 8, 2017

yellowdoge commented Mar 8, 2017

SingingTree commented Mar 9, 2017

yellowdoge commented Apr 18, 2017

SingingTree commented May 4, 2017

fthiery commented May 11, 2017

buynao commented Jan 17, 2022

dontcallmedom commented Jan 17, 2022

ZHENGYUN01 commented Jun 13, 2022

Creation of Seekable Files #119

Creation of Seekable Files #119

Comments

SingingTree commented Feb 27, 2017

yellowdoge commented Mar 2, 2017

SingingTree commented Mar 2, 2017

Pehrsons commented Mar 2, 2017

yellowdoge commented Mar 3, 2017

yellowdoge commented Mar 8, 2017

jnoring commented Mar 8, 2017

yellowdoge commented Mar 8, 2017

SingingTree commented Mar 9, 2017

yellowdoge commented Apr 18, 2017

SingingTree commented May 4, 2017

fthiery commented May 11, 2017

buynao commented Jan 17, 2022

dontcallmedom commented Jan 17, 2022

ZHENGYUN01 commented Jun 13, 2022