Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide a means to select only part of a screen to capture #105

Closed
guest271314 opened this issue Apr 28, 2019 · 29 comments
Closed

Provide a means to select only part of a screen to capture #105

guest271314 opened this issue Apr 28, 2019 · 29 comments
Labels

Comments

@guest271314
Copy link
Contributor

@guest271314 guest271314 commented Apr 28, 2019

Firefox provide a "Take a Screenshot" feature which allows the user to select only a portion of a screen. That option should be provided for getDisplayMedia() in the form of constraints where specific coordinates can be passed; e.g., using .getBoundingClientRect() in the form of {topLeft:<pixelCoordinate>, topRight:<pixelCoordinate>, bottomLeft:<pixelCoordinate>, bottomRight:<pixelCoordinate>}, or at the selection UI, similar to how Firefox implements "Take a Screenshot" feature. Use case: The user only wants to share a specific element, e.g. a <video>.

@martinthomson

This comment has been minimized.

Copy link
Member

@martinthomson martinthomson commented Apr 29, 2019

I don't see how this would be compatible with user selection of content.

@guest271314

This comment has been minimized.

Copy link
Contributor Author

@guest271314 guest271314 commented Apr 29, 2019

@martinthomson The feature request would be compatible and consistent with user selection of content. The "Take a Screenshot" feature of Firefox Developer Tools provides a basic template of how to implement the feature, by way of selection of the content that should be captured, which could be translated into the appropriate corresponding MediaStreamTrack constraints.

resizeMode: "crop-and-scale" roughly provides such functionality now, if a user takes the time to test and determine the resulting output.

One example use case is getDisplayMedia() being executed for a window opened with window.open(), where it is not possible to hide scrollbars, location and title bars; though the expected resulting output is a webm video without scrollbars, location and title bars. After a day of testing finally was able to use CSS to achieve the requirement (at Chromium 73) just moments ago at https://github.com/guest271314/MediaFragmentRecorder/tree/getdisplaymedia-webaudio-windowopen

<!DOCTYPE html>
<html>
<head>
  <title>Record media fragments to single webm video using getDisplayMedia(), AudioContext(), window.open(), MediaRecorder()</title>
</head>
<body>
  <h1 id="click">open window</h1>
  <script>
    const click = document.getElementById("click");
    const go = ({
        width = 320, height = 240
      } = {}) => (async() => {
        const html = `<!DOCTYPE html>
                        <html>
                          <head>
                            <style>
                              * {padding:0; margin:0;overflow:hidden;} 
                              #video {cursor:none; object-fit:cover;object-position: 50% 50%;} 
                              video::-webkit-media-controls,audio::-webkit-media-controls {display:none !important;}
                            </style>
                          </head>
                          <body>
                            <!-- add 30 for title and location bars -->
                            <video id="video" width="${width}" height="${height}"></video>
                          </body>
                        </html>`;

        const blob_url = URL.createObjectURL(new Blob([html], {
          type: "text/html"
        }));

        let done;
        const promise = new Promise(resolve => done = resolve);

        const mediaWindow = window.open(blob_url, "getDisplayMedia", `width=${width},height=${height + 30},alwaysOnTop`);

        mediaWindow.addEventListener("load", async e => {
          console.log(e);
          const mediaDocument = mediaWindow.document;
          const video = mediaDocument.getElementById("video");

          const displayStream = await navigator.mediaDevices.getDisplayMedia({
            video: {
              cursor: "never", // this has little/no effect https://github.com/web-platform-tests/wpt/issues/16206
              displaySurface: "browser"
            }
          });

          console.log(displayStream, displayStream.getTracks());

          let urls = await Promise.all([{
            src: "https://upload.wikimedia.org/wikipedia/commons/a/a4/Xacti-AC8EX-Sample_video-001.ogv",
            from: 0,
            to: 4
          }, {
            src: "https://mirrors.creativecommons.org/movingimages/webm/ScienceCommonsJesseDylan_240p.webm#t=10,20"
          }, {
            from: 55,
            to: 60,
            src: "https://nickdesaulniers.github.io/netfix/demo/frag_bunny.mp4"
          }, {
            from: 0,
            to: 5,
            src: "https://raw.githubusercontent.com/w3c/web-platform-tests/master/media-source/mp4/test.mp4"
          }, {
            from: 0,
            to: 5,
            src: "https://commondatastorage.googleapis.com/gtv-videos-bucket/sample/ForBiggerBlazes.mp4"
          }, {
            from: 0,
            to: 5,
            src: "https://commondatastorage.googleapis.com/gtv-videos-bucket/sample/ForBiggerJoyrides.mp4"
          }, {
            src: "https://commondatastorage.googleapis.com/gtv-videos-bucket/sample/ForBiggerMeltdowns.mp4#t=0,6"
          }].map(async({...props
          }) => {
            const {
              src
            } = props;
            const blob = (await (await fetch(src)).blob());
            return {
              blob,
              ...props
            }
          }));
          click.textContent = "click popup window to start recording";

          const canvas = document.createElement("canvas");
          canvas.width = width;
          canvas.height = height;
          const ctx = canvas.getContext("2d");
          ctx.font = "20px Monospace";
          ctx.fillText("click to start recording", 0, height / 2);
          video.poster = canvas.toDataURL();
          mediaWindow.focus();

          mediaWindow.addEventListener("click", async e => {
            video.poster = "";
            const context = new AudioContext();
            const mediaStream = context.createMediaStreamDestination();
            const [audioTrack] = mediaStream.stream.getAudioTracks();
            const [videoTrack] = displayStream.getVideoTracks();

            videoTrack.applyConstraints({
              cursor: "never",
              width: 320,
              height: 240,
              aspectRatio: 1.33,
              resizeMode: "crop-and-scale"
            });

            mediaStream.stream.addTrack(videoTrack);
            console.log(videoTrack.getSettings());
            const source = context.createMediaElementSource(video);
            source.connect(context.destination);
            source.connect(mediaStream);

            [videoTrack, audioTrack].forEach(track => {
              track.onended = e => console.log(e);
            });

            const recorder = new MediaRecorder(mediaStream.stream, {
              mimeType: "video/webm;codecs=vp8,opus",
              audioBitsPerSecond: 128000,
              videoBitsPerSecond: 2500000
            });
            recorder.addEventListener("error", e => {
              console.error(e)
            });
            recorder.addEventListener("dataavailable", e => {
              console.log(e.data);
              done(URL.createObjectURL(e.data));
            });
            recorder.addEventListener("stop", e => {
              console.log(e);
              [videoTrack, audioTrack].forEach(track => track.stop());
            });
            video.addEventListener("loadedmetadata", async e => {
              console.log(e);
              try {
                await video.play();
              } catch (e) {
                console.error(e);
              }
            });
            try {
              for (let {
                  from,
                  to,
                  src,
                  blob
                }
                of urls) {
                await new Promise(resolve => {
                  const url = new URL(src);
                  if (url.hash.length) {
                    [from, to] = url.hash.match(/\d+|\d+\.\d+/g).map(Number);
                  }
                  const blobURL = URL.createObjectURL(blob);
                  video.addEventListener("play", e => {
                    if (recorder.state === "inactive") {
                      recorder.start()
                    } else {
                      if (recorder.state === "paused") {
                        recorder.resume();
                      }
                    }
                  }, {
                    once: true
                  });

                  video.addEventListener("pause", e => {
                    if (recorder.state === "recording") {
                      recorder.pause();
                    }
                    resolve();
                  }, {
                    once: true
                  });
                  video.src = `${blobURL}#t=${from},${to}`;
                })
              }
              recorder.stop();
            } catch (e) {
              throw e;
            }
          }, {
            once: true
          });
        });
        return await promise;
      })()
      .then(blobURL => {
        console.log(blobURL);
        const video = document.createElement("video");
        video.controls = true;
        document.body.appendChild(video);
        video.src = blobURL;
      }, console.error);

    click.addEventListener("click", e => {
      go();
    }, {
      once: true
    });
  </script>
</body>
</html>

The Media Capture Screen Share API should provide a means to achieve such a requirement; by either fine-tuning the constraints for such selection, or allowing the user to select the content to be captured in similar fashion as the Firefox "Take a Screenshot" Developer Tool.

What is the reason such functionality should not be provided by this API?

@guest271314

This comment has been minimized.

Copy link
Contributor Author

@guest271314 guest271314 commented Apr 29, 2019

@martinthomson BTW cursor: "never" currently does not output expected result. Again, CSS needs to be used to not display the cursor.

@guest271314

This comment has been minimized.

Copy link
Contributor Author

@guest271314 guest271314 commented Apr 29, 2019

@martinthomson The above workarounds would not be necessary if an HTML <video> could be considered a "device" or window, where MediaRecorder would not stop when the underlying media resource is changed when src attribute of HTMLVideoElement is changed (new MediaStreamTracks added to the captured stream); that is, the MediaStreamTrack of kind "video" could be configured with the option to be read as a single MediaStreamTrack even when src is changed (similar to RTCPeerConnection <transceiverInstance>.receiver.track, and the single video track created by getDisplayMedia()).

@guest271314

This comment has been minimized.

Copy link
Contributor Author

@guest271314 guest271314 commented Apr 29, 2019

@martinthomson Tested the code at Mozilla browser at console; getDisplayMedia() is not executed at plnkr. The workaround to not record the title and location bar at Chromium 73

// add 30 to height for Chromium 73 to not record title and location bar
// Firefox 68 records location bar using the same code
// TODO: adjust to not record location bar at Firefox
const mediaWindow = window.open(blob_url, "getDisplayMedia", `width=${width},height=${height + 30},alwaysOnTop`);

which sets the window height to 30 px greater than the height MediaStreamTrack constraint still records the location bar at Nightly.

@youennf

This comment has been minimized.

Copy link
Contributor

@youennf youennf commented Apr 29, 2019

It seems there are multiple requirements here.

One is for a browser to allow a user selecting a part of a screen. I do not see benefits in adding a constraint for that so this seems like an implementation decision, not a spec one.
Implementing this partial screen selection probably need some thoughts. For instance, how to present to users which part of the screen is being captured.

The second requirement is to capture a browser tab w/o title, location bar, slider... This seems somehow more closely related to cursor. Are you only interested in the latter?

@guest271314

This comment has been minimized.

Copy link
Contributor Author

@guest271314 guest271314 commented Apr 29, 2019

@youennf Am only interested in concatenating multiple media files into a single media file. While attempting to achieve that requirement have tried several approaches where summaries of some of the approaches tried can be found at each branch of the above-linked repository.

One is for a browser to allow a user selecting a part of a screen. I do not see benefits in adding a constraint for that so this seems like an implementation decision, not a spec one.
Implementing this partial screen selection probably need some thoughts. For instance, how to present to users which part of the screen is being captured.

The live window, tab, appliacation is already presented to the user at Chromium 73, minimized in a grid display. Firefox provides a dropdown next to the location bar. The selected screen at the grid at Chromium can could be scaled to 2/3 or the entire screen selected for capture if the user, for example, selects a radio to toggle on exact selection, in similar manner that Firefox Developer Tools provides a means to select a region of a screen. Notice, the specification already has an exact constraint modifier. This proposal is not asking for what should not already be provided, given a specifier exists named exact, the normal course of updating an API with more exacting features; specifications and standards are, in general, not static.
The requirement is to capture the exact dimensions of a given display, monitor, screen, application, window, <video>, etc.

firefox_developer_tools_take_a_screenshot

The selected screen can be translated to sx, sy, sWidth, sHeight constraints. Alternatively, the user can programmatically set the input constraints directly videoTrack.applyConstraints({screenX:100, screenY:100, screenWidth, screenHeight}).

sx_sy_sWidth_sHeight

The benefits for the end-user should be self-evident: exacting capture of a given screen or application, without unnecessary portions of the screen or application. Such an specification extension of sx, sy, sWidth, sHeight is consistent with the current specification width, height, resizeMode, etc. For the getDisplayMedia() API (potentially for ImageCapture API), since the entire screen is being captured, it is reasonable to have constraints which select only part of a screen, similar to Canvas​Rendering​Context2D.draw​Image() https://developer.mozilla.org/en-US/docs/Web/API/CanvasRenderingContext2D/drawImage implementation

sx Optional
The x-axis coordinate of the top left corner of the sub-rectangle of the source image to draw into the destination context.

sy Optional
The y-axis coordinate of the top left corner of the sub-rectangle of the source image to draw into the destination context.

sWidth Optional
The width of the sub-rectangle of the source image to draw into the destination context. If not specified, the entire rectangle from the coordinates specified by sx and sy to the bottom-right corner of the image is used.

sHeight Optional
The height of the sub-rectangle of the source image to draw into the destination context.

The second requirement is to capture a browser tab w/o title, location bar, slider... This seems somehow more closely related to cursor. Are you only interested in the latter?

The constraint cursor:"never" does not work. Filed an issue at wpt to address the fact that there are no wpt tests coded to independently verify that fact. In the mean time, created workarounds for the non-working constraint using CSS.

Created a workaround for not recording the local and title bars at Chromium 73 after a day trying different approaches. The original branch of using getDisplayMedia() to record multiple media resources used requestFullScreen() to avoid having to address not capturing portions of default browser GUI. Tried kiosk mode at Chromium, though functionality in that mode is minimal. Decided to revisit attempting to exclude portions of the screen that were not needed to be captured. The above code, modified since posting the comment, does not record the title and location bars at Chromium 73, mainly using CSS. Will dive in to Firefox next. These workarounds can be omitted if existing technologies were incorporated at least into this API, which by its very name is relevant to capturing an entire screen, thus it is reasonable to conclude that there might be portions of an entire screen which would need to be excluded, or, conversely, included to the exclusion of certain parts of the screen. Am only requesting rectangular selection, not triangles or circles, or parallelograms, though those dimensions, too, should ultimately be possible.

@youennf

This comment has been minimized.

Copy link
Contributor

@youennf youennf commented Apr 29, 2019

What I mean is that crop constraints at getDisplayMedia call time does not make a lot of sense.

Allowing a web page to efficiently manipulate video tracks (be they screen, camera or peer connection originating) with operations like cropping makes sense.
This relates somehow to requirements expressed in https://www.w3.org/TR/webrtc-nv-use-cases/#funnyhats*

@guest271314

This comment has been minimized.

Copy link
Contributor Author

@guest271314 guest271314 commented Apr 29, 2019

@youennf Crop constraints at getDisplayMedia() call makes sense from perspective here. Yes, similar to the linked document, though, again the requirement for this user is to capture video from an HTMLVideoElement (or, preferably, without having to use a <video> element at all; e.g., new OfflineVideoContext(data).startRendering() similar to OfflineAudioContext().startRendering(); decode/encode/read/write media in a Worklet or Worker context - potentially faster than "real-time", without necessarily having the play the media using the browsers' Web Media Player implementation; that is, an image: <video> presented on the browser surface) where multiple media files (audio and/or video could be concatenated into a single matroska or webm file; motivation: https://creativecommons.org/about/videos/a-shared-culture; https://mirrors.creativecommons.org/movingimages/webm/ScienceCommonsJesseDylan_240p.webm (create such video in the ostensibly FOSS browser without using any third-party code - use only the code shipped with the FOSS browser; or determine if such a requirement is not possible)), which is essentially the output of the browsers' respective Web Media Player implementation; initial attempt https://stackoverflow.com/questions/45217962/how-to-use-blob-url-mediasource-or-other-methods-to-play-concatenated-blobs-of described in more detail at whatwg/html#2824, whatwg/html#3593, w3c/media-source#209, w3c/mediacapture-record#166 and w3c/mediacapture-main#575. HTMLMediaElement.captureStream() and MediaRecorder currently do not provide a means to capture multiple video media sources when the <video> src attribute is changed. replaceTrack() and getDisplayMedia() is the closest have arrived at meeting the requirement at both Chromium and Firefox. Chromium crashes the tab when MediaSource is captured using captureStream() w3c/media-source#190; canvas.captureStream() with AudioContext.createMediaStreamDestination() has noticeable lack of quality ("pixelation") at closing image frames; replaceTrack() mutes a minimal though noticeable portion of the last 1 second of audio; getDisplayMedia() has the issue of having to perform two user gestures and use CSS to remove portions of the screen other than the <video>; ended event is dispatched or not differently at Chromium and Firefox w3c/mediacapture-fromelement#78; and other bugs or interoperability concerns found during the process of trying to compose code for both Chromium and Firefox; still not sure why MediaRecorder is specified to stop recording if a MediaStreamTrack is added or removed from the MediaStream.

If the <video> element could be listed as a "device", e.g., "videoouput" at enumerateDevices(); or <video> could be selected as an application or window for getDisplayMedia() (the Web Media Player which drive the HTML <video> element is essentially an application), or MediaRecorder could be modified to not be specified to stop recording when the underlying media resource of a <video> is changed at src attribute, or using the functionality of replaceTrack() within the MediaRecorder (MediaRecorder.stream.replaceTrack(withTrack)) and/or MediaStream specification (MediaStream.replaceTrack(withTrack)) which would not dispatch stop at MediaRecorder (same as transceiver.receiverInstance.track being the same track with potentially different media sources with a single "timeline"), then that might lead to resolution of the requirement.

Edit

Taking it a step further, an API which exposed the respective browsers' MediaDecoder/MediaEncoder and webm writer code. And/or human-readable form of EBML for the ability to write audio and video as XML (https://github.com/vi/mkvparse; https://github.com/vi/mkvparse/blob/master/mkvcat.py) or JSON, then compress, if necessary into a webm container (achieved similar uncompressed using <canvas> Web Animation API (images as keyframes), and Web Audio API; with ability to adjust playback rate and reverse the media; synchronization of audio with images was an issue; lost the tests).

@guest271314

This comment has been minimized.

Copy link
Contributor Author

@guest271314 guest271314 commented May 4, 2019

@youennf Firefox includes viewportOffsetX and viewportOffsetY for resulting object returned by navigator.mediaDevices.getSupportedConstraints(), though the constraints are not set for the MediaStreamTrack of kind "video". Is there a specification which clearly indicates which constraints are capable of being applied to a MediaStreamTrack depending on which API is used to get the MediaStream and MediaStreamTrack?

@guest271314

This comment has been minimized.

Copy link
Contributor Author

@guest271314 guest271314 commented May 4, 2019

@youennf FWIW after testing various approaches which included Firefox freezing the operating system on several occasions requiring hard re-boot, setting dom.disable_window_open_feature.location at "about:preferences" at Firefox results in the title bar not being recorded. Not sure how such functionality could be incorporated into Media Capture Screen Share API, though would be helpful for the use case of not sharing specified regions of the captured window, screen, application.

@alvestrand

This comment has been minimized.

Copy link
Contributor

@alvestrand alvestrand commented May 9, 2019

So what happens when you capture part of the screen, and the user moves the window?

@guest271314

This comment has been minimized.

Copy link
Contributor Author

@guest271314 guest271314 commented May 9, 2019

@alvestrand What do you mean by "moves the window"?

@jan-ivar

This comment has been minimized.

Copy link
Contributor

@jan-ivar jan-ivar commented May 9, 2019

Firefox includes viewportOffsetX and viewportOffsetY

These are non-spec and should be removed. They originate from a browser-tab sharing experiment Firefox had behind a pref years ago, and worked solely with applyConstraints to move which area of a web page (specifically) to capture, independent of end-user scrolling. The idea was to let a viewer, using a data channel, scroll independently from the presenter, with the obvious privacy implications that follow. We have no current plans to revive this effort.

In short, they weren't general purpose pixel croppers, which I agree with @youennf belongs elsewhere.

Is there a specification which clearly indicates which constraints are capable of being applied to a MediaStreamTrack

Track constraints are source-specific. If the specification of a source does not mention a constraint, then it is not supported for tracks from that source. w3c/mediacapture-main#578 is hoping to clarify this.

@guest271314

This comment has been minimized.

Copy link
Contributor Author

@guest271314 guest271314 commented May 9, 2019

@jan-ivar Am not certain what the issue, hesitancy and/or reluctance is with adding a constraint which provides a means to capture only part of a screen that will be shared? That is, if am gathering the hesitancy to add this constraint correctly from the responses so far in this issue.

This feature request is consistent with screen capture (still image and live stream) programs, and in fact, consistent with the concept of setting width, height and resizeMode. This proposal simply asks requests to further refine the screen to be captured.

It is reasonable to have the ability to select only part of a screen both by a selection tool to physically outline the portions of the screen to be captured and by setting sx, sy, sWidth, sHeight constraints.

Why should such functionality not be specified?

@guest271314

This comment has been minimized.

Copy link
Contributor Author

@guest271314 guest271314 commented May 9, 2019

@jan-ivar FWIW dove in to getDisplayMedia() and removing specific portions of the screen while trying to record multiple media resources to a single webm file using MediaRecorder (various attempts, nuances, caveats can be read at the branches at https://github.com/guest271314/MediaFragmentRecorder); that is how this issue came about and why filed this issue moments ago requesting the feature that the replaceTrack method that, browsing the history of the method, you championed by defined as a method of MediaStream https://github.com/w3c/mediacapture-main/issues/586.

@guest271314

This comment has been minimized.

Copy link
Contributor Author

@guest271314 guest271314 commented Jun 2, 2019

A rudimentary POC https://plnkr.co/edit/UmrSwN?p=preview to select only part of a screen for a screen shot (the movement and resizing of the selection element can be fine-tuned to have similar functionality as, for example, https://codepen.io/zz85/pen/gbOoVP)

<!DOCTYPE html>
<html>
<head>
</head>
<body>
  <script>
    (async() => {
      function screenShotSelection() {
        const div = document.createElement("div");
        const styles = {
          background: "transparent",
          display: "block",
          position: "absolute",
          outline: "thin solid gold",
          width: "100px",
          height: "100px",
          left: "calc(50vw)",
          top: "calc(50vh)",
          overflow: "auto",
          resize: "both",
          cursor: "move",
          zIndex: 100
        };
        div.id = "screenShotSelector";
        div.ondragstart = e => {
          console.log(e);
        };
        div.ondragend = e => {
          const {
            clientX, clientY
          } = e;
          e.target.style.left = clientX + "px";
          e.target.style.top = clientY + "px";
        }
        Object.assign(div.style, styles);
        div.draggable = true;
        document.body.appendChild(div);
      }
      const video = document.createElement("video");
      video.controls = true;
      video.style.objectFit = "cover";
      video.style.lineHeight = 0;
      video.style.fontSize = 0;
      video.style.margin = 0;
      video.style.border = 0;
      video.style.padding = 0;
      video.loop = true;

      video.src = "https://upload.wikimedia.org/wikipedia/commons/d/d9/120-cell.ogv";

      document.body.insertAdjacentElement("afterbegin", video);

      video.addEventListener("play", async e => {
        screenShotSelection();
        const bounding = document.getElementById("screenShotSelector").getBoundingClientRect();
        const stream = await navigator.mediaDevices.getDisplayMedia({
          video: {cursor:"never"} // has no effect at Chromium
        });

        const [videoTrack] = stream.getVideoTracks();
        const imageCapture = new ImageCapture(videoTrack);
        const osc = new OffscreenCanvas(100, 100); // dynamic
        const osctx = osc.getContext("2d");
        screenShotSelector.addEventListener("dblclick", async e => {
          console.log(window.getComputedStyle(e.target).left, window.getComputedStyle(e.target).top);
          osctx.drawImage(await imageCapture.grabFrame(), parseInt(window.getComputedStyle(e.target).left), parseInt(window.getComputedStyle(e.target).top), 100, 100, 0, 0, 100, 100); // dynamic
          console.log(bounding, URL.createObjectURL(await osc.convertToBlob()));
          videoTrack.stop();
          video.pause();
        }, {
          once: true
        })
      }, {
        once: true
      });

    })();
  </script>
</body>
</html>

screenshot_selection_4

screenshot_selection_3

Ideally, specific for a screen shot (i.e., #107)

  1. once permission is granted to capture a screen, application of tab a draggable and resizable element having transparent background is appended to a fullscreen selection UI;
  2. once a region has been selected for example using dblclick event on the draggable and resizable element a MediaStreamTrack is created and exists for the time to capture a single image is having the dimensions of the selection element, without the entire screen being captured then drawn again to resize to the element dimensions; the MediaStreamTrack is immediately stopped and ended once the single image is captured

where the permission is not for a screen shot, but for a continuous media stream to be captured

  1. only the selected region is captured (for example, the bounding client rectangle of the <video> element at the example code) not the entire screen;
  2. all other MediaStream and MediaStreamTrack functionality remains the same
@aboba aboba added the question label Jun 20, 2019
@hoolahoop

This comment has been minimized.

Copy link

@hoolahoop hoolahoop commented Aug 12, 2019

I would use this in my implementation of WebRTC. Getting exact dimensions is useful, and it is currently not allowed (eg. https://developer.mozilla.org/en-US/docs/Web/API/MediaDevices/getDisplayMedia section about TypeError (no exact or min spec's allowed)).

Is this question still open for discussion?

@guest271314

This comment has been minimized.

Copy link
Contributor Author

@guest271314 guest271314 commented Aug 12, 2019

@hoolahoop How does not allowing setting min or exact constraints affect selecting only a part of the screen?

The issue has not been closed, yet.

@alvestrand

This comment has been minimized.

Copy link
Contributor

@alvestrand alvestrand commented Aug 15, 2019

This seems to be possible to do with current APIs, so no compelling reason seen for more browser surface.

@alvestrand alvestrand closed this Aug 15, 2019
@henbos

This comment has been minimized.

Copy link
Contributor

@henbos henbos commented Aug 15, 2019

I think the spec allows browsers to have a way to crop if they wanted to, but I don't think the application should have control which part of the screen the user can pick

@guest271314

This comment has been minimized.

Copy link
Contributor Author

@guest271314 guest271314 commented Aug 15, 2019

@alvestrand @henbos Have you tried "Take a Screenshot" at Firefox Devloper Tools? That is what am asking to be specified as part of mediacapture-screen-share. Is there a compelling reason such functionality should not be specified?

@guest271314

This comment has been minimized.

Copy link
Contributor Author

@guest271314 guest271314 commented Aug 15, 2019

take_a_screenshot_firefox

@youennf

This comment has been minimized.

Copy link
Contributor

@youennf youennf commented Aug 15, 2019

Is there a compelling reason such functionality should not be specified?

This functionality is allowed by the spec.
The spec stays away from any specific way to constrain user selection UI so I do not see what the spec could say related to the proposed functionality.

"Take a Screenshot" at Firefox Devloper Tools does not seem to require any specific spec.

This is really up to browser implementors to decide whether to add support for that functionality or not.

@henbos

This comment has been minimized.

Copy link
Contributor

@henbos henbos commented Aug 16, 2019

Have you tried "Take a Screenshot" at Firefox Devloper Tools? That is what am asking to be specified as part of mediacapture-screen-share. Is there a compelling reason such functionality should not be specified?

I still don't see mediacapture-screen-share as a screenshot API, which makes me reluctant to add any language advising the browser to that it must or even should have this type of a picker. I'm not saying there aren't any use cases for capturing part of a screen, but I'm not compelled that we should attempt to mandate this, and as-is browsers are allowed to implement such a picker.

If this use case was more compelling for getDisplayMedia then I do think we should advise implementations so that you don't end up having to do cropping as part of the application depending on whether or not the browser supported cropping, but again this is not the intent of getDisplayMedia, and I think it is within the browser's decision about what UI to support.

For what you're asking I would like to see a different API than getDisplayMedia.

@guest271314

This comment has been minimized.

Copy link
Contributor Author

@guest271314 guest271314 commented Aug 16, 2019

@henbos

but again this is not the intent of getDisplayMedia

See the current language in the specification, emphasis added

Abstract

This document defines how a user's display, or parts thereof, can be used as the source of a media stream using getDisplayMedia, an extension to the Media Capture API [GETUSERMEDIA].

We could ignore that language and write a specification from scratch which included the same language to achieve double-redundancy.

It that is what you believe to be necessary, where to post the specification? WICG discourse? Am not a "member" of W3C, and not really interested in becoming beholden to an organization, particularly one which cannot write the words "patent and copyright" when that is what they are asking about. The specification should be very simple. In fact, since Mozilla has already written the and and implemented the code, the only question would be is will Chrome implement the specification? Using getUserMedia() is a logical choice for the solution. Yes, specifications should give guidance on UI functionality, to avoid multiple different actual implementations which could vary widely. One example of variance is WebM files output by MediaRecorder. Where without specifying track order, the tracks can be in an arbitrary order, adding complexity to the case of merging the WebM files output by Chrome and Firefox both withing the same browser and between the two browsers. That does not even get to h264 code in WebM the Chrome supports, though WebM was proffered as only having certain codecs.

In any event, how do you suggest to proceed?

@guest271314

This comment has been minimized.

Copy link
Contributor Author

@guest271314 guest271314 commented Aug 16, 2019

@henbos There is an existing "Screenshot API" topic at WICG discourse https://discourse.wicg.io/t/proposal-screenshot-api/2412. There are comments citing privacy and security concerns. Well, getDisplayMedia() can currently leak information relating to tabs, windows, and applications where permission was not granted to be captured #108 (comment). Does not appear to be particularly alarming to some maintainers of the specification.

but again this is not the intent of getDisplayMedia

From perspective here getDisplayMedia() is well-suited to be capable of selecting only part of the screen for capture, as the language in the specification currently states right now.

Can you explain the reasoning for your statement concerning the intent of getDisplayMedia()?

How does the reading of the specification infer what was not the intent of the specification when the actual language includes the term

or parts thereof?

@guest271314

This comment has been minimized.

Copy link
Contributor Author

@guest271314 guest271314 commented Aug 16, 2019

@henbos The intent of this issue was to select a portion of the screen to record, e.g., a <video> element. Since HTML <video> is not an "Application", in general, it would be necessary to use ImageCapture.grabFrame() and getDisplayMedia(). Since PictureInPictureWindow is considered an "Application" a partial workaround https://github.com/guest271314/MediaFragmentRecorder/tree/getdisplaymendia-pictureinpicture that has limitations on width and height. The similarity to a screenshot proposal or issue is incidental. The user should still be able to select only part of the screen using this API, without having to use other APIs to exclude portions of the screen not intended to be captured. That is what lead to the awareness that getDisplayMedia() at Chrome can leak tabs, applications and windows not granted permission to capture.

@guest271314

This comment has been minimized.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
8 participants
You can’t perform that action at this time.