Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

getViewportScreenshot (younger sibling of getViewportMedia) #2

Open
eladalon1983 opened this issue Sep 21, 2021 · 3 comments
Open

getViewportScreenshot (younger sibling of getViewportMedia) #2

eladalon1983 opened this issue Sep 21, 2021 · 3 comments

Comments

@eladalon1983
Copy link
Member

Using getViewportMedia to obtain a single screenshot is a useful use-case that has already garnered some interest from Web-developers. However, some issues exist, motivating a variant of getViewportMedia to be specified, subject to the same security-gating. Those issues are:

  • When getViewportMedia is called, the user agent does not know ahead of time how many frames will be consumed by the application before the track is stopped. The user agent will therefore employ language that informs the user that the application is seeking permission to capture a video of the current tab.
    • Savvy users might come to suspect an application that purports in-app to be seeking a static image, but requests permission to capture a video.
    • Users may be confused by this prompt.
    • Users might come to learn the "un-lesson" that giving more permissions than intended is OK.
  • User agents mind end up setting up UX elements warning the users that self-capture is in session, only to take them away when the track is stopped. This flashing of UX elements to the screen for a short period of time is undesirable. Moreover, it could be especially disruptive to users with accessibility readers, etc.
  • When a video is captured, some user agents may employ messaging which changes the size of the viewport. For example, when capturing a tab, Chrome currently presents an infobar right below the URL bar. Attempts to capture a screenshot using getViewportMedia on such user agents will consistently miss a portion of the viewport for no good reason. (This may even push some applications to attempt using getDisplayMedia and full-window capture - a perverse incentive.)

For those reasons, I propose that after we finalize getViewportMedia, we add to the same document a specification of getViewportScreenshot.

partial interface MediaDevices {
  Promise<ImageBitmap> getViewportScreenshot();
}

Fine details:

  1. All the same gating as getViewportMedia.
  2. User agent to modify its messaging to let users know they are consenting to a single frame, not a video.
  3. Returns a single screenshot rather than a MediaStream, of course.
@eladalon1983 eladalon1983 changed the title getViewportScreenshot (lesser brother of getViewportMedia) getViewportScreenshot (younger brother of getViewportMedia) Sep 21, 2021
@eladalon1983 eladalon1983 changed the title getViewportScreenshot (younger brother of getViewportMedia) getViewportScreenshot (younger sibling of getViewportMedia) Sep 21, 2021
@jan-ivar jan-ivar transferred this issue from w3c/mediacapture-screen-share Oct 12, 2021
@jan-ivar
Copy link
Member

How about

<input type="file" accept="image/*" capture="viewport">

? That'd be consistent with what we have for camera (i.e. there's no getUserSnapshot()).

We could keep the same security requirements.

@eladalon1983
Copy link
Member Author

It seems to me like a mostly stylistic change from the API proposed by me, so I'd be generally fine with it. But it looks a bit sub-optimally ergonomic to me. Consider an application that has an API comprised mainly of buttons. It would want to add yet another and set its handler to call newApiForScreenshotsBySomeName(). With your suggestion, they'd have to instead instantiate a new HTMLInputElement and call click on it. Reasonable enough that I'd not block, but could you please explain why you prefer this? Am I overlooking some benefit? Perhaps uniformity with existing, adjacent APIs? Do you happen to know what the rationale was for shaping those other APIs this way?

@github-nilsson
Copy link

For my purposes I would love to give getViewportScreenshot a DOM node to take screenshot of to further minimize what is actually captured, similar to how requestFullscreen displays a specific part of the document. What I really want to get rid of from getDisplayMedia are all the options for the user to make mistakes and capture the wrong thing. I want to present them with one thing and have them go yes or no. I'm not against giving the user the option to trim or black out stuff if they want to, as long as it doesn't give any extra steps if you don't want to. In my main purpose the result is saved to the user's computer, but I realize the security question can't say "Save", since the browser doesn't have a simple way to enforce it.

Using the input tag feels like a throwback to Opera Mini where type="file" would always just open the camera and then upload the result. The less active HTML is the better in my opinion. Have it deal with layout and content, and move everything event based to JavaScript.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants