Skip to content
This repository has been archived by the owner on Jul 22, 2024. It is now read-only.

Head tracking support #262

Closed
ghost opened this issue Dec 27, 2020 · 21 comments
Closed

Head tracking support #262

ghost opened this issue Dec 27, 2020 · 21 comments
Labels
ARCHIVED CLOSED at time of archiving enhancement New feature or request

Comments

@ghost
Copy link

ghost commented Dec 27, 2020

Hi! I've added head tracking support and have started adding some sensitivity controls. I also intend to make it work with the Gizmo Mode, so that you can manually drag the headset around to "set it's origin" and then use head tracking to look around and move from the point. Additionally, I plan to persist these settings to local storage so that users can "load into" their previous positions without needing any further setup (especially useful when working with Hot Module Reload).

Here's a video of what I have so far:

ezgif.com-gif-maker.mp4

Are you accepting PRs for this? And if so what are the guidelines? I'd like to hyper focus in on just the head tracking for now, but I've built a library called Handsfree.js to make adding and combining models very "easy" (it's been tricky inside a Chrome Extension 😅 but it'll be figured out soon)

Thanks!

@takahirox
Copy link
Contributor

takahirox commented Dec 28, 2020

Hi, thanks for asking.

I saw your tweet. It looks very interesting. And controlling emulated device's transform with mouse is a bit annoying. So having easier way to do that in the emulator is nice.

Do you have a live demo? I would like to test the experience and think whether we should have head tracking support.

Some concerns and thoughts.

  • If you rotate or move your head, won't it be harder to look at the screen?
  • Is the performance good enough on the both Firefox and Chrome and even on the low-end desktop devices?
  • I'm thinking of supporting hand controllers with computer vision Support Hand controllers input #254 Can this head tracking support work well with the hand controller support?
  • Handsfree.js sounds good, but I wouldn't like the emulator extension to have many dependencies because they can slow the web store reviews. I may want to write minimal code rather than importing it.

(Sorry if my response will be slow because I'm in the vacation now.)

@ghost
Copy link
Author

ghost commented Dec 28, 2020

Hi thanks for getting back to me! And no worries at all on the response times, actually this will give me some time to get a really polished prototype ready.

Here is a live demo of just Head Tracking within an A-Frame, this was the prototype I made before I was shown your extension. In practice, you can adjust the sensitivity such that a 20deg turn IRL does 180deg in game: https://handsfree.js.org/integration/aframe/look-around-handsfree.html

On the home page, you can toggle of few different models together on/off to see the performance but to answer your question yes both head and hand tracking work together fairly well: https://handsfree.js.org/#models

And I totally understand the hesitation to using dependencies so no worries there! For straight up emulation and nothing more, going minimal is 100% the way to go. Handsfree.js uses the same models but with extra sugar and spice on top that isn't necessary for this application (the main advantage is the plugin architecture, which allows for rapid development with Hot Module Reloading which isn't used in extensions).

I don't have benchmarks but thanks for mentioning that, I'll do those soon!

Thanks again for the response and I apologize for the length of this one 😅

@takahirox
Copy link
Contributor

takahirox commented Dec 28, 2020

Thanks for sharing the demo. The experience is much better than I thought. So, we may still need to test the performances but I feel positive about the head tracking support now.

Regarding the contribution guide, unfortunately we don't have clear code style or tests (yet). Please just send a PR. I can clean up the code in my side even if needed.

Some thoughts about supporting it

  • I don't want to make the extension size big, so smaller addition is happier for me.
  • I want to enable to switch between the head tracking mode and the mouse control mode because there may be a chance that the head tracking is slow or non-functional on low-end platforms. (I can do this type of UI stuffs if you want me to do.)
  • I'm thinking of supporting hand controllers with maybe MediaPipe. So head tracking support must not block hand controllers support (functionality, performance, code simplicity, and so on).

@ghost
Copy link
Author

ghost commented Dec 28, 2020

Oh thanks for trying it out and for the feedback!

What I'll do is create a very minimal PR with no dependencies at all other than the models themselves. I'll also use a MediaPipe head tracker instead of what I'm using now that way all the models are by the same vendor. I'll keep the mouse mode on and make handsfree mode opt-in...you'll be able to enable head tracking, hand tracking, the mouse, or any combination of them.

I'll use the same code style to keep everything consistent, and the head/hand tracking isolated so that you can easily pick which one to keep etc. I started a fork because I want to experiment with some other ideas I have, but the PR I submit will be minimal. I'll start small and build up. I definitely intend to go very far with this even if it's with the fork so if the PR doesn't make the cut it's no big deal because the code will still be used!

I should have a good PR ready with benchmarks in 1-3 weeks and I'll try not to make it overwhelming 😅
Please feel free to close this issue, have a great vacation and thanks for your time!

@takahirox takahirox changed the title Are you accepting PRs for headtracking and if so what are the guidelines? Head tracking support Dec 28, 2020
@ghost
Copy link
Author

ghost commented Dec 29, 2020

Hi! So I got both head and hand tracking (just recognition, doesn't move the controllers yet) working together now and wanted to share some notes in case it helps you with your implementation. Because of the way the models fetch their weights and because of how you need to grab the webcam stream, it was not trivial to understand which context things need to be loaded in

My new workflow is:

Untitled Diagram

  • Load all models in the background script (it did not seem to work in a panel, since the models fetch the weights from the same context that the initial script was loaded in)
  • Create an options page to ask the user for the initial webcam permission. This permission is then bound to the Chrome Extension itself, and not the individual websites (it did not seem to work in a popup page at all). It's only required once
  • Add a "Start/Stop" webcam to the popup page, which tells the background page to start/stop. This could also be done inside the Panel and probably where you'll want it
  • Use the backgrounds script to send pose information to the panel script

Some warnings:

  • Loading the models in the content script is way easier, but then the user needs to approve webcam permission for individual websites. This could have unintended privacy consequences as the permission is now active for that domain even if you uninstall the extension
  • There should be some indicator that the webcam is on even if the devtools is not open. For example, a little red badge on the extension button that says "On" or something
  • Keeping the webcam and models in the background seems safest as it's totally isolated from every other context including the DevTools

Edit: Oh the start/stop button doesn't need to be a popup, it could work in the panel too. That's just how I first tried it

@takahirox
Copy link
Contributor

Thanks for sharing the thoughts. Let me try to make an easy prototype here to understand the limitations well...

@takahirox
Copy link
Contributor

takahirox commented Dec 31, 2020

So... I started to make a prototype and encountered the permission problem for capturing webcam in the extension. navigator.mediaDevices.getUserMedia({video: true}) failed in Devtools panel, Popup, or Background. But it works in content-script.

@midiblocks

Would you please elaborate the following for me? Do you think of grabbing the webcam stream in Background (by getting a permission in a option page)?

Create an options page to ask the user for the initial webcam permission. This permission is then bound to the Chrome Extension itself, and not the individual websites (it did not seem to work in a popup page at all). It's only required once

@ghost
Copy link
Author

ghost commented Jan 1, 2021

Yes grabbing the webcam in the options page is how I'm currently doing it. You only need to do this once so that the browser requests permission, but then you can do it from background page once permission is approved. You can do it in content page but there are some risks. Sorry that this is a long response but I hope it helps:

Options page

My options page just has some text with a button (you don't need a button tho, you can just getUserMedia automatically). This the code in my options page:

/**
 * Start the media stream
 */
document.querySelector('#handsfree-approve').addEventListener('click', function () {
  navigator.mediaDevices.getUserMedia({
    audio: false,
    video: true
  })
  .then(() => {
    // Here I set a flag to let us know if we've captured the stream or not...
    chrome.storage.local.set({hasCapturedStream: true})
    window.close()
  })
  .catch((err) => {
    // ...but on webcam error, or if user denies, then we set to false
    chrome.storage.local.set({hasCapturedStream: false})
    alert(`🚨 ${err}
    
Please fix the error above and try again.`)
  })
})

Panel

From the panel page, you could have a button that when pressed checks the hasCapturedStream. If it is false, then open the options page, otherwise send a message to background page to start the webcam.

This is the code I use in my popup page, but this should work in Panel page too. If not, just use the correct API to communicate to background:

/**
 * Start the webcam
 * - If the user hasn't approved permissions yet, then visit the options page first
 */
document.querySelector('#start-button').addEventListener('click', () => {
  chrome.storage.local.get(['hasCapturedStream'], (data) => {
    // If we captured the webcam, tell background to start webcam
    if (data.hasCapturedStream) {
      // tell background page to start webcam
      chrome.runtime.sendMessage({action: 'handsfreeStart'})
      // this is just a method that adds classes to panel page to hide the start button and show a stop button
      setHandsfreeState(true)
    } else {
      // Open options page if the webcam stream hasn't been approved yet
      chrome.runtime.openOptionsPage()
    }
  })
})

Background

I use Handsfree.js for all this, but this is where you would put your model logic:

chrome.runtime.onMessage.addListener(function(message, sender, respond) {
  switch (message.action) {
    // Start
    case 'handsfreeStart':
      // ... start your model here
      break
   }
})

Hope this helps! Once you get hand tracking working I'll make a PR for the head tracking, that way I can match my head tracking code to your hand tracking.

Also, if you need to debug the webcam or model or want to show it to the user then you can use the Picture in Picture API to "pop it out" of the background page. This should help you with testing too

@takahirox
Copy link
Contributor

takahirox commented Jan 1, 2021

Thanks for sharing the notes. But navigator.mediaDevices.getUserMedia() failed in an options page (due to the permission error) here. Am I missing something?

manifest.json

...
"options_ui": {
  "page": "src/extension/options.html"
},
...

src/extension/options.html

<script src="options.js"></script>

src/extension/options.js

navigator.mediaDevices.getUserMedia({video: true}).then(stream => {
  console.log(stream);
}).catch(error => {
  console.error(error);  
});

Result in an options page

error: object DOMException

@ghost
Copy link
Author

ghost commented Jan 1, 2021

Hi this is really embarrassing, but I just realized that I have been working in Chrome recently 😅 I started out in Firefox but when I switched over to my other computer I must have continued in Chrome without realizing.

Can you try adding to your manifest open_in_tab to your manifest:

{
  "options_ui": {
    "page": "src/extension/options.html",
    "open_in_tab": true
  }
}

That should get rid of the DOMException. If not, I'll take a look first thing in the morning. Sorry about that 😅

Edit: It looks like I might need to debug this workflow a bit more on Firefox, I'll report back as soon as I find something!

@takahirox
Copy link
Contributor

It works, thanks! Let me continue the trial...

@takahirox takahirox added the enhancement New feature or request label Jan 1, 2021
@takahirox
Copy link
Contributor

takahirox commented Jan 2, 2021

@midiblocks

Would you please share your repo/fork of the extension for me if possible? Because loading the mediapipe files in background fails here. I want to refer to your configuration.

@ghost
Copy link
Author

ghost commented Jan 2, 2021

Hi yes, I have everything documented here: https://github.com/MIDIBlocks/handsfree-browser#development-guide

I changed the folder structure around a bit so that I could keep the webxr and my stuff separate, and I don't think I modified any of the existing files at all other than to move them

You may need to manually tell mediapipe to manually use local files (it tries to load from a CDN by default which may cause it to fail). It's a little bit complex to do this manually which is one of the issues Handsfree.js solves, but you'll need to specify the locateFile property when instantiating the Hands model:

const PATH_TO_MEDIA_PIPE_ASSETS = '/assets/...'
Hands({locateFile: file => {
  return `${PATH_TO_MEDIA_PIPE_ASSETS}/${file}`
}})

You may also need to polyfill requestanimationframe, here's what I'm doing:

/**
 * Override requestAnimationFrame, which doesn't work in background script
 */
// store a backup
_requestAnimationFrame = requestAnimationFrame
// override it to use setTimeout
requestAnimationFrame = function(cb) {
  setTimeout(function() {
    cb()
  }, 1000 / 30)
}

Oh! And I still haven't checked this on Firefox, I apologize for goofing up on that I'll get this working there soon. I hope that doesn't make things too confusing 😆

@takahirox
Copy link
Contributor

takahirox commented Jan 14, 2021

Thanks @midiblocks for the help.

Finally the prototype starts to work now.

https://twitter.com/superhoge/status/1349560469837672452

I realized the workflow you suggested seems the best approach. I'll test on multiple platforms and then clean up and merge the prototype as soon as possible. Once I merge it, you can start to work on the head tracking support for the extension.

@ghost
Copy link
Author

ghost commented Jan 14, 2021

This is so great, I'll start on the head tracking this weekend!

@takahirox
Copy link
Contributor

Still WIP but I made a branch for XR Hand input support https://github.com/MozillaReality/WebXR-emulator-extension/tree/Hand. I will remove it if I merge.

@takahirox
Copy link
Contributor

@midiblocks

If user clicks a "Start WebCam" button in the panel, the panel sends a message to the background via port and the background gets user media and invokes video.requestPictureInPicture() for debug purpose.

But video.requestPictureInPicture() fails because of "Must be handling a user gesture if there isn't already an element in Picture-in-Picture." error. It seems needs to be initiated via user gesture and user gesture (click the button) in panel doesn't seem to be propagated to the background(?).

Did you see the same issue? And do you know a solution? I don't think we can't put any UI in the background which user can touch.

(A weird thing is video.requestPictureInPicture() sometimes succeeds.)

@ghost
Copy link
Author

ghost commented Jan 28, 2021

Hi @takahirox I did run into this issue, but I don't remember how I fixed it. I believe the issue might be related to the click gesture being lost due to a Promise or Async when you use port.postMessage. Have you tried sending the message to the background with chrome.runtime.sendMessage (or browser.runtime.sendMessage) instead?

Here is some pseudocode that might help:

Inside dev tools script:

// $el is the button you use to start webcam
$el.start.addEventListener('click', () => {
  chrome.runtime.sendMessage({action: 'startWebcamWithPIP'})
})

Inside background script:

// Lets add an empty canvas into the background page to contain our webcam and debug info
const $pipCanvas = document.createElement('CANVAS')
document.body.appendChild($pipCanvas)
const pipContext = $pipCanvas.getContext('2d')

// Let's also create an empty video tag which we will PiP
const $videoPip = document.createElement('VIDEO')
document.body.appendChild($videoPip)

chrome.runtime.onMessage.addListener(function(message, sender, respond) {
  switch (message.action) {
    case 'startWebcamWithPIP':
      // Here we make the video source the canvas
      $videoPip.srcObject = $pipCanvas.captureStream()

      // When the video source is received play the webcam
      $videoPip.onloadedmetadata = () => {
        $videoPip.play()
      }

      // When the video starts, immediately do PiP
      $videoPip.onplay = () => {
        $videoPip.requestPictureInPicture()
      }

    return
  }
})

Here $videoPip is an empty video element, NOT the webcam. What I do is draw the webcam into a canvas, draw the debug wireframes on top of it, and then use that canvas as a src for the empty video. So really, you have in your background page is 2 video tags: the one containing the webcam, and a separate one you do PiP with.

I think this workflow will preserve the gesture. Let me know if it works and I'll take a look, I still haven't tested this in Firefox but it's working in Chrome for me

@takahirox
Copy link
Contributor

Thanks for the advice.

I figured out that it seems it takes too long time from the time a button is pressed in the panel to the time requestPictureInPicture() is invoked in the background because I tried setup WebCam video and then setup PIP video in one onMessage event listener callback.

So I added a separated "Start PIP" button from "Start WebCam" button in the panel. It works without the exception now.

@takahirox
Copy link
Contributor

BTW I renamed the Hand branch to HandWIP

https://github.com/MozillaReality/WebXR-emulator-extension/tree/HandWIP

@takahirox
Copy link
Contributor

@midiblocks

I started to check the extension functionality on Firefox. Does "Option" page hack work on your Firefox? It doesn't work on my Firefox. Even though I get the permission for webcam in option page, the permission doesn't seem to be propagated to the background on Firefox.

@cknowles-admin cknowles-admin added the ARCHIVED CLOSED at time of archiving label Jul 22, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
ARCHIVED CLOSED at time of archiving enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants