New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Posenet estimation requires explicit setting of the `width` and `height` properties on the input object. #322

Open
FenrirWillow opened this Issue May 22, 2018 · 11 comments

Comments

Projects
None yet
6 participants
@FenrirWillow
Copy link

FenrirWillow commented May 22, 2018

TensorFlow.js version

"@tensorflow-models/posenet": "^0.0.1"
"@tensorflow/tfjs": "0.10.3"

Browser version

Google Chrome | 66.0.3359.181 (Official Build) (64-bit)
-- | --
Revision | a10b9cedb40738cb152f8148ddab4891df876959-refs/branch-heads/3359@{#828}
OS | Mac OS X
JavaScript | V8 6.6.346.32

Describe the problem or feature request

When trying to implement the Posenet model, I have encountered an interesting problem. Passing an HTMLVideoElement into estimateSinglePose without it explicitly having a width and height property set results in this error:

Uncaught (in promise) Error: Requested texture size [0x0] is invalid.
    at Object.validateTextureSize (es6.promise.js:287)
    at createAndConfigureTexture (es6.promise.js:287)
    at Object.createMatrixTexture (es6.promise.js:287)
    at GPGPUContext.createMatrixTexture (es6.promise.js:287)
    at TextureManager.acquireTexture (es6.promise.js:287)
    at MathBackendWebGL.uploadToGPU (util.js:264)
    at MathBackendWebGL.getTexture (promise.js:8)
    at MathBackendWebGL.fromPixels (es7.promise.finally.js:21)
    at Engine.fromPixels (_perform.js:8)
    at ArrayOps.fromPixels (_uid.js:6)

I suspect this API uses tf.fromPixels somewhere underneath (the signature looks eerily similar) and that API expects to read the width and height from the object itself.

My question would be is this intended? Should this be documented somewhere (unless it is already and I have missed it)? Should the type interface force those properties to be set somehow?

Code to reproduce the bug / link to feature request

<html>
<body>
  <script src="./index.js"></script>
  <div>
    <video id="video-stream" style="width: 600px; height: 600px;">
  </div>
</body>
</html>
  const net = await Posenet.load(1.01);

  const mediaStream = await navigator.mediaDevices.getUserMedia({
    audio: false,
    video: {
      height: 600,
      width: 600,
      facingMode: 'user',
    }
  });

  const videoElement = document.getElementById('video-stream');

  // If the below two lines are missed out, the error appears!
  // videoElement.width = 600;
  // videoElement.height = 600;

  videoElement.srcObject = mediaStream;
  videoElement = await new Promise((resolve, reject) => {
    videoElement.onloadedmetadata = () => resolve(videoElement);
  });
  videoElement.play();

  // The next line throws if the object does not have `width` and `height` explicitly set!
  const pose = await net.estimateSinglePose(videoElement);
@ericdnielsen

This comment has been minimized.

Copy link
Member

ericdnielsen commented May 23, 2018

@oveddan

This comment has been minimized.

Copy link

oveddan commented May 23, 2018

@FenrirWillow thanks for the detailed explanation. You need to give the element an explicit width and height via attributes. Otherwise, Tensorflow.js, and PoseNet, would have no way of knowing what the width and height of the element are, without doing something like getBoundingClientRect. PoseNet needs this to properly scale the image before feeding it through the network.

This should be documented, or we should throw an error if the width and height of the element is not known.

@oveddan

This comment has been minimized.

Copy link

oveddan commented May 23, 2018

@nsthorat should we add assertions that the input image or video has a width and height?

@nsthorat

This comment has been minimized.

Copy link
Collaborator

nsthorat commented May 23, 2018

Yeah I think that makes sense! It wouldn't make sense for us to actually add a width and height, I think that would be unexpected behavior.

@FenrirWillow

This comment has been minimized.

Copy link

FenrirWillow commented May 25, 2018

Thanks for getting back so quickly! Do you need help adding this? I am unsure of where to start, but if you point me in the right direction I am sure I can knock out an assert or two.

@oveddan

This comment has been minimized.

Copy link

oveddan commented May 29, 2018

@FenrirWillow sure if you want to add the assertions that would be great. I would add next to the existing assertions in estimateSinglePose and estimateMultiplePoses

@FenrirWillow

This comment has been minimized.

Copy link

FenrirWillow commented May 29, 2018

I have opened a new PR ready for review in the models repository: tensorflow/tfjs-models#23

@RadEdje

This comment has been minimized.

Copy link

RadEdje commented May 30, 2018

Thank you for this thread. This error was driving me crazy. I new I had already set my dimensions in the CSS file but it was still throwing the error.

I've noticed something similar with anything that uses the webcam.

I used the class Webcam from the emoji scavenger hunt github repo with this code:

// for the webcam to produce tensor
class Webcam {

    // class Webcam {

    /**
     * @param {HTMLVideoElement} webcamElement A HTMLVideoElement representing the webcam feed.
     */
    constructor(webcamElement) {
        this.webcamElement = webcamElement;
    }

    /**
     * Captures a frame from the webcam and normalizes it between -1 and 1.
     * Returns a batched image (1-element batch) of shape [1, w, h, c].
     */
    capture() {
        return tf.tidy(() => {
            // Reads the image as a Tensor from the webcam <video> element.
            const webcamImage = tf.fromPixels(this.webcamElement);

            // Crop the image so we're using the center square of the rectangular
            // webcam.
            const croppedImage = this.cropImage(webcamImage);

            // Expand the outer most dimension so we have a batch size of 1.

            const batchedImage = croppedImage.expandDims(0);
            // const batchedImage = croppedImage;


            // Normalize the image between -1 and 1. The image comes in between 0-255,
            // so we divide by 127 and subtract 1.
            return batchedImage.toFloat().div(tf.scalar(127)).sub(tf.scalar(1));
        });
    }

    /**
     * Crops an image tensor so we get a square image with no white space.
     * @param {Tensor4D} img An input image Tensor to crop.
     */
    cropImage(img) {
        const size = Math.min(img.shape[0], img.shape[1]);
        const centerHeight = img.shape[0] / 2;
        const beginHeight = centerHeight - (size / 2);
        const centerWidth = img.shape[1] / 2;
        const beginWidth = centerWidth - (size / 2);
        return img.slice([beginHeight, beginWidth, 0], [size, size, 3]);
    }

    /**
     * Adjusts the video size so we can make a centered square crop without
     * including whitespace.
     * @param {number} width The real width of the video element.
     * @param {number} height The real height of the video element.
     */
    adjustVideoSize(width, height) {
        const aspectRatio = width / height;
        if (width >= height) {
            this.webcamElement.width = aspectRatio * this.webcamElement.height;
        } else if (width < height) {
            this.webcamElement.height = this.webcamElement.width / aspectRatio;
        }
    }

    async setup() {
        return new Promise((resolve, reject) => {
            const navigatorAny = navigator;
            navigator.getUserMedia = navigator.getUserMedia ||
                navigatorAny.webkitGetUserMedia || navigatorAny.mozGetUserMedia ||
                navigatorAny.msGetUserMedia;
            if (navigator.getUserMedia) {
                navigator.getUserMedia({
                        video: true
                    },
                    stream => {
                        this.webcamElement.srcObject = stream;
                        this.webcamElement.addEventListener('loadeddata', async () => {
                            this.adjustVideoSize(
                                this.webcamElement.videoWidth,
                                this.webcamElement.videoHeight);
                            resolve();
                        }, false);
                    },
                    error => {
                        document.querySelector('#no-webcam').style.display = 'block';
                    });
            } else {
                reject();
            }
        });
    }



}

It kept giving me the same error of

Uncaught (in promise) Error: Requested texture size [0x0] is invalid.

even when I already set the values for the video element dimensions to 224px for both height and width in the style.css file.

It was only when I wrote the height and width values inline (in the actual HTML) that the error went away.
Could be the javascript async function reads the html file way faster and before the CSS file has time to load so tensorflow sees no declared dimensions when it predicts?

This might be a problem since most frontEnd developers style or declare element sizes in the css file? and not inline. Just thought I'd share my experience. It was driving me crazy for awhile since I new I set the dimensions already via CSS. Luckily I found this thread and also declared the dimensions inline. It works now. Thanks!

@nukadelic

This comment has been minimized.

Copy link

nukadelic commented Oct 21, 2018

I was facing the similar issue, adding a simple delay solved the issue for me since waiting a bit will let the video element to set the proper dimensions like so:

    this.videoElement = document.querySelector( querySeletor ) as HTMLVideoElement;

    let resulution_vga:MediaTrackConstraints = { width: { exact: 640 }, height: { exact: 480 } };
    var self = this;
    var msc: MediaStreamConstraints = { video: resulution_vga, audio: false };

    if (navigator.mediaDevices.getUserMedia) {
      navigator.mediaDevices
        .getUserMedia(msc)
        .then(function (stream: MediaStream) { 
          const vid = self.videoElement;
          // /!\ capture video once it got none zero resolution /!\
          function inputDetection()
          {
            if( vid.videoWidth === 0 || vid.videoHeight < 1 )
            {
              setTimeout(inputDetection, 100 );
              return;
            }
            self.width = vid.width = vid.videoWidth;
            self.height = vid.height = vid.videoHeight;
            // self.adjustVideoSize( 416,416 );
            self.isReady = true;
            if (onReady) onReady();
          }
          
          self.videoStream = stream;
          // vid.videoTracks;
          vid.srcObject = stream;
          vid.onloadedmetadata = () => {
            inputDetection();
          };
        })
        .catch(function (err0r) {
          console.log(err0r);
          alert("No camera found");
        });
    }
@nsthorat

This comment has been minimized.

Copy link
Collaborator

nsthorat commented Oct 26, 2018

@oveddan do you think you could throw a nice error when the width / height attributes of the video tag are not set?

@oveddan

This comment has been minimized.

Copy link

oveddan commented Oct 26, 2018

Sure, will do.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment