What's basetransf matrix used for? #6

Qingcsai · 2022-05-05T12:11:22Z

Hi, I have a little question when I apply my own data using this code.
What's the self.basetransf matrix used for as in multiviewvideo.py do?
I find this 3x4 matrix is applied for all camera poses and all the frametransf, but wht's the purpose for it? :)

mvp/data/multiviewvideo.py

Lines 410 to 415 in d758f53

    
           if "camera" in self.keyfilter: 
        
               result["campos"] = np.dot(self.basetransf[:3, :3].T, self.campos[cam] - self.basetransf[:3, 3]) 
        
               result["camrot"] = np.dot(self.basetransf[:3, :3].T, self.camrot[cam].T).T 
        
               result["focal"] = self.focal[cam] 
        
               result["princpt"] = self.princpt[cam] 
        
               result["camindex"] = self.allcameras.index(cam)

mvp/data/multiviewvideo.py

Lines 385 to 387 in d758f53

    
           result[k] = to4x4(np.dot( 
        
               np.linalg.inv(to4x4(frametransf)), 
        
               to4x4(self.basetransf))[:3, :4])

Besides, I find it necessary to apply this basetransf, because when I change it to an eyes matrix, it didn't converge during training. So how to get my own basetransf?

Your answer will help me a lot!
Thank you!

The text was updated successfully, but these errors were encountered:

stephenlombardi · 2022-05-05T19:58:15Z

Hi,

The purpose of basetransf is to center the object of interest. The reason why we need it is because the coordinate frame (i.e., extrinsics) of the cameras may not have the object at the origin of the world. For example, in our camera calibration process one of the cameras is selected as the world origin (0,0,0) after calibration (in other words, one of the cameras is the origin of the coordinate frame). However, the raymarching code shoots rays from the camera and intersects them with the axis-aligned box from [-1,1]^3. For that reason, we need basetransf, which transforms the camera locations/orientations into a new coordinate frame where object the cameras are pointed at are at the world origin (0,0,0). There's also a parameter called "volradius" which is the radius of this axis-aligned bounding box [-1,1]^3 in world space. In other words, it scales the coordinate frame down so that the object fits in [-1,1]^3.

To determine basetransf for your own data there's a few things you can try. The simplest would be to just average the camera positions and put it into the last column of basetransf with the identity matrix along the diagonal (i.e.,
[1., 0., 0., cam_average[0]],
[0., 1., 0., cam_average[1]],
[0., 0., 1., cam_average[2]]).
Note that this will only work well if your cameras are located on a sphere and all pointed inward to the same point. In general, you want the last column of basetransf to be the object position in the camera coordinate frame. Let me know if this helps.

Qingcsai · 2022-05-06T17:34:31Z

Thanks! It helps a lot, my cameras are located on a sphere with about 100+ views, and after I transl the scene/object to world origin (0,0,0), I find it not converge for some other reasons, I am checking for it.

Qingcsai · 2022-05-11T05:21:21Z

Hi @stephenlombardi,
I am still getting some troubles with your code apply at interhand dataset, could you help me?
The cameras are located on a sphere with about 100+ views, after I set basetransf with the identity matrix along the diagonal as you said and the translation are one of the cameras to keep the cameras are pointed at the world origin (0,0,0).

And in the simplest cases I use the neural volumn setting rather than mvp settings.

However I find it converge so slow during training.

Here is the result of rendering after training 5000 epoch, with batchsize=16, volradius=512:

render_ROM0_None.mp4

As you can see, the volumn is distributed everywhere rather than on a tight area, which I think is abnormal.

The images are 512*334 size, am I setting the right volradius=512? Because I see the dryice1 data is 667*1024 size and setting the volradius=256.

Qingcsai · 2022-05-17T12:44:18Z

Oh, It's is training normally for nv setting right now after setting the right camera params. Closed as marked.

TinBacon · 2022-11-10T00:34:08Z

Oh, It's is training normally for nv setting right now after setting the right camera params. Closed as marked.

hello, i have a same problem with u, could u tell me which camera parameters u have set that make everything right?

thanks very much!

Qingcsai closed this as completed May 17, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What's basetransf matrix used for? #6

What's basetransf matrix used for? #6

Qingcsai commented May 5, 2022 •

edited

stephenlombardi commented May 5, 2022

Qingcsai commented May 6, 2022

Qingcsai commented May 11, 2022 •

edited

Qingcsai commented May 17, 2022 •

edited

TinBacon commented Nov 10, 2022 •

edited

What's basetransf matrix used for? #6

What's basetransf matrix used for? #6

Comments

Qingcsai commented May 5, 2022 • edited

stephenlombardi commented May 5, 2022

Qingcsai commented May 6, 2022

Qingcsai commented May 11, 2022 • edited

Qingcsai commented May 17, 2022 • edited

TinBacon commented Nov 10, 2022 • edited

Qingcsai commented May 5, 2022 •

edited

Qingcsai commented May 11, 2022 •

edited

Qingcsai commented May 17, 2022 •

edited

TinBacon commented Nov 10, 2022 •

edited