Code related clarifications #3

poornimajd · 2020-06-23T11:31:45Z

Hello @hurjunhwa and team!
I am using this code to train on my own dataset.In regard to this I had a few questions as follows:
I wanted to know how did you get the scaling of 0.54 in the following code:

self-mono-sf/losses.py

Line 141 in c31a1d2

    
           depth = k_value.unsqueeze(1).unsqueeze(1).unsqueeze(1) * 0.54 / (disp + (1.0 - mask))

Also in the below mentioned line of code dividing the disp by 256 is only specific to kitti dataset right?:

self-mono-sf/datasets/common.py

Line 86 in c31a1d2

disp_np = io.imread(disp_file).astype(np.uint16) / 256.0

Regarding the flow,I am not sure exactly what these numbers in the following line mean,are they to be changes while using a different dataset?

self-mono-sf/utils/flow.py

Line 35 in c31a1d2

flow_u = np.clip((u * 64 + 2 ** 15), 0.0, 65535.0).astype(np.uint16)

And I was able to visualize the sceneflow but I am not sure how to validate it,because I do not have the ground truth flow nor the disparity.Can you please help me out in this case?
It was earlier mentioned that this line of code gives the scene flow:

self-mono-sf/losses.py

Line 481 in 5f4e079

    
           out_sceneflow = interpolate2d_as(output_dict['flow_f_pp'][0], input_l1, mode="bilinear")

But I am unsure of what each value in this output sceneflow represent,Is it the normalized x,y and z coordinate of the motion vector?
Can you please help me understand this line of code which gives the output for the sceneflow?
Any help is greatly appreciated!
Thank you

hurjunhwa · 2020-06-23T12:08:53Z

Hi,

This is the baseline distance between two cameras in the stereo rig, 0.54 (m).
and 3.
Yes, these are only specific to KITTI dataset.
They store the disparity or flow in uint16 type.
After loading, the disparity & flow is in the pixel unit.

The output scene flow is defined in the meter scale (m), not normalized.
It's not easy to validate if you don't have any ground truth or pseudo ground truth of your dataset.
As a sanity check, you may check whether the source image and the warped target image (using the estimated disparity and scene flow) look similar.
If you know the camera intrinsics of your datasets, you may adjust the scale of the estimation accordingly by referring the KITTI's intrinsics.

Best,
Jun

poornimajd · 2020-06-23T12:44:02Z

Thanks for the quick and detailed reply!

3\. The output scene flow is defined in the meter scale (m) world coordinate, not normalized.

Also a small clarification on this
For example if the output of out_sceneflow is the following

`tensor([[[[-0.3044, -0.3008, -0.2973,  ...,  0.4740,  0.4754,  0.4768],
          [-0.3038, -0.3004, -0.2970,  ...,  0.4728,  0.4741,  0.4753],
          [-0.3032, -0.3000, -0.2967,  ...,  0.4716,  0.4727,  0.4738],
          ...,
          [-0.1817, -0.1817, -0.1817,  ...,  0.1915,  0.1919,  0.1922],
          [-0.1817, -0.1817, -0.1818,  ...,  0.1915,  0.1919,  0.1923],
          [-0.1816, -0.1817, -0.1818,  ...,  0.1916,  0.1920,  0.1924]],

         [[-0.0584, -0.0579, -0.0574,  ..., -0.0122, -0.0131, -0.0140],
          [-0.0585, -0.0579, -0.0574,  ..., -0.0113, -0.0123, -0.0132],
          [-0.0585, -0.0579, -0.0574,  ..., -0.0104, -0.0114, -0.0124],
          ...,
          [ 0.0720,  0.0713,  0.0706,  ...,  0.0674,  0.0678,  0.0682],
          [ 0.0723,  0.0716,  0.0708,  ...,  0.0677,  0.0681,  0.0685],
          [ 0.0725,  0.0718,  0.0711,  ...,  0.0679,  0.0683,  0.0687]],

         [[-1.0730, -1.0736, -1.0742,  ..., -0.8998, -0.8987, -0.8977],
          [-1.0746, -1.0751, -1.0757,  ..., -0.9013, -0.9002, -0.8992],
          [-1.0761, -1.0766, -1.0771,  ..., -0.9027, -0.9017, -0.9007],
          ...,
          [-1.2378, -1.2389, -1.2400,  ..., -1.1398, -1.1389, -1.1380],
          [-1.2371, -1.2383, -1.2395,  ..., -1.1393, -1.1384, -1.1375],
          [-1.2364, -1.2376, -1.2389,  ..., -1.1388, -1.1379, -1.1369]]]],
       device='cuda:0')`

and with size- ([1, 3, 370, 1226])
Then the x coordinate (in meters) is -0.3044, similarly y(in meters) is -0.0584, z(in meters) is -1.0730 right,This means the object has moved by this value in a particular direction as compared to the previous frame right?
Sorry for the too much in depth analysis.
Any suggestion is appreciated!

hurjunhwa · 2020-06-23T12:53:12Z

Hi,

Yes, that's right.
You can find the definition of the coordinate and calibration information in their paper or devkit in the dataset web page.
https://www.mrt.kit.edu/z/publ/download/2013/GeigerAl2013IJRR.pdf
(Fig. 1, red-colored coordinate)
No worries!

rohaldb · 2021-07-07T03:34:50Z

I am running this on a custom dataset, and I'm confused about this line:

self-mono-sf/losses.py

Line 141 in c31a1d2

    
           depth = k_value.unsqueeze(1).unsqueeze(1).unsqueeze(1) * 0.54 / (disp + (1.0 - mask))

If during evaluation we supply the model with only monocular images, are we able to leave this value of 0.54? It doesn't really make sense to set it to the value between two cameras when there is only one.

Thanks!

hurjunhwa · 2021-07-07T09:32:49Z

Hi,
Yes, you can leave it out when testing on a custom dataset.
Then, of course, the scale of the output depth and scene flow is unknown.
It's only for the KITTI dataset to recover scales of depth and scene flow, given the camera intrinsic and the stereo baseline distance.

rohaldb · 2021-07-07T11:55:22Z

Thanks so much for the speedy reply!

Just to clarify, it would be unknown up to scale and shift, not just scale, correct? Apologies if this is a trivial question, my graphics/vision is not so strong!

hurjunhwa · 2021-07-07T14:27:54Z

Yes, you are right :). both scale and shift.

By the way probably as you may know, at CVPR this year, there is a very nice paper that recovers scale, shift, including the focal length: Learning to Recover 3D Scene Shape from a Single Image
It would be also interesting to read!

hurjunhwa closed this as completed Jul 10, 2020

rohaldb mentioned this issue Jul 26, 2021

Scene flow of static background visinf/multi-mono-sf#4

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Code related clarifications #3

Code related clarifications #3

poornimajd commented Jun 23, 2020

hurjunhwa commented Jun 23, 2020 •

edited

Loading

poornimajd commented Jun 23, 2020 •

edited

Loading

hurjunhwa commented Jun 23, 2020

rohaldb commented Jul 7, 2021

hurjunhwa commented Jul 7, 2021

rohaldb commented Jul 7, 2021

hurjunhwa commented Jul 7, 2021

Code related clarifications #3

Code related clarifications #3

Comments

poornimajd commented Jun 23, 2020

hurjunhwa commented Jun 23, 2020 • edited Loading

poornimajd commented Jun 23, 2020 • edited Loading

hurjunhwa commented Jun 23, 2020

rohaldb commented Jul 7, 2021

hurjunhwa commented Jul 7, 2021

rohaldb commented Jul 7, 2021

hurjunhwa commented Jul 7, 2021

hurjunhwa commented Jun 23, 2020 •

edited

Loading

poornimajd commented Jun 23, 2020 •

edited

Loading