Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regarding custom data generation #325

Open
ArghyaChatterjee opened this issue Oct 23, 2023 · 4 comments
Open

Regarding custom data generation #325

ArghyaChatterjee opened this issue Oct 23, 2023 · 4 comments

Comments

@ArghyaChatterjee
Copy link

ArghyaChatterjee commented Oct 23, 2023

Hello,

I was trying to generate dataset for centerpose using your (dope) pipeline. There are 4 problems that I am facing.

  1. I have taken some images at different exposures to merge and generate some hdr images which represents my lab's background. Now, my hdr images are 1920 x 1080 in resolution. The images that I am generating using your pipeline is 1280 x 720 in resolution. Now when I try to generate a training dataset with the main image and the distractors, I can see the images taken and the corresponding annotated dataset were generated but the background images are zoomed in (which is not representative of the original background). I have tried to change the position of the camera and the fov, but that results into distorting the image itself. How can I change this thing ?

Normal (Without changing anything in your script, auto zoomed in background which is a problem):
00000

Camera Eye changed (from 'eye':visii.vec3(0,0,0) to 'eye':visii.vec3(0,0,-2), looks distorted):

random_camera_movement = {
    'at':visii.vec3(1,0,0),
    'up':visii.vec3(0,0,1),
    'eye':visii.vec3(0,0,-2)
}

Screenshot from 2023-10-23 15-48-04

Camera fov changed to 2 (from default 0.78 to 2, looks distorted):

    camera = visii.entity.create(
    name = "camera",[00000](https://github.com/NVlabs/Deep_Object_Pose/assets/28845357/0aaef259-8281-4379-8d5b-d2fce87e9eb9)
    transform = visii.transform.create("camera"),
    camera = visii.camera.create_perspective_from_fov(
    name = "camera",
    field_of_view = 1.5,
    aspect = float(opt.width)/float(opt.height)
    )

00000

  1. In the original objectron dataset that centerpose is trained on, it contains keypoints 3d and scale of object in the corresponding annotated json file. As dope doesn't need that information, you haven't included that inside the nvisii interface. Can you tell me how to generate the information for centerpose dataset ??
    Here is how the json file looks like for dope:
{
    "camera_data": {
        "camera_look_at": {
            "at": [
                1.0,
                0.0,
                0.0
            ],
            "eye": [
                0.0,
                0.0,
                0.0
            ],
            "up": [
                0.0,
                0.0,
                1.0
            ]
        },
        "camera_view_matrix": [
            [
                0.0,
                0.0,
                1.0,
                0.0
            ],
            [
                -1.0,
                0.0,
                0.0,
                0.0
            ],
            [
                0.0,
                -1.0,
                0.0,
                0.0
            ],
            [
                0.0,
                0.0,
                0.0,
                1.0
            ]
        ],
        "height": 1920,
        "intrinsics": {
            "cx": 640.0,
            "cy": 960.0,
            "fx": 2317.6455078125,
            "fy": 2317.6455078125
        },
        "location_worldframe": [
            -0.0,
            0.0,
            -0.0
        ],
        "quaternion_xyzw_worldframe": [
            -0.5,
            0.5,
            -0.5,
            0.5
        ],
        "width": 1280
    },
    "objects": [
        {
            "bounding_box_minx_maxx_miny_maxy": [
                764,
                1029,
                493,
                725
            ],
            "class": "Sony_Acid_Music_Studio",
            "local_cuboid": null,
            "local_to_world_matrix": [
                [
                    0.3346782624721527,
                    -0.293707937002182,
                    -0.8953915238380432,
                    -0.0
                ],
                [
                    0.9418398141860962,
                    0.13497743010520935,
                    0.30776405334472656,
                    -0.0
                ],
                [
                    0.030464906245470047,
                    -0.9463173747062683,
                    0.3217999041080475,
                    -0.0
                ],
                [
                    1.9207875728607178,
                    -0.12306323647499084,
                    0.2600710093975067,
                    1.0
                ]
            ],
            "location": [
                0.12306323647499084,
                -0.2600710093975067,
                1.9207875728607178
            ],
            "location_worldframe": [
                1.9207875728607178,
                -0.12306323647499084,
                0.2600710093975067
            ],
            "name": "google_Sony_Acid_Music_Studio_0",
            "projected_cuboid": [
                [
                    1038.0235290527344,
                    655.3387069702148
                ],
                [
                    996.9047546386719,
                    491.82838439941406
                ],
                [
                    984.8480224609375,
                    486.7687225341797
                ],
                [
                    1025.5551147460938,
                    647.6087951660156
                ],
                [
                    815.9352874755859,
                    730.3356170654297
                ],
                [
                    769.0859222412109,
                    568.1963539123535
                ],
                [
                    760.9901428222656,
                    561.7803955078125
                ],
                [
                    807.2328186035156,
                    721.2976455688477
                ],
                [
                    900.2195739746094,
                    608.800220489502
                ]
            ],
            "provenance": "nvisii",
            "px_count_all": 0,
            "px_count_visib": 0,
            "quaternion_xyzw": [
                0.6266990900039673,
                0.30334094166755676,
                0.5110090374946594,
                0.5040854811668396
            ],
            "quaternion_xyzw_worldframe": [
                0.46848180890083313,
                0.34586817026138306,
                -0.4615582525730133,
                0.669226348400116
            ],
            "segmentation_id": 1,
            "visibility": 1
        },
        {
            "bounding_box_minx_maxx_miny_maxy": [
                257,
                461,
                837,
                1052
            ],
            "class": "Epson_DURABrite_Ultra_786_Black_Ink_Cartridge_T786120S",
            "local_cuboid": null,
            "local_to_world_matrix": [
                [
                    -0.36425772309303284,
                    0.1863223910331726,
                    -0.9124693870544434,
                    0.0
                ],
                [
                    -0.6184156537055969,
                    -0.7809761166572571,
                    0.08739950507879257,
                    0.0
                ],
                [
                    -0.6963321566581726,
                    0.5961212515830994,
                    0.39970117807388306,
                    -0.0
                ],
                [
                    1.6742818355560303,
                    0.14743097126483917,
                    -0.02730831876397133,
                    1.0
                ]
            ],
            "location": [
                -0.14743097126483917,
                0.02730831876397133,
                1.6742818355560303
            ],
            "location_worldframe": [
                1.6742818355560303,
                0.14743097126483917,
                -0.02730831876397133
            ],
            "name": "google_Epson_DURABrite_Ultra_786_Black_Ink_Cartridge_T786120S_2",
            "projected_cuboid": [
                [
                    222.03956604003906,
                    956.2545776367188
                ],
                [
                    255.54000854492188,
                    835.0365829467773
                ],
                [
                    291.5988540649414,
                    828.6186218261719
                ],
                [
                    258.3934783935547,
                    951.4541816711426
                ],
                [
                    404.1435241699219,
                    1057.3942565917969
                ],
                [
                    431.6315460205078,
                    943.2322311401367
                ],
                [
                    467.2838592529297,
                    938.7166213989258
                ],
                [
                    440.1404571533203,
                    1054.3155670166016
                ],
                [
                    349.8906707763672,
                    947.1115493774414
                ]
            ],
            "provenance": "nvisii",
            "px_count_all": 0,
            "px_count_visib": 0,
            "quaternion_xyzw": [
                -0.6319432854652405,
                -0.6699357032775879,
                0.37993085384368896,
                0.08652451634407043
            ],
            "quaternion_xyzw_worldframe": [
                -0.5042363405227661,
                0.21423150599002838,
                0.7976426482200623,
                0.2522238790988922
            ],
            "segmentation_id": 3,
            "visibility": 1
        },
        {
            "bounding_box_minx_maxx_miny_maxy": [
                754,
                1237,
                1787,
                1920
            ],
            "class": "STACKING_BEAR_V04KKgGBn2A",
            "local_cuboid": null,
            "local_to_world_matrix": [
                [
                    -0.1307300329208374,
                    0.9679588079452515,
                    0.21439549326896667,
                    -0.0
                ],
                [
                    0.49649521708488464,
                    -0.12326063215732574,
                    0.8592435121536255,
                    -0.0
                ],
                [
                    0.8581387996673584,
                    0.21877521276474,
                    -0.46447306871414185,
                    -0.0
                ],
                [
                    1.031076431274414,
                    -0.173092320561409,
                    -0.47199130058288574,
                    1.0
                ]
            ],
            "location": [
                0.173092320561409,
                0.47199130058288574,
                1.031076431274414
            ],
            "location_worldframe": [
                1.031076431274414,
                -0.173092320561409,
                -0.47199130058288574
            ],
            "name": "google_STACKING_BEAR_V04KKgGBn2A_3",
            "projected_cuboid": [
                [
                    694.1983795166016,
                    2238.947296142578
                ],
                [
                    1155.0393676757812,
                    2306.8888092041016
                ],
                [
                    1160.1544189453125,
                    1852.720069885254
                ],
                [
                    739.4944000244141,
                    1779.2024230957031
                ],
                [
                    747.8303527832031,
                    2240.9852600097656
                ],
                [
                    1244.3098449707031,
                    2314.351272583008
                ],
                [
                    1241.6778564453125,
                    1826.1781311035156
                ],
                [
                    791.5280151367188,
                    1746.397590637207
                ],
                [
                    974.1064453125,
                    2027.9118347167969
                ]
            ],
            "provenance": "nvisii",
            "px_count_all": 0,
            "px_count_visib": 0,
            "quaternion_xyzw": [
                -0.09103001654148102,
                0.25028812885284424,
                0.9598621129989624,
                -0.0879439264535904
            ],
            "quaternion_xyzw_worldframe": [
                0.603532075881958,
                0.6066181659698486,
                0.444273978471756,
                0.26530003547668457
            ],
            "segmentation_id": 4,
            "visibility": 1
        },
        {
            "bounding_box_minx_maxx_miny_maxy": [
                889,
                1106,
                132,
                444
            ],
            "class": "Nestle_Carnation_Cinnamon_Coffeecake_Kit_1913OZ",
            "local_cuboid": null,
            "local_to_world_matrix": [
                [
                    -0.30763235688209534,
                    -0.11167889088392258,
                    -0.9449289441108704,
                    0.0
                ],
                [
                    -0.3483346402645111,
                    0.9373666048049927,
                    0.0026190669741481543,
                    0.0
                ],
                [
                    0.8854524493217468,
                    0.32995718717575073,
                    -0.3272658884525299,
                    -0.0
                ],
                [
                    1.718860387802124,
                    -0.30652472376823425,
                    0.5503286123275757,
                    1.0
                ]
            ],
            "location": [
                0.30652472376823425,
                -0.5503286123275757,
                1.718860387802124
            ],
            "location_worldframe": [
                1.718860387802124,
                -0.30652472376823425,
                0.5503286123275757
            ],
            "name": "google_Nestle_Carnation_Cinnamon_Coffeecake_Kit_1913OZ_4",
            "projected_cuboid": [
                [
                    989.7109985351562,
                    447.8106880187988
                ],
                [
                    961.2370300292969,
                    289.87009048461914
                ],
                [
                    884.0269470214844,
                    280.8959770202637
                ],
                [
                    910.965576171875,
                    440.72656631469727
                ],
                [
                    1112.410888671875,
                    309.3926811218262
                ],
                [
                    1077.9110717773438,
                    139.5348358154297
                ],
                [
                    994.6639251708984,
                    127.50640869140625
                ],
                [
                    1027.3947143554688,
                    299.54017639160156
                ],
                [
                    992.0835876464844,
                    294.37883377075195
                ]
            ],
            "provenance": "nvisii",
            "px_count_all": 0,
            "px_count_visib": 0,
            "quaternion_xyzw": [
                -0.23918910324573517,
                -0.007904015481472015,
                0.6664068698883057,
                0.7061359882354736
            ],
            "quaternion_xyzw_worldframe": [
                -0.14341111481189728,
                0.8019140362739563,
                0.1036820113658905,
                0.5706289410591125
            ],
            "segmentation_id": 5,
            "visibility": 1
        },
        {
            "bounding_box_minx_maxx_miny_maxy": [
                804,
                1047,
                605,
                845
            ],
            "class": "mug",
            "local_cuboid": null,
            "local_to_world_matrix": [
                [
                    0.6584553718566895,
                    -0.3450233042240143,
                    -0.6688762903213501,
                    -0.0
                ],
                [
                    0.7392677068710327,
                    0.12983770668506622,
                    0.6607765555381775,
                    0.0
                ],
                [
                    -0.14113791286945343,
                    -0.9295704364776611,
                    0.34055688977241516,
                    -0.0
                ],
                [
                    1.2288596630096436,
                    -0.1444949060678482,
                    0.12551634013652802,
                    1.0
                ]
            ],
            "location": [
                0.1444949060678482,
                -0.12551634013652802,
                1.2288596630096436
            ],
            "location_worldframe": [
                1.2288596630096436,
                -0.1444949060678482,
                0.12551634013652802
            ],
            "name": "mug_0",
            "projected_cuboid": [
                [
                    1093.7495422363281,
                    788.3962440490723
                ],
                [
                    1054.3683624267578,
                    657.6246643066406
                ],
                [
                    1005.3102111816406,
                    551.9614219665527
                ],
                [
                    1044.7257995605469,
                    680.1902961730957
                ],
                [
                    867.2451019287109,
                    871.4556884765625
                ],
                [
                    816.9114685058594,
                    746.8995094299316
                ],
                [
                    782.5321960449219,
                    637.4864959716797
                ],
                [
                    831.6124725341797,
                    760.0553512573242
                ],
                [
                    936.1599731445312,
                    711.9807815551758
                ]
            ],
            "provenance": "nvisii",
            "px_count_all": 0,
            "px_count_visib": 0,
            "quaternion_xyzw": [
                0.7326215505599976,
                0.18394167721271515,
                0.5418983697891235,
                0.3684796094894409
            ],
            "quaternion_xyzw_worldframe": [
                0.5449910163879395,
                0.18084901571273804,
                -0.3715722858905792,
                0.7295289635658264
            ],
            "segmentation_id": 10,
            "visibility": 1
        }
    ]
}

Here is how the json file looks like for centerpose:

{
    "AR_data": {
        "plane_center": [
            0.026276886463165283,
            0.03733876347541809,
            -0.42468586564064026
        ],
        "plane_normal": [
            -0.7663699388504028,
            0.09618763625621796,
            0.6351574063301086
        ]
    },
    "camera_data": {
        "camera_projection_matrix": [
            [
                1.6554118394851685,
                0.0,
                0.019000232219696045,
                0.0
            ],
            [
                0.0,
                2.2072157859802246,
                -0.004737734794616699,
                0.0
            ],
            [
                0.0,
                0.0,
                -0.9999997615814209,
                -0.0009999998146668077
            ],
            [
                0.0,
                0.0,
                -1.0,
                0.0
            ]
        ],
        "camera_view_matrix": [
            [
                -0.26714298129081726,
                -0.7525513172149658,
                -0.6019145250320435,
                -0.055233731865882874
            ],
            [
                -0.9542973041534424,
                0.11975152790546417,
                0.27381736040115356,
                0.18261873722076416
            ],
            [
                -0.13398146629333496,
                0.6475539207458496,
                -0.7501484751701355,
                -0.00018225116946268827
            ],
            [
                0.0,
                0.0,
                0.0,
                1.0
            ]
        ],
        "height": 800,
        "intrinsics": {
            "cx": 298.370361328125,
            "cy": 392.1915690104167,
            "fx": 662.1647135416667,
            "fy": 662.1647135416667
        },
        "location_world": [
            0.15949289500713348,
            -0.06331707537174225,
            -0.08338691294193268
        ],
        "quaternion_world_xyzw": [
            -0.583792361067781,
            0.7309314302392926,
            0.3151360368846342,
            0.1600468734586658
        ],
        "width": 600
    },
    "objects": [
        {
            "class": "cup",
            "keypoints_3d": [
                [
                    -0.027369018644094467,
                    0.04407183825969696,
                    -0.38022491335868835
                ],
                [
                    -0.0025680139660835266,
                    -0.032148003578186035,
                    -0.44896677136421204
                ],
                [
                    0.05524785816669464,
                    -0.021686285734176636,
                    -0.38079142570495605
                ],
                [
                    -0.10985984653234482,
                    -0.01868179440498352,
                    -0.3600447177886963
                ],
                [
                    -0.05204399302601814,
                    -0.008220136165618896,
                    -0.2918694317340851
                ],
                [
                    -0.002694040536880493,
                    0.09636379778385162,
                    -0.4685803949832916
                ],
                [
                    0.05512183904647827,
                    0.10682550072669983,
                    -0.40040507912635803
                ],
                [
                    -0.10998588055372238,
                    0.10982996970415115,
                    -0.37965837121009827
                ],
                [
                    -0.05217001214623451,
                    0.12029168009757996,
                    -0.3114830255508423
                ]
            ],
            "location": [
                -0.02736901868021091,
                0.044071842568774056,
                -0.38022490531119946
            ],
            "mug": true,
            "mug_handle_visible": true,
            "name": "cup_0",
            "projected_cuboid": [
                [
                    378,
                    344
                ],
                [
                    253,
                    388
                ],
                [
                    263,
                    488
                ],
                [
                    266,
                    189
                ],
                [
                    282,
                    273
                ],
                [
                    437,
                    388
                ],
                [
                    478,
                    483
                ],
                [
                    493,
                    200
                ],
                [
                    557,
                    281
                ]
            ],
            "provenance": "objectron",
            "quaternion_xyzw": [
                0.19061728062246633,
                0.29139854153287625,
                0.6446484091470697,
                0.6805735602451516
            ],
            "scale": [
                0.12999999523162842,
                0.14000000059604645,
                0.09000000357627869
            ]
        }
    ]
}
  1. When it's generating the annotated dataset, why can't I see the segmented.exr images ( I mean it's blank white). How to get that ?? I can only see a depth.exr which is a 32 bit binary image (which looked more like a segmentation image than a depth image). Here are the things that the pipeline is generating with the depth.exr as a binary image and seg.exr as a blank white image.

Screenshot from 2023-10-23 16-19-17

The depth image (.exr) looks like this:

Screenshot from 2023-10-23 16-21-59

The segmentation image (.exr) looks like this:

Screenshot from 2023-10-23 16-22-14

  1. Is there a way to interactively change the parameters inside nvisii so that I can adjust parameters visually, I mean where to put the objects, distractors and camera itself ??
@ArghyaChatterjee
Copy link
Author

@TontonTremblay, can you respond please ?

@TontonTremblay
Copy link
Collaborator

I am in vacation this week, will answer next week.

@TontonTremblay
Copy link
Collaborator

Wow this is some unknown territory. I am impressed by what you are trying to do, and I have always wanted to try something along these lines. Is this the tutorial you followed: https://blog.polyhaven.com/how-to-create-high-quality-hdri/
nvisii uses this model of hdr map, I am afraid I cannot help much more than that tbh. But please share your results here.

@TontonTremblay
Copy link
Collaborator

I see there are questions in there :P

  1. See above, I do not think you created your hdr map correctly.
  2. Yeah you are correct, you need the cuboid sizes, https://github.com/NVlabs/Deep_Object_Pose/blob/master/scripts/nvisii_data_gen/utils.py#L950-L960 this is the information you need and you need to then normalize it by one axis.
  3. exr images are a format that is not bounded by values, you can store what ever you want in them, then most viewer wont know how to read the segmentation. https://chat.openai.com/share/dbfb8ef1-6504-48e6-bc2b-9972f18283f0 try something like this.
  4. https://github.com/owl-project/NVISII/blob/master/examples/17.materials_visii_interactive.py try to make the script work with this and then you could control your params live.

Good luck

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants