Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

About evaluation for mesh. #2

Closed
guangkaixu opened this issue Feb 26, 2022 · 6 comments
Closed

About evaluation for mesh. #2

guangkaixu opened this issue Feb 26, 2022 · 6 comments

Comments

@guangkaixu
Copy link

Hi, thanks for your work on reconstruction. I'm interested in the evaluation for mesh, but when I perfoming evaluation with retrieval-fuse/util/mesh_metrics.py (https://github.com/nihalsid/retrieval-fuse/blob/fce90fa6adf349a3c7bb5eb4b57d387d4f6ff46c/util/mesh_metrics.py) on GT mesh, i.e. the same mesh for prediction and target, the chamferL1 is 0.017 and the normals_correctness is 0.801, which should be 0 and 1 theoretically to my understanding. What should I do to get the correct 3d mesh evaluation results?

@nihalsid
Copy link
Owner

Hi guangkaixu, can you share the mesh for debugging? I suspect that the mesh you're using is not scaled to 64^3 grid which these scripts expect.

@guangkaixu
Copy link
Author

Hi guangkaixu, can you share the mesh for debugging? I suspect that the mesh you're using is not scaled to 64^3 grid which these scripts expect.

Oh yes, I performed the evaluation code on the ScanNet and NYUDepthv2 datasets. The voxel size is set to 2cm and the grid varies between different scenes, instead of 64^3. Is there any suggestion for evaluation on different grid without scaling? I think evaluation on 64^3 grid seems to be coarse. Thanks above.

@nihalsid
Copy link
Owner

I see. Then maybe you can use the original scripts from convocc https://github.com/autonomousvision/convolutional_occupancy_networks/blob/master/src/eval.py that i had modified.

Since these metrics are based on randomly sampled points you might not get absolute 0 and 1 but the metrics should tend to 0 and 1 as number of points get large. The default number of points is already quite large so you should get numbers very close to 0 for CD and 1 for normals. In case you still have issues, i can try debugging it at my end it you provide me with the mesh.

@guangkaixu
Copy link
Author

Thanks for offering the original evaluation code for 3D mesh. I tried to modify the original one but got stuck on getting all points from mesh. If I follow your evaluation code to sample 100k points from mesh(total about 500k points), the points are different between each sampling, which will lead to errors in metrics. Is there any method to get all points from mesh except pointcloud = trimesh.sample(sample_num)? The evaluation code and demo mesh are released in my repo (https://github.com/guangkaixu/eval_3d_mesh) and thanks again for your support.

@nihalsid
Copy link
Owner

Hi @guangkaixu, sorry for the late response, got busy with some deadlines. I had a chance to look at your mesh. With regards to your script, I guess you already discovered the reason why you get higher chamfer L1 and low NC. You have quite a big mesh with 255K vertices, and since both CD and NC are based on random samples, the quality of the metric depends on how many points you sample. E.g. extreme case, if you sample just 1 point on both GT and predected meshes, you'd get a really bad normal consistency and high chamfer distance, because this one point can be at one place in GT and at another place in pred. The metric gets more and more reliable the more samples you use. For your example, I get

{'normals completeness': 0.9103680471111619, 'chamfer-L1': 0.005938444974643876} for original number of points
{'normals completeness': 0.9770087988483396, 'chamfer-L1': 0.0013580619865905396} for N=1e7 points

As you see it gets better with more samples. However, increasing the number of samples comes at the cost of more memory and time.

Also you have to be careful with the IoU metric. In your code you're using pitch=1,1875 even though the scale of your mesh is somewhere around 5 units. This pitch was for a res of 64, so maybe you should try to scale the pitch to your mesh resolution (best way is to visualize the resulting voxels, if they look reasonable).

@guangkaixu
Copy link
Author

Thank you for your patient explanation. I found evaluating point cloud instead of mesh will be more reliable without sampling, but your suggestion is also useful for me!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants