Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support ptex mesh - step 6: dump sorted faces from ReplicaSDK, and load them to simulator #221

Merged
merged 19 commits into from
Oct 1, 2019

Conversation

bigbike
Copy link
Contributor

@bigbike bigbike commented Sep 18, 2019

Motivation and Context

the original mesh is cut into pieces (sub-meshes), each of which contains a number of faces. A key step of the algorithm is to sort them based on a "code". ReplicaSDK uses std::sort to do this job, where the original relative order of the semantically equivalent values are not preserved.
(ideally, std::stable_sort should be applied here.)

The consequence is we cannot reproduce such face order in our simulator using std::sort. (clang and gcc may have different implementations of std::sort.

This PR is to load the sorted faces, which are dumped from ReplicaSDK so that we can have the same face sequence as the RelicaSDK does.

How Has This Been Tested

Disabled the buffer texture in the shader, and test it on Mac. It gives the following rendering result:
image

Types of changes

  • Docs change / refactoring / dependency upgrade
  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist

  • My code follows the code style of this project.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I have read the CONTRIBUTING document.
  • I have completed my CLA (see CONTRIBUTING)
  • I have added tests to cover my changes.
  • All new and existing tests passed.

@facebook-github-bot facebook-github-bot added the CLA Signed Do not delete this pull request or issue due to inactivity. label Sep 18, 2019
@bigbike
Copy link
Contributor Author

bigbike commented Sep 18, 2019

This PR is ready to be reviewed. Thanks.

Copy link
Collaborator

@msavva msavva left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @bigbike . In order to allow testing it would be good to have download links to the sorted_faces.bin files. For now, perhaps just on S3, but eventually we should loop in @jstraub as well to help update the public Replica dataset release files with included sorted face files.

@jstraub
Copy link

jstraub commented Sep 18, 2019

Yeah we already have a habitat folder in the Replica dataset. We could add them into that folder.

@bigbike
Copy link
Contributor Author

bigbike commented Sep 18, 2019

@jstraub : Yes, I was thinking the same idea this morning as well. Will do.

I am now pre-processing all of the replica models. Will keep all of you posted.

@bigbike
Copy link
Contributor Author

bigbike commented Sep 20, 2019

Warning:

The program will segfault on mac at:

currentMesh->adjFacesBuffer.setData(adjFaces[iMesh],
                                        Magnum::GL::BufferUsage::StaticDraw);

with the latest Magnum in master.

This is a known issue discussed at the end of #132 . Do not be surprised.

I am actually going to write another PR to apply conditional inclusions (#ifdef etc.) on Mac.

But I can also do it in this PR if you guys are not against it.

@bigbike
Copy link
Contributor Author

bigbike commented Sep 20, 2019

I pre-processed all the replica models. The data files are compressed in to a .zip file, that can be download from here:
http://dl.fbaipublicfiles.com/habitat/sorted_faces.zip

I wrote a script (included in the .zip file) to copy each file to its corresponding folder. You only need to specify the path to the folder, which contains all of the replica models (e.g., on my machine, it is ~/models/replica/).

Feel free to test this PR with the data on linux ONLY. (On mac, it would cause segfault. This is a known issue. See the previous comment.)

@bigbike
Copy link
Contributor Author

bigbike commented Sep 23, 2019

Dear reviewers,

Please let me know your thoughts and concerns regarding this PR. Thanks.

@mosra
Copy link
Collaborator

mosra commented Sep 24, 2019

Apologies if I'm missing something obvious, but why go through all the pain instead of fixing it upstream? It's a non-trivial amount of data that only adds more to the already huge size of the uncompressed dataset, and if we'd want to run this on the web we would need to spend extra time removing all of this to generate the data on-the-fly again (and then fixing the inevitable code rot as the original code path was not used and not tested anymore).

If the above is not possible, at the very least can we make this configurable at runtime, so it's either pulling the data from a file for a reproducible output, or generating them on-the-fly for people who don't want to load the extra data?

@bigbike
Copy link
Contributor Author

bigbike commented Sep 24, 2019

Apologies if I'm missing something obvious, but why go through all the pain instead of fixing it upstream? It's a non-trivial amount of data that only adds more to the already huge size of the uncompressed dataset, and if we'd want to run this on the web we would need to spend extra time removing all of this to generate the data on-the-fly again (and then fixing the inevitable code rot as the original code path was not used and not tested anymore).

I believe @jstraub, @simesgreen and their colleagues are working on the upstream fix, but it may take a lot of time. It was suggested by them to dump out the data at this moment to circumvent this problem. It is a temporary but reasonable solution at this point.

In terms of the data volume, the extra data for the entire Replica dataset (20 models in total) is just 108MB. So I think it is trivial, not a problem at all. When developing this project, I already carefully considered this issue, and selected the minimum amount of information to dump out. No redundancy.

Our team has the urgent needs on the PTex mesh for model training. Such issue should not be the blocker that slows us down. Hope you understand.

If the above is not possible, at the very least can we make this configurable at runtime, so it's either pulling the data from a file for a reproducible output, or generating them on-the-fly for people who don't want to load the extra data?

Pulling the data from a file is the only correct solution at this moment. It is impossible to generate the correct data on the fly. So perhaps making it configurable does not make any sense here. Again, it is a temporary solution that we have to apply here. We do not have any other correct options before FRL fixes this issue.

@bigbike
Copy link
Contributor Author

bigbike commented Sep 27, 2019

Dear reviewers,

Would you please let me know your further concerns on this PR, so that I can improve?

Thanks!

CORRADE_ASSERT(file.good(), "Error: cannot open the file " << filename, {});

uint64_t numSubMeshes = 0;
file.read((char*)&numSubMeshes, sizeof(uint64_t));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

static_cast<char *>

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, will do.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, actually I suddenly remembered that:
static_cast from 'uint64_t *' (aka 'unsigned long long *') to 'char *' is not allowed.

Just tried and double confirmed it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah. reinterpret_cast<uint64_t *> then to be consistent with the style guide:

https://google.github.io/styleguide/cppguide.html#Casting

auto& ibo = subMeshes[iMesh].ibo;
ibo.resize(numFaces * 4);
uint64_t idx = 0;
for (size_t jFace = 0; jFace < numFaces; ++jFace) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couldn't this be done in the loop above. iow, could we collapse these two loops.

Copy link
Contributor Author

@bigbike bigbike Sep 27, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Certainly you can.
But it would reduce the code readability, and thus cause difficulty to the future maintenance in my humble opinion.
Unless the the jobs are extremely simple and trivial, I prefer just to do "one job" in one for loop.

Here, as the comments suggest, the steps are quite clear by having individual loops:

  • compute the two lookup tables, localToGlobal, globalToLocal;
  • compute ibo based on globalToLocal;
  • compute vbo, nbo based on localToGlobal (here, you see, the jobs to compute vbo and nbo are so trivial that I put them in the same for loop);

Copy link
Contributor

@msbaines msbaines Sep 28, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree. Readability is important unless it affects performance critical. A single loop would be more cache efficient and we are processing a lot of data. If this code path is taking some time you may want to consider a single loop. Maybe time this loop on a large mesh?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree. Readability is important unless it affects performance critical. A single loop would be more cache efficient and we are processing a lot of data. If this code path is taking some time you may want to consider a single loop. Maybe time this loop on a large mesh?

This function is called only once in the initial stage and the time complexity is O(n). The largest model contains roughly 9M triangles. So it is totally fine.

atlasFolder_, "../habitat/sorted_faces.bin");
submeshes_ = loadSubMeshes(originalMesh, subMeshesFilename);

// TODO:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this comment still relevant?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this TODO is relevant. It reminds us that the code introduced by this PR is a temporary solution.
The ideal solution is to call the splitMesh(originalMesh, splitSize_), and do the computation on the fly. But it replies on FRL lab, who needs to fix the Replica dataset first.
When that day comes, one can just "revert this PR".

@bigbike bigbike merged commit 7ddc17d into master Oct 1, 2019
@bigbike bigbike deleted the replica_faces branch October 1, 2019 00:22
@bigbike bigbike added this to the Replica PTex Rendering Support milestone Oct 7, 2019
Ram81 pushed a commit to Ram81/habitat-web-sim that referenced this pull request Dec 10, 2020
…ad them to simulator (facebookresearch#221)

* Support ptex mesh - step 6: dump sub-meshes directly from ReplicaSDK, and load them in the simulator

* only save and load the minimal amount of data

* minor

* minor

* fix segfault

* fix a bug.

A vertex in the original mesh can be used in multiple sub-meshes.

* minor

* minor

* minor

* minor

* minor

* reinterpret_cast

* temporarily disable the ptex mesh rendering on mac. (fix it in coming PR)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed Do not delete this pull request or issue due to inactivity.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants