Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

some missing hdf5 files #7

Closed
hiroyasuakada opened this issue Dec 25, 2020 · 12 comments
Closed

some missing hdf5 files #7

hiroyasuakada opened this issue Dec 25, 2020 · 12 comments

Comments

@hiroyasuakada
Copy link

Thank you for your great work!

I downloaded the dataset but found that there seem to be some missing hdf5 files besides the ones excluded manually beforehand.

For example. when I downloaded "https://docs-assets.developer.apple.com/ml-research/datasets/hypersim/v1/scenes/ai_006_001.zip" with dataset_download_images.py, I was able to get frame.0053.tonemap.jpg in ai_006_001/images/scene_cam_00_final_preview directory.

However, its corresponding frame.0053.depth_meters.hdf5 / frame.0053.semantic.hdf5, etc. files were not included in ai_006_001/images/scene_cam_00_geometry_hdf5 directory.

Did I overlook anything?

Thank you in advance.

@mikeroberts3000
Copy link
Collaborator

That's creepy. I'm looking into this issue now.

@mikeroberts3000
Copy link
Collaborator

mikeroberts3000 commented Dec 25, 2020

This is indeed a bug. Thank you for pointing out this issue 😀

Using the --list feature in @99991's very handy alternative download script, I confirmed that ai_006_001.zip is missing the files you mentioned. I thought it was strange that the public zip file is missing the all the HDF5 and JPG and PNG files corresponding to {cam_00, frame.0053, geometry pass}, as opposed to missing random files from different frames. What is special about this frame specifically?

Then I realized that these missing HDF5 files aren't even on the portable hard drive where I originally ran our data generation pipeline. I also used this portable hard drive when I was uploading the public zip files, which explains why the HDF5 files are missing from the public zip file. Did these files fail to download from our cloud rendering system?

Then I realized that I have a private cloud backup of the missing files, at least for {ai_006_001, cam_00, frame.0053, geometry pass}. This means that these files downloaded successfully from our cloud rendering system, and were once on my portable hard drive.

In conclusion, I think I must have accidentally deleted these files off my portable hard drive while browsing through the data at some point 😅

Since you're set up to look into this issue anyway, can you please make an exhaustive list of all the files that you think are missing from the public zip files and post it here? I need to re-upload the public zip files anyway to include bounding boxes for all the semantic instances. So I'll make sure to include the missing image data when I re-upload.

(This plan assumes that I have the missing data backed up somewhere. If I don't have it backed up, then it probably failed to download from our cloud rendering system, and we'll have to live with a few less images in the dataset.)

@hiroyasuakada
Copy link
Author

Thank you for your reply!

I’ll make the list of missing files in a few days!

@mikeroberts3000
Copy link
Collaborator

mikeroberts3000 commented Dec 25, 2020

I wrote a script to find these occasional missing files. I'll upload them in an upcoming data release. In the meantime, I'll leave this issue open. Thank you 😀

@mikeroberts3000
Copy link
Collaborator

mikeroberts3000 commented Dec 29, 2020

This issue is now fixed. ai_003_010.zip and ai_006_001.zip were the only zip files that I found that were missing any files. Thank you again 😀

@hiroyasuakada
Copy link
Author

Sorry for my late reply, but I found the same files you mentioned! thank you!

@OrangeSodahub
Copy link

@mikeroberts3000 Hi, I have similar problem, but not exactly a issue. I found that the frame.0036 was missed in scene ai_001_001, please take a look:
image
I'm curious is that right, or why is that?

@mikeroberts3000
Copy link
Collaborator

@OrangeSodahub, this is the third time in a row that you have posted a question on a topic that is covered in our documentation. I have already mentioned the following suggestion your other threads, but I will repeat it here. Please slow down, read our documentation carefully, then examine our data, then look at existing issues, then post on GitHub if you are still experiencing an issue. You are clearly proceeding in the opposite order (examining our data, then posting on GitHub, then skimming the documentation).

From our README:

Note also that we manually excluded images containing people and prominent logos from our public release, and therefore our public release contains 74,619 images, rather than 77,400 images. We list all the images we manually excluded in ml-hypersim/evermotion_dataset/analysis/metadata_images.csv.

@OrangeSodahub
Copy link

Sorry, I did browse everything I can find, however there is really too much informantion around the docs and large amount of codes, and each time I meet the problem, may not recall the corresponding answers and there positions immediately

@mikeroberts3000
Copy link
Collaborator

mikeroberts3000 commented Mar 6, 2024

however there is really too much informantion around the docs

@OrangeSodahub You're criticizing our dataset for being too thoroughly documented, and claiming that you can't possibly read through our documentation because there is so much of it. This logic is exactly backwards. We have invested in our documentation precisely because we want to help people as much as possible without needing to address the same questions over and over again. The fact that our dataset is so thoroughly documented should decrease, not increase, your need to post issues.

@OrangeSodahub
Copy link

OrangeSodahub commented Mar 8, 2024

Sorry, that's not a criticism at all, and I definitely agree that you did better than most of other open source works by providing many documentations. And "you can't possibly read through our documentation because there is so much of it." is wrong, I mean I definitely read all of them but cannot remember them in a short time

@mikeroberts3000
Copy link
Collaborator

Totally no worries, thank you for clarifying. Feel free to post any issues you find after reading through our docs and existing issues. If something doesn't make sense to you after doing all that, there is probably a bug in our code or there is something important missing from our docs, so we're happy to help. I hope that Hypersim can add value in your research 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants