-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update docs for nyu_depth_v2
dataset
#5484
Conversation
I think I need to create another PR on https://huggingface.co/datasets/huggingface/documentation-images/tree/main/datasets for hosting the images there? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for working on this! Just a few nits.
The documentation is not available anymore as the PR was closed or merged. |
Thanks for the update @awsaf49 ! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for the updates!
Just some minor things remain and the we should be good to ship this 🚀
@sayakpaul I have updated the minor things. Please approve the workflows |
I think this PR is good to go.. |
Show benchmarksPyArrow==6.0.0 Show updated benchmarks!Benchmark: benchmark_array_xd.json
Benchmark: benchmark_getitem_100B.json
Benchmark: benchmark_indices_mapping.json
Benchmark: benchmark_iterating.json
Benchmark: benchmark_map_filter.json
Show updated benchmarks!Benchmark: benchmark_array_xd.json
Benchmark: benchmark_getitem_100B.json
Benchmark: benchmark_indices_mapping.json
Benchmark: benchmark_iterating.json
Benchmark: benchmark_map_filter.json
|
This PR will fix the issue mentioned in #5461. Here is brief overview,
Bug:
Discrepancy between depth map of
nyu_depth_v2
dataset here and actual depth map. Depth values somehow got discretized/clipped resulting in depth maps that are different from actual ones. Here is a side-by-side comparison,Fix:
When I first loaded the datasets from HF I noticed it was 30GB but in DenseDepth data is only 4GB with dtype=uint8. This means data from fast-depth (before loading to HF) must have high precision. So when I tried to dig deeper by directly loading depth_map from
h5py
, I found depth_map fromh5py
came withfloat32
. But when the data is processed in HF withdatasets.Image()
it was directly converted touint8
fromfloat32
hence the discretized depth map.datasets/src/datasets/features/image.py
Lines 91 to 93 in c78559c
cc: @sayakpaul @lhoestq