-
Notifications
You must be signed in to change notification settings - Fork 625
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] pretty prints of objects #1439
Comments
I would like to take up this issue and I have a few additional ideas for this issue like a schema view for the dataset. I have an idea wherein we just display Dtype, Htype, sample_compression under the name of the tensor of the schema to give greater insight into schema structure. Example of above idea: if ds.schema(verbose=False) then
if ds.schema(verbose=True) then
|
@SaiNikhileshReddy looks great! lets get started! |
Hello!! I see that this issue has already been assigned. Can I still try to work on it? I am new to open source, this issue seems like a nice place to start!! |
@neel2299 I don't have any problem with it. I'm partially done. Currently, I'm trying to integrate my solution into the hub. |
Great! I will start right away :) |
Which branch should I make the PR for? |
Hi @neel2299 ! Feel free to make the PR for the main branch. Thanks for your interest in Hub. |
@SaiNikhileshReddy @neel2299 Any updates on this issue? Do tell me if you need any help! |
I would be pushing the code for pretty prints in 2-3 days. I was working on dataset notebooks in hub/examples repo. |
@FayazRahman I have got an error running hub source files (not modified). I have excuted this command : Should this be ignored? |
@FayazRahman Thank you for giving a helping hand! Much needed... Until now I was running only the tests that were in the core/tests. Thanks to @SaiNikhileshReddy 's post I noticed that we had other tests. Can you please give my PR a look and point if I am in the right direction? I will be changing the code a bit according to some edge cases the tests point to. My main concern is if the code in the str method is getting too cluttered. |
@SaiNikhileshReddy I googled and found that the convention was to comment "# type: ignore" when its not mandatory. Its also written in our testing scripts. I think it would be nice to check where the error happened and if there is "# type: ignore" mentioned there. Though someone who is experienced would answer best. |
Thanks @neel2299 for sharing this. @FayazRahman has mentioned that, it is a known issue and can be ignored. |
@davidbuniat @FayazRahman @mikayelh Any updates on below outputs will help solve the issue quickly. |
The formatting is really nice!! |
wow, good job @SaiNikhileshReddy , pretty exciting! |
@davidbuniat @mikayelh @FayazRahman |
can we replace the word "verbose" with "detailed"? "verbose" is a little advanced vocab, could be not very clear to english as a second language speakers. :) |
@mikayelh verbose is mostly used for detail level logs, so I believe it is fine here though context is debug logs. @SaiNikhileshReddy Other than it looks great, I believe for dynamic shapes there stationary dimensions or the number of dimensions is important so having to show Nones would be better instead of Dynamic in verbose mode. Also instead of having a separate API would be great to embed into ds object. |
@davidbuniat Can you suggest me on how to modify the shapes values if they are different across the images? def tensor_info(tensor, full_shape=False):
# Htype, dtype, compression, shape
htype = tensor.htype
dtype = tensor.dtype
shape = tensor.shape
compression = None
sample_compression = tensor.meta.sample_compression
chunk_compression = tensor.meta.chunk_compression
if (sample_compression != None or chunk_compression != None):
if (sample_compression != None):
compression = sample_compression
else:
compression = chunk_compression
if full_shape:
shape = (len(tensor), tensor.meta.min_shape, tensor.meta.max_shape)
else:
shape = 'Dynamic'
if compression is None:
compression = tensor.meta.chunk_compression
return [htype, dtype, compression, str(shape)] |
@davidbuniat. Table and Schema use Does hub support nested groups? |
Regarding shapes you have two options easy, just show Sounds great! btw much better to have discussion on a specific PR rather than issues here. Yes, seems hub supports nested groups. |
Sure @davidbuniat. I'll integrate the code in hub and raise these doubts in that PR. Thanks for clarifying and sharing feedback! |
Closed by #1543 |
🚨🚨 Feature Request
If your feature will improve
HUB
To explore the structure of a dataset it is convenient to have nicer and more informative prints of dataset objects and samples
Description of the possible solution
1) show ds
now
Something along the lines would work (taken from SQLlite)
and in jupyter notebook shown as a table similar to pandas
2) show ds.tensor
now
at least provide full information about tensor
or to make consistent with 1)
2) show ds[0:5] sample
and in jupyter notebook visualize images (and other htypes)
Notes
The text was updated successfully, but these errors were encountered: