-
Notifications
You must be signed in to change notification settings - Fork 111
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WTF - bring back and extend information on metadata extractors etc #7309
Conversation
This PR breaks the test case for #6712, which ensured that calling wtf on a non-existent section would yield an impossible result instead of a key error. While the PR ensures this doesn't happen from the command line by validating the section argument, it somewhat re-introduces the/a bug for the Python API: The test case:
In the command line, a non-existent section would now result in a constraint error:
But there is no parameter validation in Python, and calling In [3]: wtf(sections=['murkie'])
# WTF
Out[3]:
[{'action': 'wtf',
'path': '/home/adina/repos/datalad',
'type': 'dataset',
'status': 'ok',
'decor': None,
'infos': {},
'flavor': 'full'}] |
I will also rebase -- I believe that should address linters. |
d6c8edb
to
fd19fdf
Compare
appveyor - known #7320 |
It was originally removed in 616fb1b as a part of "remove all metadata stuff" step but really this does not need any metadata module (so no dependencies) and IMHO well worth being in the WTF. Demand for such information is stated in datalad/datalad-metalad#340
Output of WTF is already lengthy. Adding lines which give no information is not helping consuming the WTF information
…ne with delimeters in the name space
…ns or subsections
aa92b72
to
3ac402e
Compare
Code Climate has analyzed commit 3ac402e and detected 1 issue on this pull request. Here's the issue category breakdown:
View more on Code Climate. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit torn on this one. Overall, I think I'd prefer to aim for complete removal datalad.metadata
and therefore reintroducing reporting on datalad.metadata.XXX
seems a bit off and should probably be patched in by metalad instead.
OTOH, it kinda works and doesn't really hurt.
In any case, what's not entirely clear to me is, what you consider TODO here, @yarikoptic. What's marked as TODO is checked (although the decision part only has on vote ;-) ), but the description does mention cross-dependencies. What's the state of those?
I would not mind Here is an updated table which can be produced for docs using python -c "import tabulate, numpy as np, pyperclip, pandas, datalad.api as dl; pyperclip.copy(pandas.DataFrame.from_dict(dl.wtf(sections=['metadata'], result_renderer='disabled')[0]['infos']['metadata.extractors']).T.replace(np. nan,'',regex=True).to_markdown())"
yeah, it seems we never set the requirements neither on duration nor on % to consider quorum ;-)
what do you mean by that? either we need to add dependencies on extensions ? (there should be none, we pick up only the ones which are installed if any) |
on dependencies -- after/if we merge this I would be happy to add such table to https://github.com/datalad/datalad-extensions#readme because we do know about all extensions there. It would also be useful run/test to ensure that we have all extensions co-installable and no weird version limits (like recently discovered within pybids on sqlalchemy) forbid us that -- since ATM each extension there is installed only in an individual CI run. |
I mean this part from the PR description:
What's the state of that (arriving at a conclusion there) and what does it mean for this one here? |
ah, good point -- filed datalad/datalad-metalad#370 so we do not forget but I think it has no relation to this PR. |
eh, I thought we had this merged already when I was trying to use metalad for here is how the table looks now for the metalad
|
Since this is a not critical code path, and at large returns prior present functionally, had not received objections, I will proceed merging it in merging it in a couple of days unless objections raised |
PR released in |
Ref datalad/datalad-metalad#340 .
In general I feel that implementation for that issue is a good fit for
datalad wtf
so I looked inside to discover a function (_describe_metadata_elements
) which already did what we want but was not used anywhere! I discovered that the use (but not definition/body) of it was removed in 616fb1b, so as a first step I brought back use of it. Thendatalad.metadata.indexers
metadata
but alsoextensions
) removed populatingload_error
withNone
if there were no error. IMHO it is pointless waste of screen real-estate. As our formatter is generic, we probably do not want it always to not print every field withNone
as value, so decided to remove it here.version
of extractorgeneration
of extractor which is proposed in Add __generation__ to metadata extractors to be able to tell one from another datalad-metalad#351 - so this PR is contingent on arriving to conclusion theremetalad
anddatalad_deprecated
and ideally should be centralized. Most likely withinmetalad
and makedeprecated
depend on it (attn @christian-monch )section_subsection
tosection.subsection
so it matches better and added functionality to request an entiresection
, e.g.-S metadata
returns all subsectionsmetadata.*
Example of output which you would get after
yq '.extensions| keys_unsorted | .[] | @sh' < extensions.yaml | sed -e "s,[\"'],,g" | xargs pip install
, i.e. installing all datalad extensions registed in datalad-extensions❯ datalad wtf -S metadata
WTF
metadata.extractors
metadata.filters
metadata.indexers
or using
pyperclip.copy(pandas.DataFrame.from_dict(dl.wtf(sections=['metadata'], result_renderer='disabled')[0]['infos']['metadata.extractors']).T.replace(np. nan,'',regex=True).to_markdown())
we getTODOs:
SECTION_CALLABLES
(not yet sure how to make it dynamic withinEnsureChoice
) from extensions and support forsection.subsection
, and move actual code intodatalad-metalad
.CHANGELOG-missing
label to this pull request in order to have a snippet generated from its title;or use
scriv create
locally and include the generated file in the pull request, see scriv).Thanks for contributing!