-
Notifications
You must be signed in to change notification settings - Fork 301
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improvement / dataset metadata #1382
Improvement / dataset metadata #1382
Conversation
Comment: you might wonder why |
Codecov Report
@@ Coverage Diff @@
## master #1382 +/- ##
==========================================
+ Coverage 73.29% 73.38% +0.09%
==========================================
Files 79 79
Lines 9256 9282 +26
==========================================
+ Hits 6784 6812 +28
+ Misses 2472 2470 -2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool! (hope this much testing is enough ;) )
""" | ||
Get all metadata associated with the specified run | ||
""" | ||
# TODO: promote snapshot to be present at creation time |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does this TODO mean upgrading the schema to have snapshot column in it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, exactly. I think it's just a blunder that it is not already there.
SELECT "{tag}" | ||
FROM runs | ||
WHERE run_id = ? | ||
AND "{tag}" IS NOT NULL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i was trying to find when this extra condition would give problems, but couldn't.
@@ -342,6 +346,10 @@ def run_timestamp_raw(self) -> float: | |||
def description(self) -> RunDescriber: | |||
return self._description | |||
|
|||
@property | |||
def metadata(self) -> Dict: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should metadata be added to the persistent traits list?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but the persistent traits do not exist yet, only on another feature branch.
@@ -526,3 +526,23 @@ def test_get_description(some_paramspecs): | |||
loaded_ds = DataSet(run_id=1) | |||
|
|||
assert loaded_ds.description == desc | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure that i understood your comment about snapshot. so you've made it both: non metadata AND not RUNS_TABLE_COLUMNS, right? and both should be fixed with upgrading the schema when snapshot is a necessary column of the runs table, right? if so, i agree. We should also do smth with metadata later so that arbitrary columns do not get created for metadata tags.
SELECT "{tag}" | ||
FROM runs | ||
WHERE run_id = ? | ||
AND "{tag}" IS NOT NULL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will DataSet.add_metadata({'mytag': None})
produce a new column mytag
with a NULL
value in it? if yes, then either such an add_metadata
call should not be allowed, or "{tag}" IS NOT NULL
part shall be removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are absolutely right. Since a single run with metadata tag
creates the column and thereby implicitly NULL-populates it for all other runs, I think the sensible solution is to prohibit None
as a valid metadata value. And if we do so, that prohibition should be enforced loud and clear.
There is currently no way to query for the metadata associated with a particular dataset. One can only look up metadata by tag, which requires some auxiliary knowledge about the metadata.
Changes proposed in this pull request:
metadata
property to aDataSet
object to see all associated metadata.This little improvement is needed for #1352 in order to ensure correct copying of metadata.
@astafan8