Improvement / dataset metadata #1382

WilliamHPNielsen · 2018-11-13T16:27:56Z

There is currently no way to query for the metadata associated with a particular dataset. One can only look up metadata by tag, which requires some auxiliary knowledge about the metadata.

Changes proposed in this pull request:

Add a metadata property to a DataSet object to see all associated metadata.

This little improvement is needed for #1352 in order to ensure correct copying of metadata.

@astafan8

WilliamHPNielsen · 2018-11-13T16:30:22Z

Comment: you might wonder why snapshot is missing from RUNS_TABLE_COLUMNS. See the test. I think I made the correct choice, but perhaps we should upgrade the schema later.

codecov · 2018-11-13T16:43:15Z

Codecov Report

Merging #1382 into master will increase coverage by 0.09%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master    #1382      +/-   ##
==========================================
+ Coverage   73.29%   73.38%   +0.09%     
==========================================
  Files          79       79              
  Lines        9256     9282      +26     
==========================================
+ Hits         6784     6812      +28     
+ Misses       2472     2470       -2

astafan8

Cool! (hope this much testing is enough ;) )

astafan8 · 2018-11-13T16:35:00Z

qcodes/dataset/sqlite_base.py

+    """
+    Get all metadata associated with the specified run
+    """
+    # TODO: promote snapshot to be present at creation time


does this TODO mean upgrading the schema to have snapshot column in it?

Yes, exactly. I think it's just a blunder that it is not already there.

astafan8 · 2018-11-13T16:39:45Z

qcodes/dataset/sqlite_base.py

+                SELECT "{tag}"
+                FROM runs
+                WHERE run_id = ?
+                AND "{tag}" IS NOT NULL


i was trying to find when this extra condition would give problems, but couldn't.

astafan8 · 2018-11-13T16:44:01Z

qcodes/dataset/data_set.py

@@ -342,6 +346,10 @@ def run_timestamp_raw(self) -> float:
    def description(self) -> RunDescriber:
        return self._description

+    @property
+    def metadata(self) -> Dict:


should metadata be added to the persistent traits list?

Yes, but the persistent traits do not exist yet, only on another feature branch.

astafan8 · 2018-11-13T16:49:42Z

qcodes/tests/dataset/test_dataset_basic.py

@@ -526,3 +526,23 @@ def test_get_description(some_paramspecs):
    loaded_ds = DataSet(run_id=1)

    assert loaded_ds.description == desc
+


not sure that i understood your comment about snapshot. so you've made it both: non metadata AND not RUNS_TABLE_COLUMNS, right? and both should be fixed with upgrading the schema when snapshot is a necessary column of the runs table, right? if so, i agree. We should also do smth with metadata later so that arbitrary columns do not get created for metadata tags.

astafan8 · 2018-11-13T16:52:33Z

qcodes/dataset/sqlite_base.py

+                SELECT "{tag}"
+                FROM runs
+                WHERE run_id = ?
+                AND "{tag}" IS NOT NULL


Will DataSet.add_metadata({'mytag': None}) produce a new column mytag with a NULL value in it? if yes, then either such an add_metadata call should not be allowed, or "{tag}" IS NOT NULL part shall be removed.

You are absolutely right. Since a single run with metadata tag creates the column and thereby implicitly NULL-populates it for all other runs, I think the sensible solution is to prohibit None as a valid metadata value. And if we do so, that prohibition should be enforced loud and clear.

Merge: 0204c8c 4658bfc Author: William H.P. Nielsen <whpn@mailbox.org> Merge pull request #1382 from WilliamHPNielsen/improvement/dataset_metadata

WilliamHPNielsen added 2 commits November 13, 2018 16:49

Add list of pristine table columns

0cdec35

Add metadata property to dataset

8c378c8

WilliamHPNielsen added the new dataset label Nov 13, 2018

astafan8 approved these changes Nov 13, 2018

View reviewed changes

Add validation of metadata values

4658bfc

WilliamHPNielsen merged commit a8baf7a into microsoft:master Nov 14, 2018

WilliamHPNielsen deleted the improvement/dataset_metadata branch November 14, 2018 11:53

giulioungaretti pushed a commit that referenced this pull request Nov 14, 2018

Generated gh-pages for commit a8baf7a

795e743

Merge: 0204c8c 4658bfc Author: William H.P. Nielsen <whpn@mailbox.org> Merge pull request #1382 from WilliamHPNielsen/improvement/dataset_metadata

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improvement / dataset metadata #1382

Improvement / dataset metadata #1382

WilliamHPNielsen commented Nov 13, 2018

WilliamHPNielsen commented Nov 13, 2018 •

edited

codecov bot commented Nov 13, 2018 •

edited

astafan8 left a comment

astafan8 Nov 13, 2018

WilliamHPNielsen Nov 14, 2018

astafan8 Nov 13, 2018

astafan8 Nov 13, 2018

WilliamHPNielsen Nov 14, 2018

astafan8 Nov 13, 2018

astafan8 Nov 13, 2018

WilliamHPNielsen Nov 14, 2018

		@@ -526,3 +526,23 @@ def test_get_description(some_paramspecs):
		loaded_ds = DataSet(run_id=1)

		assert loaded_ds.description == desc

Improvement / dataset metadata #1382

Improvement / dataset metadata #1382

Conversation

WilliamHPNielsen commented Nov 13, 2018

WilliamHPNielsen commented Nov 13, 2018 • edited

codecov bot commented Nov 13, 2018 • edited

Codecov Report

astafan8 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

WilliamHPNielsen commented Nov 13, 2018 •

edited

codecov bot commented Nov 13, 2018 •

edited