Merge populate_metadata PRs (rebased onto develop) #5241

atarkowska · 2017-04-10T10:38:40Z

This is the same as gh-5232 but rebased onto develop and gh-5243

What this PR does

merge #5218 #5220 #5222 #5223 #5224 #5226 #5227 #5229 (no conflicts)

TODO:

last commit from Plugin fixes #4988
Delete and create bulk-map-annotations by namespace #5074
Print more info if a duplicate map-ann is detected in the database #5085 was missing label

atarkowska · 2017-04-10T10:38:41Z

--rebased-from #5232

Most methods for dataset loading and parsing were left unimplement. Now a `Dataset:`-style object can be passed to populate_metadata.py and images will be looked up by name. Note: there's a small bug with name lookup that will be corrected separately.

The assumptions for well/imaging naming in a plate or screen differ from those from image naming in a dataset since there's no unique way to reference an image in a dataset like there is well "A1" for example. This commit loosens some of those rules to allow image columns and image name columns to work together in the case of datasets. The assumption is that for population the ID of the image in a dataset won't be known. Instead names of images will be used as a unique identifier. Currently only a warning is issued if the name is not unique.

In general, populate_metadata.py looks to be in line for a refactoring. The number of if-clauses as well as the unhandled cases (like no catch-all for unknown targets in delete) is making this ever harder to work with. All tests passing.

In order to allow Projects to smartly handle multiple images with the same name (though not in the same dataset), the internals of ValueResolver have been hidden within a ValueWrapper class. ValueResolver chooses once which ValueWrapper to use internally after which the various if/then blocks based on target object are no longer necessary (needs further refactoring). There *are* still if/then blocks basked on column-type. These could use some cleaning but will likely remain to be necessary for multiple-dispatch style handling.

For extremely large screens (idr0016), both adding map annotations as well as deleting them lead to either PG errors or Ice.MessageSizeMax exceptions. Now both are done in batches of 1000.

will-moore · 2017-04-17T08:59:51Z

components/tools/OmeroPy/test/integration/metadata/test_populate.py

+    def createCsv(self, *args, **kwargs):
+        csvFileName = super(GZIP, self).createCsv(*args, **kwargs)
+        gzipFileName = "%s.gz" % csvFileName
+        with open(csvFileName, 'rb') as f_in, \


This is causing syntax error at https://ci.openmicroscopy.org/view/Failing/job/OMERO-DEV-merge-integration-python/549/testReport/(root)/(empty)/OmeroPy_test_integration_metadata_test_populate/

@aleksandra-tarkowska Maybe try nesting them?

with open(csvFileName, 'rb') as f_i: with gzip.open(gzipFileName, 'wb') as f_out:

will-moore · 2017-04-18T08:50:32Z

Quite a few test failures with yaml issues at https://ci.openmicroscopy.org/view/Failing/job/OMERO-DEV-merge-integration-Python27/523/testReport/

atarkowska · 2017-04-21T13:14:33Z

PASS https://ci.openmicroscopy.org/job/OMERO-DEV-merge-integration-Python27/528/testReport/

FAILED: https://ci.openmicroscopy.org/job/OMERO-DEV-merge-integration-python/554/testReport/

test/integration/metadata/test_populate.py:991: in <module>
    class TestPopulateMetadata(TestPopulateMetadataHelper):
test/integration/metadata/test_populate.py:1000: in TestPopulateMetadata
    GZIP(),
test/integration/metadata/test_populate.py:648: in __init__
    colNames="Image Name,Type,Concentration",
test/integration/metadata/test_populate.py:765: in createCsv
    with gzip.open(gzipFileName, 'wb') as f_out:
E   AttributeError: GzipFile instance has no attribute '__exit__'

that test should be excluded

atarkowska · 2017-04-21T13:20:13Z

components/tools/OmeroPy/test/integration/metadata/test_populate.py

+
+@mark.skipif(sys.version_info < (2, 7),
+             reason="requires python2.7")
+class TestPopulateMetadata(TestPopulateMetadataHelper):


why that is not skipped?

Does it detect the error by static analysis without having to even run the test?

My reading of https://docs.pytest.org/en/latest/skipping.html#id1 is that the function (or class in that case) is initialized and then the abortion is testing when setting up the tests. Here, because there is some Python 2.7 specific bit in the static class variables, the class initialization fails before it has a chance to execute the skipping logic.

joshmoore · 2017-04-24T09:02:17Z

re: skipping tests, @aleksandra-tarkowska, I would assume what's happening is:

createCsv is used in in __init__ of Fixture instances (link)
and Fixture subclasses are created statically (link)

i.e. your check doesn't have a chance to run.

joshmoore · 2017-05-09T13:39:37Z

Thanks, @aleksandra-tarkowska! Great to have this making it back to the mainline. Merging for 5.3.2 after discussion with @jburel.

This was referenced Apr 10, 2017

Merge populate_metadata PRs #5232

Merged

Post merge #5243

Merged

jburel added the develop label Apr 10, 2017

emilroz and others added 26 commits April 12, 2017 09:21

Add columns flag to the parser

a6e536e

Add parse columns and expand supported types

71f6468

Flake8

ffebddc

Add column type support to HeaderResolver

c1798dd

Add column type support to ValueResolver

325f641

Add column type support to ParsingContext

2114ea8

Return long not int

dd7f54c

Set the StringColumn size

c631b75

Check that number of columns and column types equal

d2d3868

Check if rows[0] is column types, HeaderResolver

5182e8b

Check if rows[0] is column types, ParsingContext

da298c2

Raise when number of columns != number of types

cea2b3d

populate_metadata: Fix map and delete contexts

dfa1b15

In general, populate_metadata.py looks to be in line for a refactoring. The number of if-clauses as well as the unhandled cases (like no catch-all for unknown targets in delete) is making this ever harder to work with. All tests passing.

populate_metadata: adding passing screen test

8da0f5e

populate_metadata: refactor test in preparation for projects

e123e79

Tables: Add DatasetColumn (new API)

fb4fc17

populate_metadata: add support for ProjectColumn

4ed75f5

populate_metadata: disallow image name conflicts

f70c0ae

fix flake8

5001f38

populate_metadata.py: add batches to write_to_omero

d0f1770

For extremely large screens (idr0016), both adding map annotations as well as deleting them lead to either PG errors or Ice.MessageSizeMax exceptions. Now both are done in batches of 1000.

populate_metadata.py: make batch_size configurable

5d4d4bf

test_populate.py: delete annotations for second run

005d652

populate_metadata.py: fix flake8 errors

6431fbe

manics and others added 14 commits April 12, 2017 10:01

file id needs to be prefix with OriginalFile:

9d1647f

get_config requires a session to load a remote file

5cce5a9

Handle a null mimetype in OriginalFile

28e92d9

Pass session in metadata plugin when loading a config

2622bd1

Fix logging of deleted objects in DeleteMapAnnotationContext

fb05f6c

Only delete attached files if included in filtered namespaces

1da4a4a

Add simple test of attached config deletion

a4dcf19

post_merge: fix import (ITest moved)

afffb18

post_merge: rename doSubmit to do_submit

8137902

rename importPlates to import_plates

004991c

rename format to mimetype in make_file_annotation

4f91bb8

Print more info about duplicate map-anns in db

c69e622

fix renamed importMIF to import_fake_file and params

90f928f

populate metadata tests py27 only

9a90bd1

will-moore reviewed Apr 17, 2017

View reviewed changes

atarkowska added 2 commits April 19, 2017 19:14

fix renamed plate_rows and plate_cols

94e17b1

fix make_file_annotation (from metadata52 branch)

fcce33e

atarkowska mentioned this pull request Apr 20, 2017

test omero53 in travis ome/omero-mapr#19

Closed

py26 fix

77a02e0

atarkowska commented Apr 21, 2017

View reviewed changes

improve skipping marker

ac0d312

fix py26 failing test

07e9010

joshmoore merged commit dfa9f72 into ome:develop May 9, 2017

joshmoore deleted the rebased/develop/merge_populate branch May 9, 2017 13:39

jburel added this to the 5.3.2 milestone May 16, 2017

joshmoore mentioned this pull request Jul 7, 2017

test omero53 in travis ome/omero-mapr#23

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Merge populate_metadata PRs (rebased onto develop) #5241

Merge populate_metadata PRs (rebased onto develop) #5241

atarkowska commented Apr 10, 2017 •

edited

atarkowska commented Apr 10, 2017

will-moore Apr 17, 2017

manics Apr 21, 2017

will-moore commented Apr 18, 2017

atarkowska commented Apr 21, 2017 •

edited

atarkowska Apr 21, 2017

mtbc Apr 21, 2017

sbesson Apr 21, 2017

joshmoore commented Apr 24, 2017

joshmoore commented May 9, 2017

Merge populate_metadata PRs (rebased onto develop) #5241

Merge populate_metadata PRs (rebased onto develop) #5241

Conversation

atarkowska commented Apr 10, 2017 • edited

What this PR does

atarkowska commented Apr 10, 2017

will-moore Apr 17, 2017

Choose a reason for hiding this comment

manics Apr 21, 2017

Choose a reason for hiding this comment

will-moore commented Apr 18, 2017

atarkowska commented Apr 21, 2017 • edited

atarkowska Apr 21, 2017

Choose a reason for hiding this comment

mtbc Apr 21, 2017

Choose a reason for hiding this comment

sbesson Apr 21, 2017

Choose a reason for hiding this comment

joshmoore commented Apr 24, 2017

joshmoore commented May 9, 2017

atarkowska commented Apr 10, 2017 •

edited

atarkowska commented Apr 21, 2017 •

edited