Scottx611x/fix derived node attribute inheritance #2174

scottx611x · 2017-09-20T20:15:04Z

Fix bug where Derived Data Nodes did not inherit their parent's Attributes
Constrain WorkflowFilesDL creation to asterisked outputs of a Galaxy Workflow (This is way easier than having an end-user specify outputs they want returned to Refinery by name. They simply have to check the asterisk next to the output files in their Workflow Editor)

Tests to write:

is_orphan
_get_exposed_workflow_outputs
_get_creating_job_output_name
tests for the data transformation node & derived data node association (this was particularly tricky to debug over the past couple of days)

Document asterisking of outputs: https://github.com/refinery-platform/refinery-platform/wiki/Annotating-&-Importing-Refinery-Tools#exposing-galaxy-workflow-outputs-to-refinery
Make sure test coverage is up to par

…erly connected to their parents

…, only create WorkflowFilesDL for asterisked Workflow outputs

codecov-io · 2017-09-20T20:39:57Z

Codecov Report

Merging #2174 into release-1.6.0 will increase coverage by 0.43%.
The diff coverage is 99.18%.

@@                Coverage Diff                @@
##           release-1.6.0    #2174      +/-   ##
=================================================
+ Coverage          46.35%   46.79%   +0.43%     
=================================================
  Files                412      412              
  Lines              27977    28200     +223     
  Branches            1312     1312              
=================================================
+ Hits               12970    13196     +226     
+ Misses             15007    15004       -3

Impacted Files	Coverage Δ
refinery/data_set_manager/utils.py	`63.45% <ø> (+0.64%)`	⬆️
refinery/tool_manager/test_data/galaxy_mocks.py	`100% <100%> (ø)`	⬆️
refinery/data_set_manager/tests.py	`99.6% <100%> (ø)`	⬆️
refinery/core/tests.py	`99.63% <100%> (ø)`	⬆️
refinery/core/models.py	`67.92% <100%> (+1.26%)`	⬆️
refinery/tool_manager/tests.py	`99.85% <100%> (ø)`	⬆️
refinery/galaxy_connector/galaxy_workflow.py	`13.09% <100%> (+0.29%)`	⬆️
refinery/data_set_manager/models.py	`76.62% <100%> (+0.78%)`	⬆️
refinery/tool_manager/models.py	`95.36% <97.14%> (-0.39%)`	⬇️
... and 7 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 68e8d11...4ce2052. Read the comment docs.

…`, and `_get_workflow_step`

…put_name`, and `_get_workflow_step`" This reverts commit 61cd954.

…ce_chain still needs stronger assertions)

scottx611x · 2017-09-21T19:29:07Z

refinery/core/models.py

                    )
                    if graph[edge[0]][edge[1]]['output_id'] == output_id:
-                        output_node_id = edge[0]


NOTE: This is where things were flip-flopped before
This code wasn't broken, just difficult to debug since the naming didn't properly reflect what was happening.

output_node_id was really the input_node_id etc.

scottx611x · 2017-09-21T19:31:32Z

refinery/data_set_manager/utils.py

@@ -563,7 +563,6 @@ def _add_annotated_nodes(

    if len(bulk_list) > 0:
        AnnotatedNode.objects.bulk_create(bulk_list)
-        bulk_list = []


This was dead code

…node_inheritance_chain`

scottx611x · 2017-09-22T13:00:05Z

refinery/core/models.py

@@ -1609,15 +1609,21 @@ def rename_results(self):
                    if item.get_filetype() == zipfile:
                        new_file_name = ''.join([root, '.zip'])
            renamed_file_store_item_uuid = rename(


rename & rename_datafile return None in odd ways. Make an issue for fixing.

Made: this issue

scottx611x · 2017-09-22T13:02:37Z

refinery/core/models.py

-                if (graph[edge[0]][edge[1]]['output_id'] ==
-                        str(input_connection.step) + '_' +
-                        input_connection.filename):
+                input_id = "{}_{}".format(


Analysis_NodeConnection method for this concatenation pattern: {}_{}

scottx611x · 2017-09-22T13:13:00Z

refinery/core/models.py

-            analysis=self, direction=INPUT_CONNECTION)[0].node.assay
+            analysis=self,
+            direction=INPUT_CONNECTION
+        ).first().node.assay
        # 1. read workflow into graph
        graph = create_expanded_workflow_graph(
            ast.literal_eval(self.workflow_copy)


Do we need literal_eval ??? Yes we do

scottx611x · 2017-09-22T13:20:45Z

refinery/core/models.py

            # get graph edge that corresponds to this output node:
            # a. attach output node to source data transformation node
            # b. attach output node to target data transformation node
            # (if exists)
-            if len(graph.edges([output_connection.step])) > 0:
-                for edge in graph.edges_iter([output_connection.step]):
+            workflow_step = output_connection.step


workflow_step -> output_connection.step

scottx611x · 2017-09-22T13:25:53Z

refinery/core/models.py

+
+        # return a sorted list based on the AnalysisNodeConnections step
+        # attribute
+        return sorted(


Not convinced that it needs to be sorted
it didn't need to be

scottx611x · 2017-09-22T13:59:03Z

refinery/tool_manager/models.py

+        for dataset in self._get_galaxy_history_dataset_list():
+            creating_job = self._get_galaxy_dataset_job(dataset)
+            if "upload" not in creating_job["tool_id"]:
+                workflow_step = self._get_workflow_step(dataset)


workflow_step_index ?.. key... workflow_steps_dict

wrap workflow_step_key as str()

scottx611x · 2017-09-22T13:59:19Z

refinery/tool_manager/models.py

+                    workflow_steps[str(workflow_step)]["workflow_outputs"]
+                ]
+                if creating_job_output_name in workflow_output_names:
+                    visible_datasets.append(dataset)


exposed_datasets?

scottx611x · 2017-09-22T14:02:33Z

refinery/tool_manager/models.py

+            if step["job_id"] == galaxy_dataset_dict[self.CREATING_JOB]:
+                    workflow_steps.append(step["order_index"])
+
+        if not workflow_steps:


comment on necessity

scottx611x · 2017-09-22T14:03:27Z

refinery/tool_manager/models.py

+
+        if not workflow_steps:
+            workflow_steps.append(0)
+
        assert len(workflow_steps) == 1, (


is this even necessary, why not return in loop?

scottx611x · 2017-09-22T14:03:39Z

refinery/tool_manager/models.py

+        """
+        creating_job = self._get_galaxy_dataset_job(galaxy_dataset_dict)
+        creating_job_outputs = creating_job["outputs"]
+        logger.debug("Dataset: %s", galaxy_dataset_dict)


remove loggers

…ests

…ction_to_analysis_result_mapping`

mccalluc · 2017-09-22T16:14:15Z

refinery/core/models.py

-                        output_connection.step,
-                        output_connection.filename
-                    )
+                    output_id = output_connection.get_output_connection_id()


hmm: I was imagining just one method, get_id, instead of separate get_output_connection_id and get_input_connection_id?

Good call, thats less confusing

mccalluc · 2017-09-22T16:21:15Z

refinery/core/models.py

+                else:
+                    output_connections_to_analysis_results.append(
+                        (output_connection, None)
+                    )


So there's always an analysis_result from the database.... but maybe it's not really an analysis_result? The if-then could be moved up to where we set it.

mccalluc · 2017-09-22T16:24:14Z

refinery/core/models.py

+
+    def get_output_connection_id(self):
+        return "{}_{}".format(self.step, self.name)
+


oy: I hadn't noticed this discrepancy. Maybe a comment to explain why filename for one and name for the other?

mccalluc · 2017-09-22T16:26:29Z

refinery/tool_manager/models.py

-        )
-        analysis_group_number = (
-            matching_refinery_to_galaxy_file_mappings[0][self.ANALYSIS_GROUP]
+        assert len(list(set(analysis_groups))) == 1, (


this was necessary after all?

mccalluc · 2017-09-22T16:28:05Z

refinery/tool_manager/models.py

@@ -861,15 +867,54 @@ def get_galaxy_file_relationships(self):
    def _get_galaxy_history_dataset_list(self):
        """
        Retrieve a list of Galaxy Datasets from the Galaxy History of our
-        Galaxy Workflow invocation.
+        Galaxy Workflow invocation all tool outputs in the Galaxy Workflow
+        editor.


Hmm: really not sure what this is trying to say.

mccalluc · 2017-09-22T16:34:35Z

refinery/tool_manager/models.py

+        # If we reach this point and have no workflow_steps, this means that
+        #  the galaxy dataset in question corresponds to an `upload` or
+        # `input` step i.e. `0`
+        return self.INPUT_STEP_NUMBER


not sure the constant makes this more clear, but it's fine.

scottx611x added 7 commits September 20, 2017 09:14

Code tidying

a7b0ae9

Add is_orphan() to Node model

be3ce81

Clearer parameter naming

ee012fb

Remove unused code

8dac765

Fix attach_outputs_dataset to allow for derived data nodes to be prop…

2f194d8

…erly connected to their parents

Create AnalysisNodeConnection outputs for all Galaxy History datasets…

396217d

…, only create WorkflowFilesDL for asterisked Workflow outputs

Update existing tests and mock data

70669d6

scottx611x added this to the Release 1.6.0 milestone Sep 20, 2017

scottx611x added this to QC in Tool APIs Sep 20, 2017

Add test for is_orphan

b29747e

scottx611x changed the title ~~Scottx611x/fix derived node attributes~~ Scottx611x/fix derived node attribute inheritance Sep 21, 2017

scottx611x added 6 commits September 21, 2017 11:53

Refactor _get_analysis_group_number, `_get_creating_job_output_name…

61cd954

…`, and `_get_workflow_step`

Fix typo

46599bd

Update Galaxy Mock data

f14450a

Revert "Refactor _get_analysis_group_number, `_get_creating_job_out…

208bf92

…put_name`, and `_get_workflow_step`" This reverts commit 61cd954.

Update mock data

5b538e0

Update tests (test_attach_outputs_dataset_makes_proper_node_inheritan…

8bb2130

…ce_chain still needs stronger assertions)

scottx611x commented Sep 21, 2017

View reviewed changes

scottx611x added 4 commits September 21, 2017 16:50

Use CREATING_JOB constant

270852b

Use factories that already exist

41f455a

Fix comment

233614e

Add stronger assertions to `test_attach_outputs_dataset_makes_proper_…

c69ce4a

…node_inheritance_chain`

scottx611x requested a review from mccalluc September 21, 2017 21:17

scottx611x commented Sep 22, 2017

View reviewed changes

scottx611x added 4 commits September 22, 2017 10:50

Add methods to AnalysisNodeConnection to build these identifiers w/ t…

aef430f

…ests

output_connection.step was more clear as to what we were dealing with

fc7ceba

Did not need the sorting here

e87a85e

analysis_group is a number already no need to be redundant

394850a

scottx611x added 8 commits September 22, 2017 11:01

Fix docstring

28681aa

Clearer naming inside _get_galaxy_history_dataset_list

9ce1911

Clean up _get_exposed_workflow_outputs

2ee10f4

better naming in _get_exposed_workflow_outputs

58d8b49

Add comment, remove unneccessary assertion

1d076a8

Refactor _get_workflow_step to be more concise

f8c8d0e

Remove loggers from debugging

7616a1d

Update test now that I removed the sorted() from `_get_output_conne…

4ce2052

…ction_to_analysis_result_mapping`

mccalluc reviewed Sep 22, 2017

View reviewed changes

mccalluc merged commit 6003a4a into release-1.6.0 Sep 22, 2017

mccalluc deleted the scottx611x/fix-derived-node-attributes branch September 22, 2017 16:35

scottx611x moved this from QC to Done in Tool APIs Sep 22, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Scottx611x/fix derived node attribute inheritance #2174

Scottx611x/fix derived node attribute inheritance #2174

scottx611x commented Sep 20, 2017 •

edited

codecov-io commented Sep 20, 2017 •

edited

scottx611x Sep 21, 2017

scottx611x Sep 21, 2017

scottx611x Sep 22, 2017 •

edited

scottx611x Sep 22, 2017

scottx611x Sep 22, 2017 •

edited

scottx611x Sep 22, 2017

scottx611x Sep 22, 2017 •

edited

scottx611x Sep 22, 2017

scottx611x Sep 22, 2017

scottx611x Sep 22, 2017

scottx611x Sep 22, 2017

scottx611x Sep 22, 2017

mccalluc Sep 22, 2017

scottx611x Sep 22, 2017

mccalluc Sep 22, 2017

mccalluc Sep 22, 2017

mccalluc Sep 22, 2017

mccalluc Sep 22, 2017

mccalluc Sep 22, 2017


		def get_output_connection_id(self):
		return "{}_{}".format(self.step, self.name)

Scottx611x/fix derived node attribute inheritance #2174

Scottx611x/fix derived node attribute inheritance #2174

Conversation

scottx611x commented Sep 20, 2017 • edited

codecov-io commented Sep 20, 2017 • edited

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

scottx611x Sep 22, 2017 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

scottx611x Sep 22, 2017 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

scottx611x Sep 22, 2017 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

scottx611x commented Sep 20, 2017 •

edited

codecov-io commented Sep 20, 2017 •

edited

scottx611x Sep 22, 2017 •

edited

scottx611x Sep 22, 2017 •

edited

scottx611x Sep 22, 2017 •

edited