Upgrade Resolver for gears, analyses and lookup #1098

ehlertjd · 2018-03-12T14:40:37Z

Resolver now uses ContainerStorage for projections and tree knowledge. Resolver also uses DB filtering to find the next node, rather than retrieving all children then filtering.

I've added indexes for [parent, label] to projects, sessions and acquisitions. This caused intermittent breakages in a few integration tests that were relying on DB returning records in insert-order - I believe that these are all fixed. Should I include a migration closure to drop the obsoleted indexes?

This implements virtual nodes for gears, analyses, and files. By default we return all children of a node, but you can select just the files or analyses by adding files or analyses to the path. e.g. resolving scitran/Neuroscience/session-01 will return all acquisitions, analyses and files as children, but resolving scitran/Neuroscience/session-01/files will return just the list of files. This resolves the filename ambiguity that is introduced in #1089.

Gears can be retrieved by using a path of /gears/gear-name or /gears/<id:gear-id>.

In addition, this adds a new /api/lookup endpoint that takes the same input as /api/resolve, but redirects to the appropriate GET handler for the resolved node. For example, performing a lookup of scitran/Neuroscience/session-01 will redirect (transparently on the server side) to the GET handler for /api/sessions/<session-01.id>. The one exception is if you perform a lookup on a file node, it will simply return the file node (without an info object).

Finally, in order to support the resolver/lookup in SDK2, polymorphism was added to resolver.json and a node-type defined for each type. This means that we need to set a container_type attribute in any response returned from /api/resolve or /api/lookup. In the case of lookup, this is done by setting a request environment variable fw_container_type and invoking the new utility function: util.add_container_type.

Breaking Changes

You can no longer resolve files by naming them directly. For example: /scitran/Neuroscience/ses-01/acq/scan.nii.gz MUST become /scitran/Neuroscience/ses-01/acq/files/scan.nii.gz.
Renamed node_type to container_type to be consistent with other places where we identify the container type (e.g. search)

Review Checklist

Tests were added to cover all code changes
Documentation was added / updated
Code and tests follow standards in CONTRIBUTING.md

Parameters can now be optional or required, which allows for more advanced template usage. Updated existing templates to set the required flag. Added documentation for root level analyses apis via existing templates.

…ints doc

Added input schema for adding a job, and the endpoint for GET /job/{JobId}/logs.

Added "session" query parameter that limits the returned acquisitions.

This endpoint returns the JSON schema for gear configuration.

This documents the endpoints for /<container>/<cid>/analyses as well as /<container>/<cid>/<subcontainer>/analyses.

Also refactored a few schemas to better match the SDK.

Removed packfile endpoints for sessions, collections, and acquisitions. Added missing parameters to packfile and packfile-end endpoints.

Closes #1088

…ng types in schemas, added schemas for login/logout responses

…erties in schema transpiler

This update forces us to use the inflated job for single analysis end points.

This commit will allow merging input and output models based on the `x-sdk-model` property in the JSON definitions.

This commit updates resolver to make use of mongo indexes and retrieve fewer nodes from the database. Also allows resolving just a single container id without retrieving children.

In order to support polymorphism in clients, this commit also adds a new optional request environment variable `fw_node_type` which, when set should be included as `node_type` in the result.

This does not yet support gear versioning.

codecov-io · 2018-03-12T14:45:53Z

Codecov Report

Merging #1098 into swagger-fix-undocumented-codegen will increase coverage by 0.13%.
The diff coverage is 98.85%.

@@                         Coverage Diff                          @@
##           swagger-fix-undocumented-codegen    #1098      +/-   ##
====================================================================
+ Coverage                             90.81%   90.94%   +0.13%     
====================================================================
  Files                                    50       50              
  Lines                                  7031     7146     +115     
====================================================================
+ Hits                                   6385     6499     +114     
- Misses                                  646      647       +1

kofalt · 2018-03-13T17:48:21Z

api/dao/basecontainerstorage.py

+            fill_defaults (bool): Whether or not to populate the default values for returned elements. Default is False.
+            **kwargs: Additional arguments to pass to the underlying find function
+
+        """


kofalt · 2018-03-13T17:49:50Z

api/dao/basecontainerstorage.py

+        """
+        Return a copy of the list projection to use with this container, or None.
+        It is safe to modify the returned copy.
+        """


I'm so-so on this function, but I can see why you did it.

kofalt · 2018-03-13T17:58:23Z

api/dao/containerstorage.py

+            list_projection={'info': 0, 'analyses': 0, 'subject.firstname': 0,
+                'subject.lastname': 0, 'subject.sex': 0, 'subject.age': 0,
+                'subject.race': 0, 'subject.ethnicity': 0, 'subject.info': 0,
+                'files.info': 0, 'tags': 0})


Okay, so I see how you've avoided putting a unsafe non-primitive type in an optional parameter by having it embedded in the superclass constructor. This makes sense, but maybe there's a better place to put it?

Maybe they could move onto the class (self._default_projection above the init) and the superconstructor looks there? Or it's passed in as the optional parameter on this line?

Alternatively, subclasses could just override the get_list_projection method? I agree that the above is awkward.

kofalt · 2018-03-13T17:59:06Z

api/dao/containerstorage.py

-        analyses = self.get_all_el({'parent.type': parent_type, 'parent.id': parent_id}, None, None)
+    def get_analyses(self, query, parent_type, parent_id, inflate_job_info=False, projection=None, **kwargs):
+        if query is None:
+            query = {}


Seeing as query is a required param, I would advocate for not tolerating None here.

ContainerStorage.get_all_ell allows None for query, so I was following that model.

kofalt · 2018-03-13T17:59:26Z

api/handlers/containerhandler.py

-            'list_projection': {'info': 0, 'analyses': 0, 'subject.firstname': 0,
-                                'subject.lastname': 0, 'subject.sex': 0, 'subject.age': 0,
-                                'subject.race': 0, 'subject.ethnicity': 0, 'subject.info': 0,
-                                'files.info': 0, 'tags': 0},


👏 to this moving out of this file

kofalt · 2018-03-13T18:48:20Z

api/resolver.py

+
+        # Get the next node
+        if not path_in:
+            return None


path_in was not modified since the last copy of this condition? Maybe a mistake copypasta?

kofalt · 2018-03-13T18:51:28Z

api/resolver.py

+        use_id, criterion = parse_criterion(path_in)
+        parent = get_parent(path_out)
+        # Peek to see if we need files for the next path element
+        fetch_files = (path_peek(path_in) in ['files', None])


[HIGHER LEVEL DISCUSSION] This seems to maybe be a weakness of the approach: the ContainerNode has to know about things that seem more appropriate to the FilesNode. There's a lot of complexity here.

I think this goes away once we formalize files as their own collection. As long as files exist on containers I don't think it's inappropriate for container-centric code to deal with that fact. In this particular case the goal is to try to keep what we're retrieving from the database to what we actually need.

kofalt · 2018-03-13T19:03:44Z

api/resolver.py

+
+        # Check for analyses
+        if path_peek(path_in) == 'analyses':
+            if self.analyses:


I think these two conditionals should be combined and the raise removed. Right now, there's nothing stopping someone calling a container analyses (ref #1089 (comment)) and I would expect a not-found error to occur if nothing is found.

kofalt · 2018-03-13T19:21:03Z

api/resolver.py

-    @staticmethod
-    def resolve(path):
-
+    def resolve(self, path):


This function is pretty slick relative to the previous one. I'm a little iffy on the callees modifying the passed variables, but I can see the advantage.

kofalt · 2018-03-13T19:22:10Z

api/resolver.py

+        return {
+            'path': resolved_path,
+            'children': resolved_children
+        }


Overall, this file roughly doubled in length. It definitely gained some features and documentation, but I think lost a lot of readability along the way. Is there anything we can do to improve that?

I think your review had a lot of good suggestions to help with that, but I'll keep an eye out for other improvements that can be made.

Overall, increase documentation and attempt to reduce complexity in resolver.py. Also changed the paradigm for ContainerStorage list projections to use function overrides.

ehlertjd · 2018-03-14T15:35:56Z

@kofalt I believe I've addressed most of your comments. resolver.py did get a little simpler:

Before

github.com/AlDanial/cloc v 1.76  T=1.03 s (1.0 files/s, 387.0 lines/s)
-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
Python                           1             86             85            226
-------------------------------------------------------------------------------

After

github.com/AlDanial/cloc v 1.76  T=1.05 s (1.0 files/s, 373.1 lines/s)
-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
Python                           1             90            104            198
-------------------------------------------------------------------------------

ehlertjd and others added 30 commits March 2, 2018 12:17

Allow template parameters to be optional

0060ca3

Parameters can now be optional or required, which allows for more advanced template usage. Updated existing templates to set the required flag. Added documentation for root level analyses apis via existing templates.

Initial addition of reaper upload API doc

89363b4

Add jobs/logs, jobs/prepare-complete, jobs/accept-failed-output endpo…

1bc08eb

…ints doc

Document site rules

5478bf6

Add missing DELETE file

aeada8d

Reorder tags

8b2feef

Add schemas for gear APIs.

081bbdb

Add schema and endpoint for jobs

18c4b8d

Added input schema for adding a job, and the endpoint for GET /job/{JobId}/logs.

Add docs for modify info

275e0f5

Add documentation for PUT on file endpoints

89de560

Add missing parameter to collection/acquisitions

ea0d39b

Added "session" query parameter that limits the returned acquisitions.

Add missing gear invocation endpoint

58bc7f3

This endpoint returns the JSON schema for gear configuration.

Rename update_job endpoint

c11464f

Add get_analyses endpoint documentation

1631af8

This documents the endpoints for /<container>/<cid>/analyses as well as /<container>/<cid>/<subcontainer>/analyses.

Fix failing tests for analyses

1b250ce

Add schemas for batch job scheduling

866949f

Also refactored a few schemas to better match the SDK.

Add schema for adding job logs

b20a26a

Add API documentation for search endpoint

89cc372

Cleanup packfile endpoints

06e4c99

Removed packfile endpoints for sessions, collections, and acquisitions. Added missing parameters to packfile and packfile-end endpoints.

Add documentation for resolver endpoint

1135641

Closes #1088

Added step to simplify swagger for code generation. Fixed a few missi…

7496d0f

…ng types in schemas, added schemas for login/logout responses

Added additional model simplifications

452f617

Alias primitive types in definitions

e3eb466

Added support for conversion from patternProperties to additionalProp…

5bda420

…erties in schema transpiler

Replace pure references in definitions with aliases

11c709c

Checkpoint commit of modifications for codegen.

786a2aa

Added x-sdk-include-empty for Gear model

510cc77

Rename file to file-entry (python codegen had a problem with that name)

9be94d1

API Doc changes for codegen

0a47869

Added date-time format to created/modified timestamps

3677395

ehlertjd added 18 commits March 2, 2018 12:26

Update schemas for analyses in SDK

cb7de2e

This update forces us to use the inflated job for single analysis end points.

Ignore '+' and '-' properties for download filters

e467742

Add default values for search parameters

08bdbd9

Update packfile output definition

6086aa0

Add model merge for swagger codegen

4d35b4d

This commit will allow merging input and output models based on the `x-sdk-model` property in the JSON definitions.

Add additional handling for polymorphic models

087a3b1

Generalize resolver approach

49f97eb

This commit updates resolver to make use of mongo indexes and retrieve fewer nodes from the database. Also allows resolving just a single container id without retrieving children.

Add lookup endpoint that routes to GET for path

8c70ad3

In order to support polymorphism in clients, this commit also adds a new optional request environment variable `fw_node_type` which, when set should be included as `node_type` in the result.

Move list_projection into ContainerStorage

9304596

Use ContainerStorage for resolver

0073041

Add swagger documentation for lookup endpoint

3049c32

Fix test failing due to indeterminate ordering

22712cb

Refactor resolver to support virtual nodes

277deed

Add gears to resolver

31f0a5b

This does not yet support gear versioning.

Add optional query to get_analyses

94d2fd6

Add analyses to resolver

234274a

Add resolver definitions for gears and analyses

4791196

Improve resolver test coverage

21e34fc

ehlertjd added SDK Breaking Change labels Mar 12, 2018

ehlertjd requested review from kofalt and nagem March 12, 2018 14:40

Rename resolver's node_type to container_type

7c73798

kofalt reviewed Mar 13, 2018

View reviewed changes

Resolve review comments

230a39e

Overall, increase documentation and attempt to reduce complexity in resolver.py. Also changed the paradigm for ContainerStorage list projections to use function overrides.

ehlertjd force-pushed the swagger-fix-undocumented-codegen branch from 087a3b1 to 1ee8ba1 Compare March 15, 2018 20:49

gsfr removed the request for review from nagem May 11, 2018 14:54

ehlertjd closed this Dec 13, 2018

Upgrade Resolver for gears, analyses and lookup #1098

Upgrade Resolver for gears, analyses and lookup #1098

Uh oh!

Conversation

ehlertjd commented Mar 12, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Breaking Changes

Review Checklist

Uh oh!

codecov-io commented Mar 12, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ehlertjd commented Mar 14, 2018

Before

After

Uh oh!

Uh oh!

ehlertjd commented Mar 12, 2018 •

edited

Loading

codecov-io commented Mar 12, 2018 •

edited

Loading