Skip to content

Conversation

ehlertjd
Copy link
Contributor

@ehlertjd ehlertjd commented Mar 12, 2018

Resolver now uses ContainerStorage for projections and tree knowledge. Resolver also uses DB filtering to find the next node, rather than retrieving all children then filtering.

I've added indexes for [parent, label] to projects, sessions and acquisitions. This caused intermittent breakages in a few integration tests that were relying on DB returning records in insert-order - I believe that these are all fixed. Should I include a migration closure to drop the obsoleted indexes?

This implements virtual nodes for gears, analyses, and files. By default we return all children of a node, but you can select just the files or analyses by adding files or analyses to the path. e.g. resolving scitran/Neuroscience/session-01 will return all acquisitions, analyses and files as children, but resolving scitran/Neuroscience/session-01/files will return just the list of files. This resolves the filename ambiguity that is introduced in #1089.

Gears can be retrieved by using a path of /gears/gear-name or /gears/<id:gear-id>.

In addition, this adds a new /api/lookup endpoint that takes the same input as /api/resolve, but redirects to the appropriate GET handler for the resolved node. For example, performing a lookup of scitran/Neuroscience/session-01 will redirect (transparently on the server side) to the GET handler for /api/sessions/<session-01.id>. The one exception is if you perform a lookup on a file node, it will simply return the file node (without an info object).

Finally, in order to support the resolver/lookup in SDK2, polymorphism was added to resolver.json and a node-type defined for each type. This means that we need to set a container_type attribute in any response returned from /api/resolve or /api/lookup. In the case of lookup, this is done by setting a request environment variable fw_container_type and invoking the new utility function: util.add_container_type.

Breaking Changes

  • You can no longer resolve files by naming them directly. For example: /scitran/Neuroscience/ses-01/acq/scan.nii.gz MUST become /scitran/Neuroscience/ses-01/acq/files/scan.nii.gz.
  • Renamed node_type to container_type to be consistent with other places where we identify the container type (e.g. search)

Review Checklist

  • Tests were added to cover all code changes
  • Documentation was added / updated
  • Code and tests follow standards in CONTRIBUTING.md

ehlertjd and others added 30 commits March 2, 2018 12:17
Parameters can now be optional or required, which allows for more advanced template usage.
Updated existing templates to set the required flag.
Added documentation for root level analyses apis via existing templates.
Added input schema for adding a job, and the endpoint for GET /job/{JobId}/logs.
Added "session" query parameter that limits the returned acquisitions.
This endpoint returns the JSON schema for gear configuration.
This documents the endpoints for /<container>/<cid>/analyses as well as
/<container>/<cid>/<subcontainer>/analyses.
Also refactored a few schemas to better match the SDK.
Removed packfile endpoints for sessions, collections, and acquisitions.
Added missing parameters to packfile and packfile-end endpoints.
…ng types in schemas, added schemas for login/logout responses
ehlertjd added 18 commits March 2, 2018 12:26
This update forces us to use the inflated job for single analysis end points.
This commit will allow merging input and output models based on
the `x-sdk-model` property in the JSON definitions.
This commit updates resolver to make use of mongo indexes and retrieve
fewer nodes from the database. Also allows resolving just a single
container id without retrieving children.
In order to support polymorphism in clients, this commit also adds
a new optional request environment variable `fw_node_type` which,
when set should be included as `node_type` in the result.
This does not yet support gear versioning.
@codecov-io
Copy link

codecov-io commented Mar 12, 2018

Codecov Report

Merging #1098 into swagger-fix-undocumented-codegen will increase coverage by 0.13%.
The diff coverage is 98.85%.

@@                         Coverage Diff                          @@
##           swagger-fix-undocumented-codegen    #1098      +/-   ##
====================================================================
+ Coverage                             90.81%   90.94%   +0.13%     
====================================================================
  Files                                    50       50              
  Lines                                  7031     7146     +115     
====================================================================
+ Hits                                   6385     6499     +114     
- Misses                                  646      647       +1

fill_defaults (bool): Whether or not to populate the default values for returned elements. Default is False.
**kwargs: Additional arguments to pass to the underlying find function
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👏

"""
Return a copy of the list projection to use with this container, or None.
It is safe to modify the returned copy.
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm so-so on this function, but I can see why you did it.

list_projection={'info': 0, 'analyses': 0, 'subject.firstname': 0,
'subject.lastname': 0, 'subject.sex': 0, 'subject.age': 0,
'subject.race': 0, 'subject.ethnicity': 0, 'subject.info': 0,
'files.info': 0, 'tags': 0})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, so I see how you've avoided putting a unsafe non-primitive type in an optional parameter by having it embedded in the superclass constructor. This makes sense, but maybe there's a better place to put it?

Maybe they could move onto the class (self._default_projection above the init) and the superconstructor looks there? Or it's passed in as the optional parameter on this line?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively, subclasses could just override the get_list_projection method? I agree that the above is awkward.

analyses = self.get_all_el({'parent.type': parent_type, 'parent.id': parent_id}, None, None)
def get_analyses(self, query, parent_type, parent_id, inflate_job_info=False, projection=None, **kwargs):
if query is None:
query = {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seeing as query is a required param, I would advocate for not tolerating None here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ContainerStorage.get_all_ell allows None for query, so I was following that model.

'list_projection': {'info': 0, 'analyses': 0, 'subject.firstname': 0,
'subject.lastname': 0, 'subject.sex': 0, 'subject.age': 0,
'subject.race': 0, 'subject.ethnicity': 0, 'subject.info': 0,
'files.info': 0, 'tags': 0},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👏 to this moving out of this file

api/resolver.py Outdated

# Get the next node
if not path_in:
return None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

path_in was not modified since the last copy of this condition? Maybe a mistake copypasta?

api/resolver.py Outdated
use_id, criterion = parse_criterion(path_in)
parent = get_parent(path_out)
# Peek to see if we need files for the next path element
fetch_files = (path_peek(path_in) in ['files', None])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[HIGHER LEVEL DISCUSSION] This seems to maybe be a weakness of the approach: the ContainerNode has to know about things that seem more appropriate to the FilesNode. There's a lot of complexity here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this goes away once we formalize files as their own collection. As long as files exist on containers I don't think it's inappropriate for container-centric code to deal with that fact. In this particular case the goal is to try to keep what we're retrieving from the database to what we actually need.

api/resolver.py Outdated

# Check for analyses
if path_peek(path_in) == 'analyses':
if self.analyses:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these two conditionals should be combined and the raise removed. Right now, there's nothing stopping someone calling a container analyses (ref #1089 (comment)) and I would expect a not-found error to occur if nothing is found.

@staticmethod
def resolve(path):

def resolve(self, path):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is pretty slick relative to the previous one. I'm a little iffy on the callees modifying the passed variables, but I can see the advantage.

return {
'path': resolved_path,
'children': resolved_children
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, this file roughly doubled in length. It definitely gained some features and documentation, but I think lost a lot of readability along the way. Is there anything we can do to improve that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think your review had a lot of good suggestions to help with that, but I'll keep an eye out for other improvements that can be made.

Overall, increase documentation and attempt to reduce complexity in
resolver.py.
Also changed the paradigm for ContainerStorage list projections to use
function overrides.
@ehlertjd
Copy link
Contributor Author

@kofalt I believe I've addressed most of your comments. resolver.py did get a little simpler:

Before

github.com/AlDanial/cloc v 1.76  T=1.03 s (1.0 files/s, 387.0 lines/s)
-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
Python                           1             86             85            226
-------------------------------------------------------------------------------

After

github.com/AlDanial/cloc v 1.76  T=1.05 s (1.0 files/s, 373.1 lines/s)
-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
Python                           1             90            104            198
-------------------------------------------------------------------------------

@ehlertjd ehlertjd force-pushed the swagger-fix-undocumented-codegen branch from 087a3b1 to 1ee8ba1 Compare March 15, 2018 20:49
@gsfr gsfr removed the request for review from nagem May 11, 2018 14:54
@ehlertjd ehlertjd closed this Dec 13, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants