-
Notifications
You must be signed in to change notification settings - Fork 18
Upgrade Resolver for gears, analyses and lookup #1098
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Parameters can now be optional or required, which allows for more advanced template usage. Updated existing templates to set the required flag. Added documentation for root level analyses apis via existing templates.
Added input schema for adding a job, and the endpoint for GET /job/{JobId}/logs.
Added "session" query parameter that limits the returned acquisitions.
This endpoint returns the JSON schema for gear configuration.
This documents the endpoints for /<container>/<cid>/analyses as well as /<container>/<cid>/<subcontainer>/analyses.
Also refactored a few schemas to better match the SDK.
Removed packfile endpoints for sessions, collections, and acquisitions. Added missing parameters to packfile and packfile-end endpoints.
…ng types in schemas, added schemas for login/logout responses
…erties in schema transpiler
This update forces us to use the inflated job for single analysis end points.
This commit will allow merging input and output models based on the `x-sdk-model` property in the JSON definitions.
This commit updates resolver to make use of mongo indexes and retrieve fewer nodes from the database. Also allows resolving just a single container id without retrieving children.
In order to support polymorphism in clients, this commit also adds a new optional request environment variable `fw_node_type` which, when set should be included as `node_type` in the result.
This does not yet support gear versioning.
Codecov Report
@@ Coverage Diff @@
## swagger-fix-undocumented-codegen #1098 +/- ##
====================================================================
+ Coverage 90.81% 90.94% +0.13%
====================================================================
Files 50 50
Lines 7031 7146 +115
====================================================================
+ Hits 6385 6499 +114
- Misses 646 647 +1 |
fill_defaults (bool): Whether or not to populate the default values for returned elements. Default is False. | ||
**kwargs: Additional arguments to pass to the underlying find function | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👏
""" | ||
Return a copy of the list projection to use with this container, or None. | ||
It is safe to modify the returned copy. | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm so-so on this function, but I can see why you did it.
api/dao/containerstorage.py
Outdated
list_projection={'info': 0, 'analyses': 0, 'subject.firstname': 0, | ||
'subject.lastname': 0, 'subject.sex': 0, 'subject.age': 0, | ||
'subject.race': 0, 'subject.ethnicity': 0, 'subject.info': 0, | ||
'files.info': 0, 'tags': 0}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, so I see how you've avoided putting a unsafe non-primitive type in an optional parameter by having it embedded in the superclass constructor. This makes sense, but maybe there's a better place to put it?
Maybe they could move onto the class (self._default_projection
above the init) and the superconstructor looks there? Or it's passed in as the optional parameter on this line?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively, subclasses could just override the get_list_projection
method? I agree that the above is awkward.
analyses = self.get_all_el({'parent.type': parent_type, 'parent.id': parent_id}, None, None) | ||
def get_analyses(self, query, parent_type, parent_id, inflate_job_info=False, projection=None, **kwargs): | ||
if query is None: | ||
query = {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seeing as query is a required param, I would advocate for not tolerating None
here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ContainerStorage.get_all_ell
allows None for query, so I was following that model.
'list_projection': {'info': 0, 'analyses': 0, 'subject.firstname': 0, | ||
'subject.lastname': 0, 'subject.sex': 0, 'subject.age': 0, | ||
'subject.race': 0, 'subject.ethnicity': 0, 'subject.info': 0, | ||
'files.info': 0, 'tags': 0}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👏 to this moving out of this file
api/resolver.py
Outdated
|
||
# Get the next node | ||
if not path_in: | ||
return None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
path_in
was not modified since the last copy of this condition? Maybe a mistake copypasta?
api/resolver.py
Outdated
use_id, criterion = parse_criterion(path_in) | ||
parent = get_parent(path_out) | ||
# Peek to see if we need files for the next path element | ||
fetch_files = (path_peek(path_in) in ['files', None]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[HIGHER LEVEL DISCUSSION] This seems to maybe be a weakness of the approach: the ContainerNode has to know about things that seem more appropriate to the FilesNode. There's a lot of complexity here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this goes away once we formalize files as their own collection. As long as files exist on containers I don't think it's inappropriate for container-centric code to deal with that fact. In this particular case the goal is to try to keep what we're retrieving from the database to what we actually need.
api/resolver.py
Outdated
|
||
# Check for analyses | ||
if path_peek(path_in) == 'analyses': | ||
if self.analyses: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think these two conditionals should be combined and the raise
removed. Right now, there's nothing stopping someone calling a container analyses
(ref #1089 (comment)) and I would expect a not-found error to occur if nothing is found.
@staticmethod | ||
def resolve(path): | ||
|
||
def resolve(self, path): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is pretty slick relative to the previous one. I'm a little iffy on the callees modifying the passed variables, but I can see the advantage.
return { | ||
'path': resolved_path, | ||
'children': resolved_children | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, this file roughly doubled in length. It definitely gained some features and documentation, but I think lost a lot of readability along the way. Is there anything we can do to improve that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think your review had a lot of good suggestions to help with that, but I'll keep an eye out for other improvements that can be made.
Overall, increase documentation and attempt to reduce complexity in resolver.py. Also changed the paradigm for ContainerStorage list projections to use function overrides.
@kofalt I believe I've addressed most of your comments. resolver.py did get a little simpler: Before
After
|
087a3b1
to
1ee8ba1
Compare
Resolver now uses ContainerStorage for projections and tree knowledge. Resolver also uses DB filtering to find the next node, rather than retrieving all children then filtering.
I've added indexes for [
parent
,label
] to projects, sessions and acquisitions. This caused intermittent breakages in a few integration tests that were relying on DB returning records in insert-order - I believe that these are all fixed. Should I include a migration closure to drop the obsoleted indexes?This implements virtual nodes for
gears
,analyses
, andfiles
. By default we return all children of a node, but you can select just the files or analyses by addingfiles
oranalyses
to the path. e.g. resolvingscitran/Neuroscience/session-01
will return all acquisitions, analyses and files as children, but resolvingscitran/Neuroscience/session-01/files
will return just the list of files. This resolves the filename ambiguity that is introduced in #1089.Gears can be retrieved by using a path of
/gears/gear-name
or/gears/<id:gear-id>
.In addition, this adds a new
/api/lookup
endpoint that takes the same input as/api/resolve
, but redirects to the appropriate GET handler for the resolved node. For example, performing a lookup ofscitran/Neuroscience/session-01
will redirect (transparently on the server side) to the GET handler for/api/sessions/<session-01.id>
. The one exception is if you perform a lookup on a file node, it will simply return the file node (without an info object).Finally, in order to support the resolver/lookup in SDK2, polymorphism was added to
resolver.json
and a node-type defined for each type. This means that we need to set acontainer_type
attribute in any response returned from/api/resolve
or/api/lookup
. In the case of lookup, this is done by setting a request environment variablefw_container_type
and invoking the new utility function:util.add_container_type
.Breaking Changes
/scitran/Neuroscience/ses-01/acq/scan.nii.gz
MUST become/scitran/Neuroscience/ses-01/acq/files/scan.nii.gz
.node_type
tocontainer_type
to be consistent with other places where we identify the container type (e.g. search)Review Checklist