-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DNAnexus module within stor #2
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bunch of comments
NotFoundError = stor_exceptions.NotFoundError | ||
|
||
|
||
def _parse_dx_error(exc, **kwargs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this looks unused now - should you remove it? It's also missing a second else clause when none of the cases match that would lead to a confusing error message
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, outdated and removed.
wrapper.__doc__ = 'No-op for %r' % attr_name | ||
return wrapper | ||
|
||
abspath = _noop('abspath') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this going to lead to bugs around dx://Proj:/parent/subdir/../anotherfile.txt
- do we care about that behavior?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
abspath is not used for DXPaths in code base, and seeing that OBS paths in general have abspath as a noop, it is upto the user to use abspath cautiously with DXPaths.
project=self.canonical_project | ||
)[0] | ||
else: | ||
raise DXError('DX Projects cannot have a temporary download url') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks pretty close to the right way to do this; however a couple points:
- since the if clause always raise an error, you can check it first, raise the exception and dedent here, i.e.:
if not self.resource:
raise DXError('Can only generate a download URL for individual files')
try:
return file_handler.get_download_url(...)
except: pass
- I think you want to check
resource
and not canonical resource before raising an error, no reason to go hit an API to canonicalize a project before raising this exception.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed on both points, and changed.
return file_handler.get_download_url( | ||
duration=lifetime, | ||
preauthenticated=True, | ||
filename=filename, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if you're already describing this, should filename just be what's returned from describe() if not present? what happens if filename is None here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If filename is None here, the get_download_url defaults to the name of the source file, which is the behavior we want.
else: | ||
raise DXError('DX Projects cannot have a temporary download url') | ||
except ValueError: | ||
raise DXError('DXPaths ending in folders cannot have a temporary download url') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how would you hit this exception? wouldn't you know this because you can't generate a canonical resource for this method?
I'm thinking it'd be clearer to write something like:
if not self.resource: error
canonical = self.canonicalize() # causes error if it's a folder already??
in the second case I think it's fine to just have an error that's just about data objects, etc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed after discussion. Changed the method to reflect this.
if canonicalize: | ||
yield DXCanonicalPath('dx://{}:/{}'.format(obj['project'], obj['id'])) | ||
else: | ||
dx_p = DXVirtualPath(self.drive + proj_name + ':' + obj['describe']['folder']) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this'd be clearer written as:
'{drive}{proj_name}:{folder}/{name}'.format(drive=drive, proj_name=proj_name, folder=obj['describe']['folder'], name=obj['describe']['name'])
``` which I believe does the same thing but avoids a level of indirection
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, changed here and other places where it would be more intuitive to write like this.
pass | ||
# otherwise we could be a directory, so try to grab first | ||
# file/subfolder | ||
if self._is_folder(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you help me understand what the _is_folder
check adds here? why not just try to list it directly and return False on the NotFoundError?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you ensure there's an integration test that shows the following is True?
# empty dirs should be seen as existing
folder_path = dx_p / 'myfolder'
folder_path.makedirs_p()
assert stor.exists(folder_path)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be a documented diff between swift vs. s3 vs. posix
Swift & S3 - "empty" folder is exists = False
Posix & DX - "empty" folder is exists = True
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed _is_folder
There is a unit test TestExists.test_true_empty_dir
to check that empty dirs are seen as existing. Should I add an integration test as well?
""" | ||
if not self.resource: | ||
return None | ||
if self._is_folder(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when I look here, it feels like this should just raise a NotFoundError that specifies no data object could be found for the given path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed and changed.
'folder': self.resource.parent, | ||
'project': self.canonical_project | ||
}] | ||
results = dxpy.resolve_data_objects(objects=objects)[0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
aside, should you pass in limit=2
here, just cuz we only care 0 or 1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately resolve_data_objects doesn't have a limit argument, hence cannot pass it.
.format(self.project)) | ||
return proj_dict['id'] | ||
except DXError as e: | ||
raise DuplicateProjectError('Duplicate projects were found with given name ({})' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this seems like a dangerous check, because it's possible you'd swallow up a different error (e.g., about authentication or something else). can you list projects with a limit of 2 and then handle 0/1/2 projects with appropriate exceptions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed DXError to DXSearchError which is only raised when there are multiple projects.
Since find_one_project does not have a limit argument, the only other way is to pass zero_ok=False, in which case DXSearchError is raised for 0/2 projects and there's no way to differentiate the two scenarios.
|
||
Args: | ||
canonicalize (boolean): whether to return canonicalized paths | ||
only (str): "objects" for only objects, "folders" for only folder, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
makes sense to keep because using it below
"""List the path as a dir, returning top-level directories and files. | ||
|
||
Args: | ||
canonicalize (boolean): whether to return canonicalized paths |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
canonicalize (bool, default False): if True, return canonical paths
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed here and other places. Will include in next commit.
Yes, added a maximum limit of 15s, after which the test case fails |
Agreed. Made a 'setup_file' method that waits for the files to close before returning the handler. |
Implement apis for DNAnexus inside stor. Also implement unit and integration testing for DNAnexus paths inside stor. Setup DNAnexus credentials for testing.
Motivation
As part of the epic to support DNAnexus paths within stor.
Implementation