New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
daf_persistence Tickets/dm 7468 #51
Conversation
c00f085
to
92a7133
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks OK, small bunch of minor comments.
* see <http://www.lsstcorp.org/LegalNotices/>. | ||
*/ | ||
|
||
#ifndef LSST_MWI_PERSISTENCE_STORAGE_H | ||
#define LSST_MWI_PERSISTENCE_STORAGE_H |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change to LSST_MWI_PERSISTENCE_STORAGEFORMATTER_H
? BTW, why LSST_MWI_PERSISTENCE...
and not LSST_DAF_PERSISTENCE...
(this applies to many other headers too).
Storage& operator=(Storage const&); | ||
// Do not copy or assign a StorageFormatter instance. | ||
StorageFormatter(StorageFormatter const&); | ||
StorageFormatter& operator=(StorageFormatter const&); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use = delete
instead of making them private?
tests/testSwiftStorage.py
Outdated
user = os.getenv('SWIFT_USERNAME') | ||
if user is None: | ||
raise unittest.SkipTest( | ||
'Can not run test: OS_USERNAME not in environment') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SWIFT_USERNAME
tests/testSwiftStorage.py
Outdated
key = os.getenv('SWIFT_PASSWORD') | ||
if key is None: | ||
raise unittest.SkipTest( | ||
'Can not run test: OS_PASSWORD not in environment') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SWIFT_PASSWORD
""" | ||
p = os.path.join(self.root, path) | ||
if os.path.exists(p): | ||
return p | ||
return open(p) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpick: in general I prefer to open()
and catch an exception if it fails, as checking that file exist and opening it in two separate steps is non-atomic and my lead to some interesting errors. But I'm not sure you have to worry about this in your storage classes as they are probably the once managing lifetime of the files.
try: | ||
f = self._getLocalFile(location) | ||
self._log.info( | ||
"...getting file took {} seconds.".format(time.time() - start)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some people may not like too much nose at INFO log level.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would debug be better, or what would you suggest?
For the time being I find it very useful for profiling Butler behavior, although I agree it will end up being noisy. I'd like to leave something that will print during unit test execution, but I don't need it to be 'info'.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd say that this was at the trace level, not even debug
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe we log at INFO by default, so if you want to see this output from unit test at DEBUG/TRACE level you need to configure logging with non-default values.
""" | ||
# Now search for the path in the root or its parents | ||
# Strip off any cfitsio bracketed extension if present | ||
strippedPath, pathStripped = self.getFitsHeaderStrippedPath(path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
strippedPath, pathStripped
is not going to confuse anyone? 😸
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It confuses me! Actually it's pretty descriptive if you sit on your zen cushion for a minute. In any case, it's moved here from code before my time, and I don't want to mess with it right now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I'll practice zen, it'll just take longer to review if I need to focus for a minute on that line 😸
except self.swift.ClientException as err: | ||
raise RuntimeError("Container \'{0}\' not found: {1}".format( | ||
self._containerName, err)) | ||
import fnmatch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should probably go to the top of the file.
# Assuming the file does not exist, not some other kind of error | ||
# (how can we do more precise handling of this expcetion?) | ||
return None | ||
with open(localFile.name, mode='r') as f: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
localFile
is already a file object, can you read from it directly?
tests/testSwiftStorage.py
Outdated
import sys | ||
import uuid | ||
|
||
packageDir = os.path.join(getPackageDir('daf_persistence')) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is os.path.join
doing anything here?
790f537
to
7ca1c3a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please consider the proposed modifications.
SWIFT_USERNAME : string | ||
The username to use when authorizing the connection. | ||
SWIFT_PASSWORD : string | ||
The password to use when authorizing the connection. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In would suggest to use the environment variables OS_USERNAME
, OS_PASSWORD
and OS_TENANT_NAME
for authentication information. These are the variables that the official Swift python command line interface uses for authentication, so I see value in keeping the same convention for the Butlet.
A URI to connect to a swift storage location. The form of the URI is | ||
`swift://[URL without 'http://']/[Object API Version]/[tenant name (account)]/[container]` | ||
For example: | ||
`swift://nebula.ncsa.illinois.edu:5000/v2.0/lsst/my_container` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An URL such as swift://nebula.ncsa.illinois.edu:5000/v2.0/lsst/my_container
does not tell what scheme the Swift authentication server uses. For instance, NCSA's instance of Swift uses http://
but CC-IN2P3's uses https://
for both the authentication service and the storage server. In other words, it is not robust to infer that a swift://
URI implies http://
.
An alternative way to create an object of the SwiftStorage
class would be to pass as an argument a Python dictionary with the required fields for authentication, i.e.:
- user name (or the value of
OS_USERNAME
if not specified) - password (or the value of
OS_PASSWORD
if not specified) - tenant name (or the value of
OS_TENANT_NAME
if not specified) - authentication URL (or the value of
OS_AUTH_URL
if not specified)
This mechanism would allow for accessing Butler repositories stored in several Swift instances (e.g. at NCSA and CC-IN2P3) for which the authentication information (auth url, user name, tenant name, password) is different.
self._version, \ | ||
self._tenantName, \ | ||
self._containerName = self._parseURI(uri) | ||
self._url = "http://" + os.path.join(self._url, self._version) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See my comment above. A Swift endpoint may speak http://
or https://
depending on how it is configured.
The object representing the connection. The object may or may not | ||
actually be connected. | ||
""" | ||
user = os.getenv('SWIFT_USERNAME') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See above comment. I think OS_USERNAME
would be a better variable for compatibility purposes.
if user is None: | ||
raise RuntimeError( | ||
'SwiftStorage could not find SWIFT_USERNAME in environment') | ||
key = os.getenv('SWIFT_PASSWORD') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See above comment. I think OS_PASSWORD
would be a better variable for compatibility purposes.
return storage.instanceSearch(path) if storage else None | ||
|
||
def copyFile(self, fromLocation, toLocation): | ||
"""Copy a file from one location to another on the local filesystem. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From the implementation of this method I would say that the purpose here is to copy one Swift object into another Swift object, therefore in the same container, not in the local filesystem. If that's the intention, this comment needs to be updated.
7ca1c3a
to
89875c5
Compare
When a dataset is downloaded from a remote resource, SwiftStorage puts the dataset into a NamedTemporaryFile, which will be deleted when the file object is deleted (or closed explicitly). This guarantees the file will exist as long as the object exists (in case the object refers internally to the file).
No description provided.