New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DM-20842: Change Formatter API #176
Conversation
The motivation for this is to simplify formatters so that they know they can store state in their instance without fear that a Datastore is going to reuse the Formatter for some other location. * This changes the read() and write() methods since they no longer need FileDescriptor parameter. * All the formatters need to be tweaked. * Datastore needs changing to ensure that FileDescriptor is created a bit earlier. * getFormatter gets an extra argument which can be None for formatters that will not be using read/write at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just two comments, otherwise this looks fine.
|
||
Parameters | ||
---------- | ||
fileDescriptor : `FileDescriptor` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure, but I think you still have to give the full namespace in order for this to become a link.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was under the impression that classes in the local package didn't need to be fully qualified.
tests/test_formatter.py
Outdated
@@ -45,15 +45,15 @@ def testRegistry(self): | |||
formatterTypeName = "lsst.daf.butler.formatters.fitsCatalogFormatter.FitsCatalogFormatter" | |||
storageClassName = "Image" | |||
self.factory.registerFormatter(storageClassName, formatterTypeName) | |||
f = self.factory.getFormatter(storageClassName) | |||
f = self.factory.getFormatter(storageClassName, None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are all of these None
s really necessary here? Could you make wherever that is landing take a None
default arg?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I decided to force FileDescriptor to take an argument. This at least made it obvious where I needed to put an argument during testing. It's more explicit this way that you are saying you know that the formatter will never be used to read or write anything.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 I'm looking forward to seeing how this improves the concrete raw formatters; I think they'll really benefit from this change.
def __init__(self, fileDescriptor): | ||
if fileDescriptor is not None and not isinstance(fileDescriptor, FileDescriptor): | ||
raise TypeError("File descriptor must be a FileDescriptor") | ||
self._fileDescriptor = fileDescriptor |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like this might as well be a public attribute.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. It's a read-only property. There's a getter further down.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Meaning I want to guarantee that people can't change the file location once the object is constructed.
Instance retrieval is now layered on top of class retrieval.
This can be a class method and so be distinct from the instance methoer predictPath
Related pull requests in lsst/obs_subaru#217 and lsst/obs_base#165 I think I'm ready for a proper review. I rejigged things in the way suggested by @TallJimbo to allow for a |
location : `Location` | ||
The location to simulate writing to. | ||
Location of file for which path prediction is required. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I gather this is a root filename, and what's returned is the full thing with the extension (or some generalization thereof)? This is mostly preexisting, but it's not currently clear from the docs what the distinction between input and output is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method doesn't care about input or output. It just wants a Location
object which refers to a place in the datastore -- this method generally applies the right file suffix and then returns the filename from the Location.
formatter = formatter.name() | ||
else: | ||
raise ValueError(f"Supplied formatter '{formatter}' is not supported") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does "not supported" mean in this context? That it's not actually a Formatter
?
def metadata(self): | ||
"""The metadata read from this file. It will be stripped as | ||
components are extracted from it. | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Docstring should include the return type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couple of docstring comments, otherwise looks good.
@@ -197,7 +234,57 @@ def getLookupKeys(self): | |||
""" | |||
return self._mappingFactory.getLookupKeys() | |||
|
|||
def getFormatterWithMatch(self, entity): | |||
def getFormatterClassWithMatch(self, entity): | |||
"""Get a the matching formatter class along with the matching registry |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
get a the?
|
||
Parameters | ||
---------- | ||
entity : `DatasetRef`, `DatasetType` or `StorageClass`, or `str` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Either missing an or
, or has one to many. Same in getFormatterClass
below.
def metadata(self): | ||
"""The metadata read from this file. It will be stripped as | ||
components are extracted from it. | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The motivation for this is to simplify formatters so that they know they can store state in their instance without fear that a Datastore is going to reuse the Formatter for some other location.
@TallJimbo , @parejkoj does this look okay to you?