Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DM-20842: Change Formatter API #176

Merged
merged 13 commits into from Aug 5, 2019
Merged

DM-20842: Change Formatter API #176

merged 13 commits into from Aug 5, 2019

Conversation

timj
Copy link
Member

@timj timj commented Jul 31, 2019

The motivation for this is to simplify formatters so that they know they can store state in their instance without fear that a Datastore is going to reuse the Formatter for some other location.

  • This changes the read() and write() methods since they no longer need FileDescriptor parameter.
  • All the formatters need to be tweaked.
  • Datastore needs changing to ensure that FileDescriptor is created a bit earlier.
  • getFormatter gets an extra argument which can be None for formatters that will not be using read/write at all.

@TallJimbo , @parejkoj does this look okay to you?

timj added 3 commits July 31, 2019 14:54
The motivation for this is to simplify formatters so
that they know they can store state in their instance
without fear that a Datastore is going to reuse the
Formatter for some other location.

* This changes the read() and write() methods since they no
  longer need FileDescriptor parameter.
* All the formatters need to be tweaked.
* Datastore needs changing to ensure that FileDescriptor is
  created a bit earlier.
* getFormatter gets an extra argument which can be None for
  formatters that will not be using read/write at all.
@timj timj requested review from TallJimbo and parejkoj July 31, 2019 22:01
Copy link
Contributor

@parejkoj parejkoj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just two comments, otherwise this looks fine.


Parameters
----------
fileDescriptor : `FileDescriptor`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure, but I think you still have to give the full namespace in order for this to become a link.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was under the impression that classes in the local package didn't need to be fully qualified.

@@ -45,15 +45,15 @@ def testRegistry(self):
formatterTypeName = "lsst.daf.butler.formatters.fitsCatalogFormatter.FitsCatalogFormatter"
storageClassName = "Image"
self.factory.registerFormatter(storageClassName, formatterTypeName)
f = self.factory.getFormatter(storageClassName)
f = self.factory.getFormatter(storageClassName, None)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are all of these Nones really necessary here? Could you make wherever that is landing take a None default arg?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I decided to force FileDescriptor to take an argument. This at least made it obvious where I needed to put an argument during testing. It's more explicit this way that you are saying you know that the formatter will never be used to read or write anything.

Copy link
Member

@TallJimbo TallJimbo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 I'm looking forward to seeing how this improves the concrete raw formatters; I think they'll really benefit from this change.

def __init__(self, fileDescriptor):
if fileDescriptor is not None and not isinstance(fileDescriptor, FileDescriptor):
raise TypeError("File descriptor must be a FileDescriptor")
self._fileDescriptor = fileDescriptor
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like this might as well be a public attribute.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. It's a read-only property. There's a getter further down.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Meaning I want to guarantee that people can't change the file location once the object is constructed.

@timj
Copy link
Member Author

timj commented Aug 2, 2019

Related pull requests in lsst/obs_subaru#217 and lsst/obs_base#165

I think I'm ready for a proper review. I rejigged things in the way suggested by @TallJimbo to allow for a Formatter.getFormatterClass method which makes the logic in datastore.ingest more explicit.

location : `Location`
The location to simulate writing to.
Location of file for which path prediction is required.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I gather this is a root filename, and what's returned is the full thing with the extension (or some generalization thereof)? This is mostly preexisting, but it's not currently clear from the docs what the distinction between input and output is.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method doesn't care about input or output. It just wants a Location object which refers to a place in the datastore -- this method generally applies the right file suffix and then returns the filename from the Location.

formatter = formatter.name()
else:
raise ValueError(f"Supplied formatter '{formatter}' is not supported")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does "not supported" mean in this context? That it's not actually a Formatter?

def metadata(self):
"""The metadata read from this file. It will be stripped as
components are extracted from it.
"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docstring should include the return type.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

@parejkoj parejkoj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple of docstring comments, otherwise looks good.

@@ -197,7 +234,57 @@ def getLookupKeys(self):
"""
return self._mappingFactory.getLookupKeys()

def getFormatterWithMatch(self, entity):
def getFormatterClassWithMatch(self, entity):
"""Get a the matching formatter class along with the matching registry
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get a the?


Parameters
----------
entity : `DatasetRef`, `DatasetType` or `StorageClass`, or `str`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Either missing an or, or has one to many. Same in getFormatterClass below.

def metadata(self):
"""The metadata read from this file. It will be stripped as
components are extracted from it.
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@timj timj merged commit 92010a9 into master Aug 5, 2019
@timj timj deleted the tickets/DM-20842 branch August 5, 2019 16:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants