Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DM-20842: Change Formatter API #176

Merged
merged 13 commits into from
Aug 5, 2019
Merged
130 changes: 113 additions & 17 deletions python/lsst/daf/butler/core/formatter.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,13 +27,21 @@
from .configSupport import processLookupConfigs
from .mappingFactory import MappingFactory
from .utils import getFullTypeName
from .fileDescriptor import FileDescriptor

log = logging.getLogger(__name__)


class Formatter(metaclass=ABCMeta):
"""Interface for reading and writing Datasets with a particular
`StorageClass`.

Parameters
----------
fileDescriptor : `FileDescriptor`, optional
Identifies the file to read or write, and the associated storage
classes and parameter information. Its value can be `None` if the
caller will never call `Formatter.read` or `Formatter.write`.
"""

unsupportedParameters = frozenset()
Expand All @@ -42,21 +50,27 @@ class Formatter(metaclass=ABCMeta):
are supported.
"""

def __init__(self, fileDescriptor=None):
if fileDescriptor is not None and not isinstance(fileDescriptor, FileDescriptor):
raise TypeError("File descriptor must be a FileDescriptor")
self._fileDescriptor = fileDescriptor
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like this might as well be a public attribute.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. It's a read-only property. There's a getter further down.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Meaning I want to guarantee that people can't change the file location once the object is constructed.


@property
def fileDescriptor(self):
return self._fileDescriptor

@classmethod
def name(cls):
"""Returns the fully qualified name of the formatter.
"""
return getFullTypeName(cls)

@abstractmethod
def read(self, fileDescriptor, component=None):
def read(self, component=None):
"""Read a Dataset.

Parameters
----------
fileDescriptor : `FileDescriptor`
Identifies the file to read, type to read it into and parameters
to be used for reading.
component : `str`, optional
Component to read from the file. Only used if the `StorageClass`
for reading differed from the `StorageClass` used to write the
Expand All @@ -70,15 +84,13 @@ def read(self, fileDescriptor, component=None):
raise NotImplementedError("Type does not support reading")

@abstractmethod
def write(self, inMemoryDataset, fileDescriptor):
def write(self, inMemoryDataset):
"""Write a Dataset.

Parameters
----------
inMemoryDataset : `InMemoryDataset`
The Dataset to store.
fileDescriptor : `FileDescriptor`
Identifies the file to write.

Returns
-------
Expand All @@ -87,17 +99,38 @@ def write(self, inMemoryDataset, fileDescriptor):
"""
raise NotImplementedError("Type does not support writing")

@classmethod
@abstractmethod
def predictPath(self, location):
def predictPathFromLocation(cls, location):
"""Return the path that would be returned by write, without actually
writing.

Parameters
----------
location : `Location`
The location to simulate writing to.
Location of file for which path prediction is required.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I gather this is a root filename, and what's returned is the full thing with the extension (or some generalization thereof)? This is mostly preexisting, but it's not currently clear from the docs what the distinction between input and output is.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method doesn't care about input or output. It just wants a Location object which refers to a place in the datastore -- this method generally applies the right file suffix and then returns the filename from the Location.


Returns
-------
path : `str`
Path that would be returned by a call to `Formatter.write()`.
"""
raise NotImplementedError("Type does not support writing")

def segregateParameters(self, parameters):
def predictPath(self):
"""Return the path that would be returned by write, without actually
writing.

Uses the `FileDescriptor` associated with the instance.

Returns
-------
path : `str`
Path that would be returned by a call to `Formatter.write()`.
"""
return self.predictPathFromLocation(self.fileDescriptor.location)

def segregateParameters(self, parameters=None):
"""Segregate the supplied parameters into those understood by the
formatter and those not understood by the formatter.

Expand All @@ -106,9 +139,10 @@ def segregateParameters(self, parameters):

Parameters
----------
parameters : `dict`
parameters : `dict`, optional
Parameters with values that have been supplied by the caller
and which might be relevant for the formatter.
and which might be relevant for the formatter. If `None`
parameters will be read from the registered `FileDescriptor`.

Returns
-------
Expand All @@ -118,6 +152,9 @@ def segregateParameters(self, parameters):
Those parameters not supported by this formatter.
"""

if parameters is None:
parameters = self.fileDescriptor.parameters

if parameters is None:
return {}, {}

Expand Down Expand Up @@ -197,7 +234,57 @@ def getLookupKeys(self):
"""
return self._mappingFactory.getLookupKeys()

def getFormatterWithMatch(self, entity):
def getFormatterClassWithMatch(self, entity):
"""Get a the matching formatter class along with the matching registry
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get a the?

key.

Parameters
----------
entity : `DatasetRef`, `DatasetType` or `StorageClass`, or `str`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Either missing an or, or has one to many. Same in getFormatterClass below.

Entity to use to determine the formatter to return.
`StorageClass` will be used as a last resort if `DatasetRef`
or `DatasetType` instance is provided. Supports instrument
override if a `DatasetRef` is provided configured with an
``instrument`` value for the data ID.

Returns
-------
matchKey : `LookupKey`
The key that resulted in the successful match.
formatter : `type`
The class of the registered formatter.
"""
if isinstance(entity, str):
names = (entity,)
else:
names = entity._lookupNames()
matchKey, formatter = self._mappingFactory.getClassFromRegistryWithMatch(names)
log.debug("Retrieved formatter %s from key '%s' for entity '%s'", getFullTypeName(formatter),
matchKey, entity)

return matchKey, formatter

def getFormatterClass(self, entity):
"""Get the matching formatter class.

Parameters
----------
entity : `DatasetRef`, `DatasetType` or `StorageClass`, or `str`
Entity to use to determine the formatter to return.
`StorageClass` will be used as a last resort if `DatasetRef`
or `DatasetType` instance is provided. Supports instrument
override if a `DatasetRef` is provided configured with an
``instrument`` value for the data ID.

Returns
-------
formatter : `type`
The class of the registered formatter.
"""
_, formatter = self.getFormatterClassWithMatch(entity)
return formatter

def getFormatterWithMatch(self, entity, *args, **kwargs):
"""Get a new formatter instance along with the matching registry
key.

Expand All @@ -209,6 +296,10 @@ def getFormatterWithMatch(self, entity):
or `DatasetType` instance is provided. Supports instrument
override if a `DatasetRef` is provided configured with an
``instrument`` value for the data ID.
args : `tuple`
Positional arguments to use pass to the object constructor.
kwargs : `dict`
Keyword arguments to pass to object constructor.

Returns
-------
Expand All @@ -221,12 +312,13 @@ def getFormatterWithMatch(self, entity):
names = (entity,)
else:
names = entity._lookupNames()
matchKey, formatter = self._mappingFactory.getFromRegistryWithMatch(*names)
log.debug("Retrieved formatter from key '%s' for entity '%s'", matchKey, entity)
matchKey, formatter = self._mappingFactory.getFromRegistryWithMatch(names, *args, **kwargs)
log.debug("Retrieved formatter %s from key '%s' for entity '%s'", getFullTypeName(formatter),
matchKey, entity)

return matchKey, formatter

def getFormatter(self, entity):
def getFormatter(self, entity, *args, **kwargs):
"""Get a new formatter instance.

Parameters
Expand All @@ -237,13 +329,17 @@ def getFormatter(self, entity):
or `DatasetType` instance is provided. Supports instrument
override if a `DatasetRef` is provided configured with an
``instrument`` value for the data ID.
args : `tuple`
Positional arguments to use pass to the object constructor.
kwargs : `dict`
Keyword arguments to pass to object constructor.

Returns
-------
formatter : `Formatter`
An instance of the registered formatter.
"""
_, formatter = self.getFormatterWithMatch(entity)
_, formatter = self.getFormatterWithMatch(entity, *args, **kwargs)
return formatter

def registerFormatter(self, type_, formatter):
Expand Down
79 changes: 69 additions & 10 deletions python/lsst/daf/butler/core/mappingFactory.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@

__all__ = ("MappingFactory", )

from .utils import getInstanceOf
from .utils import getClassOf
from .configSupport import LookupKey


Expand Down Expand Up @@ -74,22 +74,22 @@ def getLookupKeys(self):
"""
return set(self._registry)

def getFromRegistryWithMatch(self, *targetClasses):
"""Get a new instance of the object stored in the registry along with
def getClassFromRegistryWithMatch(self, targetClasses):
"""Get the class stored in the registry along with
the matching key.

Parameters
----------
*targetClasses : `LookupKey`, `str` or objects with ``name`` attribute
targetClasses : `LookupKey`, `str` or objects with ``name`` attribute
Each item is tested in turn until a match is found in the registry.
Items with `None` value are skipped.

Returns
-------
matchKey : `LookupKey`
The key that resulted in the successful match.
instance : `object`
Instance of class stored in registry associated with the first
cls : `type`
Class stored in registry associated with the first
matching target class.

Raises
Expand All @@ -110,21 +110,80 @@ def getFromRegistryWithMatch(self, *targetClasses):
except KeyError:
pass
else:
return key, getInstanceOf(typeName)
return key, getClassOf(typeName)

# Convert list to a string for error reporting
msg = ", ".join(str(k) for k in attempts)
plural = "" if len(attempts) == 1 else "s"
raise KeyError(f"Unable to find item in registry with key{plural}: {msg}")

def getFromRegistry(self, *targetClasses):
def getClassFromRegistry(self, targetClasses, *args, **kwargs):
"""Get the matching class stored in the registry.

Parameters
----------
targetClasses : `LookupKey`, `str` or objects with ``name`` attribute
Each item is tested in turn until a match is found in the registry.
Items with `None` value are skipped.

Returns
-------
cls : `type`
Class stored in registry associated with the first
matching target class.

Raises
------
KeyError
Raised if none of the supplied target classes match an item in the
registry.
"""
_, cls = self.getClassFromRegistryWithMatch(targetClasses)
return cls

def getFromRegistryWithMatch(self, targetClasses, *args, **kwargs):
"""Get a new instance of the object stored in the registry along with
the matching key.

Parameters
----------
targetClasses : `LookupKey`, `str` or objects with ``name`` attribute
Each item is tested in turn until a match is found in the registry.
Items with `None` value are skipped.
args : `tuple`
Positional arguments to use pass to the object constructor.
kwargs : `dict`
Keyword arguments to pass to object constructor.

Returns
-------
matchKey : `LookupKey`
The key that resulted in the successful match.
instance : `object`
Instance of class stored in registry associated with the first
matching target class.

Raises
------
KeyError
Raised if none of the supplied target classes match an item in the
registry.
"""
key, cls = self.getClassFromRegistryWithMatch(targetClasses)
return key, cls(*args, **kwargs)

def getFromRegistry(self, targetClasses, *args, **kwargs):
"""Get a new instance of the object stored in the registry.

Parameters
----------
*targetClasses : `LookupKey`, `str` or objects with ``name`` attribute
targetClasses : `LookupKey`, `str` or objects with ``name`` attribute
Each item is tested in turn until a match is found in the registry.
Items with `None` value are skipped.
args : `tuple`
Positional arguments to use pass to the object constructor.
kwargs : `dict`
Keyword arguments to pass to object constructor.

Returns
-------
Expand All @@ -138,7 +197,7 @@ def getFromRegistry(self, *targetClasses):
Raised if none of the supplied target classes match an item in the
registry.
"""
_, instance = self.getFromRegistryWithMatch(*targetClasses)
_, instance = self.getFromRegistryWithMatch(targetClasses, *args, **kwargs)
return instance

def placeInRegistry(self, registryKey, typeName):
Expand Down