Skip to content

Commit

Permalink
Merge pull request #222 from lsst/tickets/DM-23063
Browse files Browse the repository at this point in the history
DM-23063: Allow checksum calculation to be disabled
  • Loading branch information
timj committed Jan 17, 2020
2 parents a6c9b55 + 3e15f51 commit dbfd732
Show file tree
Hide file tree
Showing 5 changed files with 44 additions and 5 deletions.
8 changes: 4 additions & 4 deletions doc/lsst.daf.butler/configuring.rst
Original file line number Diff line number Diff line change
Expand Up @@ -17,9 +17,9 @@ There are additional search paths that can be included when a config object is c
To construct a Butler configuration object (`~lsst.daf.butler.ButlerConfig`) from a file the following happens:

* The supplied config is read in.
* If any leaf nodes in the configuration end in ``configIncludes`` the values (either a scalar or list) will be treated as the names of other config files.
* If any leaf nodes in the configuration end in ``includeConfigs`` the values (either a scalar or list) will be treated as the names of other config files.
These files will be located either as an absolute path or relative to the current working directory, or the directory in which the original configuration file was found.
The contents of these files will then be inserted into the configuration at the same hierarchy as the ``configIncludes`` directive, with priority given to the values defined explicitly in the parent configuration (for lists of include files later files overwrite content from earlier ones).
The contents of these files will then be inserted into the configuration at the same hierarchy as the ``includeConfigs`` directive, with priority given to the values defined explicitly in the parent configuration (for lists of include files later files overwrite content from earlier ones).
* Each sub configuration class is constructed by supplying the relevant subset of the global config to the component Config constructor.
* A search path is constructed by concatenating the supplied search path, the environment variable path (``$DAF_BUTLER_CONFIG_PATH``), and the daf_butler config directory (``$DAF_BUTLER_DIR/config``).
* Defaults are first read from the config class default file name (e.g., ``registry.yaml`` for `~lsst.daf.butler.Registry`, and ``datastore.yaml`` for `~lsst.daf.butler.Datastore`) and merged in priority order given in the search path.
Expand All @@ -34,7 +34,7 @@ The name of the specialist configuration file to search for can be found by look

We also have a YAML parser extension ``!include`` that can be used to pull in other YAML files before the butler specific config parsing happens.
This is very useful to allow reuse of YAML snippets but be aware that the path specified is relative to the file that contains the directive.
In many cases ``configIncludes`` is a more robust approach to file inclusion as it handles overrides in a more predictable manner.
In many cases ``includeConfigs`` is a more robust approach to file inclusion as it handles overrides in a more predictable manner.

There is a command available to allow you to see how all these overrides and includes behave.

Expand All @@ -51,5 +51,5 @@ In addition to the configuration options described above, there are some values
For `~lsst.daf.butler.RegistryConfig` and `~lsst.daf.butler.DatastoreConfig` the ``root`` key, which can be used to specify paths, can include values using the special tag ``<butlerRoot>``.
At run time, this tag will be replaced by a value derived from the location of the main butler configuration file, or else from the value of the ``root`` key found at the top of the butler configuration.

Currently, if you create a butler configuration file that loads another butler configuration file, via ``configIncludes``, then any ``<butlerRoot>`` tags will be replaced with the location of the new file, not the original.
Currently, if you create a butler configuration file that loads another butler configuration file, via ``includeConfigs``, then any ``<butlerRoot>`` tags will be replaced with the location of the new file, not the original.
It is therefore recommended that an explicit ``root`` be defined at the top level when defining butler overrides via a new top level butler configuration.
3 changes: 3 additions & 0 deletions python/lsst/daf/butler/datastores/fileLikeDatastore.py
Original file line number Diff line number Diff line change
Expand Up @@ -218,6 +218,9 @@ def __init__(self, config, registry, butlerRoot=None):
self._tableName = self.config["records", "table"]
registry.registerOpaqueTable(self._tableName, self.makeTableSpec())

# Determine whether checksums should be used
self.useChecksum = self.config.get("checksum", True)

def __str__(self):
return self.root

Expand Down
5 changes: 4 additions & 1 deletion python/lsst/daf/butler/datastores/posixDatastore.py
Original file line number Diff line number Diff line change
Expand Up @@ -263,7 +263,10 @@ def _extractIngestInfo(self, path: str, ref: DatasetRef, *, formatter: Type[Form
raise NotImplementedError("Transfer type '{}' not supported.".format(transfer))
path = newPath
fullPath = newFullPath
checksum = self.computeChecksum(fullPath)
if self.useChecksum:
checksum = self.computeChecksum(fullPath)
else:
checksum = None
stat = os.stat(fullPath)
size = stat.st_size
return StoredFileInfo(formatter=formatter, path=path, storageClass=ref.datasetType.storageClass,
Expand Down
3 changes: 3 additions & 0 deletions tests/config/basic/posixDatastoreNoChecksums.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
includeConfigs: posixDatastore.yaml
datastore:
checksum: false
30 changes: 30 additions & 0 deletions tests/test_datastore.py
Original file line number Diff line number Diff line change
Expand Up @@ -465,6 +465,36 @@ def setUp(self):
super().setUp()


class PosixDatastoreNoChecksumsTestCase(PosixDatastoreTestCase):
"""Posix datastore tests but with checksums disabled."""
configFile = os.path.join(TESTDIR, "config/basic/posixDatastoreNoChecksums.yaml")

def testChecksum(self):
"""Ensure that checksums have not been calculated."""

datastore = self.makeDatastore()
storageClass = self.storageClassFactory.getStorageClass("StructuredData")
dimensions = self.universe.extract(("visit", "physical_filter"))
metrics = makeExampleMetrics()

dataId = {"instrument": "dummy", "visit": 0, "physical_filter": "V"}
ref = self.makeDatasetRef("metric", dimensions, storageClass, dataId,
conform=False)

# Configuration should have disabled checksum calculation
datastore.put(metrics, ref)
info = datastore.getStoredItemInfo(ref)
self.assertIsNone(info.checksum)

# Remove put back but with checksums enabled explicitly
datastore.remove(ref)
datastore.useChecksum = True
datastore.put(metrics, ref)

info = datastore.getStoredItemInfo(ref)
self.assertIsNotNone(info.checksum)


class CleanupPosixDatastoreTestCase(DatastoreTestsBase, unittest.TestCase):
configFile = os.path.join(TESTDIR, "config/basic/butler.yaml")

Expand Down

0 comments on commit dbfd732

Please sign in to comment.