Skip to content

Commit

Permalink
Some tweaks from review
Browse files Browse the repository at this point in the history
  • Loading branch information
timj committed Sep 9, 2020
1 parent 7283127 commit 03852a3
Show file tree
Hide file tree
Showing 2 changed files with 31 additions and 12 deletions.
39 changes: 27 additions & 12 deletions doc/lsst.daf.butler/datastores.rst
Original file line number Diff line number Diff line change
Expand Up @@ -9,10 +9,21 @@ Datastore Configuration
A Butler `Datastore` is configured in the ``datastore`` section of the top-level Butler YAML configuration.
The only mandatory entry in the datastore configuration is the ``cls`` key.
This specifies the fully qualified class name of the Python class implementing the datastore.
The default is to use the `~datastores.posixDatastore.PosixDatastore`.
The default Butler configurations uses the `~datastores.posixDatastore.PosixDatastore`.
All other keys depend on the specific datastore class that is selected.

The default configuration values can be inspected at ``$DAF_BUTLER_DIR/config`` and current values can be obtained by calling ``butler config-dump`` on a butler repository.

.. note::

The default configuration values can be inspected at ``$DAF_BUTLER_DIR/python/lsst/daf/butler/configs`` (they can be accessed directly as Python package resources) and current values can be obtained by calling ``butler config-dump`` on a Butler repository.

The supported datastores are:

* :ref:`daf_butler-datastores-file` (POSIX and S3)
* :ref:`daf_butler-datastores-memory`
* :ref:`daf_butler-datastores-chain`

.. _daf_butler-datastores-file:

File-Based Datastores
=====================
Expand All @@ -21,25 +32,25 @@ The file-based datastores (for example `~datastores.posixDatastore.PosixDatastor

The supported configurations are:

root
**root**
The location of the "root" of the datastore "file system".
Usually the default value of ``<butlerRoot>/datastore`` can be left unchanged.
Here ``<butlerRoot>`` is a magic value that is replaced either with the location of the Butler configuration file or the top-level ``root`` as set in that ``butler.yaml`` configuration file.
records
**records**
This sections defines the name of the registry table that should be used to hold details about datasets stored in the datastore (such as the path within the datastore and the associated formatter).
This only needs to be set if multiple datastores are to be used simultaneously within one Butler repository since the table names should not clash.
create
**create**
A Boolean to define whether an attempt should be made to initialize the datastore by creating the directory. Defaults to `True`.
templates
**templates**
The template to use to construct "files" within the datastore.
The template uses data dimensions to do this.
Generally the default setting will be usable although it can be tuned per `DatasetType`, `StorageClass` or data ID.
formatters
**formatters**
Mapping of `DatasetType`, `StorageClass` or data ID to a specific formatter class that understands the associated Python type and will serialize it to a file artifact.
The formatters section also supports the definitions of write recipes (bulk configurations that can be selected for specific formatters) and write parameters (parameters that control how the dataset is serialized; note it is required that all serialized artifacts be readable by a formatter without knowing which write parameters were used).
constraints
**constraints**
Specify `DatasetType`, `StorageClass` or data ID that will be accepted or rejected by this datastore.
composites
**composites**
Controls whether composite datasets are disassembled by the datastore.
By default composites are not disassembled.
Disassembly can be controlled by `DatasetType`, `StorageClass` or data ID.
Expand All @@ -61,6 +72,8 @@ The order is:
For example ``instrument+physical_filter+visit`` would match any `DatasetType` that uses those three dimensions.
#. The final match is against the `StorageClass` name.

.. _daf_butler-datastores-memory:

In-Memory Datastore
===================

Expand All @@ -69,10 +82,12 @@ This allows the datastore to accept specific dataset types.

In the future more features will be added to allow some form of cache expiry.

ChainedDatastore
================
.. _daf_butler-datastores-chain:

Chained Datastores
==================

This datastore enables multiple other datastores to be combined into one.
The `~datastores.chainedDatastore.ChainedDatastore` datastore enables multiple other datastores to be combined into one.
The datastore will be sent to every datastore in the chain and success is reported if any of the datastores accepts the dataset.
When a dataset is retrieved each datastore is asked for the dataset in turn and the first match is sufficient.
This allows an in-memory datastore to be combined with a file-based datastore to enable simple in-memory retrieval for a dataset that has been persisted to disk.
Expand Down
4 changes: 4 additions & 0 deletions doc/lsst.daf.butler/formatters.rst
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,10 @@ Assemblers are used to disassemble and reassemble composite datasets and can als

Deciding which formatter or assembler to use is controlled by the storage class and corresponding dataset type.

.. note::

When discussing configuration below, the default configuration values can be inspected at ``$DAF_BUTLER_DIR/python/lsst/daf/butler/configs`` (they can be accessed directly as Python package resources) and current values can be obtained by calling ``butler config-dump`` on a Butler repository.

Storage Classes
===============

Expand Down

0 comments on commit 03852a3

Please sign in to comment.