Minor docstring fixes

lsst · Mar 23, 2020 · 612da1e · 612da1e
1 parent b64a406
commit 612da1e
Show file tree

Hide file tree

Showing 7 changed files with 65 additions and 53 deletions.
diff --git a/doc/lsst.daf.butler/concreteStorageClasses.rst b/doc/lsst.daf.butler/concreteStorageClasses.rst
@@ -21,19 +21,23 @@ The loaded columns are the product of the values for all levels.
 Levels not included in the dict are included in their entirety.
 
 For example, the ``deepCoadd_obj`` dataset is typically defined as a hierarchical table with levels ``dataset``, ``filter``, and ``column``, which take values such as ``("meas", "HSC-R", "base_SdssShape_xx")``.
-Retrieving this dataset via::
-
-    butler.get(
-        "deepCoadd_obj", ...,
-        parameters={
-            "columns": {"dataset": "meas",
-                        "filter": ["HSC-R", "HSC-I"],
-                        "column": ["base_SdssShape_xx", "base_SdssShape_yy"]}
-        }
-    )
-
-is equivalent to (but potentially much more efficient than)::
-
-  full = butler.get("deepCoadd_obj", ...)
-  full.loc[:, ["meas", ["HSC-R", "HSC-I"],
-               ["base_SdssShape_xx", "base_SdssShape_yy"]]]
+Retrieving this dataset via:
+
+.. code-block:: python
+
+   butler.get(
+       "deepCoadd_obj", ...,
+       parameters={
+           "columns": {"dataset": "meas",
+                       "filter": ["HSC-R", "HSC-I"],
+                       "column": ["base_SdssShape_xx", "base_SdssShape_yy"]}
+       }
+   )
+
+is equivalent to (but potentially much more efficient than):
+
+.. code-block:: python
+
+   full = butler.get("deepCoadd_obj", ...)
+   full.loc[:, ["meas", ["HSC-R", "HSC-I"],
+                ["base_SdssShape_xx", "base_SdssShape_yy"]]]
diff --git a/doc/lsst.daf.butler/configuring.rst b/doc/lsst.daf.butler/configuring.rst
@@ -49,7 +49,7 @@ Overriding Root Paths
 ---------------------
 
 In addition to the configuration options described above, there are some values that have a special meaning.
-For `~lsst.daf.butler.RegistryConfig` and `~lsst.daf.butler.DatastoreConfig` the ``root`` key, which can be used to specify paths, can include values using the special tag ``<butlerRoot>``.
+For `~lsst.daf.butler.registry.RegistryConfig` and `~lsst.daf.butler.DatastoreConfig` the ``root`` key, which can be used to specify paths, can include values using the special tag ``<butlerRoot>``.
 At run time, this tag will be replaced by a value derived from the location of the main butler configuration file, or else from the value of the ``root`` key found at the top of the butler configuration.
 
 Currently, if you create a butler configuration file that loads another butler configuration file, via ``includeConfigs``, then any ``<butlerRoot>`` tags will be replaced with the location of the new file, not the original.

diff --git a/doc/lsst.daf.butler/dimensions.rst b/doc/lsst.daf.butler/dimensions.rst
@@ -9,7 +9,7 @@ In the `Registry` database, most dimensions are associated with a table that con
 Examples of dimensions include instruments, detectors, visits, and tracts.
 
 Instances of the `Dimension` class represent one of these concepts, not values of the type of one of those concepts (e.g. "detector", not a particular detector).
-In fact, a dimension "value" can mean different things in different contexts: it could mean the value of the primary key or other unique identifier for particular entity (the integer ID or string name for a particular detector), or it could represent a complete record in the table for that dimension.
+In fact, a dimension "value" can mean different things in different contexts: it could mean the value of the primary key or other unique identifier for a particular entity (the integer ID or string name for a particular detector), or it could represent a complete record in the table for that dimension.
 
 The dimensions schema also has some tables that do not map directly to `Dimension` instances.
 Some of these provide extra metadata fields for combinations of dimensions, and are represented by the `DimensionElement` class in Python (this is also the base class of the `Dimension` class, and provides much of its functionality).
@@ -36,7 +36,7 @@ It also categorizes those dimensions into `~DimensionGraph.required` and `~Dimen
 `DimensionGraph` also guarantees a deterministic and topological sort order for its elements.
 
 Because `Dimension` instances have a `~Dimension.name` attribute, we typically
-use `NamedValueSet` and `NamedKeyDict` as containers when immutability is needed or the guarantees of `DimensionGraph`.
+use `~lsst.daf.butler.core.utils.NamedValueSet` and `~lsst.daf.butler.core.utils.NamedKeyDict` as containers when immutability is needed or the guarantees of `DimensionGraph`.
 This allows the string names of dimensions to be used as well in most places where `Dimension` instances are expected.
 
 The complete set of all compatible dimensions is held by a special subclass of `DimensionGraph`, `DimensionUniverse`.
@@ -57,7 +57,7 @@ Most `Butler` and `Registry` APIs that accept data IDs as input accept both dict
 
 The data IDs returned by the `Butler` or `Registry` (and most of those used internally) are usually instances of the `DataCoordinate` class or its subclass, `ExpandedDataCoordinate`.
 `DataCoordinate` itself is complete but minimal.
-It contains only the keys that correspond to its `DimensionGraph`'s `~DimensionGraph.required` subset - that is, the minimal set of keys needed to fully identify all other dimensions in the graph.
+It contains only the keys that correspond to its `DimensionGraph`'s `~DimensionGraph.required` subset --- that is, the minimal set of keys needed to fully identify all other dimensions in the graph.
 Informal dictionary data IDs can be transformed into `DataCoordinate` instances by calling `DataCoordinate.standardize` (which is what most `Butler` and `Registry` APIs that accept data IDs do under the hood).
 
 `ExpandedDataCoordinate` is its maximal counterpart.
@@ -70,7 +70,7 @@ Spatial and Temporal Dimensions
 -------------------------------
 
 Dimensions can be *spatial* or *temporal* (or both, or neither), meaning that each record is associated with a region on the sky or a timespan (respectively).
-The overlaps between regions and timespans define many-to-many relationships between dimensions that -- along with the one-to-many ID-based dependencies -- generally provide a way to fully relate any set of dimensions.
+The overlaps between regions and timespans define many-to-many relationships between dimensions that --- along with the one-to-many ID-based dependencies --- generally provide a way to fully relate any set of dimensions.
 This produces a natural, concise query system; dimension relationships can be used to construct the full ``JOIN`` clause of a SQL ``SELECT`` with no input from the user, allowing them to specify just the ``WHERE`` clause (see `Registry.queryDimensions` and `Registry.queryDatasets`).
 It is also possible to associate a region or timespan with a combination of dimensions (such as the region for a visit and a detector), by defining a `DimensionElement` for that combination.
 

diff --git a/doc/lsst.daf.butler/organizing.rst b/doc/lsst.daf.butler/organizing.rst
@@ -16,9 +16,9 @@ Most of the time, however, users identify a dataset using a combination of three
  - a data ID;
  - a collection.
 
-Most collections are constrained to contain only on dataset with a particular dataset type and data ID, so this combination is usually enough to resolve a dataset (see :ref:`daf_butler_collections` for exceptions).
+Most collections are constrained to contain only one dataset with a particular dataset type and data ID, so this combination is usually enough to resolve a dataset (see :ref:`daf_butler_collections` for exceptions).
 
-A dataset's type and data ID are intrinsic to it - while there may be many datasets with a particular dataset type and/or data ID, the dataset type and data ID associated with a dataset are set and fixed when it is created.
+A dataset's type and data ID are intrinsic to it --- while there may be many datasets with a particular dataset type and/or data ID, the dataset type and data ID associated with a dataset are set and fixed when it is created.
 A `DatasetRef` always has both a dataset type attribute and a data ID, though the latter may be empty.
 Dataset types are discussed below in :ref:`daf_butler_dataset_types`, while data IDs are one aspect of the larger :ref:`Dimensions <lsst.daf.butler-dimensions_overview>` system and are discussed in :ref:`lsst.daf.butler-dimensions_data_ids`.
 
@@ -31,14 +31,14 @@ Collections are discussed further below in :ref:`daf_butler_collections`.
 Dataset types
 -------------
 
-The names "dataset" and "dataset type" (which `lsst.daf.butler` inherits from its `lsst.daf.persistence` predecessor) are intended to evoke the relationship between an instance and its class in object-oriented programming, but this is a metaphor, *not* a relationship that maps to any particular Python objects: we don't have any Python class that fully represents the *dataset* concept (`DatasetRef` is the closest), and the `DatasetType` class is a regular class, not a metaclass.
+The names "dataset" and "dataset type" (which ``daf_butler`` inherits from its ``daf_persistence`` predecessor) are intended to evoke the relationship between an instance and its class in object-oriented programming, but this is a metaphor, *not* a relationship that maps to any particular Python objects: we don't have any Python class that fully represents the *dataset* concept (`DatasetRef` is the closest), and the `DatasetType` class is a regular class, not a metaclass.
 So a *dataset type* is represented in Python as a `DatasetType` *instance*.
 
 A dataset type defines both the dimensions used in a dataset's data ID (so all data IDs for a particular dataset type have the same keys, at least when put in standard form) and the storage class that corresponds to its in-memory Python type and maps to the file format (or generalization thereof) used by a `Datastore` to store it.
 These are associated with an arbitrary string name.
 
-Beyond that definition, what a dataset type *means* isn't really specififed by the butler itself, but we expect higher-level code that *uses* butler to make that clear, and one anticipates case is worth calling out here: a dataset type roughly corresponds to the role its datasets play in a processing pipeline.
-In other words, a particular pipeline will typically accept particular dataset types as inputs and produce particular dataset types as outputs (and may produce and consumed other dataset types as intermediates).
+Beyond that definition, what a dataset type *means* isn't really specified by the butler itself, but we expect higher-level code that *uses* butler to make that clear, and one anticipated case is worth calling out here: a dataset type roughly corresponds to the role its datasets play in a processing pipeline.
+In other words, a particular pipeline will typically accept particular dataset types as inputs and produce particular dataset types as outputs (and may produce and consume other dataset types as intermediates).
 And while the exact dataset types used may be configurable, changing a dataset type will generally involve substituting one dataset type for a very similar one (most of the time with the same dimensions and storage class).
 
 .. _daf_butler_collections:
@@ -73,7 +73,7 @@ Tagged Collections
 `CollectionType.TAGGED` collections are the most flexible type of collection; datasets can be `associated <Registry.associate>` with or `disassociated <Registry.disassociate>` from a ``TAGGED`` collection at any time, as long as the usual contraint on a collection having only one dataset with a particular dataset type and data ID is maintained.
 Membership in a ``TAGGED`` collection is implemented in the `Registry` database as a single row in a many-to-many join table (a "tag") and is completely decoupled from the actual storage of the dataset.
 
-Tags are thus both extremely lightweight relative to copies or re-ingests of files or other `Datastore` content, and *slightly** more expensive to store and possibly query than than the ``RUN`` or ``CHAINED`` collection representations (which have no per-dataset costs).
+Tags are thus both extremely lightweight relative to copies or re-ingests of files or other `Datastore` content, and *slightly* more expensive to store and possibly query than the ``RUN`` or ``CHAINED`` collection representations (which have no per-dataset costs).
 The latter is rarely important, but higher-level code should avoid  automatically creating ``TAGGED`` collections that may not ever be used.
 
 Chained Collection

diff --git a/doc/lsst.daf.butler/queries.rst b/doc/lsst.daf.butler/queries.rst
@@ -21,7 +21,7 @@ Arguments that specify one or more dataset types can generally take any of the f
  - `str` values (corresponding to `DatasetType.name`);
  - `re.Pattern` values (matched to `DatasetType.name` strings, via `~re.Pattern.fullmatch`);
  - iterables of any of the above;
- - the special value `...`, which matches all dataset types.
+ - the special value "``...``", which matches all dataset types.
 
 Some of these are not allowed in certain contexts (as documented there).
 
@@ -36,11 +36,11 @@ Arguments that specify one or more collections are similar to those for dataset
  - `re.Pattern` values (matched to the collection name, via `~re.Pattern.fullmatch`);
  - a `tuple` of (`str`, *dataset-type-restriction*) - see below;
  - iterables of any of the above;
- - the special value `...`, which matches all collections;
+ - the special value "``...``", which matches all collections;
  - a mapping from `str` to *dataset-type-restriction*.
 
 A *dataset-type-restriction* is a :ref:`DatasetType expression <daf_butler_dataset_type_expressions>` that limits a search for datasets in the associated collection to just the specified dataset types.
-Unlike most other DatasetType expressions, it may not contain regular expressions (but it may be `...`, which is the implied value when no
+Unlike most other DatasetType expressions, it may not contain regular expressions (but it may be "``...``", which is the implied value when no
 restriction is given, as it means "no restriction").
 In contexts where restrictions are meaningless (e.g. `~Registry.queryCollections` when the ``datasetType`` argument is `None`) they are allowed but ignored.
 
@@ -52,9 +52,9 @@ Ordered collection searches
 
 An *ordered* collection expression is required in contexts where we want to search collections only until a dataset with a particular dataset type and data ID is found.
 These include all direct `Butler` operations, the definitions of `~CollectionType.CHAINED` collections, `Registry.findDataset`, and the ``deduplicate=True`` mode of `Registry.queryDatasets`.
-In these contexts, regular expressions and `...` are not allowed for collection names, because they make it impossible to unambiguously define the order in which to search.
+In these contexts, regular expressions and "``...``" are not allowed for collection names, because they make it impossible to unambiguously define the order in which to search.
 Dataset type restrictions are allowed in these contexts, and those
-may be (and usually are) `...`.
+may be (and usually are) "``...``".
 
 Ordered collection searches are processed by the `~registry.wildcards.CollectionSearch` class.
 
@@ -94,7 +94,7 @@ Language operator precedence rules are the same as for the other languages
 like C++ or Python. When in doubt use grouping operators (parentheses) for
 sub-expressions.
 
-General note - the parser itself does not evaluate any expressions even if
+General note --- the parser itself does not evaluate any expressions even if
 they consist of literals only, all evaluation happens in the SQL engine when
 registry runs the resulting SQL query.
 
@@ -162,15 +162,15 @@ expressions which should evaluate to a numeric value.
 Binary arithmetic operators
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
-Language supports five arithmetic operators - ``+`` (add), ``-`` (subtract),
+Language supports five arithmetic operators: ``+`` (add), ``-`` (subtract),
 ``*`` (multiply), ``/`` (divide), and ``%`` (modulo). Usual precedence rules
 apply to these operators. Operands for them can be anything that evaluates to
 a numeric value.
 
 Comparison operators
 ^^^^^^^^^^^^^^^^^^^^
 
-Language supports set of regular comparison operators - ``=``, ``!=``, ``<``,
+Language supports set of regular comparison operators: ``=``, ``!=``, ``<``,
 ``<=``, ``>``, ``>=``. This can be used on operands that evaluate to a numeric
 values, for (in)equality operators operands can also be boolean expressions.
 
@@ -182,7 +182,9 @@ IN operator
 ^^^^^^^^^^^
 
 The ``IN`` operator (and ``NOT IN``) are an expanded version of a regular SQL
-IN operator. Its general syntax looks like::
+IN operator. Its general syntax looks like:
+
+.. code-block:: sql
 
     <expression> IN ( <literal1>[, <literal2>, ... ])
     <expression> NOT IN ( <literal1>[, <literal2>, ... ])
@@ -194,15 +196,19 @@ literals as defined above. It can also be a mixture of integer literals and
 range literals (language allows mixing of string literals and ranges but it
 may not make sense when translated to SQL).
 
-For an example of range usage, these two expressions are equivalent::
+For an example of range usage, these two expressions are equivalent:
+
+.. code-block:: sql
 
-    visit IN (100, 110, 130..145:5)
-    visit in (100, 110, 130, 135, 140, 145)
+   visit IN (100, 110, 130..145:5)
+   visit in (100, 110, 130, 135, 140, 145)
 
-as are these::
+as are these:
 
-    visit NOT IN (100, 110, 130..145:5)
-    visit Not In (100, 110, 130, 135, 140, 145)
+.. code-block:: sql
+
+   visit NOT IN (100, 110, 130..145:5)
+   visit Not In (100, 110, 130, 135, 140, 145)
 
 Boolean operators
 ^^^^^^^^^^^^^^^^^
@@ -223,7 +229,9 @@ sub-expressions in the full expression.
 Examples
 ^^^^^^^^
 
-Few examples of valid expressions using some of the constructs::
+Few examples of valid expressions using some of the constructs:
+
+.. code-block:: sql
 
     visit > 100 AND visit < 200
 

diff --git a/python/lsst/daf/butler/_butler.py b/python/lsst/daf/butler/_butler.py
@@ -928,7 +928,7 @@ def prune(self, refs: Iterable[DatasetRef], *,
             Datasets to prune.  These must be "resolved" references (not just
             a `DatasetType` and data ID).
         disassociate : bool`, optional
-            Disassociate pruned datasets from ``self.collection`` (or the
+            Disassociate pruned datasets from ``self.collections`` (or the
             collection given as the ``collection`` argument).  Dataset that are
             not in this collection are ignored, unless ``purge`` is `True`.
         unstore : `bool`, optional
@@ -960,7 +960,7 @@ def prune(self, refs: Iterable[DatasetRef], *,
             composite datasets.  This will only prune components that are
             actually attached to the given `DatasetRef` objects, which may
             not reflect what is in the database (especially if they were
-            obtained from `Registry.queryDatasets`, which by does not include
+            obtained from `Registry.queryDatasets`, which does not include
             components in its results).
 
         Raises
@@ -1053,7 +1053,7 @@ def prune(self, refs: Iterable[DatasetRef], *,
                 # If we're disassociating but not purging, we can do that
                 # before we try to delete, and it will roll back if deletion
                 # fails.  That will at least do the right thing if deletion
-                # fails because the files couldn't actually be delete (e.g.
+                # fails because the files couldn't actually be deleted (e.g.
                 # due to lack of permissions).
                 for tag in tags:
                     # recursive=False here because refs is already recursive