Skip to content

Commit

Permalink
Prohibit caches of views of dimension records.
Browse files Browse the repository at this point in the history
This avoids a problem with dimension record cache consistency on was
introduced in the cache rework a few commits back, when we started
considering the cache more trustworthy.

Unfortunately there's no cost-free fix: that cache rework fixed other
bugs with cache consistency, so we can't just drop it.  The best
solution seems to be admitting that expandDataIds cannot be _relied_
upon to test for dimension record existence; this has always been the
case in general, but it may have been impossible to trigger via the
default dimension configuration before.

See code comments for additional rationale.
  • Loading branch information
TallJimbo committed Dec 2, 2022
1 parent 0bc2bd9 commit 76750ef
Show file tree
Hide file tree
Showing 3 changed files with 33 additions and 1 deletion.
8 changes: 8 additions & 0 deletions python/lsst/daf/butler/registry/_registry.py
Original file line number Diff line number Diff line change
Expand Up @@ -1115,6 +1115,14 @@ def expandDataId(
dimensions or values, or when a resulting data ID contains
contradictory key-value pairs, according to dimension
relationships.
Notes
-----
This method cannot be relied upon to reject invalid data ID values
for dimensions that do actually not have any record columns. For
efficiency reasons the records for these dimensions (which have only
dimension key values that are given by the caller) may be constructed
directly rather than obtained from the registry database.
"""
raise NotImplementedError()

Expand Down
22 changes: 22 additions & 0 deletions python/lsst/daf/butler/registry/dimensions/caching.py
Original file line number Diff line number Diff line change
Expand Up @@ -79,6 +79,28 @@ def initialize(
nested = NestedClass.initialize(
db, element, context=context, config=config, governors=governors, view_target=view_target
)
if view_target is not None:
# Caching records that are really a view into another element's
# records is problematic, because the caching code has no way of
# intercepting changes to its target's records. Instead of
# inventing a callback system to address that directly or dealing
# with an untrustworthy combination, we just ban this combination.
# But there's a problem: this is how we've configured the default
# dimension universe from the beginning, with the 'band' dimension
# being a cached view into physical_filter, and we don't want to
# break all those configurations.
if isinstance(view_target, CachingDimensionRecordStorage):
# Happily, there's a way out: if the view target's record
# storage is _also_ cached, then this outer caching is pretty
# thoroughly unnecessary as well as problematic, and it's
# reasonable to silently drop it, by returning the nested
# storage object instead of a new caching wrapper. And this
# too is the case with the default dimension configuration.
return nested
raise RuntimeError(
f"Invalid dimension storage configuration: cannot cache dimension element {element} "
f"that is itself a view of {view_target.element}."
)
return cls(nested)

@property
Expand Down
4 changes: 3 additions & 1 deletion python/lsst/daf/butler/registry/tests/_registry.py
Original file line number Diff line number Diff line change
Expand Up @@ -2507,7 +2507,9 @@ def testQueryResultSummaries(self):
# Second query should yield no results, which we should see when
# we attempt to expand the data ID.
query2 = registry.queryDataIds(["physical_filter"], band="h")
self.assertFalse(query2.any(execute=False, exact=False))
# There's no execute=False, exact=Fals test here because the behavior
# not something we want to guarantee in this case (and exact=False
# says either answer is legal).
self.assertFalse(query2.any(execute=True, exact=False))
self.assertFalse(query2.any(execute=True, exact=True))
self.assertEqual(query2.count(exact=False), 0)
Expand Down

0 comments on commit 76750ef

Please sign in to comment.