Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DM-39944: Replace Butler.registry with registry shim #864

Merged
merged 6 commits into from Jul 14, 2023

Conversation

andy-slac
Copy link
Contributor

Registry ABC is split into three classes:

  • Registry is now only contains the methods that are used by regular clients via Butler.registry attribute
  • ButlerRegistry ABC extends Registry with few methods used by Butler internally, SQL and remote implementations now inherit ButlerRegistry
  • ButlerFactory class has methods to create/instantiate registry from configuration. There is still Registry.createFromConfig, this is only for the tests that currently use this method, I do not want to update all those as I'm not sure if ButlerFactory survives all further updates (or this review).

Butler.registry now returns ButlerShim instance which is a subclass of Registry. Shim for now just forwards all calls to Butler._registry, as we do updates to Butler/Registry implementations the shim will probably do more complicated things, and eventually it should disappear after all clients migrate to new Butler API.

Checklist

  • ran Jenkins
  • added a release note for user-visible changes to doc/changes

@codecov
Copy link

codecov bot commented Jul 12, 2023

Codecov Report

Patch coverage: 90.94% and no project coverage change.

Comparison is base (50ff5f6) 87.90% compared to head (5b07c4d) 87.90%.

Additional details and impacted files
@@           Coverage Diff            @@
##             main     #864    +/-   ##
========================================
  Coverage   87.90%   87.90%            
========================================
  Files         270      273     +3     
  Lines       35633    35764   +131     
  Branches     7462     7474    +12     
========================================
+ Hits        31324    31440   +116     
- Misses       3151     3166    +15     
  Partials     1158     1158            
Impacted Files Coverage Δ
python/lsst/daf/butler/registry/_registry.py 96.55% <0.00%> (+1.95%) ⬆️
...thon/lsst/daf/butler/registry/_registry_factory.py 84.00% <84.00%> (ø)
...ython/lsst/daf/butler/registry/_butler_registry.py 87.09% <87.09%> (ø)
python/lsst/daf/butler/_registry_shim.py 89.32% <89.32%> (ø)
python/lsst/daf/butler/_butler.py 78.83% <97.87%> (+0.14%) ⬆️
python/lsst/daf/butler/registries/remote.py 80.59% <100.00%> (-0.10%) ⬇️
python/lsst/daf/butler/registries/sql.py 85.04% <100.00%> (-0.03%) ⬇️
python/lsst/daf/butler/registry/__init__.py 100.00% <100.00%> (ø)
python/lsst/daf/butler/registry/tests/_registry.py 98.09% <100.00%> (ø)
python/lsst/daf/butler/tests/_datasetsHelper.py 88.33% <100.00%> (ø)
... and 9 more

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

@timj
Copy link
Member

timj commented Jul 12, 2023

It looks like I missed a butler.datastore fix in the tests since we get 22 deprecation warnings. It's interesting that python 3.11 is now significantly faster than 3.10.

@timj
Copy link
Member

timj commented Jul 12, 2023

😄 the warnings about Butler.datastore are coming specifically from this branch. Nothing to do with main at all.

@andy-slac
Copy link
Contributor Author

andy-slac commented Jul 12, 2023

😄 the warnings about Butler.datastore are coming specifically from this branch. Nothing to do with main at all.

I likely messed it up when rebased after your changes, will check everything again.

This is a trivial change, internally Butler uses `_registry` but it is also
exposed as `registry` property. In the next commits `registry` property
will return a special shim object, this will allow us to transparently
update registry implementation without breaking client-visible interfaces.
`Registry` ABC now contains methods used by regular registry clients, this
interface is exposed via `Butler.registry` property. `ButlerRegistry` ABC
extends `Registry` with methods that are used by Butler, this includes
factory methods and a couple of internal methods. SQL and remote
implementations now inherit `ButlerRegistry`, they are not changed otherwise.

Also dropped `Registry.datasetIdFactory` attribute, we now have more transparent
method for generating dataset IDs and this attribute is not used.
Copy link
Member

@TallJimbo TallJimbo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! This is all consistent with how I see things going, at least at first.

I think we're in agreement that the roles of some of the new classes will change significantly before this is all over, and they may well disappear. So how about we add a leading underscore to ButlerRegistry and RegistryFactory to make it extra clear that these are not symbols anyone should be depending on.

Did you see any usage of createFromConfig in tests downstream of daf_butler, or was it just a lot of tests in daf_butler itself? That's currently at the top of my list of Registry methods we should actually deprecate (in favor of ButlerRegistry, for now), with transaction and resetConnectionPool close behind. There are of course a ton of other methods we want to move to Butler, maybe adjust, and ultimately remove from Registry in the distant future, but we should probably be more aggressive about removing those three and maybe a few others ASAP since we don't expect to even give them Butler-method replacements.

python/lsst/daf/butler/_registry_shim.py Outdated Show resolved Hide resolved
(defaulting to ``SqlRegistry``), import that class, and call one of the
above methods on the imported class.
"""

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to give this class a constructor that does the work of force_registry_config and holds the config as an attribute, and then make the other methods regular instance methods (probably abstract ones)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding a constructor looks reasonable. If factory methods are abstract we'd need them implemented somewhere? Do you think that this should be another base class for ButlerRegistry subclasses?

Copy link
Member

@TallJimbo TallJimbo Jul 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I think I got confused about the current state of things while reviewing commit-by-commit - I thought this still had some methods with docs that said they needed to be reimplemented by derived classes, and that made me think there were already subclass implementations and I didn't step back and notice that there weren't any.

So just ignore the bit about abstract methods; this is just a regular concrete (probably final) class, and that's fine.

This new class implements `Registry` interface and forwards all methods
to other objects. Everything is forwarded to `Butler._registry` now, but
this will change as we start refactoring ButlerRegistry implementation.
Trying to further separate concerns into independent classes. Re-introduced
`Registry.createFromConfig` to avoid updating tests in many other packages.
@andy-slac
Copy link
Contributor Author

So how about we add a leading underscore to ButlerRegistry and RegistryFactory to make it extra clear that these are not symbols anyone should be depending on.

I renamed those two classes.

Did you see any usage of createFromConfig in tests downstream of daf_butler, or was it just a lot of tests in daf_butler itself?

There are bout five other packages that use createFromConfig in their tests. I actually started replacing that with RegistryFactory but then decided that I want to wait until things become more stable.

That's currently at the top of my list of Registry methods we should actually deprecate (in favor of ButlerRegistry, for now), with transaction and resetConnectionPool close behind.

I looked at resetConnectionPool and it's of course only needed by ctrl_mpexec. I'd like to remove it from registry interface too, but having it in Butller is not much better in my opinion. Need to think again how to handle it better.

@andy-slac andy-slac merged commit da8a07c into main Jul 14, 2023
15 of 16 checks passed
@andy-slac andy-slac deleted the tickets/DM-39944 branch July 14, 2023 16:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants