Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DM-15424: add limited Registry concept #85

Merged
merged 12 commits into from Sep 13, 2018
Merged

DM-15424: add limited Registry concept #85

merged 12 commits into from Sep 13, 2018

Conversation

TallJimbo
Copy link
Member

Also includes some cleanups to existing code and configuration; recommend reviewing commit-by-commit.

Copy link
Member

@timj timj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks okay to me. Very minor comments.

@@ -264,12 +264,11 @@ def fromConfig(cls, config):
Parameters
----------
config : `SchemaConfig`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a SchemaConfig

All created tables.
views : `dict`
All created views.
A mapping from table or view name to the associated SQLAlchemy object.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hadn't really noticed before that Schema requires SQLAlchemy even if Registry is implemented in non-SQL manner.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, very much so. In fact, while doing some work on DM-15210 it occurred to me that @pschella may have based some of the structure of schema.yaml on what SQLAlchemy wants - I've always found our description of foreign keys hard to parse, for instance, but it turns out they're quite convenient to pass to SQLAlchemy. Anyhow, this bothers me too, but I'm not sure what to do about it and this ticket definitely isn't the place for it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Schema is indeed quite close to SQLAlchemy, but it doesn't have to actually be required. Overall you need some way for schema.tables['name'] to give you back something that makes sense to use as a table representation. But it doesn't have to be a SQLAlchemy.Table object per-se.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason why I deliberately stayed pretty close to a one-to-one mapping between the YAML description and SQLAlchemy's object representation was that I expected us to have to use more complicated schema properties at some later stage. I thus assumed SQLAlchemy did what it did for a reason and it would make our lives easier in matching its expressiveness by having a one-to-one mapping as opposed to something that is perhaps a bit easier to read for the simple cases but requires ugly hacks for the complex ones. This expectation may since have been proven wrong of course ;)

python/lsst/daf/butler/core/registry.py Show resolved Hide resolved
@@ -168,7 +189,12 @@ def transaction(self):

@property
def pixelization(self):
"""Object that interprets SkyPix DataUnit values (`sphgeom.Pixelization`)."""
"""Object that interprets SkyPix DataUnit values (`sphgeom.Pixelization`).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this has to be lsst.sphgeom.Pixelization.

"""Object that interprets SkyPix DataUnit values (`sphgeom.Pixelization`)."""
"""Object that interprets SkyPix DataUnit values (`sphgeom.Pixelization`).

``None`` for limited registries.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think single back ticks work for None (not that anyone will really be clicking on it to find out what it means).

Raises
------
NotImplementedError
If ``self.limited`` is ``True``.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There must be a way to make this link to the classes limited property in the documentation. @jonathansick ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd try

If `limited` is `True`.

If that doesn't work, I'd do

If `RegistryConfig.limited` is `True`.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, the recommended way of writing the sentence is:

Raised if `RegistryConfig.limited` is `True`.

https://developer.lsst.io/python/numpydoc.html#py-docstring-raises

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like it works if I just say

    If `limited` is `True`.

@@ -44,6 +45,9 @@ class Schema:
config : `SchemaConfig` or `str`, optional
Load configuration. Defaults will be read if ``config`` is not
a `SchemaConfig`.
limited : `bool`
If True, ignore tables, views, and associated foreign keys whose
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sometimes True is back ticked, here it is not. Be consistent.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should always be back ticked in this context.

@@ -19,6 +19,7 @@
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.


Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did we need this new line?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Accident; removed.

@@ -44,6 +45,9 @@ class Schema:
config : `SchemaConfig` or `str`, optional
Load configuration. Defaults will be read if ``config`` is not
a `SchemaConfig`.
limited : `bool`
If True, ignore tables, views, and associated foreign keys whose
config descriptions include a 'limited' key set to `False`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the quote character is wrong here.

----------
limited : `bool`
If True, ignore tables, views, and associated foreign keys whose
config descriptions include a 'limited' key set to `False`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Backtick the True (False already is) and fix quote char for limited.

"Registry" here just means "in-memory representation", and probably
should have a new name itself.  In any case, it's a poor name for
the on-disk representation.
...which means other objects that need a DataUnitRegistry shouldn't
rely on it to provide one, since it doesn't use it itself.
If basically all of the classes that use Schema use an attribute,
it shouldn't start with an underscore.
The signatures and behavior of these will probably change when we
get around to implementing them, and they're just cruft right now.
And we can always find the old versions in git.
This *probably* doesn't matter, given that Registries are usually
constructed via Registry.fromConfig anyway, but this is a bit safer
and more consistent with the existing base class logic.
Return None (as we were doing before) messes up other methods
provided by UserDict inheritance, like get().
This will permit the same tests to be run on differently-constructed
Registry instances.
Registries configured with limited=True do not have any DataUnit
metadata or join tables (or the foreign keys from Dataset to those
tables).  That disables a few operations, but makes it unnecessary
to populate those tables just to do simple get and put Butler
operations.
@TallJimbo
Copy link
Member Author

One more commit: a global replace to do single- instead of double-backticks for True, False, and None.

@TallJimbo TallJimbo merged commit 93f650a into master Sep 13, 2018
@timj timj deleted the tickets/DM-15424 branch April 22, 2020 22:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants