-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DM-31725: stand up initial version of package #1
Conversation
8434c98
to
762b5b9
Compare
e8bc082
to
a97dbc1
Compare
Is this something being factored out of daf_butler, or ctrl_mpexec? (I happened to see some automated mail from GitHub about it.) |
It's new code that would otherwise probably go into daf_butler: it's a rewrite of much of the query expression handling. |
Codecov Report
@@ Coverage Diff @@
## main #1 +/- ##
=======================================
Coverage ? 88.57%
=======================================
Files ? 1
Lines ? 35
Branches ? 2
=======================================
Hits ? 31
Misses ? 3
Partials ? 1 Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
108b966
to
e3bc06a
Compare
8f379e6
to
31a832b
Compare
a2531bd
to
fdbcf8e
Compare
310cf1a
to
e909b0a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've not looked at every file in details but the package is well organized, it has some nice documentation, and follows all our conventions. I'm very pleased to see 90% test coverage.
doc/changes/README.rst
Outdated
* ``perf``: A performance enhancement. | ||
* ``doc``: A documentation improvement. | ||
* ``removal``: An API removal or deprecation. | ||
* ``other``: Other Changes and Additions of interest to general users. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We dropped "other" at some point because no-one could remember the difference between "other" and "misc" but I see I haven't updated the documentation in all the packages to reflect that (specifically daf_butler still talks about other).
doc/lsst.daf.relation/sql.rst
Outdated
|
||
The SQL engine has a non-trivial `Engine.conform` method and overrides other methods to keep relation trees in a certain order when applying new operations. | ||
It uses a custom `Select` marker type to partition the tree into subqueries and mark a tree as conformed. | ||
This order attempts to avoid subqueries whenever possible by pulling `Projection`, `Deduplication`, `Sort`, and `Slice` operations downstream when commutation rules permit, both to keep the SQL output simple and to reduce the loss of row-ordering imposed by `Sort` operations whenever possible (i.e. a `Sort` can only be applied in the outermost query if it is to have any effect). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These classes don't turn into links in the sphinx docs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, need to add that pesky leading .
to them. (By the way, since I only discovered this behavior while working on this ticket, I'm sure we've got a ton of preexisting bad links in daf_butler that need it as well.)
|
||
The `Engine` and `Payload` classes are generic over a parameter we refer to as the "logical column type". | ||
In the simplest case, the logical column type is just `sqlalchemy.sql.ColumnElement`, and this is what the default implementations of most `Engine` methods assume. | ||
Custom subclasses of the SQL `Engine` can use other types, such as wrappers that hold one or more `sqlalchemy.sql.ColumnElement` objects, as long as they override a few `Engine` methods to handle them. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have to include some sqlalchemy index in the sphinx configuration for these links to work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, and I think I was meaning to ask @jonathansick about it. I don't see anything about intersphinx
in either the package-local conf.py
or the one in https://github.com/lsst/pipelines_lsst_io, so I assume this is something we import from documenteer
in those files.
I also haven't been able to find the right URL to associate with the sqlalchemy
package in the intersphinx
mapping, which may mean there isn't one - their docs sure look like they've been built with sphinx, but I don't think that means they necessarily publish the interface that intersphinx
needs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll add sqlalchemy to the documenteer configuration for pipelines packages. It does use Sphinx for the documentation sites of individual versions. DM-37194
Here's background on the Sphinx configuration for Rubin to clarify how conf.py
files work and this documentation page specifically on the Sphinx configuration settings (via a page in the Developer Guide).
e909b0a
to
24fe7d5
Compare
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## main #1 +/- ##
=======================================
Coverage ? 97.10%
=======================================
Files ? 15
Lines ? 1242
Branches ? 117
=======================================
Hits ? 1206
Misses ? 19
Partials ? 17 Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
This includes a large number of related changes: - Markers (including Transfer and Materialization) are now Relation types, not operation types. - UnaryOperation.apply is now final, with a number of new hooks for customization by both derived operations and engines, with backtracking dedicated to Engines. - The SQL engine overrides these hooks to partition and reorder its trees into subqueries indicated by the new Select marker. This lets it preserve row-order from sorts much better. It does not attempt to support backtracking at all. - The iteration engine does support backtracking, and since it always preserves order, we can drop all ordering checks (and the preserves_order method itself) in the backtracking algorithm. - Relations now use immutable concrete containers (frozenset, tuple) instead of their collections.abc generalizations to ensure hashability (though this is just in type annotations - users still have to actually pass in hashable containers at runtime). And, finally, there's now a pretty complete battery of unit tests, with missing coverage limited to: - Simple exception-raising checks for logic bugs in calling code that just didn't seem worth that much attention. - Defensive code that is at least very difficult to trigger, if not impossible, without violating other rules (e.g. getting a not-already-conformed Relation tree in the SQL engine to call its conform method on).
8729132
to
069bf3d
Compare
always returns the operand relation passed to it. | ||
`Join` (`BinaryOperation`) / `Relation.join` | ||
Perform a natural join: combine two relations by matching rows with the same values in their common columns (and satisfying an optional column expression, via a `Predicate`), producing a new relation whose columns are the union of the columns of its operands. | ||
This is equivalent to [``INNER``] ``JOIN`` in SQL. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't we also need outer joins, I think I saw at least one case in daf_butler?
No description provided.