New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DM-37786: allow PipelineTasks to control default dataset-query-constraint behavior. #304
Conversation
Codecov ReportBase: 80.56% // Head: 80.61% // Increases project coverage by
Additional details and impacted files@@ Coverage Diff @@
## main #304 +/- ##
==========================================
+ Coverage 80.56% 80.61% +0.04%
==========================================
Files 57 57
Lines 6304 6319 +15
Branches 1174 1280 +106
==========================================
+ Hits 5079 5094 +15
- Misses 981 982 +1
+ Partials 244 243 -1
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
@TallJimbo would you mind adding python 3.11 to the build matrix whilst you are doing this PR? It should work fine, ctrl_mpexec builds okay. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While I am slightly worried about the multiple ways and places dataset constraints are used that don't quite fully overlap being confusing to futures maintainers, I think this is a pragmatic approach without larger refactoring.
I had the same misgivings and sense that it was the least-bad option, and I'll add some code comments to that effect. |
I wouldn't call this a hack, but it's being added here now to work around long-standing limitations involving spatial overlaps in QG generation, and I still want to fix those limitations directly. That makes me fit a *tiny* bit better about having absolutely no idea how I'd start to write a test for this, as the circumstances under which it appears in the real world require a lot of data (multiple tracts) and a pretty complicated graph and task, and we have none of that in ci_* packages, let alone here. So I'll be relying on one-off at-scale tests for now.
Even though these are debug-level, all of the other debug-level logging in GraphBuilder is much sparser.
I've long wondered why users occasionally reported the previous "this should't be possible" error message, and finally figured it out: initInputs don't get included in the initial data ID query, so follow-ups on those can fail even if repo state hasn't changed under us.
f970cba
to
0107320
Compare
Checklist
doc/changes