-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DM-38498: rewrite QuantumGraph generation #370
Commits on Aug 20, 2023
-
Configuration menu - View commit details
-
Copy full SHA for 88022fb - Browse repository at this point
Copy the full SHA 88022fbView commit details -
Configuration menu - View commit details
-
Copy full SHA for 581557d - Browse repository at this point
Copy the full SHA 581557dView commit details -
Configuration menu - View commit details
-
Copy full SHA for 0069bb5 - Browse repository at this point
Copy the full SHA 0069bb5View commit details
Commits on Aug 24, 2023
-
Add new classes for QuantumGraph generation.
These mostly use the same algorithm as the one in graphBuilder.py, but they split the pipeline up into disconnected subgraphs first, which will provide a nice performance boost for some pipelines. The other algorithmic difference is that pruning via adjustQuantum now happens directly in QG generation, making the stuff for pruning in QuantumGraph construction unnecessary, and allowing adjustQuantum to raise NoWorkFound even during QG generation. By building on PipelineGraph's more careful handling of storage class overrides, this should make it much, much harder for those problems to creep back in. And finally, the split here into an ABC and implementation should make it much easier to handle special QG generation cases, like the ones for gathering resource usage and HiPS generation, as they can now delegate the stuff they don't want to customize to the base class. This may be useful for generating simple QGs in Prompt Processing, too (but I'm not convinced it'd gain us much) or as a starting point for a more advanced general-purpose QG generation algorithm (though it's unlikely the base class could stay _exactly_ as is for that.
Configuration menu - View commit details
-
Copy full SHA for fb44da8 - Browse repository at this point
Copy the full SHA fb44da8View commit details -
Configuration menu - View commit details
-
Copy full SHA for a430115 - Browse repository at this point
Copy the full SHA a430115View commit details -
Use per-instance loggers instead of module-level loggers in QG gen.
It's easier for a user who wants to adjust verbosity to only deal with one logger for all of QG generation.
Configuration menu - View commit details
-
Copy full SHA for 3e8cf2c - Browse repository at this point
Copy the full SHA 3e8cf2cView commit details -
Configuration menu - View commit details
-
Copy full SHA for 10de19a - Browse repository at this point
Copy the full SHA 10de19aView commit details -
Avoid DataCoordinate as comparison keys in QG gen.
We're always comparing data IDs with the same dimensions, so we can do it much faster by just comparing value tuples.
Configuration menu - View commit details
-
Copy full SHA for 66f2f1f - Browse repository at this point
Copy the full SHA 66f2f1fView commit details -
Optimize the serialization time for QuantumGraph
Make use of the butler serialization caching mechanisms to make sure object are effectively cached instead of reconstructing objects needlessly. Also lower the compression ratio of LZMA. This results in slightly larger graph sizes, but is offset by a large runtime gain.
Configuration menu - View commit details
-
Copy full SHA for d835b81 - Browse repository at this point
Copy the full SHA d835b81View commit details -
Configuration menu - View commit details
-
Copy full SHA for 3eab2cd - Browse repository at this point
Copy the full SHA 3eab2cdView commit details -
Configuration menu - View commit details
-
Copy full SHA for aa40451 - Browse repository at this point
Copy the full SHA aa40451View commit details -
Remove pruning from QG and fix internal graph construction.
QuantumGraph doesn't need to be able to prune itself anymore, since that's now handled earlier, inside QuantumGraphBuilder (where we can do it more efficiently). And without that pruning code, it's easy to replace QuantumGraph's _datasetRefDict attribute with a temporary networkx graph, and this sidesteps a problem in which different DatasetRefs don't compare as equal if they have different storage classes, which was causing QGs to lack some edges they should have had with the QuantumGraphBuilder. I believe this was because the old QG generation algorithm passed DatasetRefs with incorrect storage classes to QuantumGraph, and that also sidestepped the bug.
Configuration menu - View commit details
-
Copy full SHA for 4025751 - Browse repository at this point
Copy the full SHA 4025751View commit details -
Simplify QuantumGraph's _DatasetTracker for task-graph-only use.
This class was previously used for both task <-> dataset type graphs and quantum <-> dataset graphs, and now it's just used for the former. So it doesn't need to be generic, and it doesn't need to support node removal anymore. Eventually it should be replaced entirely by PipelineGraph, but that's out of scope for now (it's part of DM-40442).
Configuration menu - View commit details
-
Copy full SHA for 907e4f4 - Browse repository at this point
Copy the full SHA 907e4f4View commit details -
Configuration menu - View commit details
-
Copy full SHA for c2e4441 - Browse repository at this point
Copy the full SHA c2e4441View commit details -
Configuration menu - View commit details
-
Copy full SHA for 8e76925 - Browse repository at this point
Copy the full SHA 8e76925View commit details -
Configuration menu - View commit details
-
Copy full SHA for 6e3a48a - Browse repository at this point
Copy the full SHA 6e3a48aView commit details -
Configuration menu - View commit details
-
Copy full SHA for 0d8f41e - Browse repository at this point
Copy the full SHA 0d8f41eView commit details -
Clean up QuantumGraphSkeleton handling of implicit node addition.
With the switch to tuples instead of DataCoordinates as keys, adding nodes implicitly when edges are added gets trickier; we can only support it for input datasets, because something else (registry queries or upstream tasks) are responsible for making DatasetRefs for those.
Configuration menu - View commit details
-
Copy full SHA for 1a085fb - Browse repository at this point
Copy the full SHA 1a085fbView commit details -
Take more care with missing skip-existing-in collections.
We want to allow users to include at least the output run here without it actually existing yet, and let that be a no-op, instead of complaining.
Configuration menu - View commit details
-
Copy full SHA for 636e097 - Browse repository at this point
Copy the full SHA 636e097View commit details -
Configuration menu - View commit details
-
Copy full SHA for bc6752a - Browse repository at this point
Copy the full SHA bc6752aView commit details -
Configuration menu - View commit details
-
Copy full SHA for 6457335 - Browse repository at this point
Copy the full SHA 6457335View commit details