Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DM-38498: rewrite QuantumGraph generation #370

Merged
merged 21 commits into from
Aug 24, 2023
Merged

DM-38498: rewrite QuantumGraph generation #370

merged 21 commits into from
Aug 24, 2023

Commits on Aug 20, 2023

  1. Configuration menu
    Copy the full SHA
    88022fb View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    581557d View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    0069bb5 View commit details
    Browse the repository at this point in the history

Commits on Aug 24, 2023

  1. Add new classes for QuantumGraph generation.

    These mostly use the same algorithm as the one in graphBuilder.py, but
    they split the pipeline up into disconnected subgraphs first, which
    will provide a nice performance boost for some pipelines.
    
    The other algorithmic difference is that pruning via adjustQuantum now
    happens directly in QG generation, making the stuff for pruning in
    QuantumGraph construction unnecessary, and allowing adjustQuantum to
    raise NoWorkFound even during QG generation.
    
    By building on PipelineGraph's more careful handling of storage class
    overrides, this should make it much, much harder for those problems to
    creep back in.
    
    And finally, the split here into an ABC and implementation should make
    it much easier to handle special QG generation cases, like the ones
    for gathering resource usage and HiPS generation, as they can now
    delegate the stuff they don't want to customize to the base class.
    This may be useful for generating simple QGs in Prompt Processing, too
    (but I'm not convinced it'd gain us much) or as a starting point for a
    more advanced general-purpose QG generation algorithm (though it's
    unlikely the base class could stay _exactly_ as is for that.
    TallJimbo committed Aug 24, 2023
    Configuration menu
    Copy the full SHA
    fb44da8 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    a430115 View commit details
    Browse the repository at this point in the history
  3. Use per-instance loggers instead of module-level loggers in QG gen.

    It's easier for a user who wants to adjust verbosity to only deal with
    one logger for all of QG generation.
    TallJimbo committed Aug 24, 2023
    Configuration menu
    Copy the full SHA
    3e8cf2c View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    10de19a View commit details
    Browse the repository at this point in the history
  5. Avoid DataCoordinate as comparison keys in QG gen.

    We're always comparing data IDs with the same dimensions, so we can
    do it much faster by just comparing value tuples.
    TallJimbo committed Aug 24, 2023
    Configuration menu
    Copy the full SHA
    66f2f1f View commit details
    Browse the repository at this point in the history
  6. Optimize the serialization time for QuantumGraph

    Make use of the butler serialization caching mechanisms to make
    sure object are effectively cached instead of reconstructing
    objects needlessly. Also lower the compression ratio of LZMA.
    This results in slightly larger graph sizes, but is offset by
    a large runtime gain.
    natelust authored and TallJimbo committed Aug 24, 2023
    Configuration menu
    Copy the full SHA
    d835b81 View commit details
    Browse the repository at this point in the history
  7. Add changelog entry.

    TallJimbo committed Aug 24, 2023
    Configuration menu
    Copy the full SHA
    3eab2cd View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    aa40451 View commit details
    Browse the repository at this point in the history
  9. Remove pruning from QG and fix internal graph construction.

    QuantumGraph doesn't need to be able to prune itself anymore, since
    that's now handled earlier, inside QuantumGraphBuilder (where we can
    do it more efficiently).
    
    And without that pruning code, it's easy to replace QuantumGraph's
    _datasetRefDict attribute with a temporary networkx graph, and this
    sidesteps a problem in which different DatasetRefs don't compare as
    equal if they have different storage classes, which was causing QGs to
    lack some edges they should have had with the QuantumGraphBuilder.  I
    believe this was because the old QG generation algorithm passed
    DatasetRefs with incorrect storage classes to QuantumGraph, and that
    also sidestepped the bug.
    TallJimbo committed Aug 24, 2023
    Configuration menu
    Copy the full SHA
    4025751 View commit details
    Browse the repository at this point in the history
  10. Simplify QuantumGraph's _DatasetTracker for task-graph-only use.

    This class was previously used for both task <-> dataset type graphs
    and quantum <-> dataset graphs, and now it's just used for the former.
    So it doesn't need to be generic, and it doesn't need to support node
    removal anymore.
    
    Eventually it should be replaced entirely by PipelineGraph, but that's
    out of scope for now (it's part of DM-40442).
    TallJimbo committed Aug 24, 2023
    Configuration menu
    Copy the full SHA
    907e4f4 View commit details
    Browse the repository at this point in the history
  11. Configuration menu
    Copy the full SHA
    c2e4441 View commit details
    Browse the repository at this point in the history
  12. Configuration menu
    Copy the full SHA
    8e76925 View commit details
    Browse the repository at this point in the history
  13. Configuration menu
    Copy the full SHA
    6e3a48a View commit details
    Browse the repository at this point in the history
  14. Configuration menu
    Copy the full SHA
    0d8f41e View commit details
    Browse the repository at this point in the history
  15. Clean up QuantumGraphSkeleton handling of implicit node addition.

    With the switch to tuples instead of DataCoordinates as keys, adding
    nodes implicitly when edges are added gets trickier; we can only
    support it for input datasets, because something else (registry queries
    or upstream tasks) are responsible for making DatasetRefs for those.
    TallJimbo committed Aug 24, 2023
    Configuration menu
    Copy the full SHA
    1a085fb View commit details
    Browse the repository at this point in the history
  16. Take more care with missing skip-existing-in collections.

    We want to allow users to include at least the output run here without
    it actually existing yet, and let that be a no-op, instead of
    complaining.
    TallJimbo committed Aug 24, 2023
    Configuration menu
    Copy the full SHA
    636e097 View commit details
    Browse the repository at this point in the history
  17. Configuration menu
    Copy the full SHA
    bc6752a View commit details
    Browse the repository at this point in the history
  18. Configuration menu
    Copy the full SHA
    6457335 View commit details
    Browse the repository at this point in the history