Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DM-24734: Rework QuantumGraph generation to avoid O(N^2) scaling. #128

Merged
merged 4 commits into from
May 29, 2020

Commits on May 28, 2020

  1. Improve status logging for QuantumGraph generation.

    Original, per-data ID messages are now at TRACE level, with coarser
    (per-Task) status meessages at DEBUG.
    TallJimbo committed May 28, 2020
    Configuration menu
    Copy the full SHA
    aa5b6a0 View commit details
    Browse the repository at this point in the history

Commits on May 29, 2020

  1. Fix type annotations and docs for adjustQuantum.

    These now reflect how adjustQuantum is actually being called.  I suspect
    the original types reflect a reasonable aspiration: PipelineTask
    subclasses ideally would be operating on a mapping that uses their internal
    collection names instead of dataset type names, but the Connections
    infrastructure doesn't provide a good way to do that translation (!),
    so changing that here is both out-of-scope _and_ a lot of work.
    TallJimbo committed May 29, 2020
    Configuration menu
    Copy the full SHA
    de188ef View commit details
    Browse the repository at this point in the history
  2. Rework QuantumGraph generation to avoid O(N^2) scaling.

    The previous implementation tested all DatasetRefs (of the right type)
    for compatibility with all quanta, which doesn't scale when the graph
    is large.
    
    This removes the fillQuanta step from _PipelineScaffolding, moving the
    association of datasets with quanta into fillDataIds (now
    connectDataIds) and prerequisite lookup into fillDatasetRefs (now
    resolveDatasetRefs).
    
    To do that, I added a new _QuantumScaffolding object to represent an
    under-construction Quantum, and removed _DatasetScaffolding as it
    didn't end up providing more than a simple dict would provide;
    _DatasetScaffoldingDict was accordingly renamed and adjusted to
    _DatasetDict.
    TallJimbo committed May 29, 2020
    Configuration menu
    Copy the full SHA
    d74fcd6 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    1c80eb4 View commit details
    Browse the repository at this point in the history