Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Auto Parallel #8891

Merged
merged 61 commits into from
Sep 27, 2022
Merged

Auto Parallel #8891

merged 61 commits into from
Sep 27, 2022

Commits on Jul 28, 2022

  1. add auto_parallel code

    add auto_parallel pass
    wyg1997 committed Jul 28, 2022
    Configuration menu
    Copy the full SHA
    85ac376 View commit details
    Browse the repository at this point in the history
  2. Feat ap remove hierarchy cast (#7919)

    * feat(AutoParallel): support remove parallel_cast ops
    
    * feat(AutoParallel): export enable_auto_parallel_prune_parallel_cast_ops
    
    * format code
    wyg1997 committed Jul 28, 2022
    Configuration menu
    Copy the full SHA
    40e791f View commit details
    Browse the repository at this point in the history
  3. Fix add conv grad cost (#7972)

    * feat(Conv): add grad computation cost
    
    * fix ConvDataGrad computation cost
    
    * update conv grad cost
    
    * refine
    wyg1997 committed Jul 28, 2022
    Configuration menu
    Copy the full SHA
    1006247 View commit details
    Browse the repository at this point in the history
  4. Auto parallel/fast collector (#7958)

    * Try to speed up sbp collector.
    However, throughput drop
    
    * Shrink the parallel candidates for the proxy node
    
    * Print out some information and then refine
    
    * Store the sbp set for each consumer
    
    * Update binary set intersection
    
    * Remove impossible parallel candidates from sbp proxy
    
    * Refine binary set
    
    * Add a Clear() in binary set
    
    * Filter out those proxy candidates containing two
    sbps from the same unique group
    
    * refine
    
    * Check spells
    
    * Clip useless edges
    Yipeng1994 authored and wyg1997 committed Jul 28, 2022
    Configuration menu
    Copy the full SHA
    b49b953 View commit details
    Browse the repository at this point in the history
  5. AutoParallel mainstem algorithm add mutable_op_ctrl_edge (#8033)

    * feat(AutoParallel): mainstem algorithm add mutable_op_ctrl_edge
    
    * use if instead std::max
    wyg1997 committed Jul 28, 2022
    Configuration menu
    Copy the full SHA
    d92cd6b View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    50f478c View commit details
    Browse the repository at this point in the history
  7. [WIP] Fix auto parallel dump uniform sbp bug (#8330)

    * fix(AutoParallel): fix auto parallel dump uniform sbp bug
    
    * refine source op judgement
    wyg1997 committed Jul 28, 2022
    Configuration menu
    Copy the full SHA
    c919dce View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    f3fc750 View commit details
    Browse the repository at this point in the history
  9. Refactor dump nd sbp for auto parallel (#8353)

    * fix(AutoParallel): fix auto parallel dump uniform sbp bug
    
    * feat(AutoParallel): add inferface for op to dump nd_sbp to op_conf
    
    * refactor(AutoParallel): refactor DumpNdSbpSignatureForOpConfFn
    wyg1997 committed Jul 28, 2022
    Configuration menu
    Copy the full SHA
    5d76b85 View commit details
    Browse the repository at this point in the history
  10. rename Global to Singleton

    wyg1997 committed Jul 28, 2022
    Configuration menu
    Copy the full SHA
    98353b7 View commit details
    Browse the repository at this point in the history
  11. Refactor SbpEdge (#8684)

    * refactor(AP): refactor SbpEdge
    
    * Rename variables
    
    * Add const for some functions
    
    Co-authored-by: Yipeng Li <jamesonli1313@gmail.com>
    wyg1997 and Yipeng1994 committed Jul 28, 2022
    Configuration menu
    Copy the full SHA
    f26b68e View commit details
    Browse the repository at this point in the history
  12. Refactor auto parallel sbp node (#8712)

    * Rename
    
    * Code clean up
    
    * Code clean up
    
    * Code clean up and package up
    
    * Rename
    
    * Add const for some functions
    Yipeng1994 authored and wyg1997 committed Jul 28, 2022
    Configuration menu
    Copy the full SHA
    b5cc87b View commit details
    Browse the repository at this point in the history
  13. Refactor auto parallel sbp graph (#8722)

    * Code clean up
    
    * Package up
    
    * Code clean up and package up in SbpNode and SbpEdge
    
    * Rename
    
    * Rename
    
    * Rename mainstem to trunk
    
    * Typo, small bugs and rename
    
    * Rename and of format
    Yipeng1994 authored and wyg1997 committed Jul 28, 2022
    Configuration menu
    Copy the full SHA
    4e8aebb View commit details
    Browse the repository at this point in the history
  14. Refactor auto parallel rest (#8731)

    * Package up SbpCollector
    
    * Add const for SbpGraph
    
    * Add const for SbpNode
    
    * Add const for SbpEdge
    
    * Add const for SbpCollector
    
    * Add const, rename, and package up for BinarySet
    
    * Rename for BinarySet
    
    * Rename for SbpCollector
    
    * Rename for SbpCollector
    
    * Rename for algorithm utils
    
    * Fix a bug for an unused function AddEntries()
    
    * Rename for BinarySet
    
    * Rename for SbpConstructor
    
    * Rename for BoxingCollector
    
    * Add const for sbp utils
    Yipeng1994 authored and wyg1997 committed Jul 28, 2022
    Configuration menu
    Copy the full SHA
    56d70f8 View commit details
    Browse the repository at this point in the history
  15. fix merge conflict

    wyg1997 committed Jul 28, 2022
    Configuration menu
    Copy the full SHA
    a183c7c View commit details
    Browse the repository at this point in the history

Commits on Aug 1, 2022

  1. Remove template for sbp signature (#8787)

    * Remove template for sbp signature
    
    * Remove _H_ from cpp files
    
    * Remove namespace specifier oneflow::
    
    * Remove namespace specifier oneflow::
    
    * Of format
    
    * Move the inline functions to cpp files
    
    * Can not add inline specifier?
    
    * Update oneflow/core/auto_parallel/sbp_graph.h
    
    Co-authored-by: Yinggang Wang <wyg19970408@gmail.com>
    
    * Of format
    
    Co-authored-by: Yinggang Wang <wyg19970408@gmail.com>
    Yipeng1994 and wyg1997 committed Aug 1, 2022
    Configuration menu
    Copy the full SHA
    f4093ff View commit details
    Browse the repository at this point in the history

Commits on Aug 4, 2022

  1. Refactor auto parallel class object stuff (#8835)

    * Delete copy/move constructor/operator
    
    * Move the deconstructor of SbpEdge to the cpp file
    
    * Equal by address for Sbp data structor
    
    * Replace sbp_sig_list_ with sbp_sig_obj_list_
    Yipeng1994 committed Aug 4, 2022
    Configuration menu
    Copy the full SHA
    7587e8c View commit details
    Browse the repository at this point in the history

Commits on Aug 5, 2022

  1. Fix auto parallel copy cost infer2 (#8788)

    * Check the output shape for operator in auto parallel
    
    * Return infinity for different sbps while is_mutable
    
    * Update oneflow/core/auto_parallel/sbp_constructor.cpp
    
    Co-authored-by: Yinggang Wang <wyg19970408@gmail.com>
    
    * Update oneflow/core/operator/operator.cpp
    
    Co-authored-by: Yinggang Wang <wyg19970408@gmail.com>
    
    * with output -> check output
    
    Co-authored-by: Yinggang Wang <wyg19970408@gmail.com>
    Yipeng1994 and wyg1997 committed Aug 5, 2022
    Configuration menu
    Copy the full SHA
    accb933 View commit details
    Browse the repository at this point in the history
  2. Refactor prune identity as much as possible (#8849)

    * Prune a line of parallel cast ops
    
    * Avoid repeated pruning
    
    * Code clean up
    
    * Remove identity op
    
    * Update oneflow/core/job_rewriter/auto_parallel.cpp
    
    Co-authored-by: Yinggang Wang <wyg19970408@gmail.com>
    
    Co-authored-by: Yinggang Wang <wyg19970408@gmail.com>
    Yipeng1994 and wyg1997 committed Aug 5, 2022
    Configuration menu
    Copy the full SHA
    5ddc991 View commit details
    Browse the repository at this point in the history

Commits on Aug 9, 2022

  1. Fix auto parallel low throughput (#8876)

    * Speed up after pruning identity
    
    * Slight changes
    Yipeng1994 authored and wyg1997 committed Aug 9, 2022
    Configuration menu
    Copy the full SHA
    a2db39d View commit details
    Browse the repository at this point in the history
  2. Refactor auto parallel final check (#8887)

    * Of format
    
    * Use const auto &
    
    * Of format and rename
    
    * Re-compute cost if steals sbp signatures
    Yipeng1994 authored and wyg1997 committed Aug 9, 2022
    Configuration menu
    Copy the full SHA
    6642e74 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    7a47afb View commit details
    Browse the repository at this point in the history

Commits on Aug 10, 2022

  1. Configuration menu
    Copy the full SHA
    96237b3 View commit details
    Browse the repository at this point in the history
  2. Docs auto parallel doc (#8896)

    * doc(AutoParallel): add auto parallel document framework
    
    * docs(AutoParallel): add document
    
    * fix typo
    
    * refine document
    
    * refine documentation
    wyg1997 committed Aug 10, 2022
    Configuration menu
    Copy the full SHA
    afd8a96 View commit details
    Browse the repository at this point in the history

Commits on Aug 12, 2022

  1. Configuration menu
    Copy the full SHA
    a3e0886 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    9d1105f View commit details
    Browse the repository at this point in the history

Commits on Aug 15, 2022

  1. Configuration menu
    Copy the full SHA
    d0a834e View commit details
    Browse the repository at this point in the history

Commits on Aug 16, 2022

  1. Test alexnet for auto_parallel (#8917)

    * test(AutoParallel): test alexnet for auto_parallel
    
    * test(AutoParallel): test model add auto_parallel config
    wyg1997 committed Aug 16, 2022
    Configuration menu
    Copy the full SHA
    e0d1770 View commit details
    Browse the repository at this point in the history

Commits on Aug 18, 2022

  1. Fix get sbp bug (#8939)

    * Fix the bug of missing sbp for uniform op
    
    * Speed up
    
    * Add the mising sbp for optional input UserSourceOpTickInput
    
    * Remove the repeated all-B sbp signature
    
    * Add sbp for undefined UserSourceOpTickInput
    Yipeng1994 committed Aug 18, 2022
    Configuration menu
    Copy the full SHA
    bf0da26 View commit details
    Browse the repository at this point in the history

Commits on Aug 22, 2022

  1. Configuration menu
    Copy the full SHA
    929d42d View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    010de9c View commit details
    Browse the repository at this point in the history

Commits on Aug 29, 2022

  1. Configuration menu
    Copy the full SHA
    41a2835 View commit details
    Browse the repository at this point in the history
  2. Address comments

    Yipeng1994 committed Aug 29, 2022
    Configuration menu
    Copy the full SHA
    4d91a0b View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    f77441b View commit details
    Browse the repository at this point in the history
  4. fix merge conflict

    wyg1997 committed Aug 29, 2022
    Configuration menu
    Copy the full SHA
    a6ba01b View commit details
    Browse the repository at this point in the history

Commits on Sep 6, 2022

  1. Address comments

    Yipeng1994 committed Sep 6, 2022
    Configuration menu
    Copy the full SHA
    480afbb View commit details
    Browse the repository at this point in the history

Commits on Sep 14, 2022

  1. Disabled ZeRO when enabled AutoParallel (#9087)

    fix(AutoParallel): disabled ZeRO when enabled AutoParallel
    wyg1997 committed Sep 14, 2022
    Configuration menu
    Copy the full SHA
    512e17e View commit details
    Browse the repository at this point in the history

Commits on Sep 15, 2022

  1. Configuration menu
    Copy the full SHA
    f1d22ba View commit details
    Browse the repository at this point in the history

Commits on Sep 19, 2022

  1. Address comments

    Yipeng1994 committed Sep 19, 2022
    Configuration menu
    Copy the full SHA
    2c5b3f8 View commit details
    Browse the repository at this point in the history

Commits on Sep 20, 2022

  1. Address comment.

    GetComputationCostFn -> GetComputationCost
    Yipeng1994 committed Sep 20, 2022
    Configuration menu
    Copy the full SHA
    99efb17 View commit details
    Browse the repository at this point in the history

Commits on Sep 21, 2022

  1. Configuration menu
    Copy the full SHA
    c5872f3 View commit details
    Browse the repository at this point in the history
  2. Update oneflow/core/job_rewriter/auto_parallel.cpp

    Co-authored-by: Yinggang Wang <wyg19970408@gmail.com>
    Yipeng1994 and wyg1997 committed Sep 21, 2022
    Configuration menu
    Copy the full SHA
    22f557f View commit details
    Browse the repository at this point in the history
  3. New interface for pr#9018

    Yipeng1994 committed Sep 21, 2022
    Configuration menu
    Copy the full SHA
    76dac2b View commit details
    Browse the repository at this point in the history
  4. Static analysis

    Yipeng1994 committed Sep 21, 2022
    Configuration menu
    Copy the full SHA
    2102158 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    af49e8d View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    3942334 View commit details
    Browse the repository at this point in the history
  7. Fix ones like sbp bug and fix test import error in CI (#9123)

    fix(AutoParallel): skip 1n1d sbp agreement check
    wyg1997 committed Sep 21, 2022
    Configuration menu
    Copy the full SHA
    7fc2f99 View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    20d7199 View commit details
    Browse the repository at this point in the history
  9. auto format by CI

    oneflow-ci-bot committed Sep 21, 2022
    Configuration menu
    Copy the full SHA
    145049e View commit details
    Browse the repository at this point in the history
  10. Configuration menu
    Copy the full SHA
    e332dfd View commit details
    Browse the repository at this point in the history

Commits on Sep 22, 2022

  1. Configuration menu
    Copy the full SHA
    82c910e View commit details
    Browse the repository at this point in the history

Commits on Sep 26, 2022

  1. Address comments

    Yipeng1994 committed Sep 26, 2022
    Configuration menu
    Copy the full SHA
    0f0e25b View commit details
    Browse the repository at this point in the history

Commits on Sep 27, 2022

  1. Configuration menu
    Copy the full SHA
    194c79f View commit details
    Browse the repository at this point in the history
  2. fix typo

    wyg1997 committed Sep 27, 2022
    Configuration menu
    Copy the full SHA
    c6e3f91 View commit details
    Browse the repository at this point in the history
  3. Feat full auto parallel (#9140)

    * Use B for inplace op and remove the check for sbp
    while truning the auto prallelism on
    
    * Slight change
    
    * Not using B as the constrain
    
    * Address comments
    Yipeng1994 committed Sep 27, 2022
    Configuration menu
    Copy the full SHA
    6052f44 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    16d39c2 View commit details
    Browse the repository at this point in the history
  5. Merge branch 'feat-auto_parallel' of github.com:oneflow-inc/oneflow i…

    …nto feat-auto_parallel
    wyg1997 committed Sep 27, 2022
    Configuration menu
    Copy the full SHA
    9eff2ca View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    083a623 View commit details
    Browse the repository at this point in the history
  7. rename auto_parallel_prune_parallel_cast_ops to enable_auto_parallel_…

    …ignore_user_sbp_config
    wyg1997 committed Sep 27, 2022
    Configuration menu
    Copy the full SHA
    9144a5b View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    234e988 View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    5e2014c View commit details
    Browse the repository at this point in the history