Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revamp the rabit implementation. #10112

Merged
merged 45 commits into from
May 20, 2024
Merged

Commits on May 8, 2024

  1. Rework RABIT.

    get host name.
    
    send port.
    
    all gather.
    
    assert.
    
    op.
    
    begin work on bootstrap.
    
    utilities.
    
    work on bootstrap.
    
    move listener.
    
    catch.
    
    comm.
    
    async send.
    
    block.
    
    block.
    
    Start work on async.
    
    batch poll.
    
    tests.
    
    move.
    
    Start working on tracker.
    
    better tests.
    
    work on tracker.
    
    bind.
    
    work on accepting workers.
    
    complete allgather.
    
    Move.
    
    Send with JSON.
    
    work on shutdown.
    
    msg.
    
    compare task.
    
    rename.
    
    Move.
    
    cleanup.
    
    Start work on broadcast.
    
    Work comm.
    
    Cleanup bootstrap.
    
    hide.
    
    Move to bootstrap.
    
    non blocking.
    
    cleanup.
    
    any op.
    
    shift.
    
    cleanup.
    
    Cleanup.
    
    per-thread.
    
    checks.
    
    start working on nccl.
    
    backend.
    
    Get the prototype compile.
    
    log.
    
    test print.
    
    timeout on connection.
    
    get nccl allreduce basic.
    
    look into federated.
    
    proto.
    
    scatter reduce.
    
    allreduce prototype.
    
    Work on tests.
    
    cleanup.
    
    Initialization.
    
    Init.
    
    Work on Python.
    
    get args.
    
    Start working on allgatherv.
    
    convert some allreduce.
    
    remove some old use.
    
    remove cpu impl.
    
    work on gpu.
    
    play with dlopen.
    
    convert.
    
    convert.
    
    convert.
    
    placeholder.
    
    backend.
    
    work on federated.
    
    remove.
    
    Move.
    
    Federated tracker.
    
    Move.
    
    move into comm.
    
    GPU variant.
    
    not just nccl.
    
    fix.
    
    fix.
    
    Convert.
    
    bitwise.
    
    stream.
    
    copying allgather.
    
    replace.
    
    Remove.
    
    remove.
    
    Remove device.
    
    Remove rabit.
    
    Remove rabit.
    
    cmake.
    
    tests.
    
    use gmock.
    
    Move.
    
    Split.
    
    init.
    
    Extract.
    
    compiler.
    
    test timeout.
    
    exc.
    
    comments.
    
    Tests for federated.
    
    Remove.
    
    remove.
    
    Split up.
    
    refactor tests.
    
    format.
    
    extract magic number.
    
    Extract more commands.
    
    refactor.
    
    Remove.
    
    Reduce dependency on c api.
    
    remove old code.
    
    throw.
    
    coll error.
    
    indirect.
    
    look into dask module.
    
    parameters.
    
    command.
    
    probing.
    
    listen for error.
    
    debug.
    
    host.
    
    cleanup.
    
    dask.
    
    loop.
    
    working basic.
    
    header.
    
    guard.
    
    test.
    
    type.
    
    socket.
    
    cleanup & notes.
    
    use a state machine.
    
    work on tests.
    
    header.
    
    test channel.
    
    cleanup.
    
    cleanup broadcast.
    
    unneeded changes.
    
    allgather string.
    
    Fixes.
    
    cleanup rebase.
    
    fixes after rebase.
    
    split up nccl comm.
    
    Move data copying.
    
    allgatherv test.
    
    Extract.
    
    tests.
    
    test allreduce.
    
    remove the use of ctx.
    
    tests.
    
    rebase.
    
    work on fed.
    
    work on allgatherv.
    
    name.
    
    lint.
    
    Split.
    
    split.
    
    remove gmock.
    
    move.
    
    CPU.
    
    CUDA.
    
    compile.
    
    Cleanup.
    
    header.
    
    Work on tests.
    
    checks.
    
    fixes.
    
    tests.
    
    work on CUDA test.
    
    comm.
    
    Share the implementation.
    
    tests.
    
    cleanup.
    
    cleanup.
    
    cleanup
    
    cleanup.
    
    set device.
    
    cleanup.
    
    cleanup.
    
    more.
    
    cleanup.
    
    Get it work.
    
    wait.
    
    revert dask changes.
    
    time.
    
    remove reference to encoder.
    
    extract.
    
    extract.
    
    split up the training function.
    
    Fix.
    
    deterministic.
    
    Fix.
    
    debug.
    
    Fixes.
    
    remove.
    
    cleanup.
    
    fix.
    
    Move worker env.
    
    cleanup.
    
    cleanup.
    
    wait.
    
    cleanup.
    
    extract error handling.
    
    get abort to work as well.
    
    Move.
    
    policy.
    
    cleanups.
    
    cleanup.
    
    Split up.
    
    doc.
    
    Cleanup ctor.
    
    tests.
    
    tests.
    
    tests.
    
    configuration.
    
    tests.
    
    task id.
    
    start working on metric tests.
    
    Remove.
    
    type.
    
    agg.
    
    fix seq.
    
    tests.
    
    start working on cuda test.
    
    type.
    
    fixes.
    
    tests.
    
    Use device ord.
    
    Remove auc.
    
    remove elementwise.
    
    remove multi-class
    
    cleanup aft.
    
    cleanup ranking.
    
    remove old tests.
    
    headers.
    
    Move.
    
    move.
    
    single gpu tests.
    
    Cleanup C API.
    
    unknown.
    
    C API.
    
    Small cleanup.
    
    cleanup.
    
    Fix.
    
    cleanup.
    
    work on async queue.
    
    work on sync.
    
    Use blocking op.
    
    result.
    
    Fuzzing.
    
    result.
    
    Remove coll error.
    
    Move.
    
    cleanup.
    
    cleanup.
    
    cleanup.
    
    cleanup.
    
    cleanup.
    
    cleanup.
    
    cleanup.
    
    Fix removal.
    
    test
    
    lint.
    
    remove.
    
    cleanup.
    
    cleanup.
    
    cleanup.
    
    invoke result.
    
    note.
    
    Fix rebase.
    
    Fix rebase.
    
    fix & cleanup.
    
    Fix.
    
    remove coll error for now.
    
    cleanup.
    
    replace.
    
    replace.
    
    replace.
    
    test basic.
    
    cleanup, fix.
    
    deduced size.
    
    cleanup.
    
    Convert to new routines.
    
    Fixes.
    
    Fix.
    
    Add test.
    
    use vector.
    
    cleanup.
    
    safe coll.
    
    Fix.
    
    Fix.
    
    Fix.
    
    Fix.
    
    cleanup.
    
    cleanup.
    
    Don't throw.
    
    Fixes.
    
    v6
    
    timeout.
    
    Cleanups.
    
    Remove error handling for now.
    
    Timeout.
    
    syc.
    
    mac.
    
    cli.
    
    remove.
    
    types.
    
    federated.
    
    build.
    
    build.
    
    sortby.
    
    build.
    
    windows, macos.
    
    macos.
    
    federated.
    
    macos.
    
    lint.
    
    skip finalize.
    
    Forbid empty data.
    
    Shutdown before dtor.
    
    small allreduce.
    
    rounddown.
    
    empty input.
    
    remove warning.
    
    remove get host IP.
    
    take down the jvm package for now.
    
    stop early.
    
    annotation.
    
    lint.
    
    debug check.
    
    windows.
    
    np.bool.
    
    Work on shutdown.
    
    test blocking.
    
    remove dask error.
    
    Detach.
    
    comments.
    
    blocking.
    
    display timeout.
    
    debug github error.
    
    checks.
    
    Switch the order.
    
    revert debug log.
    
    fix tests.
    
    delete.
    
    reverse.
    
    don't block.
    
    improved error.
    
    clear;
    
    shutdown the tracker.
    
    release lock early.
    
    comments.
    
    macos.
    
    windows.
    
    Move.
    
    const.
    
    Unify ctor.
    
    remove exceptions.
    
    lint, comment.
    
    freeze pyarrow.
    
    windows.
    
    r package.
    
    Fix CI.
    
    Start looking into jvm
    
    chpk
    
    jni.
    
    c test.
    
    remove extra argument.
    
    tracker.
    
    interrupt.
    
    cleanup.
    
    Fix spark profiling.
    
    Compile.
    
    Start convert the scala package.
    
    Log init.
    
    cleanup test.
    
    communicator.
    
    alive.
    
    tests
    
    log.
    
    Revert "log."
    
    This reverts commit 3bc6d82.
    
    Shutdown when exit.
    
    remove tracker return code.
    
    windows build.
    
    shutdown only if not closed.
    
    lint.
    
    protect the listener.
    
    concat.
    
    Debug log.
    
    detect EOF.
    
    Revert "Debug log."
    
    This reverts commit a3e0bd9.
    
    Cleanup.
    
    Fixes.
    
    lint.
    
    don't omit frame pointer.
    
    Refactor tests.
    
    Fix minimum build.
    
    Fix distributed tests on single GPU.
    
    cleanup & win build.
    
    MacOS compilation.
    
    typo.
    
    macos, jvm.
    
    Windows socket.
    
    Ignore POLLHUP
    
    Handle shutdown.
    
    unix socket
    
    fix
    
    MSVC
    
    Sock error
    
    Restore the shutdown signal.
    
    states
    
    windows sock
    
    enable win tests
    
    lint.
    
    update.
    
    Skip tests.
    
    skip only for gpu.
    
    cleanup.
    
    cleanup.
    
    remove error code for now.
    
    Documents.
    
    lower case.
    
    rename.
    
    Fix.
    
    Fix.
    trivialfis committed May 8, 2024
    Configuration menu
    Copy the full SHA
    6c446ca View commit details
    Browse the repository at this point in the history
  2. cleanup.

    trivialfis committed May 8, 2024
    Configuration menu
    Copy the full SHA
    abf2419 View commit details
    Browse the repository at this point in the history
  3. cleanup jvm packages.

    trivialfis committed May 8, 2024
    Configuration menu
    Copy the full SHA
    9fa105d View commit details
    Browse the repository at this point in the history
  4. Fix demo chunks.

    trivialfis committed May 8, 2024
    Configuration menu
    Copy the full SHA
    3929daa View commit details
    Browse the repository at this point in the history
  5. consistent jvm.

    trivialfis committed May 8, 2024
    Configuration menu
    Copy the full SHA
    ae0f49d View commit details
    Browse the repository at this point in the history
  6. Revert "consistent jvm."

    This reverts commit 0525394.
    trivialfis committed May 8, 2024
    Configuration menu
    Copy the full SHA
    08ccefc View commit details
    Browse the repository at this point in the history
  7. non-jvm changes.

    trivialfis committed May 8, 2024
    Configuration menu
    Copy the full SHA
    22194a7 View commit details
    Browse the repository at this point in the history
  8. bisect.

    test.
    trivialfis committed May 8, 2024
    Configuration menu
    Copy the full SHA
    bb8e40e View commit details
    Browse the repository at this point in the history
  9. revert test changes.

    trivialfis committed May 8, 2024
    Configuration menu
    Copy the full SHA
    e5c35ec View commit details
    Browse the repository at this point in the history
  10. revert gh changes.

    trivialfis committed May 8, 2024
    Configuration menu
    Copy the full SHA
    81d8692 View commit details
    Browse the repository at this point in the history
  11. jvm changes.

    trivialfis committed May 8, 2024
    Configuration menu
    Copy the full SHA
    99d55b3 View commit details
    Browse the repository at this point in the history
  12. unused import.

    trivialfis committed May 8, 2024
    Configuration menu
    Copy the full SHA
    baf99f3 View commit details
    Browse the repository at this point in the history
  13. Revert "unused import."

    This reverts commit ee8e203.
    trivialfis committed May 8, 2024
    Configuration menu
    Copy the full SHA
    353f2de View commit details
    Browse the repository at this point in the history
  14. Revert "Revert "unused import.""

    This reverts commit 1baa02a.
    trivialfis committed May 8, 2024
    Configuration menu
    Copy the full SHA
    4690758 View commit details
    Browse the repository at this point in the history
  15. update jackson.

    trivialfis committed May 8, 2024
    Configuration menu
    Copy the full SHA
    72ef55b View commit details
    Browse the repository at this point in the history
  16. try latest spark.

    trivialfis committed May 8, 2024
    Configuration menu
    Copy the full SHA
    594bbc9 View commit details
    Browse the repository at this point in the history
  17. scope.

    trivialfis committed May 8, 2024
    Configuration menu
    Copy the full SHA
    2232a6e View commit details
    Browse the repository at this point in the history
  18. Revert "try latest spark."

    This reverts commit d7540d8.
    trivialfis committed May 8, 2024
    Configuration menu
    Copy the full SHA
    33876f6 View commit details
    Browse the repository at this point in the history
  19. Fix.

    trivialfis committed May 8, 2024
    Configuration menu
    Copy the full SHA
    c3547ab View commit details
    Browse the repository at this point in the history
  20. GPU package.

    trivialfis committed May 8, 2024
    Configuration menu
    Copy the full SHA
    269090c View commit details
    Browse the repository at this point in the history
  21. more.

    trivialfis committed May 8, 2024
    Configuration menu
    Copy the full SHA
    eb7b88e View commit details
    Browse the repository at this point in the history
  22. Revert "more."

    This reverts commit eb7b88e.
    trivialfis committed May 8, 2024
    Configuration menu
    Copy the full SHA
    b4e97f9 View commit details
    Browse the repository at this point in the history
  23. Revert jackson version

    trivialfis committed May 8, 2024
    Configuration menu
    Copy the full SHA
    b56ba97 View commit details
    Browse the repository at this point in the history

Commits on May 9, 2024

  1. Configuration menu
    Copy the full SHA
    ae44c1a View commit details
    Browse the repository at this point in the history

Commits on May 11, 2024

  1. Fix secure definition.

    trivialfis committed May 11, 2024
    Configuration menu
    Copy the full SHA
    83ac716 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    56cf189 View commit details
    Browse the repository at this point in the history

Commits on May 14, 2024

  1. Configuration menu
    Copy the full SHA
    a1bfe5f View commit details
    Browse the repository at this point in the history

Commits on May 15, 2024

  1. Configuration menu
    Copy the full SHA
    04cb943 View commit details
    Browse the repository at this point in the history
  2. Free.

    trivialfis committed May 15, 2024
    Configuration menu
    Copy the full SHA
    92a1bc2 View commit details
    Browse the repository at this point in the history
  3. linter.

    trivialfis committed May 15, 2024
    Configuration menu
    Copy the full SHA
    d91cb40 View commit details
    Browse the repository at this point in the history
  4. lint.

    trivialfis committed May 15, 2024
    Configuration menu
    Copy the full SHA
    ec4b35c View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    44a43a9 View commit details
    Browse the repository at this point in the history
  6. rng.

    trivialfis committed May 15, 2024
    Configuration menu
    Copy the full SHA
    cc70c88 View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    61983c4 View commit details
    Browse the repository at this point in the history
  8. Fix rc check.

    trivialfis committed May 15, 2024
    Configuration menu
    Copy the full SHA
    e334141 View commit details
    Browse the repository at this point in the history
  9. err

    trivialfis committed May 15, 2024
    Configuration menu
    Copy the full SHA
    53d072e View commit details
    Browse the repository at this point in the history

Commits on May 16, 2024

  1. Log.

    trivialfis committed May 16, 2024
    Configuration menu
    Copy the full SHA
    53d2a73 View commit details
    Browse the repository at this point in the history
  2. Log time stamp.

    trivialfis committed May 16, 2024
    Configuration menu
    Copy the full SHA
    3391490 View commit details
    Browse the repository at this point in the history
  3. cleanup.

    trivialfis committed May 16, 2024
    Configuration menu
    Copy the full SHA
    c9129ef View commit details
    Browse the repository at this point in the history

Commits on May 17, 2024

  1. Simplify the loop.

    trivialfis committed May 17, 2024
    Configuration menu
    Copy the full SHA
    4c90247 View commit details
    Browse the repository at this point in the history
  2. Never throw.

    trivialfis committed May 17, 2024
    Configuration menu
    Copy the full SHA
    7bb4f22 View commit details
    Browse the repository at this point in the history
  3. remove ref.

    trivialfis committed May 17, 2024
    Configuration menu
    Copy the full SHA
    c5331f4 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    a53c87b View commit details
    Browse the repository at this point in the history
  5. Windows

    trivialfis committed May 17, 2024
    Configuration menu
    Copy the full SHA
    abc5f3b View commit details
    Browse the repository at this point in the history

Commits on May 18, 2024

  1. Windows.

    trivialfis committed May 18, 2024
    Configuration menu
    Copy the full SHA
    e4aa87b View commit details
    Browse the repository at this point in the history