Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Update NCCL to v2.21.5 #1780

Merged
merged 4 commits into from
Jun 28, 2024
Merged

Conversation

bryantbiggs
Copy link
Contributor

@bryantbiggs bryantbiggs commented Apr 8, 2024

@bryantbiggs
Copy link
Contributor Author

@bryantbiggs bryantbiggs mentioned this pull request May 16, 2024
@nWEIdia
Copy link
Collaborator

nWEIdia commented May 16, 2024

Please help integrate #1823's ARM related changes as well, thanks!

@bryantbiggs
Copy link
Contributor Author

@nWEIdia done!

Copy link
Collaborator

@nWEIdia nWEIdia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!

@Skylion007
Copy link

Once this is merged, feel free to merge my PR on PyTorch

@bryantbiggs
Copy link
Contributor Author

cc @atalman @malfet

@nWEIdia
Copy link
Collaborator

nWEIdia commented Jun 10, 2024

Could you please do another rebase? @bryantbiggs

@Skylion007 we may need to repeat a round of pytorch/pytorch test after Bryant rebases.

@bryantbiggs
Copy link
Contributor Author

@nWEIdia done!

@bryantbiggs
Copy link
Contributor Author

rebased to include CI fixes from #1885

@bryantbiggs
Copy link
Contributor Author

looks like the failed test is a flake if someone could re-run the failed tests

Dockerfile_aarch64:15
--------------------
  13 |     # the binary builds (torch, vision, audio, text, data)
  14 |     RUN yum -y install epel-release
  15 | >>> RUN yum -y update
  16 |     RUN yum install -y \
  17 |       autoconf \
--------------------
ERROR: failed to solve: process "/bin/sh -c yum -y update" did not complete successfully: exit code: 1

@atalman atalman merged commit 448908b into pytorch:main Jun 28, 2024
26 checks passed
@bryantbiggs bryantbiggs deleted the feat/nccl-1-21-5 branch June 28, 2024 17:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants