Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add cu118 workflows #90826

Closed
wants to merge 6 commits into from
Closed

Add cu118 workflows #90826

wants to merge 6 commits into from

Conversation

ptrblck
Copy link
Collaborator

@ptrblck ptrblck commented Dec 14, 2022

@ptrblck ptrblck requested a review from a team as a code owner December 14, 2022 07:03
@pytorch-bot
Copy link

pytorch-bot bot commented Dec 14, 2022

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/90826

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 515c1ec:
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@linux-foundation-easycla
Copy link

linux-foundation-easycla bot commented Dec 14, 2022

CLA Signed

The committers listed above are authorized under a signed CLA.

@pytorch-bot pytorch-bot bot added the topic: not user facing topic category label Dec 14, 2022
@zou3519 zou3519 added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Dec 15, 2022
@atalman atalman added ciflow/binaries Trigger all binary build and upload jobs on the PR ciflow/binaries_conda Trigger binary build and upload jobs for conda on the PR ciflow/binaries_wheel Trigger binary build and upload jobs for wheel on the PR ciflow/binaries_libtorch Trigger binary build and upload jobs for libtorch on the PR and removed ciflow/binaries Trigger all binary build and upload jobs on the PR ciflow/binaries_conda Trigger binary build and upload jobs for conda on the PR ciflow/binaries_wheel Trigger binary build and upload jobs for wheel on the PR ciflow/binaries_libtorch Trigger binary build and upload jobs for libtorch on the PR labels Dec 16, 2022
@ptrblck
Copy link
Collaborator Author

ptrblck commented Dec 18, 2022

The MacOS build failures seem to come from an invalid usage of --jobs 0:

+ git submodule update --init --recursive --jobs 0
~/work/pytorch/pytorch/pytorch ~/work/pytorch/pytorch
BUG: run-command.c:1521: you must provide a non-zero number of processes!

while the Windows build issues seem to be real:

Path = C:\actions-runner\_work\pytorch\pytorch\builder\windows\internal\\..\temp_build\gpu_driver_dlls.zip
Type = zip
Physical Size = 5127591


Would you like to replace the existing file:
  Path:     C:\Windows\System32\nvcuda.dll
  Size:     2729776 bytes (2666 KiB)
  Modified: 2021-04-10 18:27:18
with the file from archive:
  Path:     nvcuda.dll
  Size:     17459864 bytes (17 MiB)
  Modified: 2019-11-14 22:47:12


Break signaled
? (Y)es / (N)o / (A)lways / (S)kip all / A(u)to rename all / (Q)uit? 
Archives with Errors: 1

Cleaning temp files

C:\actions-runner\_work\pytorch\pytorch\builder>goto set_cuda_env_vars 

C:\actions-runner\_work\pytorch\pytorch\builder>echo Setting up environment... 
Setting up environment...

C:\actions-runner\_work\pytorch\pytorch\builder>set "PATH=C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.8\bin;C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.8\libnvvp;C:\Windows\system32;C:\Windows;C:\Windows\System32\Wbem;C:\Windows\System32\WindowsPowerShell\v1.0;C:\Windows\System32\OpenSSH;C:\Program Files\Amazon\cfn-bootstrap;C:\ProgramData\chocolatey\bin;C:\Program Files\Amazon\AWSCLIV2;C:\Program Files\Git\cmd;C:\Program Files\Git\mingw64\bin;C:\Program Files\Git\usr\bin;C:\Program Files (x86)\Windows Kits\10\Windows Performance Toolkit;C:\Users\runneruser\AppData\Local\Microsoft\WindowsApps" 

C:\actions-runner\_work\pytorch\pytorch\builder>set "CUDA_PATH=C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.8" 

C:\actions-runner\_work\pytorch\pytorch\builder>set "CUDA_PATH_V11_8=C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.8" 

C:\actions-runner\_work\pytorch\pytorch\builder>set "NVTOOLSEXT_PATH=C:\Program Files\NVIDIA Corporation\NvToolsExt" 

C:\actions-runner\_work\pytorch\pytorch\builder>if errorlevel 1 exit /b 1 
Error: Process completed with exit code 1.

I'm currently unsure what's causing these issues and what the actual error is, so would need to dig a bit into it.

@ptrblck ptrblck mentioned this pull request Dec 19, 2022
@atalman atalman added ciflow/binaries Trigger all binary build and upload jobs on the PR ciflow/binaries_conda Trigger binary build and upload jobs for conda on the PR ciflow/binaries_wheel Trigger binary build and upload jobs for wheel on the PR and removed ciflow/binaries_conda Trigger binary build and upload jobs for conda on the PR ciflow/binaries_wheel Trigger binary build and upload jobs for wheel on the PR ciflow/binaries_libtorch Trigger binary build and upload jobs for libtorch on the PR ciflow/binaries Trigger all binary build and upload jobs on the PR labels Dec 19, 2022
@atalman
Copy link
Contributor

atalman commented Dec 19, 2022

Windows issues where because of the windows 11.8 image was still not available on production. This issue should be resolved this morning. I am rerunning the tests now.

@ngimel
Copy link
Collaborator

ngimel commented Dec 19, 2022

If windows issues persist, we should start 11.8 runs and nightlies for linux only, it's been long enough.

@atalman atalman added ciflow/binaries Trigger all binary build and upload jobs on the PR ciflow/binaries_conda Trigger binary build and upload jobs for conda on the PR ciflow/binaries_wheel Trigger binary build and upload jobs for wheel on the PR ciflow/binaries_libtorch Trigger binary build and upload jobs for libtorch on the PR and removed ciflow/binaries Trigger all binary build and upload jobs on the PR ciflow/binaries_conda Trigger binary build and upload jobs for conda on the PR ciflow/binaries_wheel Trigger binary build and upload jobs for wheel on the PR labels Dec 19, 2022
@atalman
Copy link
Contributor

atalman commented Dec 20, 2022

@pytorchbot merge

@pytorch-bot
Copy link

pytorch-bot bot commented Dec 20, 2022

This PR needs to be approved by an authorized maintainer before merge.

Copy link
Contributor

@atalman atalman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@atalman
Copy link
Contributor

atalman commented Dec 20, 2022

@pytorchbot merge -f "Signal is green"

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ciflow/binaries_conda Trigger binary build and upload jobs for conda on the PR ciflow/binaries_libtorch Trigger binary build and upload jobs for libtorch on the PR ciflow/binaries_wheel Trigger binary build and upload jobs for wheel on the PR ciflow/binaries Trigger all binary build and upload jobs on the PR Merged open source topic: not user facing topic category triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants