Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pytorch 1.12.1 - and Disable kineto due to cupti conflict #136

Merged
merged 7 commits into from
Aug 10, 2022

Conversation

conda-forge-linter
Copy link

@conda-forge-linter conda-forge-linter commented Jul 31, 2022

Hi! This is the friendly automated conda-forge-webservice.

I've rerendered the recipe as instructed in #135.

Here's a checklist to do before merging.

  • Bump the build number if needed.

@conda-forge-linter
Copy link
Author

Hi! This is the friendly automated conda-forge-linting service.

I just wanted to let you know that I linted all conda-recipes in your PR (recipe) and found it was in an excellent condition.

conda-forge-webservices[bot] and others added 3 commits July 31, 2022 22:02
@hmaarrfk hmaarrfk changed the title MNT: rerender Pytorch 1.12 - Disable cupti Jul 31, 2022
@hmaarrfk
Copy link
Contributor

hmaarrfk commented Aug 1, 2022

This isn't having the desired effect, the 10.2 build shows:

2022-08-01T03:53:43.6127771Z    INFO (pytorch,lib/python3.10/site-packages/torch/lib/libtorch_cpu.so): Needed DSO lib/libcupti.so.10.2 found in conda-forge::cudatoolkit-10.2.89-h713d32c_10

@hmaarrfk hmaarrfk changed the title Pytorch 1.12 - Disable cupti Pytorch 1.12 - Disable kineto due to cupti conflict Aug 1, 2022
# CUPTI seems to cause trouble when users install a version of
# cudatoolkit different than the one specified at compile time.
# https://github.com/conda-forge/pytorch-cpu-feedstock/issues/135
export LIBKINETO_NOCUPTI=ON
export USE_KINETO=ON
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you forget to set it to "OFF" here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks

@ngam
Copy link
Contributor

ngam commented Aug 1, 2022

Suggestion: limit the builds to cuda112 and only use one cuda arch (e.g. 8.0) for testing here so that the ci finishes within six hours

@hmaarrfk
Copy link
Contributor

hmaarrfk commented Aug 1, 2022

Suggestion: limit the builds to cuda112 and only use one cuda arch (e.g. 8.0) for testing here so that the ci finishes within six hours

Seems like that would complicate the process even more. I would rather do one thing at once. address the cupti bugs instead of revamping the build system at the same time.

@ngam
Copy link
Contributor

ngam commented Aug 1, 2022

Suggestion: limit the builds to cuda112 and only use one cuda arch (e.g. 8.0) for testing here so that the ci finishes within six hours

Seems like that would complicate the process even more. I would rather do one thing at once. address the cupti bugs instead of revamping the build system at the same time.

Fine with me. My suggestions was merely to get this to finish within six hours for diagnostics only within this PR, not a permanent change. So once we see if disabling kineto actually does the trick, we can go back to build all the cuda arches as before. This would remove the need to build this locally (unless you got the info below from the CI?)

2022-08-01T03:53:43.6127771Z    INFO (pytorch,lib/python3.10/site-packages/torch/lib/libtorch_cpu.so): Needed DSO lib/libcupti.so.10.2 found in conda-forge::cudatoolkit-10.2.89-h713d32c_10

@hmaarrfk
Copy link
Contributor

hmaarrfk commented Aug 1, 2022

The 10.2 builds finish within the allotted time (sometimes) so you can check those. That is what I did in what you quoted.

@ngam
Copy link
Contributor

ngam commented Aug 1, 2022

10.2 builds finish within the allotted time (sometimes)

Ah, didn't know that. Okay then, we can check that in the meanwhile!

@hmaarrfk
Copy link
Contributor

hmaarrfk commented Aug 1, 2022

Well kineto is properly disabled, and cupti is nowhere to be found in the logs for now. I think we are good. I'm going to build cuda locally.

@ngam
Copy link
Contributor

ngam commented Aug 5, 2022

Let's try to change this into 1.12.1?

@hmaarrfk
Copy link
Contributor

hmaarrfk commented Aug 5, 2022

I guess i had the builds ready, but I don't see a sense in having too much uploaded.

@hmaarrfk hmaarrfk changed the title Pytorch 1.12 - Disable kineto due to cupti conflict Pytorch 1.12.1 - and Disable kineto due to cupti conflict Aug 5, 2022
@ngam
Copy link
Contributor

ngam commented Aug 5, 2022

I guess i had the builds ready, but I don't see a sense in having too much uploaded.

If you do, just upload them! Sorry I only wrote that thinking you haven't built them already!

Upload them and we can deal with the 1.12.1 stuff later

@ngam
Copy link
Contributor

ngam commented Aug 5, 2022

Also, I am not sure if this is their final 1.12.1 tag... pytorch/pytorch@v1.12.0...v1.12.1

They didn't announce it under releases, but it has already appeared on pypi, etc.

@hmaarrfk
Copy link
Contributor

hmaarrfk commented Aug 5, 2022

I'm pretty swamped right now with the day job.i think i can get this in by next week

@hmaarrfk
Copy link
Contributor

hmaarrfk commented Aug 5, 2022

Also.

pytorch/pytorch#81680 (comment)

@ngam
Copy link
Contributor

ngam commented Aug 5, 2022

Take your time :)

@hmaarrfk
Copy link
Contributor

GPU build logs
log_files.zip

@hmaarrfk hmaarrfk merged commit 2a3f37b into conda-forge:main Aug 10, 2022
@hmaarrfk
Copy link
Contributor

GPU builds are being uploaded now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants