Add support for including PTX code in PyTorch #2328

Flamefire · 2021-02-02T17:08:29Z

This adds PTX code to PyTorch by default for any newer architecture than the last selected one.
This can be changed by the new EC option "ptx"

Discussion about the cuda cache needs resolving (see below)

QUESTION: What about cuda_cache_size? It might be better to make this an easybuild option (similar to --cuda-compute-capabilities) instead. E.g. for PyTorch the cache seems to be quite large. Running the test test_cpp_extensions_aot_no_ninja alone fills up 1GB

Framework PR: easybuilders/easybuild-framework#3569 If that is merged I can remove the option in this EC

This adds PTX code to PyTorch by default for any newer architecture than the last selected one. This can be changed by the new EC option "ptx"

Flamefire · 2021-02-05T08:25:13Z

Had to add a bugfix as TORCH_CUDA_ARCH_LIST needs to be set for tests too or the build (even current one) will fail if the GPUs found during build are newer than what the used nvcc supports. See https://gist.github.com/3ce737772ff805683c226e500b525c67

Flamefire · 2021-02-06T03:48:38Z

Test report by @Flamefire

Overview of tested easyconfigs (in order)

FAIL (build issue) PyTorch-1.7.1-fosscuda-2019b-Python-3.7.4.eb (partial log available at https://gist.github.com/58bf303fda8006243a91887dad037a27)

Build succeeded for 0 out of 1 (1 easyconfigs in total)
taurusi8028 - Linux centos linux 7.9.2009, x86_64, AMD EPYC 7352 24-Core Processor, Python 2.7.5
See https://gist.github.com/d30c8f48f9c0948ea0c543797ddff458 for a full test report.

branfosj · 2021-02-02T18:48:03Z

easybuild/easyblocks/p/pytorch.py

@@ -51,7 +51,9 @@ def extra_options():
        extra_vars.update({
            'excluded_tests': [{}, 'Mapping of architecture strings to list of tests to be excluded', CUSTOM],
            'custom_opts': [[], 'List of options for the build/install command. Can be used to change the defaults ' +
-                                'set by the PyTorch EasyBlock, for example ["USE_MKLDNN=0"].', CUSTOM]
+                                'set by the PyTorch EasyBlock, for example ["USE_MKLDNN=0"].', CUSTOM],
+            'ptx': ['latest', 'For which compute architectures PTX code should be generated. Can be '


The CUDA arches are not guaranteed to be in order, so one of the following changes is required:

latest to become last

The code below changes to add +PTX to the latest CUDA arch

We order the CUDA arches

Good catch! Using "last" which matches "first" and is easiest.

Flamefire · 2021-02-08T10:38:23Z

I had to add an option for setting up the cuda cache as using ptx (by default on) will now possibly trigger JIT compilation which writes to the HOME directory. See https://developer.nvidia.com/blog/cuda-pro-tip-understand-fat-binaries-jit-caching/ for details

There also seems to be a failure which I'm not sure about. Might be because I started the test to early. Will rerun

Flamefire · 2021-02-09T15:38:07Z

Test report by @Flamefire

Overview of tested easyconfigs (in order)

SUCCESS PyTorch-1.7.1-fosscuda-2019b-Python-3.7.4.eb

Build succeeded for 1 out of 1 (1 easyconfigs in total)
taurusml3 - Linux RHEL 7.6, POWER, 8335-GTX, Python 2.7.5
See https://gist.github.com/a2fd3c5a7ab9956d83e1df29e62cb6f7 for a full test report.

Flamefire · 2021-02-10T12:29:31Z

Test report by @Flamefire

Overview of tested easyconfigs (in order)

FAIL (build issue) PyTorch-1.7.1-fosscuda-2019b-Python-3.7.4.eb (partial log available at https://gist.github.com/6db857d5e8672a80907e9fd81902f1b0)

Build succeeded for 0 out of 1 (1 easyconfigs in total)
taurusi8033 - Linux centos linux 7.9.2009, x86_64, AMD EPYC 7352 24-Core Processor, Python 2.7.5
See https://gist.github.com/333e42009f7c32b178b429f547deae8b for a full test report.

Flamefire · 2021-05-17T07:13:42Z

While this does work the amount of JIT compiling that would happen when running the so-compiled PyTorch make this unfeasible. So I'd recommend against using this and am closing it.

Add support for including PTX code in PyTorch

ecdef00

This adds PTX code to PyTorch by default for any newer architecture than the last selected one. This can be changed by the new EC option "ptx"

boegel added the enhancement label Feb 2, 2021

boegel added this to the next release (4.3.3?) milestone Feb 2, 2021

Make sure TORCH_CUDA_ARCH_LIST is also set for tests

3f374ab

branfosj requested changes Feb 6, 2021

View reviewed changes

Flamefire and others added 3 commits February 8, 2021 09:37

Replace latest by last

afb5fa9

Add option to setup cuda cache

9b9cc1a

Actually use 1 GB default cache size

e14e7e9

Flamefire added 2 commits February 8, 2021 13:38

Env var must be a string

8ee8966

Move CUDA cache setting to framework

5030426

Don't modify original CCC list

9735206

boegel modified the milestones: 4.3.3 (next release), release after 4.3.3 Feb 19, 2021

boegel modified the milestones: next release (4.3.4), release after 4.3.4 Apr 6, 2021

Flamefire closed this May 17, 2021

branfosj mentioned this pull request Jun 7, 2021

add custom easyblock for NCCL (built from source) #2337

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for including PTX code in PyTorch #2328

Add support for including PTX code in PyTorch #2328

Flamefire commented Feb 2, 2021 •

edited

Flamefire commented Feb 5, 2021

Flamefire commented Feb 6, 2021

branfosj Feb 2, 2021

Flamefire Feb 8, 2021

Flamefire commented Feb 8, 2021

Flamefire commented Feb 9, 2021

Flamefire commented Feb 10, 2021

Flamefire commented May 17, 2021

Add support for including PTX code in PyTorch #2328

Add support for including PTX code in PyTorch #2328

Conversation

Flamefire commented Feb 2, 2021 • edited

Flamefire commented Feb 5, 2021

Flamefire commented Feb 6, 2021

Overview of tested easyconfigs (in order)

branfosj Feb 2, 2021

Choose a reason for hiding this comment

Flamefire Feb 8, 2021

Choose a reason for hiding this comment

Flamefire commented Feb 8, 2021

Flamefire commented Feb 9, 2021

Overview of tested easyconfigs (in order)

Flamefire commented Feb 10, 2021

Overview of tested easyconfigs (in order)

Flamefire commented May 17, 2021

Flamefire commented Feb 2, 2021 •

edited