Fix some issues with the GPU support to enable simple Arkouda builds #20239

e-kayrakli · 2022-07-18T19:12:16Z

This branch makes bunch of small-ish changes in an attempt to get Arkouda to
compile with the GPU locale model. I can separate this into multiple PRs, but I
want to test it some more as a whole. Fixes with this PR:

Resolves Clang error compiling included C++ headers with GPU locale model #19754
- Stops adding extern "C" in the C headers included at the command line.
  This was proposed by @milthorhpe.
- Fixes an interop test broken by this.
To be able to do that makes runtime's generated code interface for complex
numbers C-based even if we're compiling with C++. (A prior PR that made some
improvements in that direction is
Adjust runtime headers to better handle C++ compilation #16847)
This enables defining c_string_to_complex* functions to be included in the
generated executable, because their return types are no longer
std::complex<> which you can't link with C linkage.
Adds a no gpu codegen pragma to thwart gpuization within some internal
functions. Currently this only applies to:
- chpl__initCopy_shapeHelp: Has PRIM_ASSIGN that gets normalized after
  gpuization, but not resolved. We probably need this function, and the lack
  might be preventing us from using loop expressions to initialize arrays on
  gpus.
Fixes .name implementation of the GPULocale type.
- Adds a relevant test.
Adds support for BitOps module.
- Adds a relevant test.
While there, cleans up gpuTransforms.cpp a bit.

Test:

gpu/native
gpu/interop
make check + spot checks in types/complex with intel as the host+target compiler
make check + spot checks in types/complex with cray as the host+target compiler

e-kayrakli · 2022-07-18T19:17:07Z

@stonea -- I did the refactoring we discussed online and some more. This commit has them (with a new function this PR adds). Let me know if you think I went too far and want me to split that commit into its own PR.

@mppf -- This commit has the core of the complex-related change in the runtime. I'll do some portability testing there. Do you have any specific concerns as to the configs where what I am doing can be problematic? In your #16847, you just did a full local testing, it looks like.

mppf · 2022-07-18T19:53:06Z

@e-kayrakli - I don't have any concerns or specific test configurations to bring up.

runtime/include/chpltypes.h

e-kayrakli · 2022-07-20T17:17:48Z

@milthorpe -- This still needs some work, but we'll merge it soon. I'm wondering if you can give it a try to see if you can get a local, no-module Arkouda server built with it?

compiler/optimizations/gpuTransforms.cpp

stonea · 2022-07-20T19:54:32Z

@stonea -- I did the refactoring we discussed online and some more. This commit has them (with a new function this PR adds). Let me know if you think I went too far and want me to split that commit into its own PR.

I don't think we need to be sticklers for separating refactoring changes out into a separate PR. Having it as a separate commit is nice for software archeology and reviewing purposes so I appreciate that.

milthorpe · 2022-07-21T11:23:09Z

Today I re-tried compiling Arkouda and got this error, which I believe is a known problem:

In file included from <built-in>:3:
In file included from /tmp/chpl-milthorpe.deleteme-jUa4RL/command-line-includes.h:7:
In file included from /noback/milthorpe/chapel-1.27.0/modules/packages/ZMQHelper/zmq_helper.h:26:
In file included from /noback/milthorpe/chapel-1.27.0/runtime/include/qio/qio.h:26:
In file included from /noback/milthorpe/chapel-1.27.0/runtime/include/qio/qbuffer.h:39:
In file included from /noback/milthorpe/chapel-1.27.0/runtime/include/qio/deque.h:44:
In file included from /noback/milthorpe/chapel-1.27.0/runtime/include/chpl-mem.h:30:
In file included from /noback/milthorpe/chapel-1.27.0/runtime/include/arg.h:24:
In file included from /noback/milthorpe/chapel-1.27.0/runtime/include/chpltypes.h:39:
In file included from /noback/milthorpe/chapel-1.27.0/third-party/llvm/install/linux64-x86_64/lib/clang/14.0.0/include/cuda_wrappers/complex:33:
/auto/software/swtree/ubuntu20.04/x86_64/gcc/11.3.0/lib/gcc/x86_64-pc-linux-gnu/11.3.0/../../../../include/c++/11.3.0/type_traits:44:3: error: templates must have C++ linkage
  template<typename... _Elements>
  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/tmp/chpl-milthorpe.deleteme-jUa4RL/command-line-includes.h:2:1: note: extern "C" language linkage specification begins here
extern "C" {
^

e-kayrakli · 2022-07-21T16:30:38Z

@milthorpe did you get that with this PR applied on main? It is not really a must that you do it for this PR to proceed, but if you are getting this error with this PR applied, that's surprising.

milthorpe · 2022-07-22T06:45:48Z

@milthorpe did you get that with this PR applied on main? It is not really a must that you do it for this PR to proceed, but if you are getting this error with this PR applied, that's surprising.

Oops, I didn't correctly apply the PR. With the PR, I get:

$CHPL_HOME/modules/internal/DefaultRectangular.chpl:1494: error: GPU support does not currently allow nested kernel launches. Do you have
nested forall/foreach loops or looping over a multidimensional domain?

e-kayrakli · 2022-07-22T14:46:10Z

Ah, OK. That's good enough for this PR, I think/hope.

FWIW, the error you are getting is also a bit surprising but probably not from the GPU support perspective. Do you have a module in your ServerModules.cfg that supports multidimensional arrays in Arkouda? (which I know is something @bmcdonald3 has recently worked on). In any case, this PR causes issues with bunch of other Arkouda modules, too. But you should be able to compile a local, no-module Arkouda server.

bmcdonald3 · 2022-07-22T17:28:08Z

Had a chance to look into this a bit and I confirmed that the error you are running into is from the HDF5MultiDim module:

$CHPL_HOME/modules/internal/DefaultRectangular.chpl:1494: error: GPU support does not currently allow nested kernel launches. Do you have
nested forall/foreach loops or looping over a multidimensional domain?

So that can be removed by commenting out/deleting that module from the ServerModules.cfg file in the Arkouda repo.

I was able to get Arkouda to compile with GPU support using this patch with all the modules commented out from the build, but it does seem that some of the Arkouda modules still don't build with GPU support unfortunately. There is some documentation about the modular building of Arkouda located at https://github.com/Bears-R-Us/arkouda/blob/master/MODULAR.md, but I am happy to offer more assistance there if it would help.

e-kayrakli · 2022-07-22T17:42:08Z

Thanks for checking @bmcdonald3!

but it does seem that some of the Arkouda modules still don't build with GPU support unfortunately

That's right. I looked at only a few of those modules. It was relatively easy to find the "offending" foralls. And I believe trying to get all the modules to compile is a good stress test for our GPU support down the road.

Signed-off-by: Engin Kayraklioglu <e-kayrakli@users.noreply.github.com>

- Add a new function to check for the new "no gpu codegen" pragma - Store the parent function of the loop in a private field - Refactor the recursive function into a non-recursive wrapper and a recursive, static helper - Remove `allowFnCalls` from the GpuizableLoop interface Signed-off-by: Engin Kayraklioglu <e-kayrakli@users.noreply.github.com>

Signed-off-by: Engin Kayraklioglu <e-kayrakli@users.noreply.github.com>

Add the explicit include to runtime's bitops header This was an obvious missing piece from #20239. [Trivial, not reviewed] Test: - [x] standard

mppf reviewed Jul 18, 2022

View reviewed changes

runtime/include/chpltypes.h Show resolved Hide resolved

stonea approved these changes Jul 20, 2022

View reviewed changes

compiler/optimizations/gpuTransforms.cpp Outdated Show resolved Hide resolved

e-kayrakli added 9 commits July 27, 2022 16:02

An attempt to remove C++ complex from runtime

e03d5d1

Signed-off-by: Engin Kayraklioglu <e-kayrakli@users.noreply.github.com>

Cleanup

fa23135

Signed-off-by: Engin Kayraklioglu <e-kayrakli@users.noreply.github.com>

Fix an interop test

f5cacea

Signed-off-by: Engin Kayraklioglu <e-kayrakli@users.noreply.github.com>

Support the BitOps module

6a2cf7a

Signed-off-by: Engin Kayraklioglu <e-kayrakli@users.noreply.github.com>

Add 'no gpu codegen' pragma and use it

68e9de9

Signed-off-by: Engin Kayraklioglu <e-kayrakli@users.noreply.github.com>

Minor cleanup

db7cd12

Signed-off-by: Engin Kayraklioglu <e-kayrakli@users.noreply.github.com>

Clean the compiler implementation a bit

be9621e

Signed-off-by: Engin Kayraklioglu <e-kayrakli@users.noreply.github.com>

Add test for the new pragma

4adb9ad

Signed-off-by: Engin Kayraklioglu <e-kayrakli@users.noreply.github.com>

e-kayrakli force-pushed the gpu-zmq2 branch from b15af38 to 232a4f7 Compare July 27, 2022 23:31

e-kayrakli marked this pull request as ready for review July 27, 2022 23:31

e-kayrakli added 2 commits July 27, 2022 16:33

Add a comment for the new define

c3e2ee0

Signed-off-by: Engin Kayraklioglu <e-kayrakli@users.noreply.github.com>

Address Andy's comments

3d74b37

Signed-off-by: Engin Kayraklioglu <e-kayrakli@users.noreply.github.com>

e-kayrakli merged commit 7e6992d into chapel-lang:main Jul 28, 2022

e-kayrakli deleted the gpu-zmq2 branch July 28, 2022 01:00

e-kayrakli mentioned this pull request Jul 28, 2022

Add the explicit include to runtime's bitops header #20321

Merged

1 task

e-kayrakli added a commit that referenced this pull request Jul 28, 2022

Merge pull request #20321 from e-kayrakli/fix-bitops

815c267

Add the explicit include to runtime's bitops header This was an obvious missing piece from #20239. [Trivial, not reviewed] Test: - [x] standard

DanilaFe mentioned this pull request Nov 7, 2023

GPU: Initializers with promoted expressions don't get GPUized. #23801

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix some issues with the GPU support to enable simple Arkouda builds #20239

Fix some issues with the GPU support to enable simple Arkouda builds #20239

e-kayrakli commented Jul 18, 2022 •

edited

Loading

e-kayrakli commented Jul 18, 2022

mppf commented Jul 18, 2022

e-kayrakli commented Jul 20, 2022

stonea commented Jul 20, 2022

milthorpe commented Jul 21, 2022

e-kayrakli commented Jul 21, 2022

milthorpe commented Jul 22, 2022

e-kayrakli commented Jul 22, 2022 •

edited

Loading

bmcdonald3 commented Jul 22, 2022

e-kayrakli commented Jul 22, 2022

Fix some issues with the GPU support to enable simple Arkouda builds #20239

Fix some issues with the GPU support to enable simple Arkouda builds #20239

Conversation

e-kayrakli commented Jul 18, 2022 • edited Loading

e-kayrakli commented Jul 18, 2022

mppf commented Jul 18, 2022

e-kayrakli commented Jul 20, 2022

stonea commented Jul 20, 2022

milthorpe commented Jul 21, 2022

e-kayrakli commented Jul 21, 2022

milthorpe commented Jul 22, 2022

e-kayrakli commented Jul 22, 2022 • edited Loading

bmcdonald3 commented Jul 22, 2022

e-kayrakli commented Jul 22, 2022

e-kayrakli commented Jul 18, 2022 •

edited

Loading

e-kayrakli commented Jul 22, 2022 •

edited

Loading