PyMemory View error and muliti cuda flag errors #404

trontrytel · 2020-06-29T23:01:03Z

I was trying to compile the current master version of libcloud on the singularity image. I'm running into 2 problems:

This flag -DLIBCLOUDPHXX_FORCE_MULTI_CUDA=1 hangs the compilation
Without the above flag I get the following error:

/libcloudphxx/bindings/python/lgrngn.hpp:53:63: error: 'PyMemoryView_FromMemory' was not declared in this scope
         return bp::object(bp::handle<>(PyMemoryView_FromMemory(
                                        ~~~~~~~~~~~~~~~~~~~~~~~^
           reinterpret_cast<char *>(arg->outbuf()),
           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~             
           sizeof(real_t)
           ~~~~~~~~~~~~~~                                       
           * std::max(1, arg->opts_init->nx)
           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~                    
           * std::max(1, arg->opts_init->ny)
           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~                    
           * std::max(1, arg->opts_init->nz),
           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~                   
           PyBUF_READ
           ~~~~~~~~~~                                           
         ))); // TODO: this assumes Python 2 -> make it compatible with P3 or require P2 in CMake
         ~                                                      
libcloudphxx/bindings/python/lgrngn.hpp:53:63: note: suggested alternative: 'PyMemoryView_FromBuffer'
         return bp::object(bp::handle<>(PyMemoryView_FromMemory(
                                        ~~~~~~~~~~~~~~~~~~~~~~~^
                                        PyMemoryView_FromBuffer
           reinterpret_cast<char *>(arg->outbuf()),
           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~             
                                        PyMemoryView_FromBuffer
           sizeof(real_t)
           ~~~~~~~~~~~~~~                                       
                                        PyMemoryView_FromBuffer
           * std::max(1, arg->opts_init->nx)
           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~                    
                                        PyMemoryView_FromBuffer
           * std::max(1, arg->opts_init->ny)
           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~                    
                                        PyMemoryView_FromBuffer
           * std::max(1, arg->opts_init->nz),
           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~                   
                                        PyMemoryView_FromBuffer
           PyBUF_READ
           ~~~~~~~~~~                                           
                                        PyMemoryView_FromBuffer
         ))); // TODO: this assumes Python 2 -> make it compatible with P3 or require P2 in CMake
         ~                                                      
                                        PyMemoryView_FromBuffer

The text was updated successfully, but these errors were encountered:

pdziekan · 2020-06-30T09:00:54Z

I think that the issue is linked to the retirement of Python 2.
Python bindings now require Python 3.
I'm surprised that CMake didn't complain about Python version.

To fix the issue, try building a singularity image that has Python 3.
An updated recipe is in the UWLCM repo.

trontrytel · 2020-07-01T02:57:21Z

Hi @pdziekan . Thank you for taking a look. The singularity image works for me mostly. cmake still detects python2.7 by default but using something like cmake -DPYTHON_EXECUTABLE:FILEPATH=/usr/bin/python3 -DPYTHON_INCLUDE_DIR:PATH=/usr/include/python3.6 .. helped to point it in the right direction.

I'm actually trying to run parcel model. But I'm running into issues with MPI and python bindings: igfuw/parcel#81 Could you take a look there as well?

trontrytel · 2020-07-01T03:05:24Z

Actually I'm getting the same error if I run unit tests for libcloudphxx inside the singularity image: RuntimeError: The Python bindings of libcloudph++ Lagrangian microphysics can't be used in MPI runs.

claresinger · 2020-07-03T00:32:54Z

I just stumbled into this problem with Python bindings running the tests for libcloudph++ inside my singularity image as well.

trontrytel · 2020-07-03T00:35:56Z

@claresinger I think that the issue is more with our cluster setting that run mpi silently even if you don't ask for it. To get rid of that you should type unset PMI_RANK in terminal.

That clears the env variable that libcloud is using to check if the simulation is run with mpi or not

claresinger · 2020-09-03T21:59:16Z

@trontrytel fyi libcloudph is hanging again during compilation on central... noticed that the singularity scripts got updated in UWLCM so I will try and remake that image and see if that works...

trontrytel · 2020-09-03T22:00:16Z

sounds good! LMK if it doesnt work

pdziekan · 2020-09-04T08:28:37Z

@claresinger does the compilation hang without any output?

@trontrytel could you check if using this branch of libcloudph++ https://github.com/pdziekan/libcloudphxx/tree/mpi_detection you still need to unset PMI_RANK env var?

Also, I think that libcloudph++ is detecting Python2.7, because of some environmental variables set at your cluster.
Did you try running a singularity shell in a clean environment?
You can do this with:
env -i singularity shell --nv sng_ubuntu_18_04_cuda_10_0.sif

claresinger · 2020-09-04T15:31:26Z

@pdziekan yes, it starts compiling and then hangs at this step with no further output for 30+ minutes.

Singularity sng_ubuntu_18_04_cuda_10_0.sif:~/microphys/libcloudphxx/build> make -j64
Scanning dependencies of target git_revision.h
Scanning dependencies of target cloudphxx_lgrngn
[  0%] Built target git_revision.h
[  4%] Building CXX object CMakeFiles/cloudphxx_lgrngn.dir/src/lib.cpp.o
[ 19%] Building CUDA object CMakeFiles/cloudphxx_lgrngn.dir/src/lib_multicuda.cu.o
[ 19%] Building CXX object CMakeFiles/cloudphxx_lgrngn.dir/src/lib_omp.cpp.o
[ 19%] Building CUDA object CMakeFiles/cloudphxx_lgrngn.dir/src/lib_cuda.cu.o
[ 23%] Building CXX object CMakeFiles/cloudphxx_lgrngn.dir/src/lib_cpp.cpp.o

Here is the output from cmake. Should it say -- Detecting if the compiler is an MPI wrapper... - FALSE?

Singularity sng_ubuntu_18_04_cuda_10_0.sif:~/microphys> cd libcloudphxx/build/
Singularity sng_ubuntu_18_04_cuda_10_0.sif:~/microphys/libcloudphxx/build> rm -rf *
Singularity sng_ubuntu_18_04_cuda_10_0.sif:~/microphys/libcloudphxx/build> unset PMI_RANK
Singularity sng_ubuntu_18_04_cuda_10_0.sif:~/microphys/libcloudphxx/build> cmake -DPYTHON_EXECUTABLE:FILEPATH=/usr/bin/python3 -DPYTHON_INCLUDE_DIR:PATH=/usr/include/python3.6 -DLIBCLOUDPHXX_FORCE_MULTI_CUDA=1 ..
-- The CXX compiler identification is GNU 7.5.0
-- Check for working CXX compiler: /usr/bin/c++
-- Check for working CXX compiler: /usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Looking for a CUDA compiler
-- Looking for a CUDA compiler - /usr/local/cuda/bin/nvcc
-- The CUDA compiler identification is NVIDIA 10.0.130
-- Check for working CUDA compiler: /usr/local/cuda/bin/nvcc
-- Check for working CUDA compiler: /usr/local/cuda/bin/nvcc -- works
-- Detecting CUDA compiler ABI info
-- Detecting CUDA compiler ABI info - done
-- Found OpenMP_CXX: -fopenmp (found version "4.5") 
-- Found OpenMP: TRUE (found version "4.5")  
--  OpenMP found
-- Detecting if the compiler is an MPI wrapper...
-- Detecting if the compiler is an MPI wrapper... - FALSE
-- Trying to obtain CUDA capability of local hardware...
-- Detected more than 1 GPU or LIBCLOUDPHXX_FORCE_MULTI_CUDA set, the multi_CUDA backend will be built.
-- CUDA capability: 60
-- Found Thrust: /usr/local/include (found version "1.9.910") 
-- Boost version: 1.65.1
-- Testing if Boost ODEINT version >= 1.58
-- Found PythonInterp: /usr/bin/python3 (found suitable version "3.6.9", minimum required is "3") 
-- Found PythonLibs: /usr/lib/x86_64-linux-gnu/libpython3.6m.so (found version "3.6.9") 
-- boost numpy as numpy3
-- Boost version: 1.65.1
-- Found the following Boost libraries:
--   numpy3
--   python3
-- Performing Test BLITZ_FOUND
-- Performing Test BLITZ_FOUND - Success
-- Configuring done
-- Generating done
-- Build files have been written to: /home/csinger/microphys/libcloudphxx/build

@trontrytel I'm happy to check this new branch to see if it means we don't have to unset PMI_RANK after the compiling works with unsetting it!

pdziekan · 2020-09-08T10:12:14Z

@claresinger compilation may hang because you are running out of RAM.
Try assigning more memory or running compilation with a single thread only (make -j1).

-- Detecting if the compiler is an MPI wrapper... - FALSE
is fine.
If you wanted to run a simulation on a distributed memory system, you would need to compile the library with an MPI compiler.

claresinger · 2020-09-09T18:23:18Z

Thanks @pdziekan using make -j1 works!

claresinger · 2020-09-09T20:05:01Z

Tested this branch and it works without having to unset PMI_RANK https://github.com/pdziekan/libcloudphxx/tree/mpi_detection. Thanks @pdziekan! I see you already have PR #406 created to merge this fix.

pdziekan · 2020-09-10T10:04:05Z

@claresinger great! Can I close this issue, or is there something else to resolve?

claresinger · 2020-09-14T16:29:44Z

@pdziekan Yes, you can close this issue now.

pdziekan closed this as completed Sep 21, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PyMemory View error and muliti cuda flag errors #404

PyMemory View error and muliti cuda flag errors #404

trontrytel commented Jun 29, 2020

pdziekan commented Jun 30, 2020

trontrytel commented Jul 1, 2020

trontrytel commented Jul 1, 2020

claresinger commented Jul 3, 2020

trontrytel commented Jul 3, 2020

claresinger commented Sep 3, 2020

trontrytel commented Sep 3, 2020

pdziekan commented Sep 4, 2020

claresinger commented Sep 4, 2020 •

edited

Loading

pdziekan commented Sep 8, 2020

claresinger commented Sep 9, 2020

claresinger commented Sep 9, 2020

pdziekan commented Sep 10, 2020

claresinger commented Sep 14, 2020

PyMemory View error and muliti cuda flag errors #404

PyMemory View error and muliti cuda flag errors #404

Comments

trontrytel commented Jun 29, 2020

pdziekan commented Jun 30, 2020

trontrytel commented Jul 1, 2020

trontrytel commented Jul 1, 2020

claresinger commented Jul 3, 2020

trontrytel commented Jul 3, 2020

claresinger commented Sep 3, 2020

trontrytel commented Sep 3, 2020

pdziekan commented Sep 4, 2020

claresinger commented Sep 4, 2020 • edited Loading

pdziekan commented Sep 8, 2020

claresinger commented Sep 9, 2020

claresinger commented Sep 9, 2020

pdziekan commented Sep 10, 2020

claresinger commented Sep 14, 2020

claresinger commented Sep 4, 2020 •

edited

Loading