Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Null Thread id encountered on partitioned_vector #2201

Closed
dcbdan opened this issue Jun 6, 2016 · 0 comments
Closed

Null Thread id encountered on partitioned_vector #2201

dcbdan opened this issue Jun 6, 2016 · 0 comments

Comments

@dcbdan
Copy link
Contributor

dcbdan commented Jun 6, 2016

The following code worked on one node but failed on two nodes. See the error message reproduced below.

Also, I have a similar code example with the same runtime error--except in that example, an action is called, hpx::apply(action_name, hpx::colocated(id_of_correct_partition), local_iter_beg, local_iter_end), for each partition instead of by calling hpx::parallel::generate.

#include <hpx/hpx.hpp>
#include <hpx/hpx_init.hpp>

#include <hpx/include/parallel_generate.hpp>
#include <hpx/include/partitioned_vector.hpp>

#include <boost/random.hpp>

///////////////////////////////////////////////////////////////////////////////
// Define the vector types to be used.
HPX_REGISTER_PARTITIONED_VECTOR(int);

///////////////////////////////////////////////////////////////////////////////
struct random_fill
{
    random_fill()
      : gen(std::rand()),
        dist(0, RAND_MAX)
    {}

    int operator()()
    {
        return dist(gen);
    }

    boost::random::mt19937 gen;
    boost::random::uniform_int_distribution<> dist;

    template <typename Archive>
    void serialize(Archive& ar, unsigned)
    {}
};

///////////////////////////////////////////////////////////////////////////////
int hpx_main(boost::program_options::variables_map& vm)
{
    if (hpx::get_locality_id() == 0)
    {
        // create as many partitions as we have localities
        std::size_t size = 10000;
        hpx::partitioned_vector<int> v(
            size, hpx::container_layout(hpx::find_all_localities()));

        // initialize data
        // segmented version of algorithm used
        using namespace  hpx::parallel;
        generate(par, v.begin(), v.end(), random_fill());

        return hpx::finalize();
    }

    return 0;
}

int main()
{
    return hpx::init();
}

The error message:

[dbourge@node48 hpx_build]$ srun -p haswell -N 1 ./bin/dv09 
[dbourge@node48 hpx_build]$ srun -p haswell -N 2 ./bin/dv09 

{stack-trace}: 4 frames:
0x7fa50be4fba2  : hpx::detail::backtrace(unsigned long) + 0xa2 in /home/dbourge/hpx/hpx_build/lib/libhpx.so.0
0x7fa50be84e87  : boost::exception_ptr hpx::detail::get_exception<hpx::exception>(hpx::exception const&, std::string const&, std::string const&, long, std::string const&) + 0x97 in /home/dbourge/hpx/hpx_build/lib/libhpx.so.0
0x7fa50be85495  : void hpx::detail::throw_exception<hpx::exception>(hpx::exception const&, std::string const&, std::string const&, long) + 0x65 in /home/dbourge/hpx/hpx_build/lib/libhpx.so.0
0x7fa50be9afe0  : hpx::detail::throw_exception(hpx::error, std::string const&, std::string const&, std::string const&, long) + 0x50 in /home/dbourge/hpx/hpx_build/lib/libhpx.so.0
{env}: 100 entries:
  BASH_FUNC_module()=() {  eval `/usr/bin/modulecmd bash $*`
}
  BINUTILSROOT=/home/projects/x86-64-haswell/bintuils/2.25.0
  BINUTILS_ROOT=/home/projects/x86-64-haswell/bintuils/2.25.0
  BINUTILS_VERSION=2.25.0
  CPATH=/home/projects/x86-64-haswell/bintuils/2.25.0/include:/home/projects/x86-64-haswell/zlib/1.2.8/include:/home/projects/x86-64-haswell/gmp/5.1.3/include:/home/projects/x86-64-haswell/mpfr/3.1.2/include:/home/projects/x86-64-haswell/mpc/1.0.1/include
  CVS_RSH=ssh
  GCCPATH=/home/projects/x86-64-haswell/gnu/4.9.2
  GCC_PATH=/home/projects/x86-64-haswell/gnu/4.9.2
  GCC_VERSION=4.9.2
  GMPROOT=/home/projects/x86-64-haswell/gmp/5.1.3
  GMP_ROOT=/home/projects/x86-64-haswell/gmp/5.1.3
  G_BROKEN_FILENAMES=1
  HISTCONTROL=ignoredups
  HISTSIZE=1000
  HOME=/home/dbourge
  HOSTNAME=shepard-lsm1
  IBV_FORK_SAFE=1
  INCLUDE=/home/projects/x86-64-haswell/bintuils/2.25.0/include:/home/projects/x86-64-haswell/zlib/1.2.8/include:/home/projects/x86-64-haswell/gmp/5.1.3/include:/home/projects/x86-64-haswell/mpfr/3.1.2/include:/home/projects/x86-64-haswell/mpc/1.0.1/include
  IPATH_NO_CPUAFFINITY=1
  LANG=en_US.UTF-8
  LD_LIBRARY_PATH=/home/projects/x86-64-haswell/gnu/4.9.2/lib64:/home/projects/x86-64-haswell/gnu/4.9.2/lib:/home/projects/x86-64-haswell/bintuils/2.25.0/lib:/home/projects/x86-64-haswell/zlib/1.2.8/lib:/home/projects/x86-64-haswell/gmp/5.1.3/lib:/home/projects/x86-64-haswell/mpfr/3.1.2/lib:/home/projects/x86-64-haswell/mpc/1.0.1/lib:/usr/lib/gcc/x86_64-redhat-linux/4.4.7
  LESSOPEN=||/usr/bin/lesspipe.sh %s
  LIBRARY_PATH=/home/projects/x86-64-haswell/gnu/4.9.2/lib/gcc/x86_64-unknown-linux-gnu/4.9.2:/home/projects/x86-64-haswell/bintuils/2.25.0/lib:/home/projects/x86-64-haswell/zlib/1.2.8/lib:/home/projects/x86-64-haswell/gmp/5.1.3/lib:/home/projects/x86-64-haswell/mpfr/3.1.2/lib:/home/projects/x86-64-haswell/mpc/1.0.1/lib:/usr/lib/gcc/x86_64-redhat-linux/4.4.7
  LOADEDMODULES=mpc/1.0.1:mpfr/3.1.2:gmp/5.1.3:zlib/1.2.8:binutils/2.25.0:gcc/4.9.2
  LOGNAME=dbourge
  LS_COLORS=rs=0:di=01;34:ln=01;36:mh=00:pi=40;33:so=01;35:do=01;35:bd=40;33;01:cd=40;33;01:or=40;31;01:mi=01;05;37;41:su=37;41:sg=30;43:ca=30;41:tw=30;42:ow=34;42:st=37;44:ex=01;32:*.tar=01;31:*.tgz=01;31:*.arj=01;31:*.taz=01;31:*.lzh=01;31:*.lzma=01;31:*.tlz=01;31:*.txz=01;31:*.zip=01;31:*.z=01;31:*.Z=01;31:*.dz=01;31:*.gz=01;31:*.lz=01;31:*.xz=01;31:*.bz2=01;31:*.tbz=01;31:*.tbz2=01;31:*.bz=01;31:*.tz=01;31:*.deb=01;31:*.rpm=01;31:*.jar=01;31:*.rar=01;31:*.ace=01;31:*.zoo=01;31:*.cpio=01;31:*.7z=01;31:*.rz=01;31:*.jpg=01;35:*.jpeg=01;35:*.gif=01;35:*.bmp=01;35:*.pbm=01;35:*.pgm=01;35:*.ppm=01;35:*.tga=01;35:*.xbm=01;35:*.xpm=01;35:*.tif=01;35:*.tiff=01;35:*.png=01;35:*.svg=01;35:*.svgz=01;35:*.mng=01;35:*.pcx=01;35:*.mov=01;35:*.mpg=01;35:*.mpeg=01;35:*.m2v=01;35:*.mkv=01;35:*.ogm=01;35:*.mp4=01;35:*.m4v=01;35:*.mp4v=01;35:*.vob=01;35:*.qt=01;35:*.nuv=01;35:*.wmv=01;35:*.asf=01;35:*.rm=01;35:*.rmvb=01;35:*.flc=01;35:*.avi=01;35:*.fli=01;35:*.flv=01;35:*.gl=01;35:*.dl=01;35:*.xcf=01;35:*.xwd=01;35:*.yuv=01;35:*.cgm=01;35:*.emf=01;35:*.axv=01;35:*.anx=01;35:*.ogv=01;35:*.ogx=01;35:*.aac=01;36:*.au=01;36:*.flac=01;36:*.mid=01;36:*.midi=01;36:*.mka=01;36:*.mp3=01;36:*.mpc=01;36:*.ogg=01;36:*.ra=01;36:*.wav=01;36:*.axa=01;36:*.oga=01;36:*.spx=01;36:*.xspf=01;36:
  MAIL=/var/spool/mail/dbourge
  MANPATH=/home/projects/x86-64-haswell/gnu/4.9.2/man:/home/projects/x86-64-haswell/bintuils/2.25.0/man:/usr/share/man
  MODULEPATH=/usr/share/Modules/modulefiles:/etc/modulefiles:/home/projects/x86-64-haswell/modulefiles
  MODULESHOME=/usr/share/Modules
  MPC_ROOT=/home/projects/x86-64-haswell/mpc/1.0.1
  MPFR_ROOT=/home/projects/x86-64-haswell/mpfr/3.1.2
  MPIRUN_CONNECT_ONCE=1
  MPIRUN_HOST=10.101.12.48
  MPIRUN_ID=-1265958909
  MPIRUN_MPD=0
  MPIRUN_NPROCS=2
  MPIRUN_PORT=38140
  MPIRUN_RANK=1
  NOT_USE_TOTALVIEW=1
  OLDPWD=/home/dbourge
  PATH=/home/projects/x86-64-haswell/gnu/4.9.2/libexec/gcc/x86_64-unknown-linux-gnu/4.9.2/:/home/projects/x86-64-haswell/gnu/4.9.2/bin:/home/projects/x86-64-haswell/bintuils/2.25.0/bin:/usr/lib64/qt-3.3/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/opt/ibutils/bin:/home/dbourge/bin
  PWD=/home/dbourge/hpx/hpx_build
  QTDIR=/usr/lib64/qt-3.3
  QTINC=/usr/lib64/qt-3.3/include
  QTLIB=/usr/lib64/qt-3.3/lib
  SHELL=/bin/bash
  SHLVL=3
  SLURMD_NODENAME=node49
  SLURM_CHECKPOINT_IMAGE_DIR=/home/dbourge/hpx/hpx_build
  SLURM_CPUS_ON_NODE=32
  SLURM_DISTRIBUTION=cyclic
  SLURM_GTIDS=1
  SLURM_JOBID=1029259
  SLURM_JOB_CPUS_PER_NODE=32(x2)
  SLURM_JOB_ID=1029259
  SLURM_JOB_NODELIST=node[48-49]
  SLURM_JOB_NUM_NODES=2
  SLURM_LAUNCH_NODE_IPADDR=10.101.12.48
  SLURM_LOCALID=0
  SLURM_NNODES=2
  SLURM_NODEID=1
  SLURM_NODELIST=node[48-49]
  SLURM_NPROCS=2
  SLURM_NTASKS=2
  SLURM_PRIO_PROCESS=0
  SLURM_PROCID=1
  SLURM_PTY_PORT=54961
  SLURM_PTY_WIN_COL=106
  SLURM_PTY_WIN_ROW=23
  SLURM_SRUN_COMM_HOST=10.101.12.48
  SLURM_SRUN_COMM_PORT=53230
  SLURM_STEPID=3
  SLURM_STEP_ID=3
  SLURM_STEP_LAUNCHER_PORT=53230
  SLURM_STEP_NODELIST=node[48-49]
  SLURM_STEP_NUM_NODES=2
  SLURM_STEP_NUM_TASKS=2
  SLURM_STEP_TASKS_PER_NODE=1(x2)
  SLURM_SUBMIT_DIR=/home/dbourge/hpx/hpx_build
  SLURM_TASKS_PER_NODE=1(x2)
  SLURM_TASK_PID=132550
  SLURM_TOPOLOGY_ADDR=node49
  SLURM_TOPOLOGY_ADDR_PATTERN=node
  SSH_ASKPASS=/usr/libexec/openssh/gnome-ssh-askpass
  SSH_CLIENT=134.253.194.87 52669 22
  SSH_CONNECTION=134.253.194.87 52669 132.175.51.34 22
  SSH_TTY=/dev/pts/22
  TERM=xterm
  TMPDIR=/tmp
  USER=dbourge
  VIADEV_DEFAULT_MIN_RNR_TIMER=25
  VIADEV_DEFAULT_TIME_OUT=22
  VIADEV_ENABLE_AFFINITY=0
  VIADEV_NUM_RDMA_BUFFER=4
  WISECONFIGDIR=/usr/share/wise2/
  ZLIBROOT=/home/projects/x86-64-haswell/zlib/1.2.8
  ZLIB_ROOT=/home/projects/x86-64-haswell/zlib/1.2.8
  _=/usr/bin/srun
  _LMFILES_=/home/projects/x86-64-haswell/modulefiles/mpc/1.0.1:/home/projects/x86-64-haswell/modulefiles/mpfr/3.1.2:/home/projects/x86-64-haswell/modulefiles/gmp/5.1.3:/home/projects/x86-64-haswell/modulefiles/zlib/1.2.8:/home/projects/x86-64-haswell/modulefiles/binutils/2.25.0:/home/projects/x86-64-haswell/modulefiles/gcc/4.9.2
{locality-id}: 1
{hostname}: [ (tcp:10.101.12.49:7911) ]
{process-id}: 132550
{function}: threads::get_self
{file}: /home/dbourge/hpx/src/runtime/threads/thread_data.cpp
{line}: 123
{os-thread}: parcel-thread-tcp#1
{thread-description}: <unknown>
{state}: state_running
{auxinfo}: 
{config}:
  HPX_HAVE_NATIVE_TLS=ON
  HPX_HAVE_STACKTRACES=ON
  HPX_HAVE_COMPRESSION_BZIP2=OFF
  HPX_HAVE_COMPRESSION_SNAPPY=OFF
  HPX_HAVE_COMPRESSION_ZLIB=OFF
  HPX_HAVE_PARCEL_COALESCING=ON
  HPX_HAVE_PARCELPORT_TCP=ON
  HPX_HAVE_PARCELPORT_MPI=OFF
  HPX_HAVE_PARCELPORT_IPC=OFF
  HPX_HAVE_PARCELPORT_IBVERBS=OFF
  HPX_HAVE_VERIFY_LOCKS=OFF
  HPX_HAVE_HWLOC=ON
  HPX_HAVE_ITTNOTIFY=OFF
  HPX_HAVE_RUN_MAIN_EVERYWHERE=OFF
  HPX_PARCEL_MAX_CONNECTIONS=512
  HPX_PARCEL_MAX_CONNECTIONS_PER_LOCALITY=4
  HPX_AGAS_LOCAL_CACHE_SIZE=4096
  HPX_HAVE_MALLOC=talloc
  HPX_PREFIX (configured)=/home/dbourge/inst_hpx
  HPX_PREFIX=/home/dbourge/hpx/hpx_build
{version}: V0.9.12-trunk (AGAS: V3.0), Git: 306f128810
{boost}: V1.61.0
{build-type}: release
{date}: Jun  1 2016 07:13:44
{platform}: linux
{compiler}: GNU C++ version 4.9.2
{stdlib}: GNU libstdc++ version 20141030
{what}: NULL thread id encountered (is this executed on a HPX-thread?): HPX(null_thread_id)

terminate called after throwing an instance of 'boost::exception_detail::clone_impl<boost::exception_detail::error_info_injector<hpx::exception> >'
  what():  NULL thread id encountered (is this executed on a HPX-thread?): HPX(null_thread_id)
srun: error: node49: task 1: Aborted
srun: First task exited 30s ago
srun: task 0: running
srun: task 1: exited abnormally
srun: Terminating job step 1029259.3
srun: Job step aborted: Waiting up to 2 seconds for job step to finish.
slurmd[node48]: *** STEP 1029259.3 KILLED AT 2016-06-06T13:29:29 WITH SIGNAL 9 ***
slurmd[node48]: *** STEP 1029259.3 KILLED AT 2016-06-06T13:29:29 WITH SIGNAL 9 ***
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants