Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jacobi2D AMR example failed to run #2980

Closed
ouankou opened this issue Jul 22, 2020 · 1 comment · Fixed by #2989
Closed

Jacobi2D AMR example failed to run #2980

ouankou opened this issue Jul 22, 2020 · 1 comment · Fixed by #2989
Assignees

Comments

@ouankou
Copy link

ouankou commented Jul 22, 2020

Hello! I built Charm++, AMR lib, and Jacobi2D in order from this repo on Ubuntu 18.04 and the compilation works well.
The building command is ./build charm++ multicore-linux-x86_64.
However, executing ./charmrun ./jacobi under the example folder leads to the following error.

Running command: ./jacobi

Charm++> No provisioning arguments specified. Running with a single PE.
         Use +auto-provision to fully subscribe resources or +p1 to silence this message.
Charm++: standalone mode (not using charmrun)
Charm++> Running in Multicore mode: 1 threads (PEs)
Converse/Charm++ Commit ID: v6.11.0-devel-317-gfe43a4dcd
CharmLB> Load balancer assumes all CPUs are same.
Charm++> Running on 1 hosts (1 sockets x 6 cores x 2 PUs = 12-way SMP)
Charm++> cpu topology info is gathered in 0.006 seconds.
------------- Processor 0 Exiting: Called CmiAbort ------------
Reason: Cannot insert array element twice!
[0] Stack Traceback:
  [0:0] jacobi 0x55dc0ee36aec CmiAbortHelper(char const*, char const*, char const*, int, int)
  [0:1] jacobi 0x55dc0ee36bc6 CmiGetNonLocal
  [0:2] jacobi 0x55dc0ed12435 CkLocMgr::addElementToRec(CkLocRec*, CkArray*, CkMigratable*, int, void*)
  [0:3] jacobi 0x55dc0ed12399 CkLocMgr::addElement(CkArrayID, CkArrayIndex const&, CkMigratable*, int, void*)
  [0:4] jacobi 0x55dc0ece6110 CkArray::insertElement(CkArrayMessage*, CkArrayIndex const&, int*)
  [0:5] jacobi 0x55dc0ece5f95 CkArray::insertElement(CkMarshalledMessage&&, CkArrayIndex const&, int*)
  [0:6] jacobi 0x55dc0ecead6f CkIndex_CkArray::_call_insertElement_marshall2(void*, void*)
  [0:7] jacobi 0x55dc0eccfb26 CkDeliverMessageFree
  [0:8] jacobi 0x55dc0eccfd11
  [0:9] jacobi 0x55dc0eccfd3c
  [0:10] jacobi 0x55dc0ecd0dce
  [0:11] jacobi 0x55dc0ecd245d CkSendMsgBranchInline
  [0:12] jacobi 0x55dc0ecd24bd CkSendMsgBranch
  [0:13] jacobi 0x55dc0ece8727 CProxyElement_CkArray::insertElement(CkMarshalledMessage const&, CkArrayIndex const&, int const*, CkEntryOptions const*)
  [0:14] jacobi 0x55dc0ece4b75 CProxy_ArrayBase::ckInsertIdx(CkArrayMessage*, int, int, CkArrayIndex const&)
  [0:15] jacobi 0x55dc0ece4c13 CProxyElement_ArrayBase::ckInsert(CkArrayMessage*, int, int)
  [0:16] jacobi 0x55dc0ecf5b9c CProxyElement_ArrayElement::ckInsert(CkArrayMessage*, int, int)
  [0:17] jacobi 0x55dc0eef1864 CProxyElement_Cell::ckInsert(CkArrayMessage*, int, int)
  [0:18] jacobi 0x55dc0eef201a CProxyElement_Cell2D::ckInsert(CkArrayMessage*, int, int)
  [0:19] jacobi 0x55dc0eeef604 CProxyElement_Cell2D::insert(_ArrInitMsg*, int)
  [0:20] jacobi 0x55dc0eee938d Cell2D::create_children(_ArrInitMsg**)
  [0:21] jacobi 0x55dc0eee5fd2 Cell::treeSetup(_ArrInitMsg*)
  [0:22] jacobi 0x55dc0eee901c Cell2D::Cell2D(_ArrInitMsg*)
  [0:23] jacobi 0x55dc0eeef7ee CkIndex_Cell2D::_call_Cell2D__ArrInitMsg(void*, void*)
  [0:24] jacobi 0x55dc0eccfb26 CkDeliverMessageFree
  [0:25] jacobi 0x55dc0ed10d71 CkLocRec::invokeEntry(CkMigratable*, void*, int, bool)
  [0:26] jacobi 0x55dc0ed124c8 CkLocMgr::addElementToRec(CkLocRec*, CkArray*, CkMigratable*, int, void*)
  [0:27] jacobi 0x55dc0ed12399 CkLocMgr::addElement(CkArrayID, CkArrayIndex const&, CkMigratable*, int, void*)
  [0:28] jacobi 0x55dc0ece6110 CkArray::insertElement(CkArrayMessage*, CkArrayIndex const&, int*)
  [0:29] jacobi 0x55dc0ece5f95 CkArray::insertElement(CkMarshalledMessage&&, CkArrayIndex const&, int*)
  [0:30] jacobi 0x55dc0ecead6f CkIndex_CkArray::_call_insertElement_marshall2(void*, void*)
  [0:31] jacobi 0x55dc0eccfb26 CkDeliverMessageFree
  [0:32] jacobi 0x55dc0eccfd11
  [0:33] jacobi 0x55dc0eccfd3c
  [0:34] jacobi 0x55dc0ecd0dce
  [0:35] jacobi 0x55dc0ecd245d CkSendMsgBranchInline
  [0:36] jacobi 0x55dc0ecd24bd CkSendMsgBranch
  [0:37] jacobi 0x55dc0ece8727 CProxyElement_CkArray::insertElement(CkMarshalledMessage const&, CkArrayIndex const&, int const*, CkEntryOptions const*)
  [0:38] jacobi 0x55dc0ece4b75 CProxy_ArrayBase::ckInsertIdx(CkArrayMessage*, int, int, CkArrayIndex const&)
  [0:39] jacobi 0x55dc0ece4c13 CProxyElement_ArrayBase::ckInsert(CkArrayMessage*, int, int)
  [0:40] jacobi 0x55dc0ecf5b9c CProxyElement_ArrayElement::ckInsert(CkArrayMessage*, int, int)
  [0:41] jacobi 0x55dc0eef1864 CProxyElement_Cell::ckInsert(CkArrayMessage*, int, int)
  [0:42] jacobi 0x55dc0eef201a CProxyElement_Cell2D::ckInsert(CkArrayMessage*, int, int)
  [0:43] jacobi 0x55dc0eeef604 CProxyElement_Cell2D::insert(_ArrInitMsg*, int)
  [0:44] jacobi 0x55dc0eee5211 AmrCoordinator::create_tree()
  [0:45] jacobi 0x55dc0eee48a9 AmrCoordinator::AmrCoordinator(_DMsg*)
  [0:46] jacobi 0x55dc0eeed126 CkIndex_AmrCoordinator::_call_AmrCoordinator__DMsg(void*, void*)
  [0:47] jacobi 0x55dc0eccfb26 CkDeliverMessageFree
  [0:48] jacobi 0x55dc0eccfd11
  [0:49] jacobi 0x55dc0eccfd3c
  [0:50] jacobi 0x55dc0ecd09cc
  [0:51] jacobi 0x55dc0ecd1723 _processHandler(void*, CkCoreState*)
  [0:52] jacobi 0x55dc0ee41ca7 CmiHandleMessage
  [0:53] jacobi 0x55dc0ee42042 CsdScheduleForever
  [0:54] jacobi 0x55dc0ee41f7e CsdScheduler
  [0:55] jacobi 0x55dc0ee36883
  [0:56] jacobi 0x55dc0ee36563 ConverseInit
  [0:57] jacobi 0x55dc0ed91d4d charm_main
  [0:58] jacobi 0x55dc0ecccbba main
  [0:59] libc.so.6 0x7f502ec71b97 __libc_start_main
  [0:60] jacobi 0x55dc0eccaa7a _start
CHARM++ FATAL ERROR: Cannot insert array element twice!
Segmentation fault (core dumped)

It seems that during initialization, the function CkLocMgr::addElement from src/ck-core/cklocation.C inserts an element and then immediately calls CkLocMgr::addElementToRec to update some record. However, the latter function found out the element has been added and terminated the program.
I tried to comment out the if statement in CkLocMgr::addElementToRec, which aborts the program. Jacobi can run but shortly hang up there with Error in sychronisation step messages.

Any help would be highly appreciated!

@rbuch
Copy link
Contributor

rbuch commented Jul 23, 2020

Thanks for reporting this problem, I'm able to reproduce this issue on my side, as well. I think the problem here is that the custom array indexing we do for AMR isn't being hashed properly and different levels of the AMR tree collide if they have the same data in their index (the problem seems to stem from the fact that the index we use for AMR is actually a tuple of (data, number of bits) and the number of bits part is essentially getting lost in the runtime code).

@epmikida has most recently looked at the code, so assigning to him.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants