Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory pool bug #1154

Closed
kyungjoo-kim opened this issue Oct 9, 2017 · 6 comments
Closed

Memory pool bug #1154

kyungjoo-kim opened this issue Oct 9, 2017 · 6 comments
Assignees
Labels
Blocks Promotion Overview issue for release-blocking bugs Bug Broken / incorrect code; it could be Kokkos' responsibility, or others’ (e.g., Trilinos)
Milestone

Comments

@kyungjoo-kim
Copy link
Contributor

I have a problem in memory pool. The error message is

Kokkos MemoryPool::deallocate given erroneous pointer
Program received signal SIGABRT, Aborted.

gdb says....

#2  0x000000000044bc21 in Kokkos::Impl::host_abort (
    message=message@entry=0x4881c0 "Kokkos MemoryPool::deallocate given erroneous pointer")
    at /home/kyukim/Work/lib/trilinos/temp/packages/kokkos/core/src/impl/Kokkos_Error.cpp:64
#3  0x000000000043f528 in abort (message=0x4881c0 "Kokkos MemoryPool::deallocate given erroneous pointer")
    at /home/kyukim/Work/lib/trilinos/temp/packages/kokkos/core/src/impl/Kokkos_Error.hpp:79
#4  Kokkos::MemoryPool<Kokkos::OpenMP>::deallocate (this=this@entry=0x7fffcbdec840, p=p@entry=0x7ffc34636280)
    at /home/kyukim/Work/lib/trilinos/temp/packages/kokkos/core/src/Kokkos_MemoryPool.hpp:796
#5  0x0000000000449f3e in Tacho::Experimental::TaskFunctor_FactorizeChol<double, Kokkos::OpenMP>::factorize_internal (
    this=this@entry=0x7fffcbdec830, member=..., n=<optimized out>, final=<optimized out>)
    at /home/kyukim/Work/lib/trilinos/temp/packages/shylu/shylu_node/tacho/src/TachoExp_TaskFunctor_FactorizeChol.hpp:75
#6  0x000000000044a24f in Tacho::Experimental::TaskFunctor_FactorizeChol<double, Kokkos::OpenMP>::operator() (
    this=this@entry=0x7fffcbdec830, member=..., r_val=<optimized out>)
    at /home/kyukim/Work/lib/trilinos/temp/packages/shylu/shylu_node/tacho/src/TachoExp_TaskFunctor_FactorizeChol.hpp:91
@hcedwar
Copy link
Contributor

hcedwar commented Oct 9, 2017

There is print statement at
https://github.com/kokkos/kokkos/blob/master/core/src/Kokkos_MemoryPool.hpp#L793
that you could enable to get detailed information.

@kyungjoo-kim
Copy link
Contributor Author

Neither of them really is my case

log.txt

Kokkos MemoryPool allocate(0x7f61fb236280)
Kokkos MemoryPool deallocate(0x7f61fb236280) contains(1) block_aligned(1) dealloc_once(0)

I only found the allocate and deallocate of the address 0x7f61fb236280) with the above pair only.

@kyungjoo-kim
Copy link
Contributor Author

log4.txt

According to this log output, the problematic pointer has pairs of allocations and deallocations before it crashes.

@hcedwar
Copy link
Contributor

hcedwar commented Oct 9, 2017

Problem size is requesting superblocks of 1lu << 31 or more. There are several 32-vs-64 bit conversion bugs.

@hcedwar hcedwar added the Bug Broken / incorrect code; it could be Kokkos' responsibility, or others’ (e.g., Trilinos) label Oct 9, 2017
@kokkos kokkos deleted a comment from kyungjoo-kim Oct 9, 2017
@hcedwar
Copy link
Contributor

hcedwar commented Oct 9, 2017

@kyungjoo-kim : Please email MemoryPool source file with our bug fixes.

@kyungjoo-kim
Copy link
Contributor Author

Still error with intel compiler

++ pool(0x7f691d236100)   allocate(0x7f6a9e236280) from sb_id(3) alloc_size(2,091,270,888) blocksize_lg2(31) sb_state(0x0) count_lg2(0) result(0,1)             
-- pool(0x7f691d236100) deallocate(0x7f6a9e236280) from sb_id(3) result(2) bit(1536) state(0x24000002) d(180000000) m_sb_size_lg2(31)                           
Kokkos MemoryPool deallocate(0x7f6a9e236280) contains(1) block_aligned(0) dealloc_once(0)    

hcedwar added a commit that referenced this issue Oct 10, 2017
Tighten up MemoryPool block and superblock size constraint checking.
Issue #1154
@hcedwar hcedwar added the Blocks Promotion Overview issue for release-blocking bugs label Oct 10, 2017
@hcedwar hcedwar added this to the 2017 October milestone Oct 10, 2017
@hcedwar hcedwar self-assigned this Oct 10, 2017
@crtrott crtrott closed this as completed Oct 28, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Blocks Promotion Overview issue for release-blocking bugs Bug Broken / incorrect code; it could be Kokkos' responsibility, or others’ (e.g., Trilinos)
Projects
None yet
Development

No branches or pull requests

3 participants