Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi-thread issue with stateful materials #1321

Closed
YaqiWang opened this issue Feb 14, 2014 · 5 comments
Closed

Multi-thread issue with stateful materials #1321

YaqiWang opened this issue Feb 14, 2014 · 5 comments

Comments

@YaqiWang
Copy link
Contributor

This multi-thread issue seems related to the stateful material. I have a test case checked in
rattlesnake/examples/GODIVA/godiva_thread.i

If I run the problem on my mac with:
gdb --args ../../rattlesnake-dbg -i godiva_thread.i --n-threads=12

I got the error and the backtrace:

Program received signal EXC_BAD_ACCESS, Could not access memory.
Reason: KERN_INVALID_ADDRESS at address: 0x00000000000005b0
to process 72566 thread 0x4403
0x000000010128b4b6 in MaterialPropertyStorage::initStatefulProps (this=0x104827a40, material_data=@0x104577730, mats=@0x106999318, n_qpoints=8, elem=@0x1045527f0, side=0) at MaterialPropertyStorage.C:102
102 if (props()[== NULL) props()elem_id[= material_data.props()prop_id->init(n_qpoints);
(gdb) bt
#0 0x000000010128b4b6 in MaterialPropertyStorage::initStatefulProps (this=0x104827a40, material_data=@0x104577730, mats=@0x106999318, n_qpoints=8, elem=@0x1045527f0, side=0) at MaterialPropertyStorage.C:102
#1 0x0000000100f62e93 in ComputeMaterialsObjectThread::onElement (this=0x1047d7158, elem=0x1045527f0) at ComputeMaterialsObjectThread.C:74
#2 0x0000000100eedea3 in ThreadedElementLoop<libMesh::StoredRange<libMesh::MeshBase::const_element_iterator, libMesh::Elem const*> >::operator() (this=0x1047d7158, range=@0x11bd06ac0) at ThreadedElementLoop.h:122
#3 0x0000000101001ccb in tbb::interface6::internal::start_reduce<libMesh::StoredRange<libMesh::MeshBase::const_element_iterator, libMesh::Elem const*>, ComputeMaterialsObjectThread, tbb::auto_partitioner const>::run_body (this=0x1047d6940, r=@0x11bd06ac0) at parallel_reduce.h:152
#4 0x0000000101002219 in tbb::interface6::internal::partition_type_basetbb::interface6::internal::auto_partition_type::execute<tbb::interface6::internal::start_reduce<libMesh::StoredRange<libMesh::MeshBase::const_element_iterator, libMesh::Elem const*>, ComputeMaterialsObjectThread, tbb::auto_partitioner const>, libMesh::StoredRange<libMesh::MeshBase::const_element_iterator, libMesh::Elem const*> > (this=0x1047d6988, start=@0x1047d6940, range=@0x1047d6950) at partitioner.h:265
#5 0x0000000100fdb101 in tbb::interface6::internal::start_reduce<libMesh::StoredRange<libMesh::MeshBase::const_element_iterator, libMesh::Elem const*>, ComputeMaterialsObjectThread, tbb::auto_partitioner const>::execute (this=0x1047d6940) at parallel_reduce.h:164
#6 0x000000010364b555 in tbb::internal::custom_schedulertbb::internal::IntelSchedulerTraits::local_wait_for_all ()
#7 0x0000000103647f6b in tbb::internal::arena::process ()
#8 0x00000001036456e0 in tbb::internal::market::process ()
#9 0x0000000103641b75 in tbb::internal::rml::private_worker::thread_routine ()
#10 0x00007fff8ae2f8bf in _pthread_start ()
#11 0x00007fff8ae32b75 in thread_start ()

If I run the problem with 6 threads, some time it's ok, some time I got segmentation fault. Any idea?

@friedmud
Copy link
Contributor

There are definitely some issues with stateful material properties. If I run stateful_prop_test through valgrind it shows me quite a few things that are leaked (see below). We should clean this stuff up first...

==66961== 32 bytes in 1 blocks are definitely lost in loss record 40 of 121
==66961==    at 0xC7F3: malloc (vg_replace_malloc.c:266)
==66961==    by 0x100168D: operator new(unsigned long) (in /usr/lib/libstdc++.6.0.9.dylib)
==66961==    by 0x10016DA: operator new[long) (in /usr/lib/libstdc++.6.0.9.dylib)
==66961==    by 0x10004C141: MaterialProperty<double>::resize(int) (in ../../moose_test-opt)
==66961==    by 0x18B070: MaterialData::size(unsigned int) (in /Users/gastdr/projects/herd_trunk/devel/moose/libmoose-opt.dylib)
==66961==    by 0x18EDA2: MaterialPropertyStorage::initStatefulProps(MaterialData&, std::vector<Material*, std::allocator<Material*> >&, unsigned int, libMesh::Elem const&, unsigned int) (in /Users/gastdr/projects/herd_trunk/devel/moose/libmoose-opt.dylib)
==66961==    by 0x89B9F: ComputeMaterialsObjectThread::onElement(libMesh::Elem const*) (in /Users/gastdr/projects/herd_trunk/devel/moose/libmoose-opt.dylib)
==66961==    by 0x10C32D: ThreadedElementLoop<libMesh::StoredRange<libMesh::MeshBase::const_element_iterator, libMesh::Elem const*> >::operator()(libMesh::StoredRange<libMesh::MeshBase::const_element_iterator, libMesh::Elem const*> const&) (in /Users/gastdr/projects/herd_trunk/devel/moose/libmoose-opt.dylib)
==66961==    by 0xA7086: FEProblem::initialSetup() (in /Users/gastdr/projects/herd_trunk/devel/moose/libmoose-opt.dylib)
==66961==    by 0x149A5B: Transient::execute() (in /Users/gastdr/projects/herd_trunk/devel/moose/libmoose-opt.dylib)
==66961==    by 0xEC69B: MooseApp::runInputFile() (in /Users/gastdr/projects/herd_trunk/devel/moose/libmoose-opt.dylib)
==66961==    by 0xECC7E: MooseApp::parseCommandLine() (in /Users/gastdr/projects/herd_trunk/devel/moose/libmoose-opt.dylib)
==66961== 
==66961== 32 bytes in 1 blocks are definitely lost in loss record 41 of 121
==66961==    at 0xC7F3: malloc (vg_replace_malloc.c:266)
==66961==    by 0x100168D: operator new(unsigned long) (in /usr/lib/libstdc++.6.0.9.dylib)
==66961==    by 0x10016DA: operator new(](unsigned)(unsigned long) (in /usr/lib/libstdc++.6.0.9.dylib)
==66961==    by 0x10004C141: MaterialProperty<double>::resize(int) (in ../../moose_test-opt)
==66961==    by 0x18B0A0: MaterialData::size(unsigned int) (in /Users/gastdr/projects/herd_trunk/devel/moose/libmoose-opt.dylib)
==66961==    by 0x18EDA2: MaterialPropertyStorage::initStatefulProps(MaterialData&, std::vector<Material*, std::allocator<Material*> >&, unsigned int, libMesh::Elem const&, unsigned int) (in /Users/gastdr/projects/herd_trunk/devel/moose/libmoose-opt.dylib)
==66961==    by 0x89B9F: ComputeMaterialsObjectThread::onElement(libMesh::Elem const*) (in /Users/gastdr/projects/herd_trunk/devel/moose/libmoose-opt.dylib)
==66961==    by 0x10C32D: ThreadedElementLoop<libMesh::StoredRange<libMesh::MeshBase::const_element_iterator, libMesh::Elem const*> >::operator()(libMesh::StoredRange<libMesh::MeshBase::const_element_iterator, libMesh::Elem const*> const&) (in /Users/gastdr/projects/herd_trunk/devel/moose/libmoose-opt.dylib)
==66961==    by 0xA7086: FEProblem::initialSetup() (in /Users/gastdr/projects/herd_trunk/devel/moose/libmoose-opt.dylib)
==66961==    by 0x149A5B: Transient::execute() (in /Users/gastdr/projects/herd_trunk/devel/moose/libmoose-opt.dylib)
==66961==    by 0xEC69B: MooseApp::runInputFile() (in /Users/gastdr/projects/herd_trunk/devel/moose/libmoose-opt.dylib)
==66961==    by 0xECC7E: MooseApp::parseCommandLine() (in /Users/gastdr/projects/herd_trunk/devel/moose/libmoose-opt.dylib)
==66961== 
==66961== 64 bytes in 1 blocks are definitely lost in loss record 58 of 121
==66961==    at 0xC7F3: malloc (vg_replace_malloc.c:266)
==66961==    by 0x100168D: operator new(unsigned long) (in /usr/lib/libstdc++.6.0.9.dylib)
==66961==    by 0x10016DA: operator new[long) (in /usr/lib/libstdc++.6.0.9.dylib)
==66961==    by 0x10004C141: MaterialProperty<double>::resize(int) (in ../../moose_test-opt)
==66961==    by 0x18B070: MaterialData::size(unsigned int) (in /Users/gastdr/projects/herd_trunk/devel/moose/libmoose-opt.dylib)
==66961==    by 0x18EDA2: MaterialPropertyStorage::initStatefulProps(MaterialData&, std::vector<Material*, std::allocator<Material*> >&, unsigned int, libMesh::Elem const&, unsigned int) (in /Users/gastdr/projects/herd_trunk/devel/moose/libmoose-opt.dylib)
==66961==    by 0x89AF6: ComputeMaterialsObjectThread::onElement(libMesh::Elem const*) (in /Users/gastdr/projects/herd_trunk/devel/moose/libmoose-opt.dylib)
==66961==    by 0x10C32D: ThreadedElementLoop<libMesh::StoredRange<libMesh::MeshBase::const_element_iterator, libMesh::Elem const*> >::operator()(libMesh::StoredRange<libMesh::MeshBase::const_element_iterator, libMesh::Elem const*> const&) (in /Users/gastdr/projects/herd_trunk/devel/moose/libmoose-opt.dylib)
==66961==    by 0xA7086: FEProblem::initialSetup() (in /Users/gastdr/projects/herd_trunk/devel/moose/libmoose-opt.dylib)
==66961==    by 0x149A5B: Transient::execute() (in /Users/gastdr/projects/herd_trunk/devel/moose/libmoose-opt.dylib)
==66961==    by 0xEC69B: MooseApp::runInputFile() (in /Users/gastdr/projects/herd_trunk/devel/moose/libmoose-opt.dylib)
==66961==    by 0xECC7E: MooseApp::parseCommandLine() (in /Users/gastdr/projects/herd_trunk/devel/moose/libmoose-opt.dylib)
==66961== 
==66961== 64 bytes in 1 blocks are definitely lost in loss record 59 of 121
==66961==    at 0xC7F3: malloc (vg_replace_malloc.c:266)
==66961==    by 0x100168D: operator new(unsigned long) (in /usr/lib/libstdc++.6.0.9.dylib)
==66961==    by 0x10016DA: operator new(](unsigned)(unsigned long) (in /usr/lib/libstdc++.6.0.9.dylib)
==66961==    by 0x10004C141: MaterialProperty<double>::resize(int) (in ../../moose_test-opt)
==66961==    by 0x18B0A0: MaterialData::size(unsigned int) (in /Users/gastdr/projects/herd_trunk/devel/moose/libmoose-opt.dylib)
==66961==    by 0x18EDA2: MaterialPropertyStorage::initStatefulProps(MaterialData&, std::vector<Material*, std::allocator<Material*> >&, unsigned int, libMesh::Elem const&, unsigned int) (in /Users/gastdr/projects/herd_trunk/devel/moose/libmoose-opt.dylib)
==66961==    by 0x89AF6: ComputeMaterialsObjectThread::onElement(libMesh::Elem const*) (in /Users/gastdr/projects/herd_trunk/devel/moose/libmoose-opt.dylib)
==66961==    by 0x10C32D: ThreadedElementLoop<libMesh::StoredRange<libMesh::MeshBase::const_element_iterator, libMesh::Elem const*> >::operator()(libMesh::StoredRange<libMesh::MeshBase::const_element_iterator, libMesh::Elem const*> const&) (in /Users/gastdr/projects/herd_trunk/devel/moose/libmoose-opt.dylib)
==66961==    by 0xA7086: FEProblem::initialSetup() (in /Users/gastdr/projects/herd_trunk/devel/moose/libmoose-opt.dylib)
==66961==    by 0x149A5B: Transient::execute() (in /Users/gastdr/projects/herd_trunk/devel/moose/libmoose-opt.dylib)
==66961==    by 0xEC69B: MooseApp::runInputFile() (in /Users/gastdr/projects/herd_trunk/devel/moose/libmoose-opt.dylib)
==66961==    by 0xECC7E: MooseApp::parseCommandLine() (in /Users/gastdr/projects/herd_trunk/devel/moose/libmoose-opt.dylib)
==66961== 
==66961== 309 (144 direct, 165 indirect) bytes in 1 blocks are definitely lost in loss record 71 of 121
==66961==    at 0xC7F3: malloc (vg_replace_malloc.c:266)
==66961==    by 0x100168D: operator new(unsigned long) (in /usr/lib/libstdc++.6.0.9.dylib)
==66961==    by 0xAA830: FEProblem::FEProblem(std::string const&, InputParameters) (in /Users/gastdr/projects/herd_trunk/devel/moose/libmoose-opt.dylib)
==66961==    by 0xD79DB: Problem* buildObject<FEProblem>(std::string const&, InputParameters) (in /Users/gastdr/projects/herd_trunk/devel/moose/libmoose-opt.dylib)
==66961==    by 0x11136E: ProblemFactory::create(std::string const&, std::string const&, InputParameters) (in /Users/gastdr/projects/herd_trunk/devel/moose/libmoose-opt.dylib)
==66961==    by 0x111971: ProblemFactory::createFEProblem(MooseMesh*) (in /Users/gastdr/projects/herd_trunk/devel/moose/libmoose-opt.dylib)
==66961==    by 0x14A958: Transient::Transient(std::string const&, InputParameters) (in /Users/gastdr/projects/herd_trunk/devel/moose/libmoose-opt.dylib)
==66961==    by 0xE06BB: MooseObject* buildObject<Transient>(std::string const&, InputParameters) (in /Users/gastdr/projects/herd_trunk/devel/moose/libmoose-opt.dylib)
==66961==    by 0xBB77E: Factory::create(std::string const&, std::string const&, InputParameters) (in /Users/gastdr/projects/herd_trunk/devel/moose/libmoose-opt.dylib)
==66961==    by 0x30903: CreateExecutionerAction::act() (in /Users/gastdr/projects/herd_trunk/devel/moose/libmoose-opt.dylib)
==66961==    by 0x1DF97: ActionWarehouse::executeActionsWithAction(std::string const&) (in /Users/gastdr/projects/herd_trunk/devel/moose/libmoose-opt.dylib)
==66961==    by 0x1E1F5: ActionWarehouse::executeAllActions() (in /Users/gastdr/projects/herd_trunk/devel/moose/libmoose-opt.dylib)
==66961== 
==66961== 309 (144 direct, 165 indirect) bytes in 1 blocks are definitely lost in loss record 72 of 121
==66961==    at 0xC7F3: malloc (vg_replace_malloc.c:266)
==66961==    by 0x100168D: operator new(unsigned long) (in /usr/lib/libstdc++.6.0.9.dylib)
==66961==    by 0xAA857: FEProblem::FEProblem(std::string const&, InputParameters) (in /Users/gastdr/projects/herd_trunk/devel/moose/libmoose-opt.dylib)
==66961==    by 0xD79DB: Problem* buildObject<FEProblem>(std::string const&, InputParameters) (in /Users/gastdr/projects/herd_trunk/devel/moose/libmoose-opt.dylib)
==66961==    by 0x11136E: ProblemFactory::create(std::string const&, std::string const&, InputParameters) (in /Users/gastdr/projects/herd_trunk/devel/moose/libmoose-opt.dylib)
==66961==    by 0x111971: ProblemFactory::createFEProblem(MooseMesh*) (in /Users/gastdr/projects/herd_trunk/devel/moose/libmoose-opt.dylib)
==66961==    by 0x14A958: Transient::Transient(std::string const&, InputParameters) (in /Users/gastdr/projects/herd_trunk/devel/moose/libmoose-opt.dylib)
==66961==    by 0xE06BB: MooseObject* buildObject<Transient>(std::string const&, InputParameters) (in /Users/gastdr/projects/herd_trunk/devel/moose/libmoose-opt.dylib)
==66961==    by 0xBB77E: Factory::create(std::string const&, std::string const&, InputParameters) (in /Users/gastdr/projects/herd_trunk/devel/moose/libmoose-opt.dylib)
==66961==    by 0x30903: CreateExecutionerAction::act() (in /Users/gastdr/projects/herd_trunk/devel/moose/libmoose-opt.dylib)
==66961==    by 0x1DF97: ActionWarehouse::executeActionsWithAction(std::string const&) (in /Users/gastdr/projects/herd_trunk/devel/moose/libmoose-opt.dylib)
==66961==    by 0x1E1F5: ActionWarehouse::executeAllActions() (in /Users/gastdr/projects/herd_trunk/devel/moose/libmoose-opt.dylib)
==66961== 
==66961== 309 (144 direct, 165 indirect) bytes in 1 blocks are definitely lost in loss record 73 of 121
==66961==    at 0xC7F3: malloc (vg_replace_malloc.c:266)
==66961==    by 0x100168D: operator new(unsigned long) (in /usr/lib/libstdc++.6.0.9.dylib)
==66961==    by 0xAA87E: FEProblem::FEProblem(std::string const&, InputParameters) (in /Users/gastdr/projects/herd_trunk/devel/moose/libmoose-opt.dylib)
==66961==    by 0xD79DB: Problem* buildObject<FEProblem>(std::string const&, InputParameters) (in /Users/gastdr/projects/herd_trunk/devel/moose/libmoose-opt.dylib)
==66961==    by 0x11136E: ProblemFactory::create(std::string const&, std::string const&, InputParameters) (in /Users/gastdr/projects/herd_trunk/devel/moose/libmoose-opt.dylib)
==66961==    by 0x111971: ProblemFactory::createFEProblem(MooseMesh*) (in /Users/gastdr/projects/herd_trunk/devel/moose/libmoose-opt.dylib)
==66961==    by 0x14A958: Transient::Transient(std::string const&, InputParameters) (in /Users/gastdr/projects/herd_trunk/devel/moose/libmoose-opt.dylib)
==66961==    by 0xE06BB: MooseObject* buildObject<Transient>(std::string const&, InputParameters) (in /Users/gastdr/projects/herd_trunk/devel/moose/libmoose-opt.dylib)
==66961==    by 0xBB77E: Factory::create(std::string const&, std::string const&, InputParameters) (in /Users/gastdr/projects/herd_trunk/devel/moose/libmoose-opt.dylib)
==66961==    by 0x30903: CreateExecutionerAction::act() (in /Users/gastdr/projects/herd_trunk/devel/moose/libmoose-opt.dylib)
==66961==    by 0x1DF97: ActionWarehouse::executeActionsWithAction(std::string const&) (in /Users/gastdr/projects/herd_trunk/devel/moose/libmoose-opt.dylib)
==66961==    by 0x1E1F5: ActionWarehouse::executeAllActions() (in /Users/gastdr/projects/herd_trunk/devel/moose/libmoose-opt.dylib)

@friedmud
Copy link
Contributor

Moving this down from Blocker... it's not truly blocking although it does need to be fixed.

@permcody
Copy link
Member

This problem is most likely due to the fact that you do not initialize your stateful properties, which you later access. We are still working on
computing regular material properties during initialSetup but this error is NOT a MOOSE problem.

We'll take a look at this again once we have material properties working during initialSetup

  • Demoting this ticket to major

@permcody
Copy link
Member

In 49f78e1:

Fixing threading issues with stateful materials - closes #1321

@permcody
Copy link
Member

In 14e81cd:

Turning threading back on in stateful material initialization - refs #1321

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants