Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CMSSW 7.4.x / 7.5.x crash on ZFS mount #9517

Closed
blinkseb opened this issue Jun 9, 2015 · 8 comments
Closed

CMSSW 7.4.x / 7.5.x crash on ZFS mount #9517

blinkseb opened this issue Jun 9, 2015 · 8 comments
Assignees

Comments

@blinkseb
Copy link
Contributor

blinkseb commented Jun 9, 2015

Dear experts,

I'm not sure it's the right place to report bugs with CMSSW, fell free to redirect me to a more suitable place!

I experience a crash with CMSSW 7.4.x / 7.5.x when trying to run from our remote UI in Louvain.

I run

runTheMatrix.py --nThreads=1 --list=250200

Here's the stacktrace:

#5  0x00007f4fd985b5b8 in LocalFileSystem::findMount(char const*, statfs*, stat*, std::vector<std::string, std::allocator<std::string> >&) () from /cvmfs/cms.cern.ch/slc6_amd64_gcc491/cms/cmssw/CMSSW_7_4_4/lib/slc6_amd64_gcc491/libUtilitiesStorageFactory.so
#6  0x00007f4fd985d6e5 in LocalFileSystem::findCachePath(std::vector<std::string, std::allocator<std::string> > const&, double) () from /cvmfs/cms.cern.ch/slc6_amd64_gcc491/cms/cmssw/CMSSW_7_4_4/lib/slc6_amd64_gcc491/libUtilitiesStorageFactory.so
#7  0x00007f4fd986e0ec in StorageFactory::setTempDir(std::string const&, double) () from /cvmfs/cms.cern.ch/slc6_amd64_gcc491/cms/cmssw/CMSSW_7_4_4/lib/slc6_amd64_gcc491/libUtilitiesStorageFactory.so
#8  0x00007f4fd986e433 in StorageFactory::StorageFactory() () from /cvmfs/cms.cern.ch/slc6_amd64_gcc491/cms/cmssw/CMSSW_7_4_4/lib/slc6_amd64_gcc491/libUtilitiesStorageFactory.so
#9  0x00007f4fd9856190 in _GLOBAL__sub_I_StorageFactory.cc () from /cvmfs/cms.cern.ch/slc6_amd64_gcc491/cms/cmssw/CMSSW_7_4_4/lib/slc6_amd64_gcc491/libUtilitiesStorageFactory.so
#10 0x00007f4fe563c5ef in call_init (env=0x7fffe3e00850, argv=0x7fffe3e00838, argc=2, l=<optimized out>) at dl-init.c:85
#11 _dl_init (main_map=0x7f4fe14ce400, argc=2, argv=0x7fffe3e00838, env=0x7fffe3e00850) at dl-init.c:134
#12 0x00007f4fe5640d15 in dl_open_worker (a=<optimized out>) at dl-open.c:492
#13 0x00007f4fe563c206 in _dl_catch_error (objname=0x7fffe3dfe770, errstring=0x7fffe3dfe768, mallocedp=0x7fffe3dfe77f, operate=0x7f4fe5640920 <dl_open_worker>, args=0x7fffe3dfe720) at dl-error.c:178
#14 0x00007f4fe56404fa in _dl_open (file=0x7f4fe09bae58 "/cvmfs/cms.cern.ch/slc6_amd64_gcc491/cms/cmssw/CMSSW_7_4_4/lib/slc6_amd64_gcc491/pluginUtilitiesStorageFactoryPlugins.so", mode=-2147483391, caller_dlopen=0x7f4fe5674193 <edmplugin::SharedLibrary::SharedLibrary(boost::filesystem::path const&)+35>, nsid=-2, argc=2, argv=<optimized out>, env=0x7fffe3e00850) at dl-open.c:583
#15 0x00007f4fe3bfaf66 in dlopen_doit () from /lib64/libdl.so.2
#16 0x00007f4fe563c206 in _dl_catch_error (objname=0x7f4fe174b5b0, errstring=0x7f4fe174b5b8, mallocedp=0x7f4fe174b5a8, operate=0x7f4fe3bfaf00 <dlopen_doit>, args=0x7fffe3dfe940) at dl-error.c:178
#17 0x00007f4fe3bfb29c in _dlerror_run () from /lib64/libdl.so.2
#18 0x00007f4fe3bfaee1 in dlopen

GLIBC_2.2.5 () from /lib64/libdl.so.2
#19 0x00007f4fe5674193 in edmplugin::SharedLibrary::SharedLibrary(boost::filesystem::path const&) () from /cvmfs/cms.cern.ch/slc6_amd64_gcc491/cms/cmssw/CMSSW_7_4_4/lib/slc6_amd64_gcc491/libFWCorePluginManager.so
#20 0x00007f4fe566c7d8 in edmplugin::PluginManager::load(std::string const&, std::string const&) () from /cvmfs/cms.cern.ch/slc6_amd64_gcc491/cms/cmssw/CMSSW_7_4_4/lib/slc6_amd64_gcc491/libFWCorePluginManager.so
#21 0x00007f4fe566461f in edmplugin::PluginFactoryBase::findPMaker(std::string const&) const () from /cvmfs/cms.cern.ch/slc6_amd64_gcc491/cms/cmssw/CMSSW_7_4_4/lib/slc6_amd64_gcc491/libFWCorePluginManager.so
#22 0x00007f4fe5839f2c in edm::serviceregistry::ServicesManager::fillListOfMakers(std::vector<edm::ParameterSet, std::allocator<edm::ParameterSet> >&) () from /cvmfs/cms.cern.ch/slc6_amd64_gcc491/cms/cmssw/CMSSW_7_4_4/lib/slc6_amd64_gcc491/libFWCoreServiceRegistry.so
#23 0x00007f4fe583ba44 in edm::serviceregistry::ServicesManager::ServicesManager(edm::ServiceToken, edm::serviceregistry::ServiceLegacy, std::vector<edm::ParameterSet, std::allocator<edm::ParameterSet> >&, bool) () from /cvmfs/cms.cern.ch/slc6_amd64_gcc491/cms/cmssw/CMSSW_7_4_4/lib/slc6_amd64_gcc491/libFWCoreServiceRegistry.so
#24 0x00007f4fe5838c16 in edm::ServiceRegistry::createSet(std::vector<edm::ParameterSet, std::allocator<edm::ParameterSet> >&, edm::ServiceToken, edm::serviceregistry::ServiceLegacy, bool) () from /cvmfs/cms.cern.ch/slc6_amd64_gcc491/cms/cmssw/CMSSW_7_4_4/lib/slc6_amd64_gcc491/libFWCoreServiceRegistry.so
#25 0x00007f4fe55733db in edm::ScheduleItems::initServices(std::vector<edm::ParameterSet, std::allocator<edm::ParameterSet> >&, edm::ParameterSet&, edm::ServiceToken const&, edm::serviceregistry::ServiceLegacy, bool) () from /cvmfs/cms.cern.ch/slc6_amd64_gcc491/cms/cmssw-patch/CMSSW_7_4_4_patch4/lib/slc6_amd64_gcc491/libFWCoreFramework.so
#26 0x00007f4fe54dd0e0 in edm::EventProcessor::init(std::shared_ptr<edm::ProcessDesc>&, edm::ServiceToken const&, edm::serviceregistry::ServiceLegacy) () from /cvmfs/cms.cern.ch/slc6_amd64_gcc491/cms/cmssw-patch/CMSSW_7_4_4_patch4/lib/slc6_amd64_gcc491/libFWCoreFramework.so
#27 0x00007f4fe54df05a in edm::EventProcessor::EventProcessor(std::shared_ptr<edm::ProcessDesc>&, edm::ServiceToken const&, edm::serviceregistry::ServiceLegacy) () from /cvmfs/cms.cern.ch/slc6_amd64_gcc491/cms/cmssw-patch/CMSSW_7_4_4_patch4/lib/slc6_amd64_gcc491/libFWCoreFramework.so
#28 0x000000000040d001 in main::{lambda()#1}::operator()() const ()
#29 0x000000000040b4a6 in main ()

The filesystem is ZFS, and proc/mounts gives:

storage/scratch /nfs/scratch zfs rw,noatime,xattr,noacl 0 0

Thanks

@blinkseb blinkseb changed the title CMSSW 7.4.x / 7.5.x crash on ZFS mount CMSSW 7.4.x crash on ZFS mount Jun 9, 2015
@blinkseb blinkseb changed the title CMSSW 7.4.x crash on ZFS mount CMSSW 7.4.x / 7.5.x crash on ZFS mount Jun 9, 2015
@davidlt
Copy link
Contributor

davidlt commented Jun 9, 2015

Hi,

I have seen the same crash while compiling CMSSW_7_5_X + GCC 4.9.X + ASan. It was happening on edm checksum checker, but that's Python. I never recompiled Python with ASan run-time, thus it crashes without providing the report. I haven't investigated it further.

@davidlt
Copy link
Contributor

davidlt commented Jun 9, 2015

I will raise this in DORP meeting. How severe the problem is on your site?

@blinkseb
Copy link
Contributor Author

blinkseb commented Jun 9, 2015

Thanks! Pretty severe I guess since we can't no longer use CMSSW at all since recent release. I'll try to see which release introduced the crash, it can help to find which commit introduced the regression.

@davidlt
Copy link
Contributor

davidlt commented Jun 9, 2015

I will try to trigger it again and recompile in a way to a good report on crash.

@davidlt davidlt self-assigned this Jun 9, 2015
@davidlt
Copy link
Contributor

davidlt commented Jun 10, 2015

Adding a reference to a fix #9541 in CMSSW_7_5_X.

@davidlt
Copy link
Contributor

davidlt commented Jun 15, 2015

This is still open, because similar PR needs to be done for CMSSW_7_4_X (will take care later today).

@davidlt
Copy link
Contributor

davidlt commented Jun 15, 2015

PR #9622 for CMSSW_7_4_X was created. Let's keep the ticket alive until it's merged.

@davidlt
Copy link
Contributor

davidlt commented Jun 16, 2015

Merged for CMSW_7_4_X also. Closing.

@davidlt davidlt closed this as completed Jun 16, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants