New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rivet 2.6.0 compiles in 10X but crash at running time #3679
Comments
A new Issue was created by @xjanssen Janssen Xavier. @davidlange6, @Dr15Jones, @smuzaffar, @fabiocos can you please review it and eventually sign/assign? Thanks. cms-bot commands are listed here |
It's linked to system dynamic loader. Probably Rivet was compiled with static TLS, but there are a few spaces left in operating system for system specific packages in a vector. CMSSW loads hundreds of shared libraries via dlopen(), but you cannot fit more shared libraries with static TLS then there are slots in the vector. This has been resolved for CentOS 7.2 and above. See: cms-externals/glibc@2c052e0 One should investigate how TLS is being used in Rivet. Maybe using "-ftls-model=global-dynamic" would help. This is default if Rivert was built for shared linking (i.e. -fPIC). |
Ah, it's not Rivet, it's "pluginGeneratorInterfaceRivetInterface_plugins.so". I wonder why... |
@xjanssen, can you please share your spec files? |
Hi, I just updated thé yoda and rivet versions wrt thé ones in git.
Xavier
Le 22 janv. 2018 à 17:09, Malik Shahzad Muzaffar <notifications@github.com<mailto:notifications@github.com>> a écrit :
@xjanssen<https://github.com/xjanssen>, can you please chare your spec files?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub<#3679 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AD3urpVyrHrIYGMF3UhYq8h62P8vQv-nks5tNLLJgaJpZM4RoCwu>.
|
did you build and |
Hi,
Yes both yoda and rivet have been rebuild. I used one of the cmsdev machine but can not remember which one (it was a week ago)
There was no error at build time but later when we use the package in CMSSW.
The output of the build I did is here:
/afs/cern.ch/user/x/xjanssen/public/screenlog.0<http://cern.ch/user/x/xjanssen/public/screenlog.0>
/afs/cern.ch/user/x/xjanssen/public/yoda_build_log<http://cern.ch/user/x/xjanssen/public/yoda_build_log>
/afs/cern.ch/user/x/xjanssen/public/rivet_build_log<http://cern.ch/user/x/xjanssen/public/rivet_build_log>
What I did to test it is:
On a lxplus host:
1) Get a recent CMSSW release:
cmsrel CMSSW_10_0_0_pre3
cd CMSSW_10_0_0_pre3
cmsenv
2) Link to newly build rivet:
scram tool remove yoda
scram tool remove rivet
edit config/toolbox/slc6_amd64_gcc530/tools/available/yoda.xml
edit config/toolbox/slc6_amd64_gcc530/tools/available/rivet.xml
to put new version and a link to the build directory
scram setup yoda
scram setup rivet
3) Setup Rivet
cd CMSSW_8_1_0_pre12/src
cmsenv
git cms-addpkg GeneratorInterface/RivetInterface
wget -P Configuration/GenProduction/python/ https://raw.githubusercontent.com/cms-sw/genproductions/master/python/rivet_customize.py
cp /afs/cern.ch/user/x/xjanssen/public/rivet_CUEP8S1_CT6_Soft_cfg.py<http://cern.ch/user/x/xjanssen/public/rivet_CUEP8S1_CT6_Soft_cfg.py>.
scram b
cmsRun rivet_CUEP8S1_CT6_Soft_cfg.py
On 22 Jan 2018, at 21:42, Malik Shahzad Muzaffar <notifications@github.com<mailto:notifications@github.com>> wrote:
did you build and scram setup both yoda-toolfile and rivet-toolfile?
I built these versions on cmsdev02 and can not reproduce the error. On which cmsdev machine did you build and can you point me to the build logs dir?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub<#3679 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AD3urk9EK8Prcn0ATDA5LF6rVQxArbBgks5tNPKogaJpZM4RoCwu>.
|
Ah ofc I used everywhere CMSSW_10_0_0_pre3 and gcc630, I forgot to update in the next lines when pasting from my note on how to do it |
@davidlt @davidlange6 It seems there was a similar problem with fireworks before, and glibc has been patched as follows: https://github.com/cms-externals/glibc/commits/cms/2.12-1.166.el6_7.3 Is it possible to increase |
hi @intrepid42 - i suspect this would not be easily done (as this is the cause of problems moving towards centos7 smoothly) - but you should bring this up to a core software meeting so people can better understand whats going on (yoda/rivet does not seem like the sort of software that should run into this problem after all) do things work on centos7? |
Hi, I just tried this:
And 4400 events get processed fine with the Should we still bother about slc6, or is it likely to be phased out soon? |
Just to confirm that same steps fail in slc6:
|
|
@intrepid42 in 10_4_X we should push for slc7 as new production version IMO |
Ok, I created a pull request! |
NB: this Rivet release is probably not super-important yet. But we definitely want to use Rivet 3.0 once it is available to get new features like processing of multiple weights |
So, unfortunately, the nanoAod workflow fails on slc6 (relies on Rivet-based ParticleLevelProducer) in #4427. Is it technically possible use "legacy" version 2.5.4 for slc6, using a statement like this in the spec file?
This would ensure the basic functionality on slc6 (= old plugins and ParticleLevelProducer). I think it would be acceptable to move Rivet plugin development and advanced usage (latest plugins and new features) to slc7. I fear that we would also need to have different RivetAnalyzer code for different Rivet versions once they diverge too much... Can that be done with some #ifdef or BuildFile.xml statements? |
@intrepid42 , Currently the version if fixed in spec file and can not be changed. I will see if cmsBuild can dynamically assign a version.
|
Thank you, that looks promising! Do you know if it works with Are there flags that we can use during compile time for the code in the RivetInterface package? We expect to integrate some new functionalities with the Rivet 3.0 upgrade (heavy-ion support, multi-weight handling), so we would need to make those parts invisible to the compiler on slc6 (that would know only the headers of Rivet 2.5.4). |
Closign this, we have disabled OPENMP for slc6 to avoid this crash. |
Hi,
I am testing the integration of Rivet 2.6.0 (and YODA 1.7.0 on which this release is based) in CMSSW 10X. I managed to build the two package on a cmsdev machine with the usual cmsBuild command without problem. However when I link these new version in a CMSSW 10X release with 'scram tool' command and try to run my test job I get the following error:
---- Begin Fatal Exception 11-Jan-2018 15:17:01 CET-----------------------
An exception of category 'PluginLibraryLoadError' occurred while
[0] Constructing the EventProcessor
Exception Message:
unable to load /afs/cern.ch/work/x/xjanssen/cms/Rivet/10X_gcc630_Rivet260/CMSSW_10_0_0_pre3/lib/slc6_amd64_gcc630/pluginGeneratorInterfaceRivetInterface_plugins.so because dlopen: cannot load any more object with static TLS
----- End Fatal Exception -------------------------------------------------
As far as I understand I am hitting some limitation of our running time environment but I am unsure how to fix/debug as I am not an expert of these kind of problems. The main change wrt the previous release of Rivet id the adding of several rivet plugins which might be the underlying reason of hitting this limit. The standalone install of Rivet 2.6.0 (outside CMSSW) is however working and hence, this seems a feature linked to CMSSW environment.
The text was updated successfully, but these errors were encountered: