Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fastsim/PF threading issues #3809

Merged
merged 5 commits into from May 15, 2014

Conversation

lgray
Copy link
Contributor

@lgray lgray commented May 11, 2014

@Dr15Jones

Fix singleton inside of fastim to be thread safe (memory barrier + mutex for possible loading).
Revert rechit producers back to standard EDProducers due to unsafe HcalChannelQuality ES product.
I'll try to make HcalChannelQuality safe, when I get a little more time.

@cmsbuild
Copy link
Contributor

A new Pull Request was created by @lgray (Lindsey Gray) for CMSSW_7_1_X.

Fastsim/PF threading issues

It involves the following packages:

FastSimulation/Particle
RecoParticleFlow/PFClusterProducer

@nclopezo, @lveldere, @cmsbuild, @anton-a, @thspeer, @giamman, @slava77, @Degano can you please review it and eventually sign? Thanks.
@bachtis this is something you requested to watch as well.
You can sign-off by replying to this message having '+1' in the first line of your reply.
You can reject by replying to this message having '-1' in the first line of your reply.
@nclopezo, @ktf you are the release manager for this.
You can merge this pull request by typing 'merge' in the first line of your comment.

@lgray
Copy link
Contributor Author

lgray commented May 11, 2014

@Dr15Jones Also, looking at HcalChannelQuality and the root cause there... Seems to be simply a misuse of a mutable for silly reasons. I'll track it down and attach to this PR.... Probably reverting the PFRecHit producers back to stream modules.

@lgray
Copy link
Contributor Author

lgray commented May 12, 2014

Figured out a bug on the plane. I will fix when I'm back in Geneva.

@cmsbuild
Copy link
Contributor

@cmsbuild
Copy link
Contributor

Pull request #3809 was updated. @anton-a, @thspeer, @giamman, @lveldere, @slava77 can you please check and sign again.

@cmsbuild
Copy link
Contributor

@Dr15Jones
Copy link
Contributor

See https://twiki.cern.ch/twiki/bin/view/CMSPublic/FWMultithreadedThreadSafeDataStructures#Lazy_caching_of_a_pointer for the much better way to cache a newly created object.

@Dr15Jones
Copy link
Contributor

Since ParticleTable can potentially violate our no communication between module rule, what I think should be done is

  1. make ParticleTable::myself a static thread_local.
  2. remove ParticleTable::instance( const HepPDT::ParticleDataTable*) and replace with
    a) private set( const HepPDT::ParticleDataTable*)
    b) helper class ParticleTable::Sentry which is declared as a friend. The constructor of Sentry takes a const HepPDT::ParticleDataTable*) and passes it to ParticleTable::set(...). The he destructor will call ParticleTable::set(nullptr).
    This design guarantees that the HepPDT::ParticleDataTable must be set by each module individually.

@lgray
Copy link
Contributor Author

lgray commented May 12, 2014

@Dr15Jones Ok, thanks. Today is pretty much done because of jetlag (at least right now, let's see at 3am). I'll submit new commits tomorrow morning CERN time.

@lgray
Copy link
Contributor Author

lgray commented May 13, 2014

@Dr15Jones In the case of static thread_local ParticleTable::myself isn't it ok to have the whatever the first module is in that thread construct the Sentry? Instead of rebuilding at each module call.

@cmsbuild
Copy link
Contributor

@lveldere
Copy link
Contributor

Hi Lindsay and Chris

I realise that things need to move forward,
so I'm not definitely going to obstruct this.

However, I would really appreciate it if you could explain me a few things
I apologise in advance for being slightly annoying and not very constructive.
I'm trying to learn something here,
and in general I'm trying to arrive at a more accessible FastSim code.

  • What is exactly is accomplished with the changes?
    From the conversation I understand
  • particle data becomes thread local
  • particle data gets updated at the beginning of each run
  • If this is correct: why did you want to accomplish this?
    Naively one would be happy with particle data shared between threads,
    and particle data constant over all runs.
    Does it speed up things somehow,
    is it useful for one or another analysis/physics purpose?
    I really don't understand the motivation right now.

What is the long term solution for this?
Can we expect a simpler solution from your side one day,
should something change about how FastSim is structured to avoid these things,
...

Many thanks

Lukas

@lgray
Copy link
Contributor Author

lgray commented May 15, 2014

@lveldere
The particle data is set for at the start of each module function that needs it. It actually slows things down a little bit.

The short answer is that there will be problems if the particle data table changes in one thread when another thread is reading it (reallocation across run boundary, etc.). Unless you think we should only ever run fastsim in single threaded mode we have to make it such that this cannot not happen. Moreover for CMS we have to ensure that the threads cannot communicate information to each other, since mixing information of lumisections/runs/events fundamentally a bad thing.

A nice way to fix this, I think, would be to make the fast sim particle, the FamosManager, and FSim*, constructors take directly the ESHandle to the pdt, instead of asking for it through singleton. I think that would solve 99% of the issues being addressed here, and be a long term replacement for the patch we are putting in here.

@cmsbuild
Copy link
Contributor

@lveldere
Copy link
Contributor

+1

I will communicate further with Lindsey to understand things in more detail

@Dr15Jones
Copy link
Contributor

Looks good to me. Thanks for doing this Lindsey, I owe you a Belgian beer.

@slava77
Copy link
Contributor

slava77 commented May 15, 2014

+1

for
#3809 ebf014c

no relevant changes in reco

@cmsbuild
Copy link
Contributor

This pull request is fully signed and it will be integrated in one of the next CMSSW_7_1_X IBs unless changes (tests are also fine). @nclopezo, @ktf can you please take care of it?

davidlange6 added a commit that referenced this pull request May 15, 2014
@davidlange6 davidlange6 merged commit 8c99369 into cms-sw:CMSSW_7_1_X May 15, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants