Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

speed up conversions #4437

Merged
merged 1 commit into from Jul 2, 2014
Merged

speed up conversions #4437

merged 1 commit into from Jul 2, 2014

Conversation

VinInn
Copy link
Contributor

@VinInn VinInn commented Jun 28, 2014

ConversionProducer speeds up by 27% on run2012C jetHT.

very small regression.

https://twiki.cern.ch/twiki/bin/viewauth/CMS/HighGranularityCMSSWOptimization

@cmsbuild
Copy link
Contributor

A new Pull Request was created by @VinInn (Vincenzo Innocente) for CMSSW_7_2_X.

speed up conversions

It involves the following packages:

RecoEgamma/EgammaPhotonAlgos

@nclopezo, @cmsbuild, @Degano, @StoyanStoynev, @slava77 can you please review it and eventually sign? Thanks.
@lgray this is something you requested to watch as well.
You can sign-off by replying to this message having '+1' in the first line of your reply.
You can reject by replying to this message having '-1' in the first line of your reply.

@cmsbuild
Copy link
Contributor

@cmsbuild
Copy link
Contributor

@@ -1,3 +1,5 @@
<flags CXXFLAGS="-Ofast -mrecip"/>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this option (Ofast) too aggressive to use? As I understand even debugging becomes problematic with it. This would be the first introduction of it in CMSSW, am I right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was discussed in a reco meeting in April and detailed in
https://twiki.cern.ch/twiki/bin/viewauth/CMS/HighGranularityCMSSWOptimization.

On 1 Jul, 2014, at 11:09 AM, StoyanStoynev notifications@github.com wrote:

In RecoEgamma/EgammaPhotonAlgos/BuildFile.xml:

@@ -1,3 +1,5 @@
+

Isn't this option (Ofast) too aggressive to use?
“too” with respect to what?
As I understand even debugging becomes problematic with it.
No. debugging is the same as O2 at the best of my knowedge.
This would be the first introduction of it in CMSSW, am I right?
No. we did many test in the past including a full set of “experimental” releases

v.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On Jul 1, 2014, at 11:09 AM, StoyanStoynev notifications@github.com
wrote:

In RecoEgamma/EgammaPhotonAlgos/BuildFile.xml:

@@ -1,3 +1,5 @@
+

Isn't this option (Ofast) too aggressive to use? As I understand even debugging becomes problematic with it. This would be the first introduction of it in CMSSW, am I right?

a 27% speedup someplace can be worth the loss of debugging in a production environment? [I thought we had explicitly discussed this in a reco meeting - but perhaps was a number of months ago]


Reply to this email directly or view it on GitHub.

@StoyanStoynev
Copy link
Contributor

OK, please add the link to the description of the PR (for better visibility and for the record). I wasn't on that April meeting, and I am not a GCC expert, good it was discussed.

@VinInn
Copy link
Contributor Author

VinInn commented Jul 1, 2014

On 1 Jul, 2014, at 11:23 AM, StoyanStoynev notifications@github.com wrote:

OK, please add the link to the description of the PR (for better visibility and for the record). I wasn't on that April meeting, and I am not a GCC expert, good it was discussed.
to ALL 5 Pr?
v.


Reply to this email directly or view it on GitHub.

Il est bon de suivre sa pente, pourvu que ce soit en montant.
A.G.
http://www.flickr.com/photos/vin60/1320965757/

@StoyanStoynev
Copy link
Contributor

And for my record (one of the sources I looked at): http://wiki.gentoo.org/wiki/GCC_optimization

@davidlange6
Copy link
Contributor

looks like its in PR4414.

On Jul 1, 2014, at 11:25 AM, Vincenzo Innocente notifications@github.com
wrote:

On 1 Jul, 2014, at 11:23 AM, StoyanStoynev notifications@github.com wrote:

OK, please add the link to the description of the PR (for better visibility and for the record). I wasn't on that April meeting, and I am not a GCC expert, good it was discussed.
to ALL 5 Pr?
v.


Reply to this email directly or view it on GitHub.

Il est bon de suivre sa pente, pourvu que ce soit en montant.
A.G.
http://www.flickr.com/photos/vin60/1320965757/

Reply to this email directly or view it on GitHub.

@StoyanStoynev
Copy link
Contributor

Just this one (PR), I'll refer to it from the others.

@StoyanStoynev
Copy link
Contributor

#4414 was closed before I looked too much in it. The description of it would fit here though.

@StoyanStoynev
Copy link
Contributor

... working on this one too.

@bendavid
Copy link
Contributor

bendavid commented Jul 2, 2014

@VinnInn, do you have some idea whether the speedup mostly comes from our own code, or from the underlying SMatrix code itself?

@VinInn
Copy link
Contributor Author

VinInn commented Jul 2, 2014

is the underlying SMatrix code itself.
with gcc 4.8 it was not inlined/optimized anymore.

@StoyanStoynev
Copy link
Contributor

Using wf 202 (ttbar + PU) I see in DQM :
allConversions 63.2865 ms/ev -> 48.1996 ms/ev . The overall normalization is ~10-12% (consistent between different modules) so the actual difference seen in this wf (200 events) is more like 12 %. wf 16 (2k events) is consistent : allConversions 103.066 ms/ev -> 68.5179 ms/ev (with overall normalization changed by 18-20%); wf17 is also consistent with : allConversions 80.4577 ms/ev -> 52.6103 ms/ev and similar NORM changes. Similar conclusions I get from wf 38 (QCD). So I only see about 12 (+-)% effect in speeding up. Some small expected differences in event size is observed (due to conversions). No issues spotted yet in DQM/FWlite script-based tests, just overall regressions where expected, plots to follow later today with some discussion. See also #4438 - somewhat related code changes.

@slava77
Copy link
Contributor

slava77 commented Jul 2, 2014

a brief summary from looking at outputs in Stoyan's test area

nDiffs       test
7    all_extended_pull4437__vs__baseRel_QCDFlatPt15s3000wf38p0
27   all_extended_pull4437__vs__baseRel_SingleElectron35wf17p0
60   all_extended_pull4437__vs__baseRel_SingleElectronPt1000wf16p0
17   all_extended_pull4437__vs__baseRel_SingleGammaPt35wf19p0
7    all_extended_pull4437__vs__baseRel_SingleMuPt1000wf22p0
11   all_extended_pull4437__vs__baseRel_TTbarPUwf202p0
1    all_extended_pull4437__vs__baseRel_TTbarPUwf25202p0
small changes in conversions; propagate to pfblock and a few plots downstream
the biggest differences are in chi2 and they are "mild"

DQM diff3:
small diffs, consistent with above in ele-guns; no diffs in 202

@StoyanStoynev
Copy link
Contributor

+1
Based on above and some discussion below.
Tested af06808 on top of CMSSW_7_2_X_2014-06-27-0200 - extended matrix tests. Mild changes in conversion variables (mostly in electron/photon wf as Slava showed; wf 202 and 38 show literally just several distributions with difference of one entry in DQM plots - hard to see on a plot; these were angles, positions and chi2). In wf 16 there are twice as more warnings (42) about not positive-defined matrices:

%MSG-w BasicTrajectoryState: TrackProducerWithSCAssociation:ckfInOutTracksFromConversions 01-Jul-2014 12:55:15 CEST Run: 1 Event: 507
curv error not pos-def
[ -141769 48.9609 11999.7 0.00316398 -0.0126656
48.9609 -0.0169089 -4.14418-1.09132e-06 5.02377e-06
11999.7 -4.14418 -1015.68-0.000267741 0.00107206
0.00316398-1.09132e-06-0.000267741 9.70689e-07 1.98119e-08
-0.0126656 5.02377e-06 0.00107206 1.98119e-08 9.67573e-06 ]
pos/mom/mf (1.01128,-10.3577,10.4025) (0.282736,-5.15101,1.118) (1.94325e-05,-0.000199032,3.81115)

No differences in these messages seen in other wf.

Wf 202:
all_extended_pull4437__vs__baserel_ttbarpuwf202p0c_recoconversions_allconversions__reco_objat_size

Wf 16:
all_extended_pull4437__vs__baserel_singleelectronpt1000wf16p0c_recoconversions_allconversions__reco_objat_size

all_extended_pull4437__vs__baserel_singleelectronpt1000wf16p0c_recopfblocks_particleflowblock__reco_objat_size

chi2_wf16

conveta_wf16

met_wf16

phijet_wf16

The point is that effects are minor elsewhere - chi2 type of variables (as above) are affected probably most but it is more like random (and small) fluctuations than else.

@cmsbuild
Copy link
Contributor

cmsbuild commented Jul 2, 2014

This pull request is fully signed and it will be integrated in one of the next CMSSW_7_2_X IBs unless changes (tests are also fine).

ktf added a commit that referenced this pull request Jul 2, 2014
@ktf ktf merged commit ba3ba4d into cms-sw:CMSSW_7_2_X Jul 2, 2014
@VinInn VinInn deleted the ConvSpeedup branch July 13, 2016 13:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants