Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Non-reproducibility in patMETs_slimmedMETsNoHF #36552

Open
makortel opened this issue Dec 20, 2021 · 9 comments
Open

Non-reproducibility in patMETs_slimmedMETsNoHF #36552

makortel opened this issue Dec 20, 2021 · 9 comments

Comments

@makortel
Copy link
Contributor

Tests in #36550 (comment) showed a difference in 2026D60 28234.0 all_mini_OldVSNew for patMETs_slimmedMETsNoHF
image

The PR should be purely technical, and I didn't find any earlier mention of a possible non-reproducibility there.

@makortel
Copy link
Contributor Author

assign reconstruction, xpog

@cmsbuild
Copy link
Contributor

New categories assigned: xpog,reconstruction

@slava77,@jpata,@mariadalfonso,@gouskos,@clacaputo,@fgolf you have been requested to review this Pull request/Issue and eventually sign? Thanks

@cmsbuild
Copy link
Contributor

A new Issue was created by @makortel Matti Kortelainen.

@Dr15Jones, @perrotta, @dpiparo, @makortel, @smuzaffar, @qliphy can you please review it and eventually sign/assign? Thanks.

cms-bot commands are listed here

@makortel
Copy link
Contributor Author

Maybe not so rare, but recent, occurred again in #36550 (comment) and also in #36507 (comment).

@makortel makortel changed the title Rare non-reproducibility in patMETs_slimmedMETsNoHF? Non-reproducibility in patMETs_slimmedMETsNoHF Dec 20, 2021
@jpata
Copy link
Contributor

jpata commented Dec 20, 2021

Based on the PR tests in https://cmssdt.cern.ch/SDT/jenkins-artifacts/baseLineComparisons I can first see this issue in CMSSW_12_3_X_2021-12-18-1100.
EDIT: actually, it's also there in CMSSW_12_3_X_2021-12-17-1100, investigating further...

@jpata
Copy link
Contributor

jpata commented Dec 20, 2021

Slava suggested it might be AVX in DeepMET, checking further in the logs...

@jpata
Copy link
Contributor

jpata commented Jan 6, 2022

Based on recent PRs, it looks like indeed an issue of executing TF on CPUs with different capabilities.

PR:

2022-01-05 17:40:07.040781: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  SSE4.1 SSE4.2 AVX AVX2 AVX512F FMA

baseline:

2022-01-05 12:25:59.300506: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  SSE4.1 SSE4.2 AVX AVX2 FMA

This is a known issue with TF-based deepmet.

@jpata
Copy link
Contributor

jpata commented May 16, 2022

type jetmet

@vlimant
Copy link
Contributor

vlimant commented Nov 2, 2022

related to #32628

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants