Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

introduce const variant of XGBooster::predict and use it in PhotonXGBoostEstimator #45232

Merged
merged 2 commits into from
Jun 17, 2024

Conversation

mmusich
Copy link
Contributor

@mmusich mmusich commented Jun 15, 2024

fixes #45235

PR description:

Addresses #45085 (comment) by using the suggestion at #45235 (comment).
This avoids to call XGBooster::predict() right after a call to XGBooster::reset(), leading to evaluating NaN-s in the input feature vector.

PR validation:

Run the following script:

#!/bin/bash -ex                                                                                                                                                                                            

jobTag=threads4
hltMenu=/dev/CMSSW_14_0_0/GRun/V141

check_log () {
  grep '0 HLT_DiphotonMVA14p25_Tight_Mass90_v' $1 | grep TrigReport
}

run(){
  echo $2
  cp $1 $2.py
  cat <<EOF >> $2.py                                                                                                                                                                                       
                                                                                                                                                                                                           
process.options.numberOfThreads = 4                                                                                                                                                                        
process.options.numberOfStreams = 4                                                                                                                                                                        
                                                                                                                                                                                                           
process.hltOutputMinimal.fileName = '${2}.root'                                                                                                                                                            
EOF                                                                                                                                                                                                        
  cmsRun "${2}".py &> "${2}".log
  check_log "${2}".log
}

hltGetCmd="hltGetConfiguration ${hltMenu}"
hltGetCmd+=" --globaltag auto:run3_mc_GRun --mc --unprescale --output minimal --max-events -1"
hltGetCmd+=" --input /store/group/dpg_trigger/comm_trigger/TriggerStudiesGroup/STORM/debug/150724_xgboost/RelVal_Raw_GRun_MC.root"
                                                                                                                                                                                  
configLabel=hlt_"${jobTag}"_onlyDiphotonMVA14p25_Tight_Mass90                                                                                                                                                                                
${hltGetCmd} --paths HLT_DiphotonMVA14p25_Tight_Mass90_v1 > "${configLabel}".py
for job_i in {0..30}; do run "${configLabel}".py "${configLabel}"_"${job_i}"; done; unset job_i;

and didn't observe crashes, whereas without this it crashes around 10% of times (3 times out of 30).

If this PR is a backport please specify the original PR and why you need to backport that PR. If this PR will be backported please specify to which release cycle the backport is meant for:

Not a backport, but if accepted will be backported.

@cmsbuild
Copy link
Contributor

cmsbuild commented Jun 15, 2024

cms-bot internal usage

@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-45232/40607

  • This PR adds an extra 16KB to repository

@cmsbuild
Copy link
Contributor

A new Pull Request was created by @mmusich for master.

It involves the following packages:

  • RecoEgamma/PhotonIdentification (reconstruction)

@mandrenguyen, @cmsbuild, @jfernan2 can you please review it and eventually sign? Thanks.
@lgray, @jainshilpi, @sobhatta, @missirol, @Sam-Harper, @valsdav, @a-kapoor, @afiqaize, @sameasy, @ram1123, @Prasant1993, @varuns23 this is something you requested to watch as well.
@rappoccio, @antoniovilela, @sextonkennedy you are the release manager for this.

cms-bot commands are listed here

@mmusich
Copy link
Contributor Author

mmusich commented Jun 15, 2024

test parameters:

  • enable_tests = threading
  • workflows_threading = 141.042, 141.044, 141.046 ,12834.0

@mmusich
Copy link
Contributor Author

mmusich commented Jun 15, 2024

@cmsbuild, please test

@mmusich mmusich changed the title make PhotonXGBoostProducer and edm::stream::EDProducer make PhotonXGBoostProducer an edm::stream::EDProducer Jun 15, 2024
@cmsbuild
Copy link
Contributor

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-f6a577/39906/summary.html
COMMIT: b857a66
CMSSW: CMSSW_14_1_X_2024-06-15-1100/el8_amd64_gcc12
Additional Tests: THREADING
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/45232/39906/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 4 differences found in the comparisons
  • DQMHistoTests: Total files compared: 48
  • DQMHistoTests: Total histograms compared: 3345018
  • DQMHistoTests: Total failures: 3
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3344995
  • DQMHistoTests: Total skipped: 20
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 47 files compared)
  • Checked 202 log files, 165 edm output root files, 48 DQM output files
  • TriggerResults: no differences found

@mmusich mmusich force-pushed the mm_fix_stream_PhotonXGBoostProducer branch from b857a66 to 0cb0985 Compare June 16, 2024 21:54
@mmusich mmusich changed the title make PhotonXGBoostProducer an edm::stream::EDProducer introduce const variant of XGBooster::predict and use it in PhotonXGBoostEstimator Jun 16, 2024
@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-45232/40614

  • This PR adds an extra 20KB to repository

@cmsbuild
Copy link
Contributor

Pull request #45232 was updated. @jfernan2, @mandrenguyen, @valsdav, @cmsbuild, @wpmccormack can you please check and sign again.

@mmusich
Copy link
Contributor Author

mmusich commented Jun 16, 2024

@cmsbuild, please test

@cmsbuild
Copy link
Contributor

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-f6a577/39909/summary.html
COMMIT: 0cb0985
CMSSW: CMSSW_14_1_X_2024-06-16-0000/el8_amd64_gcc12
Additional Tests: THREADING
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/45232/39909/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • You potentially removed 1 lines from the logs
  • Reco comparison results: 12 differences found in the comparisons
  • DQMHistoTests: Total files compared: 48
  • DQMHistoTests: Total histograms compared: 3345018
  • DQMHistoTests: Total failures: 6
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3344992
  • DQMHistoTests: Total skipped: 20
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 47 files compared)
  • Checked 202 log files, 165 edm output root files, 48 DQM output files
  • TriggerResults: no differences found

@mmusich mmusich marked this pull request as ready for review June 17, 2024 03:10
@mmusich
Copy link
Contributor Author

mmusich commented Jun 17, 2024

urgent

  • we need a backport of this for data-taking

@mandrenguyen
Copy link
Contributor

+1

@valsdav
Copy link
Contributor

valsdav commented Jun 17, 2024

+ml

@cmsbuild
Copy link
Contributor

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @sextonkennedy, @antoniovilela, @rappoccio (and backports should be raised in the release meeting by the corresponding L2)

@antoniovilela
Copy link
Contributor

+1

@cmsbuild cmsbuild merged commit ddffbf2 into cms-sw:master Jun 17, 2024
13 checks passed
@mmusich mmusich deleted the mm_fix_stream_PhotonXGBoostProducer branch June 17, 2024 13:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

PhotonXGBoostProducer related crashes in TSG IB tests
6 participants