Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use 13_0_9_HLT and 12_4_21_HLT releases in the 22/23 MC production tests #45514

Merged
merged 1 commit into from
Jul 30, 2024

Conversation

makortel
Copy link
Contributor

@makortel makortel commented Jul 19, 2024

PR description:

This PR updates the 22/23 MC production tests added in #44578 to use the 13_0_9_HLT and 12_4_21_HLT releases for the HLT step. Those releases include the backports #44921 and #45000, respectively, that allow those releases to read a ROOT file produced with 14_0_X (or later).

Resolves cms-sw/framework-team#955

PR validation:

Unit tests succeed.

If this PR is a backport please specify the original PR and why you need to backport that PR. If this PR will be backported please specify to which release cycle the backport is meant for:

To be backported to 14_0_X

@cmsbuild
Copy link
Contributor

cmsbuild commented Jul 19, 2024

cms-bot internal usage

@cmsbuild
Copy link
Contributor

@cmsbuild
Copy link
Contributor

A new Pull Request was created by @makortel for master.

It involves the following packages:

  • Configuration/PyReleaseValidation (upgrade, pdmv)

@AdrianoDee, @cmsbuild, @kskovpen, @miquork, @srimanob, @subirsarkar, @sunilUIET can you please review it and eventually sign? Thanks.
@Martin-Grunewald, @fabiocos, @missirol, @slomeo this is something you requested to watch as well.
@antoniovilela, @mandrenguyen, @rappoccio, @sextonkennedy you are the release manager for this.

cms-bot commands are listed here

@makortel
Copy link
Contributor Author

@cmsbuild, please test


<!-- In CMSSW_12_4_20 the auto:phase1_2022_realistic (pre EE) is 124X_mcRun3_2022_realistic_v12 -->
<test name="test_MC_22_setup" command="test_mc_setup/test_MC_setup.sh auto:phase1_2022_realistic Run3 2022v14 CMSSW_12_4_20 124X_mcRun3_2022_realistic_v12 Realistic25ns13p6TeVEarly2022Collision" />
<test name="test_MC_22_setup" command="test_mc_setup/test_MC_setup.sh auto:phase1_2022_realistic Run3 2022v14 CMSSW_12_4_21_HLT 124X_mcRun3_2022_realistic_v12 Realistic25ns13p6TeVEarly2022Collision" />
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to remind, the 2022 test still fails with

----- Begin Fatal Exception 19-Jul-2024 23:12:30 CEST-----------------------
An exception of category 'ConditionsError' occurred while
   [0] Processing  Event run: 1 lumi: 1 event: 1 stream: 0
   [1] Running path 'HLT_PPSMaxTracksPerRP4_v2'
   [2] Calling method for module L1TGlobalProducer/'hltGtStage2ObjectMap'
Exception Message:
 Error L1 menu loaded in via conditions does not match the L1 actually run 1517097079 vs 2016981387. This means that the mapping of the names to the bits may be incorrect. Please check the L1TUtmTriggerMenuRcd record supplied. Unless you know what you are doing, do not simply disable this check via the config as this a major error and the indication of something very wrong
----- End Fatal Exception -------------------------------------------------

and therefore I did not set CMSSW_MC_SETUP_TEST_CATCH_HLT.

Copy link
Contributor

@srimanob srimanob Jul 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the issue is mismatch of L1T record, between L1Menu_Collisions2022_v1_2_0_xml (12_4) and L1Menu_Collisions2022_v1_4_0-d1_xml (auto:phase1_2022_realistic in master). I am not sure if @cms-sw/alca-l2 knows this, and would like to do something, e.g. make new autocond modifier for 2022, or create new GT specific for this case.

Diff between 124X_mcRun3_2022_realistic_v12 and 140X_mcRun3_2022_realistic_v3: here

@cmsbuild
Copy link
Contributor

+1

Size: This PR adds an extra 16KB to repository
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-00e74c/40502/summary.html
COMMIT: c901b78
CMSSW: CMSSW_14_1_X_2024-07-19-1100/el8_amd64_gcc12
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/45514/40502/install.sh to create a dev area with all the needed externals and cmssw changes.

  • DAS Queries: The DAS query tests failed, see the summary page for details.

Comparison Summary

Summary:

@srimanob
Copy link
Contributor

+Upgrade

Good to merge, just a comment on GT above.

@makortel
Copy link
Contributor Author

Comparison differences show #39803 and #45505

@makortel
Copy link
Contributor Author

@cms-sw/pdmv-l2 Could you please review and sign? Thanks!

@AdrianoDee
Copy link
Contributor

+pdmv
(same not as 14_0_X, fix the HLT step for the tests)

@cmsbuild
Copy link
Contributor

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @rappoccio, @sextonkennedy, @mandrenguyen, @antoniovilela (and backports should be raised in the release meeting by the corresponding L2)

@antoniovilela
Copy link
Contributor

+1

@cmsbuild cmsbuild merged commit 7937acf into cms-sw:master Jul 30, 2024
11 checks passed
@makortel makortel deleted the updateMCProductionTests branch July 30, 2024 22:07
@mandrenguyen
Copy link
Contributor

@makortel
A new failing unit test seems to me to be related to this PR:
https://cmssdt.cern.ch/SDT/cgi-bin/logreader/slc7_amd64_gcc12/CMSSW_14_1_X_2024-07-31-1100/unitTestLogs/Configuration/PyReleaseValidation#/

Not sure why that would only happen on slc7. Would you mind having a look?

@makortel
Copy link
Contributor Author

Thanks @mandrenguyen. I see the test failed also in el8_aarch64_gcc12. Somehow the failing case ran the HLT step in

>>> Running HLT:2023v12 step in  /data/cmsbld/jenkins/workspace/ib-run-qa/CMSSW_14_1_X_2024-07-31-1100

whereas e.g. el8_amd64_gcc12 ran the HLT step in

/cvmfs/cms.cern.ch/el8_amd64_gcc11/cms/cmssw/CMSSW_13_0_19_HLT

which is the expected behavior.

@makortel
Copy link
Contributor Author

I think I have an idea what is going wrong. The script has a logic to search for the location of the HLT release, I think it is this part

base_arch_no_gcc=$(echo $SCRAM_ARCH | cut -d "_" -f -2)
if [[ ! ("$CMSSW_VERSION" == $release_hlt) ]]; then
hlt_cmssw_path=$(scram list -c $release_hlt | grep -w $release_hlt | sed 's|.* ||')
echo $hlt_cmssw_path
fi

It is clear from the log that this fails for the 13_0_9_HLT and 12_4_21_HLT releases, and the test ends up using the 14_1_X release, that fails because the requested HLT menu doesn't exist in that release.

I think we should either build 13_0_9_HLT and 12_4_21_HLT for the non-production architectures as well, or disable the test for the non-production architectures. @smuzaffar Any thoughts?

I'm also wondering if it would be better for the test_MC_setup.sh to fail early if the hlt_cmssw_path comes as empty. (that would be for @cms-sw/pdmv-l2)

@antoniovilela
Copy link
Contributor

I think I have an idea what is going wrong. The script has a logic to search for the location of the HLT release, I think it is this part

base_arch_no_gcc=$(echo $SCRAM_ARCH | cut -d "_" -f -2)
if [[ ! ("$CMSSW_VERSION" == $release_hlt) ]]; then
hlt_cmssw_path=$(scram list -c $release_hlt | grep -w $release_hlt | sed 's|.* ||')
echo $hlt_cmssw_path
fi

It is clear from the log that this fails for the 13_0_9_HLT and 12_4_21_HLT releases, and the test ends up using the 14_1_X release, that fails because the requested HLT menu doesn't exist in that release.
I think we should either build 13_0_9_HLT and 12_4_21_HLT for the non-production architectures as well, or disable the test for the non-production architectures. @smuzaffar Any thoughts?

I'm also wondering if it would be better for the test_MC_setup.sh to fail early if the hlt_cmssw_path comes as empty. (that would be for @cms-sw/pdmv-l2)

Thanks for following this. Just a note that I had already merged the backport before Matthew noticed the error related to this PR. Please then also backport the fix.

@smuzaffar
Copy link
Contributor

I think we should either build 13_0_9_HLT and 12_4_21_HLT for the non-production architectures as well, or disable the test for the non-production architectures. @smuzaffar Any thoughts?

@makortel , we can not build HLT release for all arch of development release cycles. It is better to disable the test if architecture part of SCRAM does not match and and if it matches then use singularity (cmssw-elX) to run the tests. I can provide a fix for this tomorrow

@smuzaffar
Copy link
Contributor

I have opened #45614 which basically exit early if HLT release is not found for current <os>_<arch>.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Update test in 14_1_X
7 participants