Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Streamer input source prefetching (12_5_X) #39088

Merged
merged 12 commits into from Aug 22, 2022

Conversation

smorovic
Copy link
Contributor

@smorovic smorovic commented Aug 17, 2022

PR description:

Adds a preallocated buffer intended to avoid many small reads when reading streamer files. This is aimed at improving EOS reading performance (see #38997).
A pointer to the big buffer is passed if possible for reading header and even payloads (contiguous), otherwise data is copied to the intermediate header or event buffer.

New parameter enables and configures size of the large buffer in MBytes (0 to disable), for example:

process.source = cms.Source("NewEventStreamFileReader",
    prefetchMBytes = cms.untracked.uint32(100),

PR validation:

A new unit test in IOPool/Streamer specifically checks behavior with this parameter (1 MB with small generated events). A file larger than the buffer was created to test behavior at the buffer boundary.
Internally, source checks adler32 checksum of both the header and event payload (compared to a header field), so corruption of payloads would be caught by unit tests.

Streamer source in CalibCalorimetry/EcalLaserSorting uses this code and has a respective unit test in that location.
Similarly, DQM streamer input source is tested as part of DAQ I/O chain in EventFilter/Utilities unit test.

@cmsbuild
Copy link
Contributor

-code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-39088/31600

  • This PR adds an extra 20KB to repository

Code check has found code style and quality issues which could be resolved by applying following patch(s)

@smorovic
Copy link
Contributor Author

12_4_X backport branch (PR will be created after this is reviewed):
https://github.com/smorovic/cmssw/tree/12_4_X-streamer-prebuffering

@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-39088/31601

  • This PR adds an extra 20KB to repository

@cmsbuild
Copy link
Contributor

A new Pull Request was created by @smorovic (Srecko Morovic) for master.

It involves the following packages:

  • IOPool/Streamer (core)

@cmsbuild, @smuzaffar, @Dr15Jones, @makortel can you please review it and eventually sign? Thanks.
@makortel, @wddgit this is something you requested to watch as well.
@perrotta, @dpiparo, @qliphy, @rappoccio you are the release manager for this.

cms-bot commands are listed here

@smorovic
Copy link
Contributor Author

@cmsbuild please test

@smorovic
Copy link
Contributor Author

abort test

@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-39088/31604

  • This PR adds an extra 20KB to repository

@smorovic smorovic force-pushed the 12_5_X-streamer-prebuffering branch from 83c865e to 8ad3ecd Compare August 19, 2022 21:36
* remove unnecessary class member
* allow zero-copy event reading because EventHeader and EventMessage consist of byte and byte array variables and non-aligned casting is safe
@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-39088/31669

  • This PR adds an extra 16KB to repository

@cmsbuild
Copy link
Contributor

Pull request #39088 was updated. @cmsbuild, @smuzaffar, @Dr15Jones, @makortel can you please check and sign again.

@smorovic
Copy link
Contributor Author

Apparently the Event Header
https://github.com/cms-sw/cmssw/blob/master/IOPool/Streamer/interface/EventMessage.h#L61
and also this header which is part of it:
https://github.com/cms-sw/cmssw/blob/master/IOPool/Streamer/interface/MsgHeader.h#L6
consist fully of of char and char arrays which are defined here:
https://github.com/cms-sw/cmssw/blob/master/IOPool/Streamer/interface/MsgTools.h#L15

EventMsgView uses char* pointer to traverse through the message:
https://github.com/cms-sw/cmssw/blob/master/IOPool/Streamer/src/EventMessage.cc#L4
and builds higher-order variables from individual bytes (like with the convert32 function).

Conclusion is that alignment is not an issue. So, in the latest commit I enabled 'zero-copy' mode parameter for reading events.
It works fine both in the unit test and reading a production streamer file larger than 100 MB.

@smorovic
Copy link
Contributor Author

@cmsbuild please test

@cmsbuild
Copy link
Contributor

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-b93f17/26967/summary.html
COMMIT: d6ea8aa
CMSSW: CMSSW_12_5_X_2022-08-21-0000/el8_amd64_gcc10
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/39088/26967/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 0 differences found in the comparisons
  • DQMHistoTests: Total files compared: 51
  • DQMHistoTests: Total histograms compared: 3693040
  • DQMHistoTests: Total failures: 2
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3693016
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 50 files compared)
  • Checked 212 log files, 49 edm output root files, 51 DQM output files
  • TriggerResults: no differences found

@makortel
Copy link
Contributor

+1

@cmsbuild
Copy link
Contributor

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @perrotta, @dpiparo, @qliphy, @rappoccio (and backports should be raised in the release meeting by the corresponding L2)

@makortel
Copy link
Contributor

urgent

Just for @cms-sw/orp-l2 to note that it would be good to have this PR in pre5 so that Tier0 could potentially test the functionality soon (before it becomes part of a production release).

@perrotta
Copy link
Contributor

+1

@cmsbuild cmsbuild merged commit 0c04ca4 into cms-sw:master Aug 22, 2022
@smorovic smorovic deleted the 12_5_X-streamer-prebuffering branch August 23, 2022 10:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants