Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a tool to convert data to the .raw format used as input by the HLT #38519

Merged
merged 1 commit into from Jun 28, 2022

Conversation

fwyzard
Copy link
Contributor

@fwyzard fwyzard commented Jun 27, 2022

PR description:

Add convertToRaw, a script that converts RAW data stored in one or more EDM .root files into the .raw file format used as input by the HLT.

usage: convertToRaw [-h] [-o PATH] [-f EVENTS] [-l EVENTS] [--one-file-per-lumi] FILES [FILES ...]

Convert RAW data from .root format to .raw format.

positional arguments:
  FILES                 input files in .root format

optional arguments:
  -h, --help            show this help message and exit
  -o PATH, --output PATH
                        base path to store the output files; subdirectories based on the run number are automatically created (default: )
  -f EVENTS, --events_per_file EVENTS
                        split the output into files with at most EVENTS events (default: 50)
  -l EVENTS, --events_per_lumi EVENTS
                        process at most EVENTS events in each lumisection (default: 11650)
  --one-file-per-lumi   assume that lumisections are not split across files (and disable --events_per_lumi) (default: False)

The default behaviour is to process a single luminosity section at a time, in order to support luminosity sections split across multiple files and a limit on the number of events in each lumisection.

If neither of these features is needed (i.e. if lumisections are not split, and all events should be converted) the --one-file-per-lumi can be used to process all data with a single job, speeding up the conversion considerably.

PR validation:

The files produced by convertToRaw can be used as input by an online-like HLT job.

@fwyzard
Copy link
Contributor Author

fwyzard commented Jun 27, 2022

please test

To make the bot happy. even if the new tool does not affect any workflow.

@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-38519/30750

  • This PR adds an extra 20KB to repository

@cmsbuild
Copy link
Contributor

A new Pull Request was created by @fwyzard (Andrea Bocci) for master.

It involves the following packages:

  • HLTrigger/Tools (hlt)

@Martin-Grunewald, @missirol can you please review it and eventually sign? Thanks.
@Martin-Grunewald, @missirol, @silviodonato this is something you requested to watch as well.
@perrotta, @dpiparo, @qliphy you are the release manager for this.

cms-bot commands are listed here

@@ -0,0 +1,125 @@
# convert the .raw data will appear under
# store/raw/Run2022A/MinimumBias/RAW/v1/000/run353087
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are these two lines about? Should not always be the case....

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ops, sorry, that's a leftover from the earlier developments... removed, and replaced with a minimal description.

@fwyzard
Copy link
Contributor Author

fwyzard commented Jun 27, 2022

please test

@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-38519/30760

  • This PR adds an extra 20KB to repository

@cmsbuild
Copy link
Contributor

Pull request #38519 was updated. @Martin-Grunewald, @missirol can you please check and sign again.

Copy link
Contributor

@missirol missirol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a few questions and picky comments inline.

My main question is the following: is convertToRaw explicative enough? Maybe something like convertEDMTo(FED)Raw would be clearer? And (for my education), is there already a tool that does the opposite (from .raw to EDM)?

Comment on lines +55 to +67
for name in 'filePrepend', 'maxEvents', 'outputFile', 'secondaryOutputFile', 'section', 'tag', 'storePrepend', 'totalSections':
del options._register[name]
del options._beenSet[name]
del options._info[name]
del options._types[name]
if name in options._singletons:
del options._singletons[name]
if name in options._lists:
del options._lists[name]
if name in options._noCommaSplit:
del options._noCommaSplit[name]
if name in options._noDefaultClear:
del options._noDefaultClear[name]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I'm missing the point. Is this meant to invalidate those cmd-line options? Couldn't they just be ignored altogether?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I don't like that they appear in cmsRun convertToRaw.py help (or better python3 HLTrigger/Tools/python/convertToRaw.py help).



options.register('runNumber',
0,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default value is invalid, I guess because it is required that the user sets it explicitly. Right? (I guess varParsing doesn't have a "required" flag for its args like argparse?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct (at least, I didn't find any other way of making it a required parameter).

Comment on lines 103 to 117
if options.runNumber == 0:
sys.stderr.write('Invalid run number\n')
sys.exit(1)

if options.lumiNumber == 0:
sys.stderr.write('Invalid luminosity section number\n')
sys.exit(1)

if options.eventsPerLumi == 0:
sys.stderr.write('Invalid number of events per luminosity section\n')
sys.exit(1)

if options.eventsPerFile == 0:
sys.stderr.write('Invalid number of events per output file\n')
sys.exit(1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about negative values for runNumber and the other args?

Out of curiosity: how do you decide between sys.stderr + exit and other types of exits? (naively, I would have used raise Exception and removed import sys)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't think about negative numbers, let me fix that,

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the way, eventsPerLumi == -1 is valid, and means "no limit".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As for the exceptions, I prefer the error message I get from sys.stderr:

$ python3 HLTrigger/Tools/python/convertToRaw.py inputFiles=file.root runNumber=0
Invalid run number

to the message I get from exceptions:

$ python3 HLTrigger/Tools/python/convertToRaw.py inputFiles=file.root runNumber=0
Traceback (most recent call last):
  File "/fff/fwyzard/CMSSW_12_3_5/src/HLTrigger/Tools/python/convertToRaw.py", line 104, in <module>
    raise RuntimeError('Invalid run number')
RuntimeError: Invalid run number

As a user, why should I care about the traceback and where in the code the exception was generated ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In more complex cases, it might be useful to know where the code is breaking, but in this case it would indeed be obvious. One could do raise SystemExit("error message") (seems equivalent: same error message to stderr, exit code 1, less source code, no import sys), but it's a minor point of course.

Comment on lines 133 to 134
os.makedirs('%s/run%06d' % (options.outputPath, options.runNumber), exist_ok=True)
open('%s/run%06d/fu.lock' % (options.outputPath, options.runNumber), 'w').close()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
os.makedirs('%s/run%06d' % (options.outputPath, options.runNumber), exist_ok=True)
open('%s/run%06d/fu.lock' % (options.outputPath, options.runNumber), 'w').close()
outputDir = f'{options.outputPath}/run{options.runNumber:06d}'
os.makedirs(outputDir, exist_ok=True)
open(f'{outputDir}/fu.lock', 'w').close()

Copy link
Contributor Author

@fwyzard fwyzard Jun 27, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still not sure I like the f'{var}' syntax over the old %-based syntax, but OK - in any case, I agree it reduces repetitions.

import argparse
import glob
import json
import os, os.path
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just import os?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just habit, I guess.

@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-38519/30761

  • This PR adds an extra 20KB to repository

@cmsbuild
Copy link
Contributor

Pull request #38519 was updated. @cmsbuild, @missirol, @Martin-Grunewald can you please check and sign again.

@fwyzard
Copy link
Contributor Author

fwyzard commented Jun 27, 2022

please test

@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-38519/30762

  • This PR adds an extra 20KB to repository

@cmsbuild
Copy link
Contributor

Pull request #38519 was updated. @Martin-Grunewald, @missirol can you please check and sign again.

@fwyzard
Copy link
Contributor Author

fwyzard commented Jun 27, 2022 via email

@cmsbuild
Copy link
Contributor

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-244bb4/25853/summary.html
COMMIT: 68a40ab
CMSSW: CMSSW_12_5_X_2022-06-27-1100/el8_amd64_gcc10
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/38519/25853/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 4 differences found in the comparisons
  • DQMHistoTests: Total files compared: 50
  • DQMHistoTests: Total histograms compared: 3659995
  • DQMHistoTests: Total failures: 7
  • DQMHistoTests: Total nulls: 1
  • DQMHistoTests: Total successes: 3659965
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: -0.004 KiB( 49 files compared)
  • DQMHistoSizes: changed ( 312.0 ): -0.004 KiB MessageLogger/Warnings
  • Checked 208 log files, 45 edm output root files, 50 DQM output files
  • TriggerResults: no differences found

@missirol
Copy link
Contributor

+hlt

@cmsbuild
Copy link
Contributor

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @perrotta, @dpiparo, @qliphy (and backports should be raised in the release meeting by the corresponding L2)

@perrotta
Copy link
Contributor

+1

@cmsbuild cmsbuild merged commit 65b254e into cms-sw:master Jun 28, 2022
@fwyzard fwyzard deleted the hltConvertToRaw branch July 31, 2022 13:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants