Script to create a HipPy input dataset file #15886

hroskes · 2016-09-16T21:32:37Z

Can also add a similar function for MP if that's needed, but I'm not familiar with the syntax.

Also make the AIO tool dataset class a bit more general by splitting up the biggest function. No point in rewriting another module to do the same thing.

Also, remove some inefficiency in validation by skipping files that have the run number in their filename if they are not in the selected run range.

And a little fix in the geometry comparison.

- filter the file list when possible to avoid unnecessary opening and closing - add function for HipPy file list

(the root file has never been called that as far back as I can tell, and it's copied later anyway https://github.com/cms-sw/cmssw/blob/813e5a/Alignment/OfflineValidation/python/TkAlAllInOneTool/configTemplates.py#L104)

cmsbuild · 2016-09-16T21:32:58Z

A new Pull Request was created by @hroskes (Heshy Roskes) for CMSSW_8_1_X.

It involves the following packages:

Alignment/HIPAlignmentAlgorithm
Alignment/OfflineValidation

@ghellwig, @cerminar, @cmsbuild, @franzoni, @mmusich, @davidlange6 can you please review it and eventually sign? Thanks.
@mschrode, @ghellwig, @mmusich, @tocheng, @tlampen this is something you requested to watch as well.
@slava77, @smuzaffar you are the release manager for this.

cms-bot commands are list here #13028

ghellwig · 2016-09-19T08:32:16Z

please test

cmsbuild · 2016-09-19T08:32:33Z

The tests are being triggered in jenkins.
https://cmssdt.cern.ch/jenkins/job/ib-any-integration/15258/console

ghellwig

@hroskes looks fine to me, except for the minor comments I made
In general, I think, it would be good to migrate the Dataset class to Alignment/CommonAlignment/python/tools. But this could be done in the next round of updates. On the MillePede side there is a script that does a similar job, but with some code duplication. The goal of this script (mps_create_file_lists.py) is to create statistically independent dataset for alignment and validation. I think, it would be good if we could unify this somehow, to get common validation datasets and create file lists for the remaining data in each of the formats required by the two alignment algorithms.
What do you think?

ghellwig · 2016-09-19T08:43:13Z

Alignment/OfflineValidation/python/TkAlAllInOneTool/dataset.py

+    def getrunnumberfromfilename(filename):
+        parts = filename.split("/")
+        result = error = None
+        if parts[0] != "" or parts[1] != "store":


@hroskes shouldn't it be and instead of or?

I don't think so... this way catches "something/store" and also "/something", and would not.

you are right!

ghellwig · 2016-09-19T08:44:36Z

Alignment/OfflineValidation/python/TkAlAllInOneTool/dataset.py

+            error = "does not start with /store"
+        elif parts[2] in ["mc", "relval"]:
+            result = 1
+        elif parts[-2] != "00000" or not parts[-1].endswith(".root"):


@hroskes is this nomenclature for file names defined or required somewhere?

Good question... I didn't know about it until @usarica told me. It's true for many datasets but not for this one. Actually Ulascan mentioned that in that case there are multiple runs in the same data file.

Basically at this point I was trying to be as strict as possible, and not remove the filename if it's not in the exact pattern that seems to be satisfied for most datasets.

If you have a better way of figuring out the run number from the filename that would be great. This way is more efficient than using a lumi filter which requires opening and closing all the files. It's particularly important for HipPy because we loop through the files multiple times, but since I was implementing it anyway I figured we might as well use it for validation too.

I think your way works in case of prompt data, but not for rereco (as the dataset that you linked). Since we usually run alignments on prompt reco, it is fine this way.
I was just curious, if you know about a place where this convention is defined.
In MillePede, we have a similar guessing mechanism, but I would not call it "a better way".

ghellwig · 2016-09-19T08:50:36Z

Alignment/OfflineValidation/python/TkAlAllInOneTool/geometryComparison.py

@@ -252,9 +252,6 @@ def createScript(self, path):
                resultingFile = os.path.expandvars( resultingFile )
                resultingFile = os.path.abspath( resultingFile )
                resultingFile = "root://eoscms//eos/cms" + resultingFile   #needs to be AFTER abspath so that it doesn't eat the //
-                repMap["runComparisonScripts"] += \
-                    ("xrdcp -f OUTPUT_comparison.root %s\n"
-                     %resultingFile)


@hroskes can you elaborate what the exact effect of this fix is?

The effect is basically to remove a bash error because OUTPUT_comparison.root does not exist.

I think this line was supposed to copy the output of makeArrowPlots("comparison.root", "...")

But actually the first argument to makeArrowPlots is something else, so this file is never created and it just gives a warning.

The actual output file contains .oO[name]Oo. so it gets copied in this step.

ok, thanks for the explanation!

cmsbuild · 2016-09-19T09:45:03Z

+1
Tested at: 2fe508a
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-15886/15258/summary.html

cmsbuild · 2016-09-19T10:34:14Z

Comparison is ready
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-15886/15258/summary.html

hroskes · 2016-09-19T15:00:17Z

@ghellwig Unifying sounds great. Where is mps_create_file_lists.py? I don't see it in Alignment/MillePedeAlignmentAlgorithm/scripts.

ghellwig · 2016-09-19T15:08:26Z

@hroskes I just realized, that the PR containing this script is not yet merged, but I linked the code in one of my comments.

ghellwig · 2016-09-19T15:13:16Z

Alignment/OfflineValidation/python/TkAlAllInOneTool/dataset.py

+                        fileList.remove(filename)
+                except AllInOneError as e:
+                    if forcerunselection: raise
+                    print e.message


@hroskes just one question: in the All-in-One tool forcerunselection is set to False, right?
But in case, one runs a validation using rereco data (for whatever reason), one get's quite a lot of stdout by this line, right?

Yes, you're right. I pushed and fixed this now.

(I guess it's still a lot of output in the case where most but not all of the files fall into this category, but I would be surprised if that ever happens, and in that case you would want to know what's going on.)

@hroskes looks good now!

cmsbuild · 2016-09-19T16:01:05Z

Pull request #15886 was updated. @ghellwig, @cerminar, @cmsbuild, @franzoni, @mmusich, @davidlange6 can you please check and sign again.

ghellwig · 2016-09-19T16:15:54Z

please test

cmsbuild · 2016-09-19T16:16:48Z

The tests are being triggered in jenkins.
https://cmssdt.cern.ch/jenkins/job/ib-any-integration/15268/console

cmsbuild · 2016-09-19T18:27:03Z

+1
Tested at: d8a9216
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-15886/15268/summary.html

cmsbuild · 2016-09-19T19:45:27Z

Comparison is ready
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-15886/15268/summary.html

ghellwig · 2016-09-20T07:35:35Z

+1

cmsbuild · 2016-09-20T07:35:52Z

This pull request is fully signed and it will be integrated in one of the next CMSSW_8_1_X IBs (tests are also fine). This pull request requires discussion in the ORP meeting before it's merged. @slava77, @davidlange6, @smuzaffar

davidlange6 · 2016-09-20T18:45:06Z

+1

hroskes added 3 commits September 16, 2016 23:23

- split __createSnippet into 3 functions

a2b302d

- filter the file list when possible to avoid unnecessary opening and closing - add function for HipPy file list

script to create file list for hippy

ce82fd5

remove nonsense line

2fe508a

(the root file has never been called that as far back as I can tell, and it's copied later anyway https://github.com/cms-sw/cmssw/blob/813e5a/Alignment/OfflineValidation/python/TkAlAllInOneTool/configTemplates.py#L104)

cmsbuild added this to the Next CMSSW_8_1_X milestone Sep 16, 2016

hroskes mentioned this pull request Sep 16, 2016

Script to create a HipPy input dataset file #15887

Merged

cmsbuild added alca-pending comparison-pending orp-pending pending-signatures tests-pending labels Sep 16, 2016

cmsbuild added tests-started and removed tests-pending labels Sep 19, 2016

ghellwig reviewed Sep 19, 2016

View reviewed changes

cmsbuild added tests-approved and removed tests-started labels Sep 19, 2016

cmsbuild added comparison-available and removed comparison-pending labels Sep 19, 2016

ghellwig reviewed Sep 19, 2016

View reviewed changes

less output

d8a9216

cmsbuild added comparison-pending tests-pending and removed comparison-available tests-approved labels Sep 19, 2016

cmsbuild added tests-started and removed tests-pending labels Sep 19, 2016

cmsbuild added tests-approved and removed tests-started labels Sep 19, 2016

cmsbuild added comparison-available and removed comparison-pending labels Sep 19, 2016

cmsbuild added alca-approved fully-signed and removed alca-pending pending-signatures labels Sep 20, 2016

cmsbuild added orp-approved and removed orp-pending labels Sep 20, 2016

cmsbuild merged commit a0e3a37 into cms-sw:CMSSW_8_1_X Sep 20, 2016

hroskes deleted the hippy-dataset branch October 4, 2016 23:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Script to create a HipPy input dataset file #15886

Script to create a HipPy input dataset file #15886

hroskes commented Sep 16, 2016

cmsbuild commented Sep 16, 2016

ghellwig commented Sep 19, 2016

cmsbuild commented Sep 19, 2016 •

edited

ghellwig left a comment

ghellwig Sep 19, 2016

hroskes Sep 19, 2016

ghellwig Sep 19, 2016

ghellwig Sep 19, 2016

hroskes Sep 19, 2016

ghellwig Sep 19, 2016

ghellwig Sep 19, 2016

hroskes Sep 19, 2016

ghellwig Sep 19, 2016

cmsbuild commented Sep 19, 2016

cmsbuild commented Sep 19, 2016

hroskes commented Sep 19, 2016

ghellwig commented Sep 19, 2016

ghellwig Sep 19, 2016

hroskes Sep 19, 2016

ghellwig Sep 20, 2016

cmsbuild commented Sep 19, 2016

ghellwig commented Sep 19, 2016

cmsbuild commented Sep 19, 2016 •

edited

cmsbuild commented Sep 19, 2016

cmsbuild commented Sep 19, 2016

ghellwig commented Sep 20, 2016

cmsbuild commented Sep 20, 2016

davidlange6 commented Sep 20, 2016

Script to create a HipPy input dataset file #15886

Script to create a HipPy input dataset file #15886

Conversation

hroskes commented Sep 16, 2016

cmsbuild commented Sep 16, 2016

ghellwig commented Sep 19, 2016

cmsbuild commented Sep 19, 2016 • edited

ghellwig left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cmsbuild commented Sep 19, 2016

cmsbuild commented Sep 19, 2016

hroskes commented Sep 19, 2016

ghellwig commented Sep 19, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cmsbuild commented Sep 19, 2016

ghellwig commented Sep 19, 2016

cmsbuild commented Sep 19, 2016 • edited

cmsbuild commented Sep 19, 2016

cmsbuild commented Sep 19, 2016

ghellwig commented Sep 20, 2016

cmsbuild commented Sep 20, 2016

davidlange6 commented Sep 20, 2016

cmsbuild commented Sep 19, 2016 •

edited

cmsbuild commented Sep 19, 2016 •

edited