Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

added --sites options for runTheMatrix #31535

Merged
merged 1 commit into from Sep 25, 2020
Merged

added --sites options for runTheMatrix #31535

merged 1 commit into from Sep 25, 2020

Conversation

smuzaffar
Copy link
Contributor

As requested here #22278 , added --sites <site> option to runTheMatrix.py script to select a specific site for recycle data. Setting it to emptry string will allow to search all sites.

This resolves #22278

@cmsbuild
Copy link
Contributor

The code-checks are being triggered in jenkins.

@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-31535/18518

  • This PR adds an extra 20KB to repository

@cmsbuild
Copy link
Contributor

A new Pull Request was created by @smuzaffar (Malik Shahzad Muzaffar) for master.

It involves the following packages:

Configuration/PyReleaseValidation

@chayanit, @cmsbuild, @wajidalikhan, @kpedro88, @jordan-martins can you please review it and eventually sign? Thanks.
@makortel, @Martin-Grunewald, @fabiocos, @slomeo this is something you requested to watch as well.
@silviodonato, @dpiparo, @qliphy you are the release manager for this.

cms-bot commands are listed here

@smuzaffar
Copy link
Contributor Author

please test

@cmsbuild
Copy link
Contributor

cmsbuild commented Sep 22, 2020

The tests are being triggered in jenkins.

@cmsbuild
Copy link
Contributor

+1
Tested at: 684229d
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-0d261f/9483/summary.html
CMSSW: CMSSW_11_2_X_2020-09-21-2300
SCRAM_ARCH: slc7_amd64_gcc820

@cmsbuild
Copy link
Contributor

Comparison job queued.

@cmsbuild
Copy link
Contributor

Comparison is ready
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-0d261f/9483/summary.html

Comparison Summary:

  • No significant changes to the logs found
  • Reco comparison results: 4 differences found in the comparisons
  • DQMHistoTests: Total files compared: 35
  • DQMHistoTests: Total histograms compared: 2540471
  • DQMHistoTests: Total failures: 7
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 2540442
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 34 files compared)
  • Checked 149 log files, 22 edm output root files, 35 DQM output files

@kpedro88
Copy link
Contributor

+upgrade

if len(self.run) is not 0:
return ["file {0}={1} run={2} site=T2_CH_CERN".format(query_by, query_source, query_run) for query_run in self.run]
return ["file {0}={1} run={2}{3}".format(query_by, query_source, query_run, site) for query_run in self.run]
#return ["file {0}={1} run={2} ".format(query_by, query_source, query_run) for query_run in self.run]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we still need to keep this line?

#return ["file {0}={1} run={2} ".format(query_by, query_source, query_run) for query_run in self.run]
else:
return ["file {0}={1} site=T2_CH_CERN".format(query_by, query_source)]
return ["file {0}={1}{2}".format(query_by, query_source, site)]
#return ["file {0}={1} ".format(query_by, query_source)]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we still need to keep this line?

@silviodonato
Copy link
Contributor

merge
we can clean MatrixUtil.py in a follow up PR

@cmsbuild cmsbuild merged commit a65b92a into cms-sw:master Sep 25, 2020
@chayanit
Copy link

chayanit commented Oct 8, 2020

Hello @smuzaffar, @silviodonato , this PR doesn't seem to remove a whitelist from CERN site. We have problem to read input for RelVals in 11_2_0_pre7. Can we revisit this?

@smuzaffar
Copy link
Contributor Author

@chayanit , I did not want to change the default behavior of the script. So by default CERN site is selected. As mentioned int he description of the PR, running --site='' should allow to search all sites. If this behavior not working?

@chayanit
Copy link

chayanit commented Oct 8, 2020

@smuzaffar seems not, apparently it always looks for sample at CERN site and not others

@smuzaffar
Copy link
Contributor Author

smuzaffar commented Oct 8, 2020

It works for me e.g running without --site option I see that runTheMatrix runs the following

>runTheMatrix.py -i all -l 1000 --maxSteps=1 --dryRun
dasgoclient --limit 0 --query 'file dataset=/MinimumBias/Run2011A-v1/RAW run=165121 site=T2_CH_CERN'

and with --site='' i get

>runTheMatrix.py -i all -l 1000 --maxSteps=1 --dryRun --site=''
 dasgoclient --limit 0 --query 'file dataset=/MinimumBias/Run2011A-v1/RAW run=165121'

@chayanit
Copy link

chayanit commented Oct 9, 2020

Hi @smuzaffar , it doesn't work for us and we found this is caused by the default setting here

@smuzaffar
Copy link
Contributor Author

How can I reproduce it?

@smuzaffar
Copy link
Contributor Author

smuzaffar commented Oct 9, 2020

are you calling runTheMatrix to generate the configuration or importing the MatrixUtils directly? If you are not using runTheMatrix then I would suggest to set environment variable CMSSW_DAS_QUERY_SITES=''

@chayanit
Copy link

chayanit commented Oct 9, 2020

Yes we run runTheMatrix to generate configuration

@smuzaffar
Copy link
Contributor Author

Can you please share the runTheMatrix.py command you run?

@chayanit
Copy link

chayanit commented Oct 9, 2020

Sure, this is the command line we usually run for RelVal production
runTheMatrix.py --what upgrade -l 11834.0 -t 4 -m 11500 -b 'fullsim_PU_2021_14TeV' -i all --noCaf --wm force
while you need to update string here https://github.com/cms-sw/cmssw/blob/master/Configuration/PyReleaseValidation/python/relval_steps.py#L3211 to "CMSSW_11_2_0_pre7-112X_mcRun3_2021_realistic_v8-v"

@smuzaffar
Copy link
Contributor Author

I do not see --sites='' option here, so for sure it will use the default. As I have mentioned earlier that by default runTheMatrix.py uses T2_CH_CERN , this was the behavior for many year and I did not want to change that. That is why I added the extra command line option. In case you want to remove the T2_CH_CERN from the whitelist then please call runTheMatrix.py with --sites='' command-line option

@chayanit
Copy link

chayanit commented Oct 9, 2020

Ah ok @smuzaffar you mean we have to put --sites='' explicitly in the command line?

@smuzaffar
Copy link
Contributor Author

yes

@chayanit
Copy link

chayanit commented Oct 9, 2020

Can you show how explicitly?

@smuzaffar
Copy link
Contributor Author

smuzaffar commented Oct 9, 2020

runTheMatrix.py --what upgrade -l 11834.0 -t 4 -m 11500 -b 'fullsim_PU_2021_14TeV' -i all --noCaf --wm force --sites=''

see the extra --sites='' at the end

@chayanit
Copy link

chayanit commented Oct 9, 2020

this option is not shown in runTheMatrix.py -h though should we include?

@smuzaffar
Copy link
Contributor Author

smuzaffar commented Oct 9, 2020

are you sure that you are using a release where this option is available? If I run runTheMatrix.py --help in CMSSW_11_2_X then I see

  --sites=DASSITES      Run DAS query to get data from a specific site
                        (default is T2_CH_CERN). Set it to empty string to
                        search all sites.

@chayanit
Copy link

chayanit commented Oct 9, 2020

Yeah now I saw the option is added to runTheMatrix.py script but it doesn't show up when run 'runTheMatrix.py --help' I'm doing it on 11_2_0_pre7

@davidlange6
Copy link
Contributor

davidlange6 commented Oct 9, 2020 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

runTheMatrix.py -i all whitelists CERN even if files are elsewhere
6 participants