Allow numbers in data tier names #11930

germanfgv · 2024-03-14T16:32:08Z

Fixes #11931

Status

ready

Description

In coming weeks Tier-0 will introduce L1SCOUT data tier. This requieres numbers in data tier names

Is it backward compatible (if not, which system it affects?)

YES

External dependencies / deployment changes

cmsdmwmbot · 2024-03-14T16:45:26Z

Jenkins results:

Python3 Unit tests: failed
- 2 new failures
- 1 changes in unstable tests
Python3 Pylint check: failed
- 4 warnings and errors that must be fixed
- 94 comments to review
Pylint py3k check: succeeded
Pycodestyle check: succeeded
- 84 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/14972/artifact/artifacts/PullRequestReport.html

vkuznet · 2024-03-14T16:47:11Z

I hope that everybody understand that it will not be sufficient to patch the Lexicon.py since it is not used in DBS Go-based code (https://github.com/dmwm/dbs2go). Therefore, in order to make this data-tier acceptable in DBS someone should patch the Go-based code and update dbs2go in production. Current maintainer of DBS Go code is @todor-ivanov

vkuznet · 2024-03-14T16:57:41Z

For the record in dbs2go codebase I ported Lexicon.py rules into independent JSON files. They are located at https://github.com/dmwm/dbs2go/tree/master/static and you can find tier patterns in the following lines:

static/lexicon_writer.json
71:    "name": "data_tier_name",

static/lexicon_writer_negative.json
27:    "data_tier_name": [

static/lexicon_writer_positive.json
14:    "data_tier_name": [

static/lexicon_reader_positive.json
14:    "data_tier_name": [

static/lexicon_reader.json
81:    "name": "data_tier_name",

static/lexicon_reader_negative.json
27:    "data_tier_name": [

All patterns should be properly adjusted. For the record, WM has long standing issue (#10614) about porting Lexicon rules from python code to independent format. Since we didn't address it now we must be careful with Lexicon changes as for python based code it is Lexicon.py file while DBS can't use it and therefore it relies on independent JSON format I pointed out above.

todor-ivanov · 2024-03-14T17:24:14Z

Hi @vkuznet

I'll look into that ASAP

amaltaro · 2024-03-15T07:22:48Z

I think we should take this opportunity and make the dataset regex consistent with the datatier field as well, both in WMCore and in DBS.

For instance, DBS defines (I guess it's the same as in the Lexicon) the dataset regex with a datatier field of up to 50 chars:
https://github.com/dmwm/dbs2go/blob/master/static/lexicon_writer.json#L29

while the datatier field is defined with up to 99 chars:
https://github.com/dmwm/dbs2go/blob/master/static/lexicon_writer.json#L73

The same situation applies to WMCore, which defines the datatier field in the dataset regex to be smaller than 50 chars:
https://github.com/dmwm/WMCore/blob/master/src/python/WMCore/Lexicon.py#L205

@germanfgv I would suggest to update the line above to up to 99 chars as well.
@todor-ivanov can you please check the length of the dataset name in the DBS table? I guess it will be >= 301, but we better confirm it before updating this in WMCore.

germanfgv · 2024-03-15T17:30:28Z

@germanfgv I would suggest to update the line above to up to 99 chars as well. @todor-ivanov can you please check the length of the dataset name in the DBS table? I guess it will be >= 301, but we better confirm it before updating this in WMCore.

I updated both SEARCHDATASET_RE and DATASET_RE

cmsdmwmbot · 2024-03-15T17:47:17Z

Jenkins results:

Python3 Unit tests: succeeded
- 2 changes in unstable tests
Python3 Pylint check: failed
- 4 warnings and errors that must be fixed
- 94 comments to review
Pylint py3k check: succeeded
Pycodestyle check: succeeded
- 84 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/14975/artifact/artifacts/PullRequestReport.html

amaltaro

Thank you, German. It looks good to me!

germanfgv requested review from amaltaro and todor-ivanov March 14, 2024 16:36

Allow digits in data tier names

a561268

germanfgv force-pushed the lexiconUpdate branch from 5e54e80 to a561268 Compare March 15, 2024 17:28

amaltaro approved these changes Mar 18, 2024

View reviewed changes

amaltaro merged commit c123bb5 into dmwm:master Mar 18, 2024
3 of 4 checks passed

amaltaro mentioned this pull request Mar 18, 2024

Enhancement: Allow versioned data tier for partial pileup data placement dmwm/CMSRucio#744

Closed

This was referenced Mar 27, 2024

Fix regex for Spec generation with alphanumerical data tiers #11951

Merged

Add data tier validation for Repack workload creation #11954

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow numbers in data tier names #11930

Allow numbers in data tier names #11930

germanfgv commented Mar 14, 2024 •

edited

Loading

cmsdmwmbot commented Mar 14, 2024

vkuznet commented Mar 14, 2024

vkuznet commented Mar 14, 2024

todor-ivanov commented Mar 14, 2024

amaltaro commented Mar 15, 2024 •

edited

Loading

germanfgv commented Mar 15, 2024

cmsdmwmbot commented Mar 15, 2024

amaltaro left a comment

Allow numbers in data tier names #11930

Allow numbers in data tier names #11930

Conversation

germanfgv commented Mar 14, 2024 • edited Loading

Status

Description

Is it backward compatible (if not, which system it affects?)

External dependencies / deployment changes

cmsdmwmbot commented Mar 14, 2024

vkuznet commented Mar 14, 2024

vkuznet commented Mar 14, 2024

todor-ivanov commented Mar 14, 2024

amaltaro commented Mar 15, 2024 • edited Loading

germanfgv commented Mar 15, 2024

cmsdmwmbot commented Mar 15, 2024

amaltaro left a comment

Choose a reason for hiding this comment

germanfgv commented Mar 14, 2024 •

edited

Loading

amaltaro commented Mar 15, 2024 •

edited

Loading