Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Das client fails with interface conversion #37052

Closed
smuzaffar opened this issue Feb 24, 2022 · 37 comments
Closed

Das client fails with interface conversion #37052

smuzaffar opened this issue Feb 24, 2022 · 37 comments

Comments

@smuzaffar
Copy link
Contributor

One unit tests https://cmssdt.cern.ch/SDT/cgi-bin/logreader/slc7_amd64_gcc10/CMSSW_12_3_X_2022-02-23-2300/unitTestLogs/PhysicsTools/Utilities#/1576-1576 is failing in all IBs and problem is dasgoclient which fails to run the following command

> dasgoclient -query 'file dataset=/Cosmics/Run2011A-v1/RAW run=160960 lumi=277' -json
panic: interface conversion: interface {} is json.Number, not []interface {}

goroutine 1 [running]:
github.com/dmwm/das2go/services.LocalAPIs.File4DatasetRunLumi({}, {{0xc000205cc0, 0x3f}, {0x7ffcfca33218, 0x39}, {0xc00027f980, 0x20}, 0xc00031df50, {0xc0003078c0, 0x1, ...}, ...})
        /home/runner/go/pkg/mod/github.com/dmwm/das2go@v0.0.0-20220121173233-cba4d3ebdd14/services/dbs.go:271 +0x7a5
reflect.Value.call({0x7bd660, 0xabf7c8, 0xa8fd20}, {0x7c2ef7, 0x4}, {0xc0001a9138, 0x1, 0x40c34b})
        /opt/hostedtoolcache/go/1.17.6/x64/src/reflect/value.go:556 +0x845
reflect.Value.Call({0x7bd660, 0xabf7c8, 0xc000352d80}, {0xc0001a9138, 0x1, 0x1})
        /opt/hostedtoolcache/go/1.17.6/x64/src/reflect/value.go:339 +0xc5
main.processLocalApis({{0xc000205cc0, 0x3f}, {0x7ffcfca33218, 0x39}, {0xc00027f980, 0x20}, 0xc00031df50, {0xc0003078c0, 0x1, 0x1}, ...}, ...)
        /home/runner/work/dasgoclient/dasgoclient/main.go:964 +0x505
main.process({0x7ffcfca33218, 0x39}, 0x1, {0x83f3a0, 0x1}, 0x0, {0x0, 0x0}, {0x7ca552, 0x16}, ...)
        /home/runner/work/dasgoclient/dasgoclient/main.go:483 +0x12cf
main.main()
        /home/runner/work/dasgoclient/dasgoclient/main.go:143 +0x10e6

@vkuznet can you please look in to this?

@cmsbuild
Copy link
Contributor

A new Issue was created by @smuzaffar Malik Shahzad Muzaffar.

@Dr15Jones, @perrotta, @dpiparo, @makortel, @smuzaffar, @qliphy can you please review it and eventually sign/assign? Thanks.

cms-bot commands are listed here

@vkuznet
Copy link
Contributor

vkuznet commented Feb 24, 2022

The fix is here cms-sw/cmsdist#7646

@vkuznet
Copy link
Contributor

vkuznet commented Feb 24, 2022

The issue pop-up due to change in DBS server we performed yesterday. The new DBS server does not provide aggregated records and as a consequence lumi values changed their data-type, i.e. from results from DBS no longer contains list of lumis but rather individual lumi numbers.

@smuzaffar
Copy link
Contributor Author

smuzaffar commented Feb 25, 2022

@vkuznet , I am afraid this DBS server side change is going to break all our IB. The https://github.com/cms-sw/cmssw/blob/master/Configuration/PyReleaseValidation/scripts/das-selected-lumis.py script is used by all IBs and cmssw releases to select the lumi numbers. e.g. previously das query

dasgoclient --limit 0 --query 'lumi,file dataset=/HIHardProbes/HIRun2018A-v1/RAW run=326479' --format json

was returning https://raw.githubusercontent.com/cms-sw/cms-sw.github.io/master/das_queries/8b/8b5f64ce639191e4db57cc88035da15cbdc57a3faaa60ea1eab0ca4377221249.json which can be parsed by das-selected-lumis.py (in cmssw env) e.g.

> curl -s https://raw.githubusercontent.com/cms-sw/cms-sw.github.io/master/das_queries/8b/8b5f64ce639191e4db57cc88035da15cbdc57a3faaa60ea1eab0ca4377221249.json |  das-selected-lumis.py 1,23
/store/hidata/HIRun2018A/HIHardProbes/RAW/v1/000/326/479/00000/853DBE29-53BA-9A44-9FDD-58E4E9064EB1.root
/store/hidata/HIRun2018A/HIHardProbes/RAW/v1/000/326/479/00000/45001EBC-B4D4-9043-A276-8F3AF621C64A.root
/store/hidata/HIRun2018A/HIHardProbes/RAW/v1/000/326/479/00000/0E2CC5D5-9D87-7348-9219-B00CD718C847.root
/store/hidata/HIRun2018A/HIHardProbes/RAW/v1/000/326/479/00000/7B3F72ED-E183-3F4B-9FE4-DAE6D911403E.root

but now new dasgoclient returns a different format which breaks all IBs and existing releases

> /cvmfs/cms-ib.cern.ch/latest/common/dasgoclient  --limit 0 --query 'lumi,file dataset=/HIHardProbes/HIRun2018A-v1/RAW run=326479' --format json | das-selected-lumis.py 1,23
Traceback (most recent call last):
  File "/cvmfs/cms-ib.cern.ch/week1/slc7_amd64_gcc10/cms/cmssw/CMSSW_12_3_X_2022-02-24-2300/bin/slc7_amd64_gcc10/das-selected-lumis.py", line 50, in <module>
    process_lumi(lumi_data)
  File "/cvmfs/cms-ib.cern.ch/week1/slc7_amd64_gcc10/cms/cmssw/CMSSW_12_3_X_2022-02-24-2300/bin/slc7_amd64_gcc10/das-selected-lumis.py", line 40, in process_lumi
    if not isinstance(lumi_nums[0], list): lumi_rang = [ [n,n] for n in lumi_nums ]
TypeError: 'int' object has no attribute '__getitem__'

@smuzaffar
Copy link
Contributor Author

smuzaffar commented Feb 25, 2022

@vkuznet , In new format , I see that for each file there are multiple records with just the lumi number changed [a]. What was the motivation behind this change? Previously we had one entry per file with list of lumi numbers

[a]

    {
      "das": {
        "expire": 1645784815,
        "instance": "prod/global",
        "primary_key": "file.name",
        "record": 1,
        "services": [
          "dbs3:file_lumi4dataset"
        ]
      },
      "file": [
        {
          "name": "/store/hidata/HIRun2018A/HIHardProbes/RAW/v1/000/326/479/00000/853DBE29-53BA-9A44-9FDD-58E4E9064EB1.root"
        }
      ],
      "lumi": [
        {
          "number": 2
        }
      ],
      "qhash": "6e4d3c0eb37cb387fdc3f507656ef7fb"
    },
    {
      "das": {
        "expire": 1645784815,
        "instance": "prod/global",
        "primary_key": "file.name",
        "record": 1,
        "services": [
          "dbs3:file_lumi4dataset"
        ]
      },
      "file": [
        {
          "name": "/store/hidata/HIRun2018A/HIHardProbes/RAW/v1/000/326/479/00000/853DBE29-53BA-9A44-9FDD-58E4E9064EB1.root"
        }
      ],
      "lumi": [
        {
          "number": 3
        }
      ],
      "qhash": "6e4d3c0eb37cb387fdc3f507656ef7fb"
    },
    {
      "das": {
        "expire": 1645784815,
        "instance": "prod/global",
        "primary_key": "file.name",
        "record": 1,
        "services": [
          "dbs3:file_lumi4dataset"
        ]
      },
      "file": [
        {
          "name": "/store/hidata/HIRun2018A/HIHardProbes/RAW/v1/000/326/479/00000/853DBE29-53BA-9A44-9FDD-58E4E9064EB1.root"
        }
      ],
      "lumi": [
        {
          "number": 4
        }
      ],
      "qhash": "6e4d3c0eb37cb387fdc3f507656ef7fb"
    },
    {
      "das": {
        "expire": 1645784815,
        "instance": "prod/global",
        "primary_key": "file.name",
        "record": 1,
        "services": [
          "dbs3:file_lumi4dataset"
        ]
      },
      "file": [
        {
          "name": "/store/hidata/HIRun2018A/HIHardProbes/RAW/v1/000/326/479/00000/853DBE29-53BA-9A44-9FDD-58E4E9064EB1.root"
        }
      ],
      "lumi": [
        {
          "number": 6
        }
      ],
      "qhash": "6e4d3c0eb37cb387fdc3f507656ef7fb"
    },
    {
      "das": {
        "expire": 1645784815,
        "instance": "prod/global",
        "primary_key": "file.name",
        "record": 1,
        "services": [
          "dbs3:file_lumi4dataset"
        ]
      },
      "file": [
        {
          "name": "/store/hidata/HIRun2018A/HIHardProbes/RAW/v1/000/326/479/00000/853DBE29-53BA-9A44-9FDD-58E4E9064EB1.root"
        }
      ],
      "lumi": [
        {
          "number": 5
        }
      ],
      "qhash": "6e4d3c0eb37cb387fdc3f507656ef7fb"
    },

@smuzaffar
Copy link
Contributor Author

smuzaffar commented Feb 25, 2022

@vkuznet , das queries like lumi,file dataset=/HLTPhysics/Commissioning2021-v1/RAW run=346512 (which has only 3 files) now returns 203KB data as compare to 4K with old das client.

> q='lumi,file dataset=/HLTPhysics/Commissioning2021-v1/RAW run=346512'
> /cvmfs/cms-ib.cern.ch/latest/common/dasgoclient --format json --limit 0 --query "$q" |wc
   624    1254  203731
> cat old.json | wc
   1     720    4315

@vkuznet
Copy link
Contributor

vkuznet commented Feb 25, 2022

@smuzaffar , we advertised all changes to DBS server up-front for two months. In short, the changes to DBS removed aggregation layer which contributed to instability of DBS itself. Instead, new DBS server just yield records from DB without any aggregation. We adjusted official DBSClient to handle aggregation on a client side. All details about new DBS server I presented on C&O meeting in early Jan. You can find all details here:
https://indico.cern.ch/event/1096200/contributions/4611461/attachments/2356881/4022206/DBS_migration.pdf
In particular, slide 10 tells about aggregation.

Now, I was not aware of das-selected-lumis.py script until yesterday when AlCaDB people contacted me about it. I already provided a fix and you can find new script here:
/afs/cern.ch/user/v/valya/public/das-selected-lumis.py

It basically aggregated lumis, but I only look at specific use-case which AlCaDB people pointed out. I suggest you to try it out and we can adjust this script to perform the aggregation you require.

Also, regarding lumi,file dataset=/HLTPhysics/Commissioning2021-v1/RAW run=346512 query. I don't know exactly what is your use case, i.e. to see aggregated lumis or not, or to see just files but I think you can easily address this issue using standard unix tools.

For, instance, if I need only uniq files I can do

dasgoclient -query="lumi,file dataset=/HLTPhysics/Commissioning2021-v1/RAW run=346512 | grep file.name" | sort | uniq
/store/data/Commissioning2021/HLTPhysics/RAW/v1/000/346/512/00000/0797d739-0677-432e-91b9-7a6d8a0e5601.root
/store/data/Commissioning2021/HLTPhysics/RAW/v1/000/346/512/00000/9e1384d9-83b8-4240-8486-950fa0e22a77.root

or if I need files for specific lumis I can do:

./dasgoclient -query="lumi,file dataset=/HLTPhysics/Commissioning2021-v1/RAW run=346512" | egrep "615|620"
/store/data/Commissioning2021/HLTPhysics/RAW/v1/000/346/512/00000/0797d739-0677-432e-91b9-7a6d8a0e5601.root 615
/store/data/Commissioning2021/HLTPhysics/RAW/v1/000/346/512/00000/0797d739-0677-432e-91b9-7a6d8a0e5601.root 620

and, then again use awk, grep, sort, uniq to get final list.

I'll be happy to discuss and help with your use-cases once I'll know all details. If you want we may schedule zoom meeting to discuss this.

@smuzaffar
Copy link
Contributor Author

@vkuznet , problem is that das-selected-lumis.py is part of cmssw releases , so we can not deployed/included the new script in existsing cmssw releases.

We run hundreds of unique das queries for each IB during the IB validation tests ( see the .query files under https://github.com/cms-sw/cms-sw.github.io/tree/master/das_queries but I have no idea about the exact uses of these queries). In order to avoid das glitches (which we hit a lot), we cache the dasgoclient results in https://github.com/cms-sw/cms-sw.github.io and update/refresh results every week. This repository is already hitting github size limits and now with latest das client change (which will increase the size to 10-20 times) I am afraid we will not be able to cache das results.

@smuzaffar
Copy link
Contributor Author

smuzaffar commented Feb 25, 2022

by the way all the suggestion you mentioned in #37052 (comment) are not going to fix existing cmssw releases.

@smuzaffar
Copy link
Contributor Author

@vkuznet , I went through your slides https://indico.cern.ch/event/1096200/contributions/4611461/attachments/2356881/4022206/DBS_migration.pdf . As you mentioned Clients can perform the aggregation if it is required, so why not update dasgoclient and do the lumi aggregation to not break extsing cmssw releases and keep the dasgoclient result size in control?

@vkuznet
Copy link
Contributor

vkuznet commented Feb 25, 2022

@smuzaffar you touched few independent issues and I don't know why we did not communicate properly on that. To move forward and properly fix them I suggest to clarify independent issues:

  • das instabilities, could you please clarify using concrete examples, queries, etc. what is that. Since DAS is a mad in a middle between different CMS service, the instabilities may come from different issues, like service itself, the amount of data to query concurrently, and DAS itself. To address the issue we need to understand where it comes from.
  • data aggregation, yes, we can add it to the client, but I need to know exact queries IB needs, etc. I need to understand how we should add this, e.g. should it be as default behavior or not, and we should discuss this before I'll do any implementation. Now, since we already upgraded DBS, we need to evaluate different use-cases within and outside CMSSW IB builds.

It is unfortunate that you did not raise these issues before we made DBS upgrade since I explicitly asked all groups to do their home work and tests changes. But, I'll do my best and prioritize work on DAS go client to resolve this asap.

@smuzaffar
Copy link
Contributor Author

@vkuznet , das client instabilities are not new, these are there since the start of das client (may be 8-10 years). In old days (pre dasgoclient era) it was due to das not returning results ( due to various reasons, network issue, service not reachable etc.). Now a days it is mostly due to failure of various services which das tries to communicate. In order to protect our IBs and PR tests against these instabilities , we use github caches for das results.

@smuzaffar
Copy link
Contributor Author

list of all queries for CMSSW are in https://github.com/cms-sw/cms-sw.github.io/tree/master/das_queries ( search for .query files)

@vkuznet
Copy link
Contributor

vkuznet commented Feb 25, 2022

@smuzaffar there are 3760 query files, and I need to know which one should be addressed first. My understanding that main failures coming from file,lumi queries where you need to aggregate results by lumi sections, is this correct? If so, I'll start with this query, provide aggregation within dasgoclient and we can move forward from there. How does it sound?

@smuzaffar
Copy link
Contributor Author

All of these queries are run by cmssw IBs ( cmssw 5.3.X up to 12.3.X) and all of these are important otherwise IBs and PR tests will fail. I do not exactly know how many types of queries are there but I guess you can cat */*.query and see the different types of searches cmssw does.

We need to fix the results so that existign cmssw release do not fail and to make sure that we not do not dump 50 times more data in to github caches

@vkuznet
Copy link
Contributor

vkuznet commented Feb 25, 2022

@smuzaffar I made first change to make file-lumis aggregation. Could you please try out this executable: /afs/cern.ch/user/v/valya/public/dasgoclient/dasgoclient_amd64. It only contains so far file-lumis aggregation and we should first decide on default behavior of dasgoclient. So far I run it as following:

/afs/cern.ch/user/v/valya/public/dasgoclient/dasgoclient_amd64 -query="lumi,file dataset=/HLTPhysics/Commissioning2021-v1/RAW run=346512" -aggFileLumis
/store/data/Commissioning2021/HLTPhysics/RAW/v1/000/346/512/00000/9e1384d9-83b8-4240-8486-950fa0e22a77.root [1,10,12,16,19,33,37,45,61,78,81,85,86,95,100,101,108,112,113,120,121,130,135,156,158,159,163,164,174,192,196,204,206,214,217,219,223,246,249,279,318,340,4,7,8,9,32,38,40,77,80,93,116,125,133,141,151,181,183,184,188,199,226,236,237,242,247,248,256,257,262,270,274,287,301,302,307,309,314,20,21,24,47,50,69,76,84,87,114,118,129,132,136,143,144,149,160,171,173,191,193,200,205,225,227,228,229,231,244,261,273,290,299,325,330,331,334,336,337,342,3,11,17,22,27,28,34,43,52,55,57,62,70,71,82,99,103,139,145,147,165,166,168,170,175,178,185,197,202,210,216,220,222,234,239,241,255,259,269,275,276,277,292,293,297,298,315,319,321,323,6,13,25,31,46,54,56,60,67,75,79,88,89,90,94,98,102,117,126,137,138,140,142,172,176,177,186,198,245,251,263,264,281,300,304,306,308,316,326,329,332,335,5,23,26,29,49,51,58,65,72,73,91,97,104,106,115,119,128,134,146,152,167,179,180,190,194,207,215,221,232,240,253,254,283,286,295,296,310,338,14,18,35,39,44,53,64,74,92,107,109,111,122,123,127,131,148,150,153,154,187,189,201,203,211,212,213,224,238,250,252,260,272,278,284,285,288,305,311,322,324,333,339,341,2,15,30,36,41,42,48,59,63,66,68,83,96,105,110,124,155,157,161,162,169,182,195,208,209,218,230,233,235,243,258,265,266,267,268,271,280,282,289,291,294,303,312,313,317,320,327,328]
/store/data/Commissioning2021/HLTPhysics/RAW/v1/000/346/512/00000/0797d739-0677-432e-91b9-7a6d8a0e5601.root [357,363,368,375,379,381,382,383,387,392,399,402,404,405,429,432,441,444,448,452,463,465,466,467,471,473,487,500,510,516,525,539,556,572,574,577,585,586,589,593,606,345,351,386,393,398,409,423,425,442,446,457,459,462,464,469,476,481,486,491,492,496,509,541,547,559,570,582,588,594,602,605,607,611,613,617,618,619,622,343,352,365,366,395,411,414,416,422,430,483,502,503,507,514,519,522,523,524,528,531,533,543,549,567,569,575,598,599,604,610,621,346,348,358,384,385,403,406,417,433,434,443,468,478,489,497,499,508,526,534,540,552,555,557,578,590,592,609,344,361,371,376,390,391,396,397,408,420,424,426,431,450,460,472,479,485,493,495,498,513,517,521,527,529,532,537,544,546,571,583,597,614,350,359,360,362,372,380,407,436,437,451,455,458,484,488,494,501,511,512,515,538,545,548,550,551,554,561,581,587,608,616,353,354,355,356,364,369,370,374,377,378,394,410,412,418,419,421,427,428,435,439,449,454,461,470,474,480,482,490,506,518,520,535,558,560,562,563,564,566,573,579,584,591,595,596,600,601,603,612,347,349,367,373,388,389,400,401,413,415,438,440,445,447,453,456,475,477,504,505,530,536,542,553,565,568,576,580,615,620]

So, it only produces 2 LFns with range of lumis. This is what das_queries/41/4100c4315028e2807e9860d22f84218aa7e86b77871bef89198d0e0523ede0db has.

Therefore, we need to decide on default behavior of dasgoclient. Should we explicitly use -aggFileLumis or not. As with DBSClient we made a decision to not use aggregation by default to allow clients to move forward with new data-streaming. Now, we should make decision with dasgoclient. My question is how difficult (if possible at all) to adjust CMSSW IB workflows to use explicit option? If the answer is that it is impossible, it seems to me we should enable aggregation by default, otherwise we should consider different use-cases.

There are only few APIs (and therefore DAS queries) which require aggregation, and once I'll know the default behavior I'll provide fixes one by one.

@smuzaffar
Copy link
Contributor Author

@vkuznet , I tried your change but it still fails. Can you please change lumi format and make it consistent with old one

"lumi":[{"number": [] }]

in old format, lumi is list of one element

@smuzaffar
Copy link
Contributor Author

by the way https://cmssdt.cern.ch/jenkins/job/das-query is jenkins job which I run for caching das results and https://cmssdt.cern.ch/jenkins/job/das-query/4142/console used your new dasgoclient and it failed due to mismatch of lumi format

@vkuznet
Copy link
Contributor

vkuznet commented Feb 25, 2022

please try now the same executable, I just updated with what you requested.

@smuzaffar
Copy link
Contributor Author

@vkuznet , it failed again with error

unexpected fault address 0x8c8b39
fatal error: fault
[signal SIGBUS: bus error code=0x2 addr=0x8c8b39 pc=0x4539fc]

see details https://cmssdt.cern.ch/jenkins/job/das-query/4146/console . Can you please also fix output for file which is also a list e.g. "file":[{"name": "" }]

@vkuznet
Copy link
Contributor

vkuznet commented Feb 25, 2022

@smuzaffar , it already has file as list of dict type. And bus error most likely comes from expired proxy. Please run it manually on lxplus to confirm that. This is exactly what I do:

voms-proxy-init -voms cms -rfc
Contacting voms2.cern.ch:15002 [/DC=ch/DC=cern/OU=computers/CN=voms2.cern.ch] "cms"...
Remote VOMS server contacted succesfully.


Created proxy in /tmp/x509up_uXXXX.

Your proxy is valid until Sat Feb 26 06:52:12 CET 2022

# and then I run
~/public/dasgoclient/dasgoclient_amd64 -query="lumi,file dataset=/HLTPhysics/Commissioning2021-v1/RAW run=346512" -aggFileLumis
/store/data/Commissioning2021/HLTPhysics/RAW/v1/000/346/512/00000/0797d739-0677-432e-91b9-7a6d8a0e5601.root [357,363,368,375,379,381,382,383,387,392,399,402,404,405,429,432,441,444,448,452,463,465,466,467,471,473,487,500,510,516,525,539,556,572,574,577,585,586,589,593,606,345,351,386,393,398,409,423,425,442,446,457,459,462,464,469,476,481,486,491,492,496,509,541,547,559,570,582,588,594,602,605,607,611,613,617,618,619,622,343,352,365,366,395,411,414,416,422,430,483,502,503,507,514,519,522,523,524,528,531,533,543,549,567,569,575,598,599,604,610,621,346,348,358,384,385,403,406,417,433,434,443,468,478,489,497,499,508,526,534,540,552,555,557,578,590,592,609,344,361,371,376,390,391,396,397,408,420,424,426,431,450,460,472,479,485,493,495,498,513,517,521,527,529,532,537,544,546,571,583,597,614,350,359,360,362,372,380,407,436,437,451,455,458,484,488,494,501,511,512,515,538,545,548,550,551,554,561,581,587,608,616,353,354,355,356,364,369,370,374,377,378,394,410,412,418,419,421,427,428,435,439,449,454,461,470,474,480,482,490,506,518,520,535,558,560,562,563,564,566,573,579,584,591,595,596,600,601,603,612,347,349,367,373,388,389,400,401,413,415,438,440,445,447,453,456,475,477,504,505,530,536,542,553,565,568,576,580,615,620]
/store/data/Commissioning2021/HLTPhysics/RAW/v1/000/346/512/00000/9e1384d9-83b8-4240-8486-950fa0e22a77.root [1,10,12,16,19,33,37,45,61,78,81,85,86,95,100,101,108,112,113,120,121,130,135,156,158,159,163,164,174,192,196,204,206,214,217,219,223,246,249,279,318,340,4,7,8,9,32,38,40,77,80,93,116,125,133,141,151,181,183,184,188,199,226,236,237,242,247,248,256,257,262,270,274,287,301,302,307,309,314,20,21,24,47,50,69,76,84,87,114,118,129,132,136,143,144,149,160,171,173,191,193,200,205,225,227,228,229,231,244,261,273,290,299,325,330,331,334,336,337,342,3,11,17,22,27,28,34,43,52,55,57,62,70,71,82,99,103,139,145,147,165,166,168,170,175,178,185,197,202,210,216,220,222,234,239,241,255,259,269,275,276,277,292,293,297,298,315,319,321,323,6,13,25,31,46,54,56,60,67,75,79,88,89,90,94,98,102,117,126,137,138,140,142,172,176,177,186,198,245,251,263,264,281,300,304,306,308,316,326,329,332,335,5,23,26,29,49,51,58,65,72,73,91,97,104,106,115,119,128,134,146,152,167,179,180,190,194,207,215,221,232,240,253,254,283,286,295,296,310,338,14,18,35,39,44,53,64,74,92,107,109,111,122,123,127,131,148,150,153,154,187,189,201,203,211,212,213,224,238,250,252,260,272,278,284,285,288,305,311,322,324,333,339,341,2,15,30,36,41,42,48,59,63,66,68,83,96,105,110,124,155,157,161,162,169,182,195,208,209,218,230,233,235,243,258,265,266,267,268,271,280,282,289,291,294,303,312,313,317,320,327,328]

If I use -json output I do see:

~/public/dasgoclient/dasgoclient_amd64 -query="lumi,file dataset=/HLTPhysics/Commissioning2021-v1/RAW run=346512" -aggFileLumis -json
[
{"das":{"expire":1645811598,"instance":"prod/global","primary_key":"file.name","record":1,"services":["dbs3:file_lumi4dataset"]},"file":[{"name":"/store/data/Commissioning2021/HLTPhysics/RAW/v1/000/346/512/00000/0797d739-0677-432e-91b9-7a6d8a0e5601.root"}],"lumi":[{"number":[357,363,368,375,379,381,382,383,387,392,399,402,404,405,429,432,441,444,448,452,463,465,466,467,471,473,487,500,510,516,525,539,556,572,574,577,585,586,589,593,606,345,351,386,393,398,409,423,425,442,446,457,459,462,464,469,476,481,486,491,492,496,509,541,547,559,570,582,588,594,602,605,607,611,613,617,618,619,622,343,352,365,366,395,411,414,416,422,430,483,502,503,507,514,519,522,523,524,528,531,533,543,549,567,569,575,598,599,604,610,621,346,348,358,384,385,403,406,417,433,434,443,468,478,489,497,499,508,526,534,540,552,555,557,578,590,592,609,344,361,371,376,390,391,396,397,408,420,424,426,431,450,460,472,479,485,493,495,498,513,517,521,527,529,532,537,544,546,571,583,597,614,350,359,360,362,372,380,407,436,437,451,455,458,484,488,494,501,511,512,515,538,545,548,550,551,554,561,581,587,608,616,353,354,355,356,364,369,370,374,377,378,394,410,412,418,419,421,427,428,435,439,449,454,461,470,474,480,482,490,506,518,520,535,558,560,562,563,564,566,573,579,584,591,595,596,600,601,603,612,347,349,367,373,388,389,400,401,413,415,438,440,445,447,453,456,475,477,504,505,530,536,542,553,565,568,576,580,615,620]}]} ,
{"das":{"expire":1645811598,"instance":"prod/global","primary_key":"file.name","record":1,"services":["dbs3:file_lumi4dataset"]},"file":[{"name":"/store/data/Commissioning2021/HLTPhysics/RAW/v1/000/346/512/00000/9e1384d9-83b8-4240-8486-950fa0e22a77.root"}],"lumi":[{"number":[1,10,12,16,19,33,37,45,61,78,81,85,86,95,100,101,108,112,113,120,121,130,135,156,158,159,163,164,174,192,196,204,206,214,217,219,223,246,249,279,318,340,4,7,8,9,32,38,40,77,80,93,116,125,133,141,151,181,183,184,188,199,226,236,237,242,247,248,256,257,262,270,274,287,301,302,307,309,314,20,21,24,47,50,69,76,84,87,114,118,129,132,136,143,144,149,160,171,173,191,193,200,205,225,227,228,229,231,244,261,273,290,299,325,330,331,334,336,337,342,3,11,17,22,27,28,34,43,52,55,57,62,70,71,82,99,103,139,145,147,165,166,168,170,175,178,185,197,202,210,216,220,222,234,239,241,255,259,269,275,276,277,292,293,297,298,315,319,321,323,6,13,25,31,46,54,56,60,67,75,79,88,89,90,94,98,102,117,126,137,138,140,142,172,176,177,186,198,245,251,263,264,281,300,304,306,308,316,326,329,332,335,5,23,26,29,49,51,58,65,72,73,91,97,104,106,115,119,128,134,146,152,167,179,180,190,194,207,215,221,232,240,253,254,283,286,295,296,310,338,14,18,35,39,44,53,64,74,92,107,109,111,122,123,127,131,148,150,153,154,187,189,201,203,211,212,213,224,238,250,252,260,272,278,284,285,288,305,311,322,324,333,339,341,2,15,30,36,41,42,48,59,63,66,68,83,96,105,110,124,155,157,161,162,169,182,195,208,209,218,230,233,235,243,258,265,266,267,268,271,280,282,289,291,294,303,312,313,317,320,327,328]}]}
]

@smuzaffar
Copy link
Contributor Author

about -aggFileLumis, i think this should be default so that ols release get the same results. I noticed that that if lumi is not queries but -aggFileLumis is set then dasgoclient fails

/afs/cern.ch/user/v/valya/public/dasgoclient/dasgoclient_amd64 --format=json --limit=0 --query 'file dataset=/RelValZEEMM_13_HI/CMSSW_10_3_0_pre2-103X_upgrade2018_realistic_v2-v1/GEN-SIM | grep file.name | sort file.name | unique'  --threshold=900 -aggFileLumis

@smuzaffar
Copy link
Contributor Author

@vkuznet , think look good now. I ran all the active queries and only two failed [a]. We need to make -aggFileLumis default but it should only work if lumi is search in the query (see my comment #37052 (comment)). Also I noticed that auto_cross_section was float but now it is integer

-          "auto_cross_section": 0.0,
+          "auto_cross_section": 0,

is this a correct change?

[a]

########################################
file dataset=/ZeroBias/Run2017F-SiStripCalMinBias-PromptReco-v1/ALCARECO site=T2_CH_CERN | grep file.name
{"status":"ok", "ecode":"", "mongo_query":{"query":"file dataset=/ZeroBias/Run2017F-SiStripCalMinBias-PromptReco-v1/ALCARECO   detail=true | grep file.name | grep file.name | sort file.name | unique"
,"hash":"4fc6568d011e916225e8620be394be87","spec":{"dataset":"/ZeroBias/Run2017F-SiStripCalMinBias-PromptReco-v1/ALCARECO"},"fields":["file"],"pipe":"grep file.name | grep file.name | sort file.name
| unique","instance":"prod/global","detail":true,"system":"","filters":{"grep":["file.name","file.name"],"sort":["file.name"],"unique":["1"]},"aggregators":[],"error":"","tstamp":1645812803}, "nresul
ts":1, "timestamp":1645812837, "ctime":33, "data":[
{"das":{"expire":1645812837,"instance":"prod/global","primary_key":"file.name","record":1,"services":["dbs3:files_via_dataset"]},"file":[{"das":{"expire":1645813437,"instance":"prod/global","primary_
key":"file.name","record":0,"services":["das:NA"],"status":"ok","ts":1645812837},"qhash":"4fc6568d011e916225e8620be394be87","query":"file dataset=/ZeroBias/Run2017F-SiStripCalMinBias-PromptReco-v1/AL
CARECO   detail=true | grep file.name | grep file.name | sort file.name | unique"}],"qhash":"4fc6568d011e916225e8620be394be87"}
]
}
########################################
file dataset=/ZeroBias/Run2017C-SiStripCalMinBias-09Aug2019_UL2017-v1/ALCARECO site=T2_CH_CERN | grep file.name
{"status":"ok", "ecode":"", "mongo_query":{"query":"file dataset=/ZeroBias/Run2017C-SiStripCalMinBias-09Aug2019_UL2017-v1/ALCARECO   detail=true | grep file.name | grep file.name | sort file.name | unique","hash":"f163befeb23e7bb1c139048150753d12","spec":{"dataset":"/ZeroBias/Run2017C-SiStripCalMinBias-09Aug2019_UL2017-v1/ALCARECO"},"fields":["file"],"pipe":"grep file.name | grep file.name | sort file.name | unique","instance":"prod/global","detail":true,"system":"","filters":{"grep":["file.name","file.name"],"sort":["file.name"],"unique":["1"]},"aggregators":[],"error":"","tstamp":1645812822}, "nresults":1, "timestamp":1645812837, "ctime":14, "data":[
{"das":{"expire":1645812837,"instance":"prod/global","primary_key":"file.name","record":1,"services":["dbs3:files_via_dataset"]},"file":[{"das":{"expire":1645813437,"instance":"prod/global","primary_key":"file.name","record":0,"services":["das:NA"],"status":"ok","ts":1645812837},"qhash":"f163befeb23e7bb1c139048150753d12","query":"file dataset=/ZeroBias/Run2017C-SiStripCalMinBias-09Aug2019_UL2017-v1/ALCARECO   detail=true | grep file.name | grep file.name | sort file.name | unique"}],"qhash":"f163befeb23e7bb1c139048150753d12"}
]
}
Total queries: 3761
Found in object store: 0
DAS Search: 2127
Total Queries Failed: 2

@vkuznet
Copy link
Contributor

vkuznet commented Feb 25, 2022

Please check again the executable, now I added aggregation by default for file-lumis, so you can query it as usual:

~/public/dasgoclient/dasgoclient_amd64 -query="lumi,file dataset=/HLTPhysics/Commissioning2021-v1/RAW run=346512"

and, new executable correctly returns file query output, see output of

~/public/dasgoclient/dasgoclient_amd64 -query="file dataset=/RelValZEEMM_13_HI/CMSSW_10_3_0_pre2-103X_upgrade2018_realistic_v2-v1/GEN-SIM"

@smuzaffar
Copy link
Contributor Author

thanks, I am testing it now

@vkuznet
Copy link
Contributor

vkuznet commented Feb 25, 2022

let's do one issue at a time. Once we have file,lumis in place, then I can correct other deviations in data types.

@smuzaffar
Copy link
Contributor Author

looks good, both lumi,file and file queries worked. Also das-selected-lumis.py works too

@vkuznet
Copy link
Contributor

vkuznet commented Feb 25, 2022

Regarding auto_cross_section value. The new DBS server does not perform any data manipulation (as Python server did) and only stream data directly from ORACLE DB. Even though auto_cross_section is declared in ORACLE schema as float, the ORACLE itself return 0 , eg.

AUTO_CROSS_SECTION
------------------
LOGICAL_FILE_NAME
--------------------------------------------------------------------------------
                 0
/store/data/Run2017F/ZeroBias/ALCARECO/SiStripCalMinBias-PromptReco-v1/000/306/4
60/00000/30FAF89E-9AC8-E711-A6FF-02163E01A509.root

This is why you get integer value instead of float in DAS output. I will not change anything in DBS server to correct this since 0 is proper value for float data type.

@vkuznet
Copy link
Contributor

vkuznet commented Feb 25, 2022

ok, I'm glad that new executable works. Please confirm how many queries (if any are failing). If all tests pass I will push my changes and create new dasgoclient release via PR.

@smuzaffar
Copy link
Contributor Author

all active queries ( 2130 in total ) look good, please go ahead and push the changes

@vkuznet
Copy link
Contributor

vkuznet commented Feb 25, 2022

ok , I made new PR here cms-sw/cmsdist#7650

@vkuznet
Copy link
Contributor

vkuznet commented Feb 25, 2022

But I also noticed that something has changed in build procedure as I no longer able to build dasgoclient using my version of build script:

./build.sh dasgoclient
pkg(s): dasgoclient
upload:
Traceback (most recent call last):
  File "pkgtools/cmsBuild", line 3737, in <module>
    factory = PackageFactory(opts)
  File "pkgtools/cmsBuild", line 1198, in __init__
    self.__cmsosDumper = cmsosDumperClass (options.cmsdist)
  File "pkgtools/cmsBuild", line 1137, in __init__
    cmsosFile = open (cmsosFilename)
IOError: [Errno 2] No such file or directory: '/afs/cern.ch/work/v/valya/builds/cmsdist/cmsos.file'

where should I get cmsos.file ? and how should I adjust my build script for that. You can see my build script here /afs/cern.ch/user/v/valya/public/build.sh

@smuzaffar
Copy link
Contributor Author

cmsos.file was removed from cmsdist. which cmsdist and pkgtools branches you are using?

@vkuznet
Copy link
Contributor

vkuznet commented Feb 25, 2022

The pkgtools is V00-30-XX and for cmsdist I use IB/CMSSW_12_3_X/master as starting point.

@smuzaffar
Copy link
Contributor Author

you need pkgtools V00-34-XX

@vkuznet
Copy link
Contributor

vkuznet commented Feb 26, 2022

thanks, I confirm that with V00-34-XX I can build now again. I think we may close this ticket.

@smuzaffar
Copy link
Contributor Author

This has been fixed via cms-sw/cmsdist#7650 also there is now a unit tests to #37072 check checks that das-selected-lumis.py can properly parse dasgoclient

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants