This repository has been archived by the owner on May 28, 2021. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 7
Duplicate entries in JSON output #4287
Comments
Robin,
there is nothing wrong with output since DAS is an "aggregation system". It
queries all available APIs from different CMS data-services and aggregate
them in single output. In your particular case DAS queried DBS datasets and
dataselist APIs (you can see it under services key in JSON output). If
you'll place file query it will query DBS and Phedex Apis, for run it will
query DBS and RunRegistry and ConditionDB, etc. It is hard to decide which
API is a "main" but different APIs serve different use-cases, e.g. they may
provide different details or different piece of information. This is by
design of DAS.
The plain data-format does not show duplicates though to make it convenient
to end-users to cut and paste.
Does it answer your concern?
…On Fri, May 18, 2018 at 1:06 PM, Robin ***@***.***> wrote:
Dear developers,
I am noticing duplicate entries when performing a simple dataset query
with das_client, but only when asking for the JSON output format.
e.g. If I do:
>> dasgoclient -query="dataset=/QCD_Pt-15to20_MuEnrichedPt5_TuneCUETP8M1_13TeV_pythia8/RunIISummer16MiniAODv2-PUMoriond17_80X_mcRun2_asymptotic_2016_TrancheIV_*/MINIAODSIM"
/QCD_Pt-15to20_MuEnrichedPt5_TuneCUETP8M1_13TeV_pythia8/RunIISummer16MiniAODv2-PUMoriond17_80X_mcRun2_asymptotic_2016_TrancheIV_v6-v1/MINIAODSIM
but if I do
>> dasgoclient -json -query="dataset=/QCD_Pt-15to20_MuEnrichedPt5_TuneCUETP8M1_13TeV_pythia8/RunIISummer16MiniAODv2-PUMoriond17_80X_mcRun2_asymptotic_2016_TrancheIV_*/MINIAODSIM"
[
***@***.******@***.***","creation_date":1480970022,"creation_time":1480970022,"data_tier_name":"MINIAODSIM","dataset_access_type":"VALID","dataset_id":13294650,"datatype":"mc","last_modification_date":1481196236,"last_modified_by":"vlimant","modification_time":1481196236,"modified_by":"vlimant","name":"/QCD_Pt-15to20_MuEnrichedPt5_TuneCUETP8M1_13TeV_pythia8/RunIISummer16MiniAODv2-PUMoriond17_80X_mcRun2_asymptotic_2016_TrancheIV_v6-v1/MINIAODSIM","physics_group_name":"NoGroup","prep_id":"BTV-RunIISummer16MiniAODv2-00029","primary_dataset.name":"QCD_Pt-15to20_MuEnrichedPt5_TuneCUETP8M1_13TeV_pythia8","primary_ds_name":"QCD_Pt-15to20_MuEnrichedPt5_TuneCUETP8M1_13TeV_pythia8","primary_ds_type":"mc","processed_ds_name":"RunIISummer16MiniAODv2-PUMoriond17_80X_mcRun2_asymptotic_2016_TrancheIV_v6-v1","processing_version":1,"status":"VALID","xtcrosssection":null}],"qhash":"04bf17ae46d481b865e64beb664fafa9"} ,
***@***.***","creation_date":1480970022,"data_tier_name":"MINIAODSIM","dataset_access_type":"VALID","dataset_id":13294650,"last_modification_date":1481196236,"last_modified_by":"vlimant","name":"/QCD_Pt-15to20_MuEnrichedPt5_TuneCUETP8M1_13TeV_pythia8/RunIISummer16MiniAODv2-PUMoriond17_80X_mcRun2_asymptotic_2016_TrancheIV_v6-v1/MINIAODSIM","physics_group_name":"NoGroup","prep_id":"BTV-RunIISummer16MiniAODv2-00029","primary_ds_name":"QCD_Pt-15to20_MuEnrichedPt5_TuneCUETP8M1_13TeV_pythia8","primary_ds_type":"mc","processed_ds_name":"RunIISummer16MiniAODv2-PUMoriond17_80X_mcRun2_asymptotic_2016_TrancheIV_v6-v1","processing_version":1,"xtcrosssection":null}],"qhash":"04bf17ae46d481b865e64beb664fafa9"}
]
Both entries have the same dataset name, same dataset_id, same prep_id,
etc. Diffing between the two entries I get the following:
{
"das":{
- "expire":1526662228,
+ "expire":1526662227,
"instance":"prod/global",
"primary_key":"dataset.name",
"record":1,
"services":[
- "dbs3:datasetlist"
+ "dbs3:datasets"
]
},
"dataset":[
{
"acquisition_era_name":"RunIISummer16MiniAODv2",
***@***.***",
+ ***@***.***",
"creation_date":1480970022,
+ "creation_time":1480970022,
"data_tier_name":"MINIAODSIM",
"dataset_access_type":"VALID",
"dataset_id":13294650,
+ "datatype":"mc",
"last_modification_date":1481196236,
"last_modified_by":"vlimant",
+ "modification_time":1481196236,
+ "modified_by":"vlimant",
"name":"/QCD_Pt-15to20_MuEnrichedPt5_TuneCUETP8M1_13TeV_pythia8/RunIISummer16MiniAODv2-PUMoriond17_80X_mcRun2_asymptotic_2016_TrancheIV_v6-v1/MINIAODSIM",
"physics_group_name":"NoGroup",
"prep_id":"BTV-RunIISummer16MiniAODv2-00029",
+ "primary_dataset.name":"QCD_Pt-15to20_MuEnrichedPt5_TuneCUETP8M1_13TeV_pythia8",
"primary_ds_name":"QCD_Pt-15to20_MuEnrichedPt5_TuneCUETP8M1_13TeV_pythia8",
"primary_ds_type":"mc",
"processed_ds_name":"RunIISummer16MiniAODv2-PUMoriond17_80X_mcRun2_asymptotic_2016_TrancheIV_v6-v1",
"processing_version":1,
+ "status":"VALID",
"xtcrosssection":null
}
],
so it looks like some fields have had their names changed, but otherwise
it's the same dataset.
I also tried adding status=VALID to my query as that is one of the
differences, but it returned an error:
>> das_client -query="dataset dataset=/QCD_Pt-15to20_MuEnrichedPt5_TuneCUETP8M1_13TeV_pythia8/RunIISummer16MiniAODv2-PUMoriond17_80X_mcRun2_asymptotic_2016_TrancheIV_*/MINIAODSIM status=VALID" -json
[
{"das":{"expire":1526663071,"instance":"prod/global","primary_key":"dataset.name","record":1,"services":["dbs3:datasetlist"]},"dataset":[{"error":"DBS unable to unmarshal the data into DAS record, api=datasetlist, data={\"exception\": 400, \"message\": \"Invalid Input Key status...\", \"type\": \"HTTPError\"}, error=json: cannot unmarshal object into Go value of type []mongo.DASRecord","name":null}],"qhash":"51862fd0a82574188f7a74ef70c978de"} ,
***@***.******@***.***","creation_date":1480970022,"creation_time":1480970022,"data_tier_name":"MINIAODSIM","dataset_access_type":"VALID","dataset_id":13294650,"datatype":"mc","last_modification_date":1481196236,"last_modified_by":"vlimant","modification_time":1481196236,"modified_by":"vlimant","name":"/QCD_Pt-15to20_MuEnrichedPt5_TuneCUETP8M1_13TeV_pythia8/RunIISummer16MiniAODv2-PUMoriond17_80X_mcRun2_asymptotic_2016_TrancheIV_v6-v1/MINIAODSIM","physics_group_name":"NoGroup","prep_id":"BTV-RunIISummer16MiniAODv2-00029","primary_dataset.name":"QCD_Pt-15to20_MuEnrichedPt5_TuneCUETP8M1_13TeV_pythia8","primary_ds_name":"QCD_Pt-15to20_MuEnrichedPt5_TuneCUETP8M1_13TeV_pythia8","primary_ds_type":"mc","processed_ds_name":"RunIISummer16MiniAODv2-PUMoriond17_80X_mcRun2_asymptotic_2016_TrancheIV_v6-v1","processing_version":1,"status":"VALID","xtcrosssection":null}],"qhash":"51862fd0a82574188f7a74ef70c978de"}
]
Please let me know if there's any other info you need. For reference I'm
using:
>> dasgoclient -version
Build: git=v01.01.09 go=go1.9.2 date=2018-05-18 18:59:20.970684859 +0200 CEST m=+0.244268001
Thanks,
Robin
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#4287>, or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAHo0r2WOZ0hUQUCb2MlPSWNH0_iw0HSks5tzv-RgaJpZM4UFEH8>
.
|
OK, I understand - it must be tricky to choose a "default" one given so many different use cases :) |
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Dear developers,
I am noticing duplicate entries when performing a simple dataset query with das_client, but only when asking for the JSON output format.
e.g. If I do:
but if I do
Both entries have the same dataset name, same dataset_id, same prep_id, etc. Diffing between the two entries I get the following:
so it looks like some fields have had their names changed, but otherwise it's the same dataset.
I also tried adding
status=VALID
to my query as that is one of the differences, but it returned an error:Please let me know if there's any other info you need. For reference I'm using:
Thanks,
Robin
The text was updated successfully, but these errors were encountered: