Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow users to supply more metadata, and pass it to executors. #502

Merged
merged 6 commits into from
Apr 20, 2021

Conversation

lgray
Copy link
Collaborator

@lgray lgray commented Apr 13, 2021

Fixes #479

Still a bit of a WIP.
At the very least this implementation does pass the metadata to the processors correctly.

@lgray lgray marked this pull request as draft April 13, 2021 13:38
@lgray
Copy link
Collaborator Author

lgray commented Apr 13, 2021

@nsmith- take a look. Lemme know if there's a better place to put things.

@lgray lgray marked this pull request as ready for review April 13, 2021 13:57
@lgray
Copy link
Collaborator Author

lgray commented Apr 13, 2021

also @alexander-held this is what you were more or less asking for right?

@alexander-held
Copy link
Contributor

Yes, this looks exactly like what I had in mind. Thanks for the work on this!

@lgray
Copy link
Collaborator Author

lgray commented Apr 14, 2021

Cool, could you give this branch a try and see if you immediately run into some problem?
I'd like to implement this more correctly, but if it works as it is now and doesn't break things... Well why not.

@alexander-held
Copy link
Contributor

I gave this a try and it worked as expected. I find this quite useful for bookkeeping, as it allows me to gather metadata in one central place as opposed to having to pass it through the processor constructor.

@lgray
Copy link
Collaborator Author

lgray commented Apr 15, 2021

@nsmith- you ok with the hacky version for now?

@nsmith-
Copy link
Member

nsmith- commented Apr 15, 2021

Can we encapsulate the metadata instead of putting it at top level at least?
e.g.

fileset = {
    "mydataset": {
        "files": ["blah.root"],
        "metadata": {"thing1": 1},
    }
}

@lgray
Copy link
Collaborator Author

lgray commented Apr 15, 2021

Sure, had been on the fence about it. Makes some things cleaner too.

@alexander-held
Copy link
Contributor

I guess the access would still be events["metadata"]["thing1"] instead of events["metadata"]["metadata"]["thing1"]?

@nsmith-
Copy link
Member

nsmith- commented Apr 15, 2021

Right, in that sense it is a little confusing since dataset is set (presumably if "dataset" were specified in the metadata it would override the current value?)

@lgray
Copy link
Collaborator Author

lgray commented Apr 15, 2021

can just make dataset a protected name...

@lgray
Copy link
Collaborator Author

lgray commented Apr 19, 2021

@nsmith- this look ok-enough for you?
@alexander-held please give this a try real quick.

Now you do:

filelist = {
        "DummyBad": {"treename": "Events", "files": [osp.abspath("tests/samples/non_existent.root")]},
        "ZJets": {"treename": "Events", "files": [osp.abspath("tests/samples/nano_dy.root")],
                  "metadata": {"checkusermeta": True, "someusermeta": "hello"} },
        "Data" : {"treename": "Events", "files": [osp.abspath("tests/samples/nano_dimuon.root")],
                  "metadata": {"checkusermeta": True, "someusermeta2": "world"} }
    }

and it will complain at you if you try to use any reserved words.

@alexander-held
Copy link
Contributor

I gave this a try, it worked as expected. One minor suggestion: In the error message

ValueError: Reserved word "treename" in fileset dictionary, please rename this entry!

it might be useful to include "metadata"? Otherwise it may sound as if the top-level "treename": "Events" is to blame.

@lgray
Copy link
Collaborator Author

lgray commented Apr 20, 2021

@nsmith- good from your side?

@lgray lgray merged commit 56f6d51 into CoffeaTeam:master Apr 20, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Passing custom fileset metadata through run_uproot_job
3 participants