-
Notifications
You must be signed in to change notification settings - Fork 82
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Possible to integrate CMS's Combine workflow? #344
Comments
After having some discussions at the US LUA meeting I think that we might want to talk with Josh Bendavid (@bendavid) about this. |
Since this is still an open issue, let me mention that on 25.6 there will be a meeting including Josh and others to talk about the implementations of binned/template pdfs in order to move the community in this niche closer together. |
Tagging @mattbellis and @benkrikler here, given the email conversation that Matt started (thanks Matt!) RE: what would need to be done to extend the HistFactory JSON |
I'll assign myself on this for now, since I have a small side project that is looking into Combine (part of my SUSY role in ATLAS) and want to explore some code for this. |
So it seems that CMS has added some rather complete tutorials that describe the Combine model (HT @kpedro88): |
together with #1188 it should be much more straight forward to built a combine-like model |
page @alexander-held |
I was curious about the possibility of converting datacards into pyhf workspaces and wrote a small utility https://github.com/alexander-held/datacard-to-pyhf. I do not know much about CMS Combine and the datacard format, so the implementation likely has a range of issues. The most glaring one is that it only supports single-bin channels (and no shape systematics) at the moment.
The paper reports |
awesome that's a great start.. taking on the simplest example and successively adding features was also pyhf's approach in general. tagging also @clelange |
In case its helpful @andrzejnovak put together a conda recipe for combine in cms-analysis/HiggsAnalysis-CombinedLimit#648 |
Are you able to run the "standalone" (works on a CernVM) version of combine [1]? that might help better compare the expected from combine vs pyHF (since the tutorial cards, even the more advanced ones are not identical to the real thing in the papers) [1] http://cms-analysis.github.io/HiggsAnalysis-CombinedLimit/#standalone-version-of-combine |
Could a standalone Docker image also be possible? Having no CVMFS dependence at all would be useful to allow running validations anywhere. |
@alexander-held check the PR Nick linked |
Having the conda version available is great! I view a ready-to-use Docker image as complementary to that (I guess with conda there is compilation involved?). |
Sure, just wanted to point out that with the conda env you can build the image on the fly as well without having to access cvmfs when compiling stuff. |
Any way to run standalone is fine. I'm not sure how well synched the version with conda env is with the main branch (the 102x vs 112x), but for this I don't think it really matters too much. Just wasn't sure whether the comparison by @alexander-held was a direct comparison of a combine run or not. |
@nucleosynthesis Yes, for the comparison with While writing this comment I noticed one discrepancy: my Would you recommend the Is there a good place to ask technical questions about Combine model building details (as a non-CMS member)? |
CombineHarvester is probably overkill, though it should be up to date. A while ago we made a python dumping option in the Datacard parser For discussions / Q's to the combine team, probably the easiest thing is to submit an issue here and add the label "question" for now. |
I had not noticed |
Just dumping here that @ajgilbert gave a session on Combine (:+1:) at the first hands-on workshop on publication of statistical models where the last 4 slides are relevant to pyhf and Combine interop and probability model preservation. |
There's a CMS Top workshop taking place this week where Combine will be discussed. It is at a time that I can't attend, but I'm going to try to reach out to the speaker(s) to see if there's any interest in understanding how / if we can create examples comparing and contrasting combine and pyhf. Building some hypersimple comparison examples is on a very long to-do list of mine. :) |
Hi! I don't know if this can be useful, but a while ago I started working on @alexander-held's repo with the intention of adding support for shape-based analyses datacards. You can find it here and the output can be tested e.g. with this datacard. A few huge disclaimers:
|
Question
CMS uses a tool called Combine which is built on top of RooStats/RooFit.
It seems very possible, as it appears that CMS' workspace is defined as a plaintext file called a
datacard
, to be able to provide adatacard2json
tool to translate the datacard into something usable bypyhf
.The text was updated successfully, but these errors were encountered: