Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RF] RooCmdArg pythonization drops temporary RooArgSets too early #11421

Closed
1 task done
elusian opened this issue Sep 23, 2022 · 3 comments · Fixed by #11425
Closed
1 task done

[RF] RooCmdArg pythonization drops temporary RooArgSets too early #11421

elusian opened this issue Sep 23, 2022 · 3 comments · Fixed by #11425

Comments

@elusian
Copy link
Contributor

elusian commented Sep 23, 2022

  • Checked for duplicates

Describe the bug

When using the pythonization of some RooCmdArgs together with the RooArgSet pythonization, the created set is dropped too early.
It does not always cause a segfault (depends on the RooCmdArg) but it is visible in valgrind.
I've seen this happen with Minos, Slice, Project, Parameters and SelectVar, while I tested Conditional, Constrain, ExternalConstraints, Components and VisualizeError and they show no issue (not even in valgrind).
I think the difference is wether the RooArgSet is stored as a RooArgSet in the RooCmdArg (works fine) or as a TObject* (dropped early).
I've also noticed that this happens only when you combine the two pythonizations, e.g. both

model.fitTo(data, Minos = ROOT.RooArgSet(parameter))
model.fitTo(data, ROOT.RooFit.Minos({parameter}))

work fine, only

model.fitTo(data, Minos = {parameter})

is affected.

Expected behavior

No use after free

To Reproduce

import ROOT

x = ROOT.RooRealVar('x', '', 0, 1)
y = ROOT.RooRealVar('y', '', 0, 1)

mu = ROOT.RooRealVar("mu", "", 0.5, 0, 1)
gx = ROOT.RooGaussian("gx", "", x, mu, ROOT.RooFit.RooConst(0.2))
gy = ROOT.RooGaussian("gy", "", y, mu, ROOT.RooFit.RooConst(0.2))
g = ROOT.RooProdPdf("g", '', [gx, gy])

# works
condg1 = ROOT.RooProdPdf("g", '', {gx}, Conditional = (gy, {y}))
condg2 = ROOT.RooProdPdf("g", '', {gx}, Conditional = ({gy}, y))
condg3 = ROOT.RooProdPdf("g", '', {gx}, Conditional = ({gy}, {y}))

data = g.generate({x,y}, NumEvents = 10000)

# often segfaults, valgrind reports problems
g.fitTo(data, Minos = {mu}, PrintLevel = -1)

# both of these work
g.fitTo(data, ROOT.RooFit.Minos({mu}), PrintLevel = -1)
g.fitTo(data, Minos = ROOT.RooArgSet(mu), PrintLevel = -1)

# works
constr = ROOT.RooGaussian('c', '', mu, ROOT.RooFit.RooConst(0.5), ROOT.RooFit.RooConst(0.01))
g.fitTo(data, ExternalConstraints = {constr}, PrintLevel = -1)

# works
gWithConstr = ROOT.RooProdPdf('gWC', '', [g, constr])
gWithConstr.fitTo(data, Constrain = {mu}, PrintLevel = -1)

frame = x.frame()

# works
g.plotOn(frame, Components = {g})

# segfault/valgrind errors
g.plotOn(frame, Slice = {y})

# segfault/valgrind errors
g.plotOn(frame, Project = {y})

fitres = g.fitTo(data, PrintLevel = -1, Save = True)
g.plotOn(frame, VisualizeError = (fitres, {x}))

# valgrind reports problems
g.paramOn(frame, Parameters = {mu})

# segfault
datax = data.reduce(SelectVars = {x})

Setup

ROOT 6.26.07 from LCG dev4
ROOT master from LCG dev3

Additional context

@guitargeek
Copy link
Contributor

Hi, thanks for reporting this and trying out the new pythonizations! I guess you also know about to/from_numpy/pandas, which is quite useful too I think.

This RooArgSet issue is known, I just didn't fix it for all the command arguments yet. But I will do it know.

Since you seem to be a RooFit power use I wanted to ask you: do you have any suggestions for further Pythonizations or PyROOT-exclusive RooFit features?

guitargeek added a commit to guitargeek/root that referenced this issue Sep 23, 2022
As reported in issue root-project#11421, the fact that in the RooFit command
arguments, the `RooArgSet`s are often stored by pointer is still a
problem. I attempted before to circumvent the problem by detecting
potential lifetime issues by having a dedicated `RooArgSet&&` overloads
that resulted in RooCmdArgs that owned copied of the RooArgSets.

However, this still didn't work for the case where the Python collection
pythonizations are used in PyROOT, because there the `RooArgSet &&`
overloads are not hit.

I realized now that the `RooCmdArg` class already has the slots to store
RooArgSets by value, which was alredy used by a few RooFit command
arguments. When these `set` slots are used, there are no ownership
issues.

This commit suggests to always use these `set` slots for RooArgSets.
This entailed changing the functions that make use of the command
arguments, migrating from `defineObject` to `defineSet`.

All the hacks to get owned copies of the RooArgSets and `RooArgSet&&`
overloads can now be removed.

Now that the slot indices for the sets was changed in many RooFit
command arguments, there is only the caveat that command arguments are
not backwards compatible anymore. However, this should not be any
problem because `RooCmdArgs` are not supposed to be written to files
anyway. To explicitely disallow this and prevent any silet backwards
compatibility issue, the class version of the RooCmdArg was set to zero
to disable IO.

Closes root-project#11421.
guitargeek added a commit to guitargeek/root that referenced this issue Sep 23, 2022
As reported in issue root-project#11421, the fact that in the RooFit command
arguments, the `RooArgSet`s are often stored by pointer is still a
problem. I attempted before to circumvent the problem by detecting
potential lifetime issues by having a dedicated `RooArgSet&&` overloads
that resulted in RooCmdArgs that owned copied of the RooArgSets.

However, this still didn't work for the case where the Python collection
pythonizations are used in PyROOT, because there the `RooArgSet &&`
overloads are not hit.

I realized now that the `RooCmdArg` class already has the slots to store
RooArgSets by value, which was alredy used by a few RooFit command
arguments. When these `set` slots are used, there are no ownership
issues.

This commit suggests to always use these `set` slots for RooArgSets.
This entailed changing the functions that make use of the command
arguments, migrating from `defineObject` to `defineSet`.

All the hacks to get owned copies of the RooArgSets and `RooArgSet&&`
overloads can now be removed.

Now that the slot indices for the sets was changed in many RooFit
command arguments, there is only the caveat that command arguments are
not backwards compatible anymore. However, this should not be any
problem because `RooCmdArgs` are not supposed to be written to files
anyway. To explicitely disallow this and prevent any silet backwards
compatibility issue, the class version of the RooCmdArg was set to zero
to disable IO.

Closes root-project#11421.
@elusian
Copy link
Contributor Author

elusian commented Sep 24, 2022

Hi, I thought this had been fixed with the addition of RooCmdArg::take (which is used even in the failing arguments), so this seemed to be a new bug.

I know about the numpy/pandas conversion, but I have not tried them yet as my analysis is TTree based.
While I can be considered a RooFit power user, I don't have that much experience with its PyROOT bindings.
I started using RooFit from PyROOT only relatively recently, because especially in ROOT 6.26 it's much nicer.

From my experience now:

  • sometimes I would have found useful to be able to pass python number anywhere a RooAbsReal is required, although I suspect this may require a pythonization for each pdf.
  • RooSimultaneous map constructor does not accept a python dictionary yet
  • one thing that surprised me a couple of times at the beginning is that RooAbsArg does not keep its servers alive from the python GC so you actually need the same workarounds as in C++ (importing frequently to a workspace).
    However, I suspect that if they did keep servers alive, server redirection would likely lead to desync between the C++ and python views of the graph.

@guitargeek
Copy link
Contributor

Thanks a lot for your comment!

  • sometimes I would have found useful to be able to pass python number anywhere a RooAbsReal is required, although I suspect this may require a pythonization for each pdf

Yes, I would like this too, but it's technically not easy to implement without changing the source for all PDFs. Maybe I will have an idea at some point as I learn more about PyROOT, but for now I have none

  • RooSimultaneous map constructor does not accept a python dictionary yet

That's a very good idea!

  • one thing that surprised me a couple of times at the beginning is that RooAbsArg does not keep its servers alive from the python GC so you actually need the same workarounds as in C++ (importing frequently to a workspace).
    However, I suspect that if they did keep servers alive, server redirection would likely lead to desync between the C++ and python views of the graph.

That's a pretty good idea too. I guess one can simply create new Python references to each server that are set as an attribute of the server, such that they are always kept alive. But you're right, server redirection would break this, unless there are Pythonizations for that one too.... So I still need to think if this is worth it, also considering that users can also use the RooWorkspace factory interface to create PDFs, which doesn't have this problem.

guitargeek added a commit that referenced this issue Sep 26, 2022
As reported in issue #11421, the fact that in the RooFit command
arguments, the `RooArgSet`s are often stored by pointer is still a
problem. I attempted before to circumvent the problem by detecting
potential lifetime issues by having a dedicated `RooArgSet&&` overloads
that resulted in RooCmdArgs that owned copied of the RooArgSets.

However, this still didn't work for the case where the Python collection
pythonizations are used in PyROOT, because there the `RooArgSet &&`
overloads are not hit.

I realized now that the `RooCmdArg` class already has the slots to store
RooArgSets by value, which was alredy used by a few RooFit command
arguments. When these `set` slots are used, there are no ownership
issues.

This commit suggests to always use these `set` slots for RooArgSets.
This entailed changing the functions that make use of the command
arguments, migrating from `defineObject` to `defineSet`.

All the hacks to get owned copies of the RooArgSets and `RooArgSet&&`
overloads can now be removed.

Now that the slot indices for the sets was changed in many RooFit
command arguments, there is only the caveat that command arguments are
not backwards compatible anymore. However, this should not be any
problem because `RooCmdArgs` are not supposed to be written to files
anyway. To explicitely disallow this and prevent any silet backwards
compatibility issue, the class version of the RooCmdArg was set to zero
to disable IO.

Closes #11421.
vgvassilev pushed a commit to vgvassilev/root that referenced this issue Oct 1, 2022
As reported in issue root-project#11421, the fact that in the RooFit command
arguments, the `RooArgSet`s are often stored by pointer is still a
problem. I attempted before to circumvent the problem by detecting
potential lifetime issues by having a dedicated `RooArgSet&&` overloads
that resulted in RooCmdArgs that owned copied of the RooArgSets.

However, this still didn't work for the case where the Python collection
pythonizations are used in PyROOT, because there the `RooArgSet &&`
overloads are not hit.

I realized now that the `RooCmdArg` class already has the slots to store
RooArgSets by value, which was alredy used by a few RooFit command
arguments. When these `set` slots are used, there are no ownership
issues.

This commit suggests to always use these `set` slots for RooArgSets.
This entailed changing the functions that make use of the command
arguments, migrating from `defineObject` to `defineSet`.

All the hacks to get owned copies of the RooArgSets and `RooArgSet&&`
overloads can now be removed.

Now that the slot indices for the sets was changed in many RooFit
command arguments, there is only the caveat that command arguments are
not backwards compatible anymore. However, this should not be any
problem because `RooCmdArgs` are not supposed to be written to files
anyway. To explicitely disallow this and prevent any silet backwards
compatibility issue, the class version of the RooCmdArg was set to zero
to disable IO.

Closes root-project#11421.
@root-project root-project deleted a comment from github-actions bot Oct 3, 2022
@root-project root-project deleted a comment from github-actions bot Oct 3, 2022
@root-project root-project deleted a comment from github-actions bot Oct 3, 2022
@root-project root-project deleted a comment from github-actions bot Oct 3, 2022
@root-project root-project deleted a comment from github-actions bot Oct 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants