RTMinuit in rootpy? #57

Open
cdeil opened this Issue Nov 9, 2012 · 13 comments

3 participants

@cdeil
the rootpy project member

We already have included root_numpy in rootpy, see issue #13 . @piti118 has made another piece of ROOT / python goodness, https://github.com/piti118/RTMinuit , a python-friendly ROOT.TMinuit wrapper.

@piti118 Are we allowed to include / modify it in rootpy?

There's also https://github.com/piti118/dist_fit , but I don't know if it is considered too specialized for rootpy. @ndawe and others: what do you think?

@cdeil cdeil was assigned Nov 9, 2012
@ndawe
the rootpy project member

Yes, I had a look at this RTMinuit recently... Wrapping TMinuit in rootpy has been on my TODO list for a while. This would be a great feature to have in rootpy.

My opinion is that dist_fit is too specialized. rootpy should continue to provide mainly just a layer on top of PyROOT while not replicating the abilities of other packages.

TMinuit should definitely be wrapped in rootpy though. RooStats would also be nice...

@piti118
the rootpy project member

Sure you can include/modify RTMinuit and dist_fit.

RTMinuit might inherit GPL from ROOT though.
But dist_fit is purely mine and I like it to be public domain-ish like MIT/BSD license.

RTMinuit:

The interface right now will get most jobs done but is not complete.
However, my pet project right now is to remove ROOT dependency from RTMinuit.

I don't remember how I wrote the setup.py but it might depend on cython.(check that when you include it in rootpy though)

I found python really lacking in good robust and fast Minimizer that gives error matrix. I know there is PyMinuit and PyMinuit2. I do not like the output printing of both since it doesn't seem to point out exactly what went wrong in minimization: variable hit the limit and which variable got so far off to nan. This is one area where ROOT get it absolutely right. The other one they get it right is the IO(which is why there is root_numpy). But plotting.... we all know how well that works.

But I'd say RTMinuit would not be very useful without Likelihood/Chi^2 functor from dist_fit though. It's given in tutorial how one would make a chi^2 one. Likelihood, simple in concept, but to implement it, one will need to get edges details, which is not so trivial and happens right at the minimum, right though. I learned this first hand.

dist_fit:

I'd say it's very very unpolished. API is very inconsistence. I wrote it to get my thesis done(which I should be doing right now) replacing ROOFIT, which is good software but.. I'd say it make difficult thing easy but easy thing more difficult that it should(due to the way C++ work) even with Python wrapper and ROOFIT workspace.

This is the area where Python is seriously lacking in general not just for high energy physics. I can see great need for it. Scipy fitting is subpar in my opinion. It doesn't seem to give error matrix which is crucial for high energy physics analysis or any serious statistical analysis. I think it only does something like chi^2 minimization and not likelihood minimization. I might be too stupid to make it work I don't know. The interface is sort of clunky for using custom distribution. I'm not even sure how to get linear+guassian done in scipy (and making sure that my distribution is properly normalized for given range). Plus, we physicists use a exotic function like crytalball, convoluted Breit–Wigner, ARGUS which is not included in scipy. Take distfunc.pyx, if you want those I wrote some of them that I use in my thesis.

Because of this need in general, I want to remove RTMinuit from ROOT dependency. Lots of other people will benefit from it. And ROOT is a big dependency. Most people in other field don't have it installed. But, again, feel free to take RTMinuit.

@cdeil
the rootpy project member

@piti118 Thanks for making RTMinuit and dist_fit freely available.

I (like probably most scientists / engineers) also want better / simpler fitting tools in Python, especially when it comes to estimating the error matrix or computing profile likelihood errors, but also for implementing models and cost functions and interactive debugging of fit convergence ...

Just for reference, here are some existing Python fitting packages:

As you say scipy is seriously lacking.
I use Sherpa a lot and it is very nice, but it is hard to install and some things are probably too astronomy-specific to become mainstream.

I don't understand the problem you describe with PyMinuit and PyMinuit2, can you give an example to RTMinuit where PyMinuit and PyMinuit2 have problems, but RTMinuit works?
Just a thought: would it be better to integrate Minuit in scipy or keep your new Python Minuit wrapper a standalone project?

I know this is getting way off track, but I have been thinking about this lately: do you know any parallel optimizer projects, e.g. based on ZeroMQ / ipython.parallel that would be able to use all cores in my Computer or a 100 CPU cluster?

Let's discuss more what we want for rootpy at our upcoming coding sprint ...

@piti118
the rootpy project member

One problem with PyMinuit is that it doesn't support callable object. I patched it https://github.com/piti118/pyminuit/commits/master. See example in tutorial where I show how to make a generic chi^2 that takes function and data as argument. This is the way it should be. Everyone shouldn't be writing their own chi^2/likelihood function.

But, again it's just my taste that I don't like how it print stuff out while minimizing. Remember how TMinuit print out every 100 call something like

 MIGRAD MINIMIZATION HAS CONVERGED.
 MIGRAD WILL VERIFY CONVERGENCE AND ERROR MATRIX.
 FCN=6.7306e-18 FROM MIGRAD    STATUS=CONVERGED      35 CALLS          36 TOTAL
                     EDM=1.34612e-17    STRATEGY= 1      ERROR MATRIX ACCURATE 
  EXT PARAMETER                                   STEP         FIRST   
  NO.   NAME      VALUE            ERROR          SIZE      DERIVATIVE 
   1  x            2.00000e+00   1.00000e+00   4.88281e-04  -1.82002e-09
   2  y            3.00000e+00   1.00000e+00   4.88281e-04   2.50530e-10
   3  z            4.00000e+00   1.00000e+00   4.88281e-04  -4.85254e-09

And it tells you hey variable x is at the limit(not in this example). PyMinuit prints out something like:

 FCN Result | Parameter values
-------------+--------------------------------------------------------
     10107.1 |           10           10
     10107.1 |      10.0049           10
     10107.1 |      9.99509           10

To me, Root Minuit is much more informative. Printing stuff couple tightly into both package modifying it will need quite an understand on how the code works.

For parallelizing job:

I know this thing should be parallel in nature. I think I tried using cython builtin parallel capability(based on MPL I think) but... since minuit has to support python callable object and there seems to be no way to tell cython that "hey this python function is actually reentrant(doesn't change any field inside object). So quit complaining and just compile it." And there is that dread GIL(workaround-able with zeromq though).

I believe this is correct. With unmodified TMinuit, as implemented in ROOFIT, parallelizing jobs is only useful for unbinned fit. Where it splits the data in multiple chunks and give it to each worker. For binned fit, I don't know but I typically have only 50-100 bins, for 1D fit. Overhead from zeromq will definitely kill the usefulness in this case.

Any of the package you list do unbinned/binned likelihood fitting? I took a quick look but don't find any.

@cdeil
the rootpy project member

Sherpa has binned likelihood and chi2 statistics. I don't think they have unbinned fit statistics implemented, though. And you can implement any model in Python or C or Fortran and use it in their fitting framework. Everything is modular, you can use any optimizer with any fit statistic and model and you can extend it with your own optimisers / fit statistics / models. Also they have great documentation. Check this out:
http://cxc.harvard.edu/sherpa4.4/statistics/index.html
http://cxc.cfa.harvard.edu/sherpa/methods/index.html
http://cxc.cfa.harvard.edu/sherpa/threads/user_model/
http://python4astronomers.github.com/fitting/sherpa.html

The only "problem" (a.k.a. feature for X-ray astronomers) is that all their data and model classes are X-ray astronomy specific, but if you're willing to overlook this I think it's the best modelling / fitting package available in Python. Although I have to admit I haven't worked with RooFit yet, it looks pretty impressive but with a steep learning curve and possibly slow from Python?

I once started writing a Python optimizer tutorial where I solved a few simple fitting problems (using small datasets, simple analytical models and binned likelihood and chi2 fit statistic) with each tool. It was very hard to get consistent results and understand what each tool does in detail (e.g. different ways to integrate models over bins or to handle parameter bounds). If you're interested I can clean that up and put it on github.

Another project I have started to work on, http://gammalib.sourceforge.net , does have binned and unbinned likelihood using a Levenberg-Marquardt optimizer. It's C++ with swig Python bindings and GPL licensed. Although like Sherpa, the data and model classes are astronomy specific (gamma-ray astronomy in this case).

@piti118
the rootpy project member

ROOFIT is pure C++. It's very fast. But, yes very very very steep learning curve. But, I'd say it's very good designed given it's in C++.

Is sherpa written in C/C++? Can custom function be plugged right(and normalized) in like in dist_fit tutorial involving two gaussian function added together?

Piti

@cdeil
the rootpy project member

I don't think there's an official sherpa repo clone on github, but if you want to have a look, here's one:
https://github.com/cxcdev/sherpa
If you look at the github code analysis (top right on the main page) you see it's 59% python, 30% C++, 8% C and 3% Fortran. E.g. their optimizers are self-written (but already with a decade of testing by many users) C++, I think it's all GPL licensed.

Here's how to define any model in Python or C:
http://cxc.harvard.edu/sherpa/threads/user_model/

But for your double-Gaussian example you'd use the built-in Gauss1D or NormalizedGauss1D object and combine it in a BinaryOpModel. While it's all object-oriented underneath, most user's only call functions that instantiate objects and keep track of different data sets and models via a global session state. It's quite a bit of python magic underneath, but the advantage is that the end-user can learn to do complex models and fits in minutes. Your example would look something like this:

dataspace1d(-5, 15, numbins=100) # define dataspace
set_model(gauss1d.a + gauss1d.b) # define model
a.ampl, a.pos, a.fwhm = ... 
# note that they use the full-width at half maximum, not sigma
# http://cxc.harvard.edu/sherpa/ahelp/gauss1d.html
b.ampl, b.pos, b.fwhm = ...
fake() # Fake some poisson count data
set_stat('cash') # We want to do a poisson likelihood fit
fit() # minimize
covar() # compute HESSE-style errors
conf() # compute MINOS-style errors
plot_fit() # Makes a nice plot of data & model

The Sherpa architecture is described a bit here:
http://conference.scipy.org/proceedings/scipy2009/paper_8/full_text.pdf

If I find the time I'll redo your RTMinuit and dist_fit examples with Sherpa in the next days...

@cdeil
the rootpy project member

For reference: found two more Python fitting packages I haven't tried yet:

I think it's worth doing the survey of how existing Python modelling and fitting packages to see what their capabilities and concepts are:
https://github.com/cdeil/python_modelling_fitting

@piti118
the rootpy project member

It looks very similar to dist_fit. It was probably built with the same philosophy. User should be able to specify just data and PDF and have it fit. No user should be forced to use complicated wrapper to satisfy any preset interface. Any def f(x,y): blah should just work. This is possible in python since it knows the name and position of all the arguments. And things that should be fast should not be written in Python.

I'm biased but I like my interface more X-D. Since you have been using sherpa. May be you can comment on the interface of dist_fit which is a mixed of ROOFIT and PyMinuit interface. We can make a really good fitting tool combining all the goodness.

@piti118
the rootpy project member

A quick look at all of them. sherpa and dist_fit has lowest learning curve: they are very very similar. I probably wouldn't have written dist_fit if I found sherpa.

The Rest are either:

  • Need more than 5 lines to do a custom fit. Usually this means it needs some wrapper.
  • Doesn't do likelihood fit. Chi2 is good for most people though... I wouldn't blame them.
  • Doesn't give you error matrix.
  • Repeat yourself like crazy. I define PDF with def f(x,y) already why should I have to tell minimizer the name of argument again?
@cdeil
the rootpy project member

I'll have a closer look at dist_fit tomorrow, I only tried it very briefly today.
Can it do 2D models?
Can it do convolved models, i.e. e.g. take a Gaussian detector resolution into account?

@piti118
the rootpy project member

No 2D model. I don't use it. So I didn't implement it. Although one could use the RTMinuit interface to implement one.

Yes it can convolve. (written in cython didn't use fftw and such though just do a straight multiply shift add them up). No specialized gaussian convolution though just normal straight up convolution.

@cdeil
the rootpy project member

RTMinuit (wrapping the Minuit in ROOT) has evolved into iminuit (wrapping standalone Minuit).

I don't know if it would still make sense to include a more Pythonic Minuit (or also for the other optimizers) wrapper in rootpy.

Loosely related: #166

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment