Op/optimization TODO: log determinant #150

dwf · 2011-10-24T15:46:19Z

Now that we have a determinant Op we should probably look into this.

There are numerically stable ways to compute the natural logarithm of the determinant, a quantity which is often needed in evaluating the log probability of a multivariate Gaussian distribution. NumPy provides this as numpy.slogdet for "sign log determinant" (it computes log(abs(det(X))) and also returns the sign of the determinant).

The derivative of the log determinant also has a particularly simple form as inv(X).T == inv(X.T).

The text was updated successfully, but these errors were encountered:

jaberg · 2011-10-24T15:51:46Z

Related: there is an optimization in linalg for the log det of PSD
matrices, it it the sum of the log of the diagonal of the cholesky
decomposition or something. I think in the case of PSD matrices that's
the best way to do it, it's better than slogdet.

On Mon, Oct 24, 2011 at 11:46 AM, David Warde-Farley
reply@reply.github.com
wrote:

Now that we have a determinant Op we should probably look into this.

There are numerically stable ways to compute the natural logarithm of the determinant, a quantity which is often needed in evaluating the log probability of a multivariate Gaussian distribution. NumPy provides this as numpy.slogdet for "sign log determinant" (it computes log(abs(det(X))) and also returns the sign of the determinant).

The derivative of the log determinant also has a particularly simple form as inv(X).T == inv(X.T).

Reply to this email directly or view it on GitHub:
#150

nouiz · 2011-10-24T15:57:42Z

David, can you be more explicit in what is needed? A new op that call numpy.slogdet and an optimization that replace log(abs(det(X)) by it? We need to do some speed comparison before enabling it by default.

jaberg · 2011-10-24T16:06:54Z

This would be a stability optimization, the speed shouldn't prevent
its application.

On Mon, Oct 24, 2011 at 11:57 AM, nouiz
reply@reply.github.com
wrote:

David, can you be more explicit in what is needed? A new op that call numpy.slogdet and an optimization that replace log(abs(det(X)) by it? We need to do some speed comparison before enabling it by default.

Reply to this email directly or view it on GitHub:
#150 (comment)

nouiz · 2011-10-24T16:25:35Z

I would be interested to know how much slower it is. Having a stability optimization that is 10 times slower is not the best. In that case, we should document it well and tell people that don't want to how to disable it.

dwf · 2011-10-24T16:30:20Z

We should be able to replace log(det(X)) with an slogdet Op as well, and just raise an error if the sign is negative (I guess this would be accomplished by an allow_negative keyword argument, the would default to False if log(det(X)) is replaced, and True if log(abs(det(X))) is replaced.

dwf · 2011-10-24T17:02:37Z

@nouiz: Newer versions of NumPy actually call slogdet when you call det, so I don't think it's much of a speed hit, if at all. In either case you are doing an LU factorization.

We have an old version of NumPy at the lab which does not include slogdet, so I looked at the source code and indeed it calls the same LAPACK routine, dgetrf (zgetrf for complex) as slogdet in the newer version.

I actually just did a comparison of the old det() implementation and slogdet:

In [28]: timeit linalg.slogdet(X)
100 loops, best of 3: 15.1 ms per loop

In [29]: timeit det(X)  # old implementation, copied from NumPy 1.4.1
100 loops, best of 3: 15.4 ms per loop

dwf · 2011-10-24T18:29:10Z

@jaberg That raises the interesting issue of how to know that certain intermediate results are positive-definite matrices, and how to mark them as such.

Seems like something that the user would need to specify in most cases, like x = posdef(tensor.matrix('x')) would mark in the graph somehow to assume that x is positive definite (either by wrapping in an Op that just computes the identity function, or by setting a property on the Variable object.

Any thoughts on the right way to do this?

nouiz · 2011-10-24T18:54:19Z

So for the speed, it seam good.

James implemented an Hint Env feature to allow the user to give hint to the compiler. He used it for the other optimization he talked about. This approch seam good for me, but it can take much time to implement correctly all case correctly. At least that is what took time with the ShapeFeature.

jaberg · 2011-10-24T19:11:33Z

On Mon, Oct 24, 2011 at 2:29 PM, David Warde-Farley
reply@reply.github.com
wrote:

@jaberg That raises the interesting issue of how to know that certain intermediate results are positive-definite matrices, and how to mark them as such.

Seems like something that the user would need to specify in most cases, like x = posdef(tensor.matrix('x')) would mark in the graph somehow to assume that x is positive definite (either by wrapping in an Op that just computes the identity function, or by setting a property on the Variable object.

Any thoughts on the right way to do this?

Check out how it's done in sandbox.linalg. there is a function
PSD_hint(x) that does exactly that.

James

harpone · 2016-01-27T19:35:30Z

Just out of curiosity, why hasn't this been done yet? I haven't coded any Theano ops, but I could give it a shot... maybe just a log abs det, since that's probably the most common case (in MLE estimation etc)?

FYI I'm hitting the stability issue pretty badly...

lamblin · 2016-01-27T20:39:34Z

I'm not sure why this has not been done yet, or maybe some people ended up implementing that in their own repositories.
If you want to give it a shot, it would be welcome. There should be a section on how to extend theano in the online doc. As long as you use python/numpy for the implementation and do not need a GPU version, it should be quite straightforward.
Please let us know if you have any issues or questions, either in a pull request or at theano-dev.

harpone · 2016-01-27T20:42:47Z

OK I'll give it a try!

On Wed, Jan 27, 2016, 22:39 Pascal Lamblin notifications@github.com wrote:

I'm not sure why this has not been done yet, or maybe some people ended up
implementing that in their own repositories.
If you want to give it a shot, it would be welcome. There should be a
section on how to extend theano in the online doc. As long as you use
python/numpy for the implementation and do not need a GPU version, it
should be quite straightforward.
Please let us know if you have any issues or questions, either in a pull
request or at theano-dev.

—
Reply to this email directly or view it on GitHub
#150 (comment).

harpone · 2016-01-28T08:25:19Z

Found a couple of implementations on Github. This one does an SVD first and takes the log of the diagonal (not sure why it's diagonal squared... need to check the math; also has a funny expression for the gradient... it's correct but computationally expensive).

This one uses numpy's slogdet.

Any preferences? SVD on GPU should be a simple copy/paste job I think, but I haven't found any CUDA/PyCUDA/gnumpy code for the slogdet...

nouiz · 2016-01-29T17:48:52Z

You could check inside numpy code how slogdet is implemented. Maybe it does
it with an svd? Can you compate the output of both op to see if the result
is the same?

There is svd code on the GPU somewhere, so with that, it wouldn't be too
hard to make a GPU implementation

On Thu, Jan 28, 2016 at 3:25 AM, Heikki Arponen notifications@github.com
wrote:

Found a couple of implementations on Github. This one
https://github.com/FlorianSeidel/GOL/blob/9f7716e78a78fbdd7ae8ecac5a58b701f6a4da41/gol/goal/constraints.py
does an SVD first and takes the log of the diagonal (not sure why it's
diagonal squared... need to check the math; also has a funny expression for
the gradient... it's correct but computationally expensive).

This one
https://github.com/kolia/subunits/blob/05f40cbc45214d80567b132c95f16913ca2a0cc7/kolia_theano.py#L12
uses numpy's slogdet.

Any preferences? SVD on GPU should be a simple copy/paste job I think, but
I haven't found any CUDA/PyCUDA/gnumpy code for the slogdet...

—
Reply to this email directly or view it on GitHub
#150 (comment).

harpone · 2016-01-29T18:38:05Z

It calls something in lapack routines in numpy/numpy/linalg/umath_linalg.c.src, which says in docstring that it "computes logdet from factored diagonal" so I guess it is SVD (or really an eigendecomposition since it's for square matrices)...

The square in the svd code was wrong as I suspected, but fixing that, the SVD and slogdet versions give slightly different results! Difference of about 1E-5 for 100*100 random normal matrices about 80% of time and zero difference 20% of time... happens for pure numpy/scipy versions too. Maybe there's some numerical stability issues in first doing the svd, then taking log and then sum??

I'll post the code below if anyone wants to take a look. I'm actually taking log abs det, because why take a log of a negative number...

So I'm guessing the slogdet version is numerically more accurate, which raises the question of how to do the SVD version accurately on a GPU... dammit!

PS. This will (hopefully) be my first ever PR, so I need some time to figure out how to code the tests, where to put them etc...

Numpy slogdet:

from theano.gof import Op, Apply
import numpy

matrix_inverse = T.nlinalg.MatrixInverse()

class LogAbsDet(Op):

    def make_node(self, x):
        x = theano.tensor.as_tensor_variable(x)
        o = theano.tensor.scalar(dtype=x.dtype)
        return Apply(self, [x], [o])

    def perform(self, node, (x,), (z,)):
        try:
            _, logabsdet = numpy.linalg.slogdet(x)
            z[0] = numpy.asarray(logabsdet, dtype=x.dtype)
        except Exception:
            print('Failed to compute determinant of {}.'.format(x))
            raise

    def grad(self, inputs, g_outputs):
        gz, = g_outputs
        x, = inputs
        return [gz * matrix_inverse(x).T]

    def __str__(self):
        return "LogAbsDet"

logabsdet = LogAbsDet()`

The SVD is the same, except now there's this inside the perform:

s = svd(x, compute_uv=False)
z[0] = np.asarray(np.sum(np.log(s)), dtype=x.dtype)

harpone · 2016-01-29T18:44:01Z

For float64 the difference is 1E-14:

from numpy.linalg import slogdet
from numpy.linalg import svd

n = 100
testmat = np.random.randn(n, n)

_, logabsdet_numpy_test_slogdet = slogdet(testmat)
logabsdet_numpy_test_svd = np.sum(np.log(svd(testmat, compute_uv=False)))

logabsdet_numpy_test_slogdet - logabsdet_numpy_test_svd

For float32 the difference is 1E-5!

from numpy.linalg import slogdet
from numpy.linalg import svd

n = 100
testmat = np.random.randn(n, n).astype(np.float32)

_, logabsdet_numpy_test_slogdet = slogdet(testmat)
logabsdet_numpy_test_svd = np.sum(np.log(svd(testmat, compute_uv=False)))

logabsdet_numpy_test_slogdet - logabsdet_numpy_test_svd

Anyone know the reason for this?

EDIT: craaaap something wrong with my numpy/scipy installation... can anyone test if you get the same errors? Probably not... need to reinstall/recompile numpy and scipy I guess...
EDIT2: the difference is definitely still there after fixing numpy!!

harpone · 2016-01-29T19:52:48Z

Given that the logdet values for the examples above are of order 1E+2, the relative error is around 1E-7...1E-8 for float32... maybe that's acceptable?

I think I should go with the direct SVD method then for the 'perform' method, if the GPU version is going to be also SVD...

lamblin · 2016-01-29T20:05:49Z

Yes, this kind of differences is really in the normal range for float32, these are even quite low.

harpone · 2016-01-30T14:36:11Z

OK did a PR here: #3954

By the way, SVD on GPU seems to be useful only when matrix dims go above 1000 * 1000, so maybe the GPU implementation is not that important yet? Or will there be some overhead from loading parameters back and forth between GPU memory and RAM??

This image is from this paper (not very fresh though...):

harpone · 2016-01-31T20:45:09Z

New PR due to me being a n00b: #3959

nouiz · 2016-01-31T23:01:22Z

The transfer to/from the GPU/CPU is to be avoided as much as possible. So
even if the op isn't faster, just for that reason, it can be useful to add
it.

On Sun, Jan 31, 2016 at 3:45 PM, Heikki Arponen notifications@github.com
wrote:

New PR due to me being a n00b: #3959
#3959

—
Reply to this email directly or view it on GitHub
#150 (comment).

harpone · 2016-02-01T07:26:59Z

OK got it

nouiz added the Stability label Feb 13, 2015

harpone mentioned this issue Feb 1, 2016

LogAbsDet #3959

Open

ricardoV94 mentioned this issue Mar 1, 2022

Gradient of SVD not implemented aesara-devs/aesara#836

Open

ricardoV94 mentioned this issue Nov 29, 2022

Implement gradient of SVD pymc-devs/pytensor#56

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Op/optimization TODO: log determinant #150

Op/optimization TODO: log determinant #150

dwf commented Oct 24, 2011

jaberg commented Oct 24, 2011

nouiz commented Oct 24, 2011

jaberg commented Oct 24, 2011

nouiz commented Oct 24, 2011

dwf commented Oct 24, 2011

dwf commented Oct 24, 2011

dwf commented Oct 24, 2011

nouiz commented Oct 24, 2011

jaberg commented Oct 24, 2011

harpone commented Jan 27, 2016

lamblin commented Jan 27, 2016

harpone commented Jan 27, 2016

harpone commented Jan 28, 2016

nouiz commented Jan 29, 2016

harpone commented Jan 29, 2016

harpone commented Jan 29, 2016

harpone commented Jan 29, 2016

lamblin commented Jan 29, 2016

harpone commented Jan 30, 2016

harpone commented Jan 31, 2016

nouiz commented Jan 31, 2016

harpone commented Feb 1, 2016

Op/optimization TODO: log determinant #150

Op/optimization TODO: log determinant #150

Comments

dwf commented Oct 24, 2011

jaberg commented Oct 24, 2011

nouiz commented Oct 24, 2011

jaberg commented Oct 24, 2011

nouiz commented Oct 24, 2011

dwf commented Oct 24, 2011

dwf commented Oct 24, 2011

dwf commented Oct 24, 2011

nouiz commented Oct 24, 2011

jaberg commented Oct 24, 2011

harpone commented Jan 27, 2016

lamblin commented Jan 27, 2016

harpone commented Jan 27, 2016

harpone commented Jan 28, 2016

nouiz commented Jan 29, 2016

harpone commented Jan 29, 2016

harpone commented Jan 29, 2016

harpone commented Jan 29, 2016

lamblin commented Jan 29, 2016

harpone commented Jan 30, 2016

harpone commented Jan 31, 2016

nouiz commented Jan 31, 2016

harpone commented Feb 1, 2016