WIP: Linear Algebra bikeshedding #2212

Closed
wants to merge 13 commits into
from

Projects

None yet
@ViralBShah
Member

Basically includes:

  1. Deprecate lud, chold, cholpd, qrd, qrpd
  2. lu, chol, cholpivot, qr, qrpivot now return Factorization objects
  3. documentation

[edit: The suitesparse stuff is now cherry-picked on the db/suitesparse branch/]

@StefanKarpinski
Member

This breaks a lot of code. I guess maybe we're ok with that and I can't see any other way, but just saying.

Member

If you used those names, the new names are exact replacements. The codes that break are the matlab names, which now return factorization objects instead of the matlab behaviour.

@StefanKarpinski
Member

Oh, never mind. It's in a pull request. Carry on.

@nolta
Member
nolta commented Feb 7, 2013

Here's a link to the previous chol vs chold argument: 826c42f#commitcomment-1816094

@carlobaldassi
Member

What is the rationale for removing the matlab-compatible versions? (Alternatively: was this discussed further?) I'm quite happy with the solution found at the end of the thread linked by @nolta, i.e. the current situation.

@ViralBShah
Member

What I do not like about the current situation is that we have all these great names that matlab uses (lu, qr, etc.) but they actually force you into computing in inefficient ways. I am proposing that we have defaults that are good and do the right thing in julia - i.e. all the linear algebra routines return Factorization objects, which allow for efficient solving of systems and such things, and use factors() to get the underlying factors in a way that is matlab compatible.

@ViralBShah
Member

With this pull request, what you get is this, which is often all you really want to do with these factorizations:

lufact = lu(A)
x = lufact \ b 

But one can always get the factors and do other things:

l,u,p = factors(lufact) # Matlab-style - just the extra `factors`
x = u \ l \ b[p]  # Matlab-style solve that is not efficient, but works
@nolta
Member
nolta commented Feb 7, 2013

So we're just going to have the same argument again?

@ViralBShah
Member

In general, we have recently, as a community, been taking decisions that favour sane behaviour over matlab compatibility. I think that this is a case where being matlab compatible actually hurts, especially since we do not have varargout that matlab does to make its APIs nicer.

@StefanKarpinski
Member

Having the Matlab-compatible versions and the *d and *d! versions is an awful lot of multiplicity here. Inflating the number of functions in the language by some non-negligible fraction is a lot of bending over backwards to maintain the Matlab-compatible and inferior interface.

@nolta
Member
nolta commented Feb 7, 2013

I guess inferior's in the eye of the beholder. I use chol all the time for monte carlo simulations, and factors(chol(A)) feels incredibly inferior.

@ViralBShah
Member

What do you do with the result of chol in your MC simulations?

@toivoh
Member
toivoh commented Feb 7, 2013

Since chol is a bit special (only one factor), perhaps it could deserve a shortcut like cholfactor to extract the factor?

@nolta
Member
nolta commented Feb 7, 2013

Sample from a multivariate Gaussian distribution.

@ViralBShah
Member

I like the idea of having cholfactor as a shortcut in the case of chol. @nolta would that work?

@ViralBShah
Member

We can also overload a few more methods for CholeskyDense objects to allow for the kind of things one might want to do.

@StefanKarpinski
Member

Needing cholfactor might be a sign of this change not being ideal. Perhaps we should give this particular change some more time? Can Gaussian sampling be done easily with and/or more efficiently with the factorization object?

@carlobaldassi
Member

For the record, what I use most often is sum(log(diag(chol(x))), i.e. the log of the square root of the determinant. Passing through det of the factor is not ok, since a lot of precision is lost that way (plus it's less efficient). This is already a long chain, putting factors in there is not the end of the world, but not particularly nice either. The only reasonable addition I can think of to the methods for CholeskyDense in this context would be having a logdet function. But I guess I'll end up writing my own shortcut anyway.

@andreasnoack
Member

There is already rand(MultivariateNormal(mean, var)) in Distributions which could be used for multivariate normal sampling but that is of course a long name. Two possibilities could be a randn(cov, dim) method and some *(Matrix, CholeskyDense) methods.

@carlobaldassi I think it would be a good idea to have a logdet.

@mlubin
Member
mlubin commented Feb 7, 2013

I support getting rid of the matlab compatibility. Factorization objects are a much more sane default.

@dmbates
Member
dmbates commented Feb 7, 2013

@nolta As @andreasnoackjensen pointed out, sampling from a multivariate normal can be done with rand(MultivariateNormal(...))from the Distributions package.

julia> using Distributions

julia> mvn = MultivariateNormal([1.,2,3], diagm([4.,5,6]))
MultivariateNormal distribution
Mean:
[1.0, 2.0, 3.0]
Covchol: CholeskyDense{Float64}(3x3 Float64 Array:
 2.0  0.0      0.0    
 0.0  2.23607  0.0    
 0.0  0.0      2.44949,'U')
Mean:
[1.0, 2.0, 3.0]
Variance:
3x3 Float64 Array:
 4.0  0.0  0.0
 0.0  5.0  0.0
 0.0  0.0  6.0

julia> srand(12321)

julia> rand(mvn)
3-element Float64 Array:
 2.34813
 4.22277
 1.4347 

julia> srand(12321)

julia> rand(mvn, 3, 20)
3x20 Float64 Array:
 2.34813   0.491271    3.73474  -1.58803  2.15756   6.94249   2.16482  -3.255631.34691  -1.97108  0.192781  2.69105  1.23021  3.86129  -1.64641  1.4843   -0.778525
 4.22277   1.74714     4.80336   4.99726  2.07695  -0.790902  1.7337   -1.32994      6.00423   2.7134   4.6624    2.69176  1.08644  4.08717   3.14195  3.62224  -3.74896 
 1.4347   -0.0141233  -1.76968   3.15905  1.85379   2.03352   1.69304   0.195228     3.5728    2.00291  4.46716   4.52372  2.83042  3.09832   7.28602  3.83717   6.57325 
@dmbates
Member
dmbates commented Feb 7, 2013

@carlobaldassi I think that adding a logdet method for CholeskyDense (and the soon-to-be CholeskySparse) type is the way to go. It would also be worthwhile adding diag methods for those types. The same could be done for the log-determinant and diagonal of LU factorizations and even the QR, operating on R.

The approach in R is to use the signature determinant(mat::Matrix{Number}, log::Bool) where log defaults to false. When log == true the value is the log of the determinant and + or - 1 for the sign. In the short term the important accessor is diag of the factorization.

@ViralBShah
Member

I really like the idea of providing methods such as diag, *, logdet for the Cholesky factorization and others where it makes sense.

In order to make this pull request complete, we need to do the same for svd, and perhaps eig as well. I am going to be on a very long flight back to India for the next couple of days, and may not be able to get around to this. If someone can implement a few of these things, it would be great - as I would love to have all this in 0.1 - assuming there is some kind of general consensus.

@ViralBShah
Member

@StefanKarpinski Cholesky factor is the only case of factorizations where you get one factor out, and it is often convenient to do stuff directly with that matrix. I do feel that by providing a few more methods for the factorization objects, we can have a unified view for matrix factorizations. In the case of Cholesky, it will just translate into doing operations on the matrix inside the factorization object.

@dmbates
Member
dmbates commented Feb 7, 2013

Speaking of the matrix embedded in the composite type CholeskyDense, is the idea of defining a macro @delegate still viable? I searched in the issues and didn't find an issue regarding that. I think Stefan mentioned the idea of a macro like

@delegate CholeskyDense LR size diag eltype 

to get definitions like

size(c::CholeskyDense) = size(c.LR)
size(c::CholeskyDense, d) = size(c.LR, d)
diag(c::CholeskyDense) = diag(c.LR)

etc.

[pao: indentation isn't a reliable way to get code blocks in GH Markdown, apparently]

@johnmyleswhite
Member

That was this thread: https://groups.google.com/forum/#!topic/julia-users/Wwn3KHmmm9I

Basically I didn't have time to get to it and I think I was on the hook for it because I was the first one to propose it. It's still as useful as ever, though.

@ViralBShah
Member

I like the idea, but not sure if @delegate is the right name for the macro.

@toivoh
Member
toivoh commented Feb 7, 2013

Any thoughts on the original @delegate formulation vs. something like what I called @template?

Also, is there a reason that MultivariateNormal couldn't be called just Normal? (perhaps Normal {Array}, if Normal{Real} would be a univariate normal)

@ViralBShah
Member

Giving a shout out to @alanedelman who is our resident matrix statistics expert.

@nolta
Member
nolta commented Feb 7, 2013

The existence of MultivariateNormal is kinda irrelevant. People are arguing that it's ok to redefine the matlab names, because the matlab approach is "[in]sane", "inefficient", and "inferior". I mentioned my use of chol simply to demonstrate that chol is perfectly sane, efficient, and ferior cromulent.

I'd like to hear more about adding *(Matrix, CholeskyDense) and diag. #1412 seems relevant.

@dmbates
Member
dmbates commented Feb 7, 2013

@toivoh In the Distributions package there are abstract types UnivariateDistribution and MultivariateDistribution. Many functions can have fallback methods defined on the abstract types. The type Normal inherits from UnivariateDistribution. See the discussion under JuliaStats/Distributions.jl#14

@StefanKarpinski
Member

The argument isn't that Matlab's chol is terrible – it isn't. The cases of chol and svd are the good ones because what Matlab returns is no less efficient than returning some kind of specialized factorization object. However, other cases like lu and qr are far less nice since it would often be more efficient to return a factorization object.

One way we might resolve this while keeping the appearance of Matlab compatibility would be to define start, next and done for factorization objects so that if you write L, U, P = lu(X) you get the factors out of the factorization object that's returned. Here's a toy example of the concept:

importall Base
type Foo end
start(::Foo) = 1
next(::Foo, i::Int) = ("factor $i", i+1)
done(::Foo, i::Int) = false

julia> a, b, c = Foo()
Foo()

julia> a, b, c
("factor 1","factor 2","factor 3")

That way you could either write LUP = lu(X) and use the factorization object directly, or write L, U, P = lu(X) and get the individual factor matrices.

In the particular case of chol this doesn't work since there's only a single returned value. However, shouldn't the CholeskyDense object that's returned be made to behave just like the matrix that it wraps with just a little additional metadata indicating whether it was an upper or lower factorization and some extra efficiency in cases where we know that a Cholesky matrix can be used more efficiently than an arbitrary matrix? In other words, shouldn't @nolta not even notice that he's getting a factorization object back rather than just a plain old matrix?

@dmbates
Member
dmbates commented Feb 7, 2013

@StefanKarpinski I think you get into trouble defining methods like

\{T<:BlasFloat}(C::CholeskyDense{T}, B::StridedVecOrMat{T}) =
    LAPACK.potrs!(C.UL, C.LR, copy(B))

This definition, which I claim is the one that makes sense, solves Ax=b where C=chol(A). You don't want to make that method solve a triangular system, you want it to solve the original positive definite system which involves solving two triangular systems.

What about if we define methods for triu and tril of the Cholesky object and have those return the upper (resp. lower) Cholesky factor? Then they would be available for solving the triangular system. It would still result in more work than should be needed to solve the triangular system (because the \ operator has to check if the result is triangular) but at least it would unambiguously produce the upper or the lower Cholesky factor.

@dmbates dmbates Got the basic CHOLMOD functions associated with a dense matrix workin…
…g by defining a Julia type

Well, working on a 64-bit system and these are only the calls to
cholmod_check_dense and cholmod_print_dense, as in

julia> aa = randn(10,2)
10x2 Float64 Array:
 -1.1131     -0.276926
 -1.18138    -2.17424
  0.615017   -0.341257
  0.0714101  -0.0933698
 -0.493773   -0.520607
  0.989568   -0.0106482
  1.33313    -0.204813
 -0.199893   -0.983869
  0.934582   -0.0652311
  0.792507   -0.369899

julia> cda = CholmodDense(aa)
CholmodDense{Float64}(10,2,20,10,Ptr{Float64} @0x0000000003186070,Ptr{Void} @0x0000000000000000,1,0,10x2 Float64 Array:
 -1.1131     -0.276926
 -1.18138    -2.17424
  0.615017   -0.341257
  0.0714101  -0.0933698
 -0.493773   -0.520607
  0.989568   -0.0106482
  1.33313    -0.204813
 -0.199893   -0.983869
  0.934582   -0.0652311
  0.792507   -0.369899 ,false)

julia> SuiteSparse.chm_chk_dn(cda)
1

julia> SuiteSparse.chm_prt_dn(cda)

CHOLMOD dense:   :  10-by-2,
  leading dimension 10, nzmax 20, real, double
  col 0:
         0: -1.1131
         1: -1.1814
         2: 0.61502
         3: 0.07141
         4: -0.49377
         5: 0.98957
         6: 1.3331
         7: -0.19989
    ...
  col 1:
         0: -0.27693
         1: -2.1742
         2: -0.34126
         3: -0.09337
    ...
         6: -0.20481
         7: -0.98387
         8: -0.065231
         9: -0.3699
  OK

4-element Uint8 Array:
 0x03
 0x00
 0x00
 0x00

Not exactly earth-shattering but at least it is a proof-of-concept for
using a Julia type to pass a pointer to a struct to one of the CHOLMOD
functions.
5ddf16b
@StefanKarpinski
Member

Right. That's exactly the kind of thing that a CholeskyDense object should override. If the factorization behaved exactly like the inner matrix, there wouldn't be any benefit. What I meant was @carlobaldassi's example of taking diag(CholeskyDense) should work correctly for a factor object as though what had been returned was the full chol matrix. @nolta: how do you use the cholesky result to generate multivariate random values? Could it be made to work reasonably with a CholeskyDense object instead?

@StefanKarpinski
Member

In general, regarding the issue of Matlab compatibility here, I think we should break compatibility for one very simple reason: we have to. We can't keep Matlab compatibility because we don't support nargout like Matlab does. I really don't grok the full behavior of chol well enough to see clearly what we should do, but we can't be Matlab compatible because chol does so many different things.

@dmbates
Member
dmbates commented on 5ddf16b Feb 7, 2013

I have only tested this on a 64-bit machine but I think it should work on 32-bit machines too because C struct should not require padding. The size_t and (void*) values all occur at the beginning of the struct. At least this is my theory. It will, naturally, be necessary to check. The conversion of a pointer to such a struct created by a CHOLMOD C function to a Julia type will be the more complicated one.

Would I be better off using pack and unpack from the strpack.jl file in extras or at https://github.com/pao/StrPack.jl. The last commit on @pao 's repository was two months ago so it seems the best hope is

require("strpack")

if I go that way for unpacking.

Member
pao replied Feb 7, 2013

Stick with extras for now. The module is still very broken (even with the local commits I haven't uploaded due to a combination of shame and lack of focus.)

Member

the Julia types will add the same padding as writing the equivalent struct in C. So you should be fine on 32-bit machines (assuming the type signature is otherwise correct)

you should be able to mimic unpack with unsafe_ref in some cases. This function copies the data pointed to by a Ptr{CholmodDense} to a new instance of CholmodDense. (the pointer can then be manually free'd)

Member
pao replied Feb 8, 2013

^ And to the extent that I misunderstood what you were asking, use the native support if possible.

Member

@vtnash The idea of using the unsafe_ref to untangle the pointer to the structure returned from the C functions is attractive. I think, however, I will need to create "condensed" types that parallel CholmodDense and CholmodSparse but do not have the vector fields at the end of those types.

The initial fields in the CholmodDense and CholmodSparse types are scalars and pointers corresponding to the fields in the C structs. But, as we discussed in Boston, I need to keep the vectors from which the pointers are generated in the object to ensure that the memory area addressed by the pointer is not garbage collected while the object is within scope and is garbage collected when the object goes out of scope.

I think that trying to create an unsafe_ref to this extended type will be problematic because those trailing fields don't exist in the C struct.

Member

One possibility is to declare things like this, so you have a Julia wrapper to take care of the rooting the GC, but you don't have to duplicate code to use the "condensed" type.

type c_CholmodDense{T<:CHMVTypes}
    m::Int
    n::Int
    nnz::Int
    lda::Int
    xpt::Ptr{T}
    zpt::Ptr{Void}
    xtype::Int32
    dtype::Int32
end

type CholmodDense{T<:CHMVTypes}
    c::c_CholmodDense{T}
    aa::Array{T}
    chm::Bool                       # was storage allocated by Cholmod?
end

convert(::Type{c_CholmodDense}, d::CholmodDense) = d.c

and use it with:

cholmod_dense  = CholmodDense( ... )
ccall( ..., ( ..., Ptr{c_CholmodDense}, ...), ..., cholmod_dense, ...)
``
Member

Ah, yes. The old "define a convert method" trick. Looks good.

@dmbates
Member
dmbates commented Feb 7, 2013

I'm not sure how this is happening but any commit to the vs/linalg2 branch and any comments on such a commit are showing up in this thread. You should not try to understand my comments on changes in the extras/suitesparse.jl file in the context of the thread of the rest of the discussion.

@pao
Member
pao commented Feb 7, 2013

The vs/linalg2 branch is what this pull request is for.

@dmbates dmbates Defined a CholmodSparse type
but my external constructor from a SparseMatrix{Tv,Ti} is not working
as hoped.  I'm pushing this commit in case someone can try
```julia
require("suitesparse")
using SuiteSparse
A = sprand(8,4,0.25);
ca = CholmodSparse(A)
```
and tell me why I get a no method error.
2ce6abc
@nolta
Member
nolta commented Feb 8, 2013

I'm prepared to compromise on matlab compatibility if we can fix the code smell coming off of factors.

@dmbates suggestion of triu(chol(A)) seems like a step in the right direction, except it would mean triu(A) != triu(chol(A)), which feels wrong.

@StefanKarpinski
Member

I agree that the factors thing smells a bit off. Any thoughts on my proposal to use iterating a factorization object to simulate getting the parts of the factorization?

@carlobaldassi
Member

Defining diag for CholeskyDense also feels wrong to me, because then diag(A) != diag(chol(A)) (related to issue #1412 again), while the Factorization object is intended as a representation of A.
I think this is the culprit: there are two use cases here, one in which the factorizations are performed as an intermediate step to compute something from A (e.g. performing divisions, computing the determinant etc.), and a second one in which the factors are what you actually want to operate with. That's what I liked about having both chol and chold, that you had both use cases at hand. I wouldn't mind about giving prominence to the first use case, actually (I'm not particularly concerned about matlab compatibility), it just seems that factors() is a bit convoluted for something so simple; if nothing better comes up, I'd prefer cholf, luf etc., or similar, i.e. the reverse of what we have now.
Iterating a factorization object is a nice way to bypass factors, but only when there's more than one factor (unless you write things like (C,) = chol(A) and diag(chol(A)...) or (*)(chol(A)...,x) (argh))`.

@dmbates
Member
dmbates commented Feb 8, 2013

Perhaps Stefan's idea of defining iterators for Factorization types is the way to go. I suggest also adding ref methods for Factorization types that can be used to extract a particular component of the factorization without calculating all of them. For example, it is often useful to get the R matrix from a QR factorization. It is used to evaluate the variance-covariance of the coefficient estimator in least squares. There could be situations where the m by n model matrix for which you have the decomposition has a very large m relative to n. If you only want R It would be a waste to also evaluate Q only to turn around and discard it. So I would suggest creating ref methods to be able to use

QR = qr(X)
R = QR[:R] # QR["R"] or less-desirable QR[2]

We would need to be careful about the form in which a permutation is returned. It would probably be best to define a permutation matrix type that inherits from AbstractMatrix so that P * A and A * P do the right thing. Also inv(P) would be the permutation matrix corresponding to the inverse permutation.

I work with square sparse matrices of sizes in the millions and I don't want to create even a sparse n by n matrix to represent a permutation

@ViralBShah
Member

@carlobaldassi You are right that there are two use cases. In my opinion, if we have matlab-like behaviour, we should probably use the matlab-compatible name. I do actually quite like the suggestion by @dmbates to use ref to pick out factors from Factorization objects. I am already beginning to like QR["R"] for performance, clarity of code, compactness, etc. Open to other suggestions too.

I would like to try out the iterators suggested by @StefanKarpinski too. The iterator method makes me wonder if we can almost get matlab style varargout.

@ViralBShah
Member

@nolta I do not understand what you meant when you said that triu(A) != triu(chol(A)) feels wrong. The upper triangular part of the matrix would surely be different than the upper triangular part of the Cholesky factor.

@carlobaldassi I have the same question about your comment about diag.

@nolta
Member
nolta commented Feb 8, 2013

@ViralBShah, since A\b == factorization(A)\b, the factorization of A is conceptually the same as A, just a different representation. So if f(A) is defined, and f(factoriation(A)) is defined, then f(A) == f(factorization(A)). Also discussed in #1412.

I'm starting to think factorizations are still a work in progress.

@nolta
Member
nolta commented Feb 8, 2013

Hmm, does anyone remember the rationale for defining QR * x = Q * x?

https://github.com/JuliaLang/julia/blob/master/base/linalg_dense.jl#L588

@andreasnoack
Member

To have an easy way to multiply Q with a matrix without forming Q explicitly, I guess. The gain is bigger for qr than in the chol case but in general I think it is okay that the factorizations serve as clever shortcuts to special calculations. Hence my suggestion that *(Matrix,CholeskyDense)=Matrix*factors(chol(Matrix)). As long as these things are documented I don't see it as a problem.

@StefanKarpinski
Member

Perhaps Stefan's idea of defining iterators for Factorization types is the way to go. I suggest also adding ref methods for Factorization types that can be used to extract a particular component of the factorization without calculating all of them.

I hadn't actually looked into how @JeffBezanson implemented that and thought it was based solely on ref, so I was a little surprised that it was an iterable thing – destructuring uses start/next/done not ref. That makes sense as the most general way to do it. It would be nice for factorizations, however, if one could write _, S, _ = svd(X) and get only the diagonal, but that would entail making _ more than just a conventional "throw this value away" symbol – the parser would understand that that's what _ means. At first blush, this would require changing destructuring assignment to using ref which would make it stop working on all iterables (e.g. streams, which seems a bit weird anyway). However, another way would be to introduce skip as part of the iteration protocol, as an optional function for advancing the state while throwing away a value, e.g. state = skip(itr,state). It would have a fallback definition of skip(itr,state) = next(itr,state)[1].

All of this makes me wonder if the way to get nargout behavior in Julia is to return an object that implements an appropriate iteration protocol and produces values upon destructuring. It's slightly higher order, but maybe that's an ok way to do it. The only feature missing to allow full simulation of nargout is having a way of passing the total number of items that the destructuring expects. I'm not at all sure if this is a good idea or not.

@nolta
Member
nolta commented Feb 8, 2013

@andreasnoackjensen Whoa. If factorizations aren't going to behave as linear operators/transformations (per the discussion in #1412), and are just a collection of magical behaviors, i'm not sure they belong in base at all.

@StefanKarpinski
Member

@ViralBShah, since A\b == factorization(A)\b, the factorization of A is conceptually the same as A, just a different representation. So if f(A) is defined, and f(factoriation(A)) is defined, then f(A) == f(factorization(A)). Also discussed in #1412.

Wholeheartedly agree with all of this. I think this is the heart of the matter as @carlobaldassi pointed out: there's one view of what chol returns that wants to be a different representation of the original matrix and a different view that wants it to be the upper/lower-triangular factor instead.

I'm starting to think factorizations are still a work in progress.

Yes, it seems that way. I would suggest that we not merge this or change the way these linear algebra functions work until after the 0.1 deadline. This all seems way too up in the air at this point. We can come up with a really coherent story for 0.2.

@StefanKarpinski
Member

@andreasnoackjensen Whoa. If factorizations aren't going to behave as linear operators/transformations (per the discussion in #1412), and are just a collection of magical behaviors, i'm not sure they belong in base at all.

I think they certainly belong in base, but they should absolutely behave like the original linear operators except with better special-case implementations. Anything else seems crazy to me. At that point, we might as well just be Matlab-compatible.

@nolta
Member
nolta commented Feb 8, 2013

Ok, but what do you mean by "better special-case implementations"? Rename * to qmul or something?

@timholy
Member
timholy commented Feb 8, 2013

Definitely, the original intention was that factorization objects should behave as does the original matrix. QR * x = Q * x seems more than a little dangerous to me. Adding additional functions like qmul seems fine, but * and \ should be reserved for their regular meaning.

@mlubin
Member
mlubin commented Feb 8, 2013

Why not QR["Q"] * x = Q * x? The point is that Q is not stored as a full matrix, so maybe QR["Q"] should return a specialized object, and if someone really wants the full matrix for some reason they can use full(QR["Q"]).

@StefanKarpinski
Member

Given how this conversation has panned out, I'm moving this from milestone 0.1 to 0.2.

@dmbates dmbates Incorporate suggestion from Jameson on keeping pointer contents alive
Defined types c_CholmodDense which mirrors the cholmod_dense C struct and a wrapper
type CholmodDense that retains the Julia matrix when constructed from one and retains
the pointer itself when constructed from *cholmod_dense returned from C.

Still need to add a finalizer for the CholmodDense type and create
similar types for CholmodSparse, CholmodFactor and CholmodTriplet.
614530e
@ViralBShah
Member

This is certainly 0.2 now. For 0.1, I would prefer to leave things as they are, but rename chold to cholfact etc. for readability. Any objections to that?

@ViralBShah
Member

To me a Factorization is not a different representation of A, but a different representation of the matrix factorization of A.

@StefanKarpinski
Member

I object on the grounds that we may very well change it again later, so lets not cause undue intermediate churn .

@ViralBShah
Member

I am really not in favour of shipping Factorization in its current form. I would much rather separate it out all in a package, rather than just ship it as it is.

@StefanKarpinski
Member

I'm not sure I understand. Are you objecting to "shipping" the current state of master or something else?

@ViralBShah
Member

What I am saying is that the current state of Factorization in master is not satisfactory in my opinion to ship. For something as basic as this, it would be difficult to change it at a later date - and I prefer moving all Factorization stuff into a package where it can be baked till it is good enough for inclusion in Base.

I would want to hear what others have to say.

@ViralBShah
Member

Just a note - I have cherry-picked all the commits by @dmbates. The blas stuff is in master, whereas the suitesparse stuff is now on the db/suitesparse branch.

@mlubin
Member
mlubin commented Feb 9, 2013

I suppose it's ok for Factorization to be moved into a package for now. Does this include the matlab-style routines? The Julia documentation should clearly point to it though.

@ViralBShah
Member

What I am proposing is to leave the matlab style routines as they are for now as part of Base - so chol, cholp, lu, qr, qrp, etc. can stay as they are. The *d versions: chold, cholpd, lud, qrd, qrpd would move to say, a LinalgFactor package, where that interface can be experimented with. The package versioning gives a good method for backwards compatibility too.

@mlubin
Member
mlubin commented Feb 9, 2013

Doesn't that make it difficult to change lu, chol, etc. to return factorization objects in the future?

@StefanKarpinski
Member

Honestly, I think this is a terrible idea. If a change this drastic was going to happen, it should have been done days ago. We now have an API stability deadline in one day and there is no way a change this disruptive is going to happen without breaking things. If the factorization stuff wasn't ready, it shouldn't have been in master – we are already shipping it. Trying to pull this out of base now is making a massive, disruptive change with no deprecation path with a single day to get it right. Not a good idea.

@mlubin
Member
mlubin commented Feb 9, 2013

Is there really that much to fix in order to get Factorization to a reasonable state? Could we agree on a list of things to fix and leave non-compatibility-breaking enhancements for later?

I propose:

  • We codify @nolta's approach that in all cases where operators are overloaded for Factorization objects or Factorization objects are passed to Base functions, the behavior should be the same as if the original matrix was passed.
  • Access individual factors using ref instead of factors. For now these can return full matrices, but in the future they could return specialized types, e.g. for QR as I mentioned the Q could be a special type which implements multiplication and backslash. This wouldn't break syntax compatibility if all someone does is use the Q factor as a normal matrix, although it would break things if they try to access the elements for some reason.
  • Put off for 0.2 implementing the nice tuple/iterator syntax for picking out all the factors at once.
@StefanKarpinski
Member

It seems to me that "getting this right" is not super crucial now that we've arrived at the realization that we're not going to get this right before 0.1; since we know we're going to break people's code with 0.2 when we change all of this the main goal is not to break their code with 0.1 also with no benefit. So for tomorrow, I would suggest we tread lightly with only minimally disruptive changes that are unambiguously a good thing.

@andreasnoack
Member

I also think it would too drastic to remove the factorizations. Just for the perspective in this, the discussion started from the annoyance of writing factors(chol(A)) instead of chol(A).

I think @mlubin's suggestions are reasonable. In particular @dmbates's idea of using ref instead of factors.

However, I think we should reconsider if it is useful to restrict factorizations to be the same thing as the original matrix: the LinearOperator view. What is the gain? If this association is dropped you wouldn't be surprised by the behaviour of * in the QR factorization. Why would you ever use * on a factorization object it represented the original matrix?

@mlubin
Member
mlubin commented Feb 9, 2013

(qr(A)[:Q])*x is much more understandable than (qr(A))*x to anyone with a background in linear algebra who has never seen Julia before.

@andreasnoack
Member

I agree that qr(A)[:Q]*x is more understandable, but that is different from qr(A)*x being dangerous. My point was partly that * is not useful if a factorization is understood as the same thing as the original matrix and partly that mathematical abstractions are not always useful in programming.

@timholy
Member
timholy commented Feb 9, 2013

But then * should not be defined at all. \ clearly is useful, because the factorization is the best way to solve a linear equation. And having * not be the inverse of \ is simply too horrible to contemplate.

@toivoh
Member
toivoh commented Feb 9, 2013

About _, not computing unneeded factors, and extracting several at once: how about

Q, R = qr(A)[(:Q, :R)]

Then the entire set of requested factors would be known immediately upon extraction, and without extending the iterator protocol. Also, it is quite clear what is happening even if the type of factorization object being operated upon cannot be seen from the code, unlike factors, which can happily give the wrong answer if the type of factorization is different than expected.

Still, I'm unclear on the details of factorizations where you only want to compute some of the factors. This seems to presuppose some kind of lazy evaluation, which would require the factorization object to keep a copy of the source matrix inside, unless the set of needed factors is given in advance. The most efficient algorithm to compute a given subset of factors would probably also need to know this set in advance, not incrementally. On the other hand, a factorization object with some factors unavailable will be crippled, e.g. cannot implement \, so is probably not really a factorization object at all.

To compute a subset of factors, maybe it would be better not to (visibly) construct a factorization object, but to use something like

U, S = svd((:U, :S), A)

(and you could perhaps specify either :V or :Vt, which could provide some nice flexibility regarding #912.)
This would also allow to compute the Cholesky factor using

L = chol (:L, A)
@ViralBShah
Member

@toivoh This is not lazy evaluation. The idea is that you use the LAPACK routines for factorization, which often compute a compact form. If you want the actual matrices that represent the factors, you may have to do some more work, when forming the actual matrices. However, if you only want to use the factorization to say, solve a system, then you may not need the actual matrices.

@ViralBShah
Member

Given the need for minimally invasive changes, I am still pushing for renaming the *d routines to *fact routines. It is absolutely impossible to figure out what the d in qrd means, given that it returns a Factorization object. It may cause some intermediate churn, but at least we will have names that are more meaningful to newcomers, and we can then take our time discussing what the ideal interface should look like.

@mlubin
Member
mlubin commented Feb 10, 2013

Sounds reasonable to me.

@ViralBShah
Member

I am also not sure whether for 0.1, we should retain the current behaviour of qrd(a)*x which only multiplies by Q.

@mlubin
Member
mlubin commented Feb 10, 2013

If it's too much intermediate churn to remove that behavior given that we haven't decided how to expose the functionality, I would at least not present it as a good thing to do in the documentation.

@ViralBShah ViralBShah added a commit that referenced this pull request Feb 10, 2013
@ViralBShah ViralBShah Based on discussions in #2212.
Minimal invasive changes to the Factorization objects for 0.1.
All existing functionality is retained,	with only a few	renamings for clarity.

Replaces the *d routines with *fact routines:
         lud ->lufact
         chold -> cholfact
         cholpd-> cholpfact
         qrd ->qrfact
         qrpd -> qrpfact

Replaces * and Ac_mul_B on QRDense objects:
         qmulQR performs Q*A
         qTmulQR performs Q'*A

Updates to exports, documentation, tests, and deprecations.

This is a self-contained commit that can be reverted if necessary
close #2062
69e407b
@ViralBShah
Member

I am closing this pull request, and we can resume this discussion after 0.1. I am actually fine with the current state, if we can retain the commit above. If everyone hates it, it can be reverted - but it does not change any functionality and only renames a few things for clarity and consistency.

I am pushing it in since 0.1 will most likely be tagged when I am not awake.

@ViralBShah ViralBShah closed this Feb 10, 2013
@andreasnoack
Member

@timholy The horrible behaviour is already there for matrices. Inherited from Matlab:

julia> A=randn(5,2);b=randn(5);

julia> b-A*(A\b)
5-element Float64 Array:
 -0.418845
  1.15814 
 -0.986439
  0.719669
  0.531056

Horrible, but convenient.

@timholy
Member
timholy commented Feb 10, 2013

Fair enough, but that's only because inv(A) doesn't exist. When it does, it gives you the expected result.

@alanedelman
Contributor

May I propose we disconnect two issues

  1. the possibility of a compressed form for Q and perhaps L and U

  2. the entanglement of Q&R, L&U etc.

    I believe that the issue is more about 2 and yet the discussion above is mostly about 1.
    There is nothing wrong with having a compressed format for Q analogous to "sparse." or
    "symtridiagonal." If we didn't entangle, it still could be just fine to go (L,U)\A for
    L(U\A). With enough overloading, the user would hardly know the difference from matlab.
    ... so the real discussion we have to have is whether LAPACK's entangled formats
    should be disentangled, or rewritten so they end up disentangled, or if both are bad ideas
    for some sort of porformance reason. (Off the top of my head, I don't know the answer)

@ViralBShah
Member

Does anyone use svdt? Is it useful to have? svd is matlab-compatible and factors(svdfact(A)) now does what svdt does right now. I would prefer to deprecate svdt but wanted to check here first.

@ViralBShah ViralBShah deleted the vs/linalg2 branch Feb 11, 2013
@andreasnoack
Member

Now that we have an svd factorization type I think svdt should disappear.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment