Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

first, map, ... don't work correctly for 1 row matrices #36

Open
alexott opened this issue Sep 1, 2013 · 21 comments
Open

first, map, ... don't work correctly for 1 row matrices #36

alexott opened this issue Sep 1, 2013 · 21 comments

Comments

@alexott
Copy link

alexott commented Sep 1, 2013

if I have 2 matrices - row matrix & col matrix, then the seq, first, map, and other functions behaves the same for them, although this is incorrect (imho):

(def m2 (matrix [[1 2 3]]))
m2
A 1x3 matrix
-------------
1.00e+00 2.00e+00 3.00e+00

(def m3 (matrix [1 2 3]))
m3
A 3x1 matrix
-------------
1.00e+00
2.00e+00
3.00e+00

(first m2)
1.0
(first m3)
1.0
(matrix (seq m2))
A 3x1 matrix
-------------
1.00e+00
2.00e+00
3.00e+00
(matrix (seq m3))
A 3x1 matrix
-------------
1.00e+00
2.00e+00
3.00e+00

This breaks some functions in incanter that process matrices on per-row basis. I think, that this relates to issue #30

@mikera
Copy link
Collaborator

mikera commented Sep 1, 2013

In core.matrix / generic array functionality the expected behaviour would be:

  • first on a row matrix [[1 2 3]] returns a vector [1 2 3]
  • first on a column matrix [[1] [2] [3]] returns a length one vector [1]

That is, first is always consistent with taking the first slice of the first (row) dimension.

@alexott
Copy link
Author

alexott commented Sep 1, 2013

Yes - I also thinking about this approach... I've tried to workaround this in to-list function, but that also requires changes on the Incanter's side.

@mikera
Copy link
Collaborator

mikera commented Sep 2, 2013

I'm mildly trying to discourage first / map / seq etc as applied to matrices arrays BTW: they are convenient sometimes but won't work if/when we introduce new matrix types from the Java world that don't exactly fit Clojure's conventions. In particular, if a class implements Iterable in some way then that's the behaviour you will get, like it or not.

Better, I think, to move to the protocol-backed equivalents in core.matrix so we can guarantee behaviour that is both consistent and extensible.

@alexott
Copy link
Author

alexott commented Sep 2, 2013

we need to discuss this problem more deeply, as it will heavily affect Incanter's functions.

@mikera
Copy link
Collaborator

mikera commented Sep 2, 2013

I know - it's a tricky one! The good news is that it doesn't have to be a big bang switchover, we can support both in parallel I think. Anyway, better to continue discussion on the relevant mailing lists.

@whilo
Copy link

whilo commented Dec 9, 2014

I also have a related problem:
boltzmann.jblas> (seq (matrix [[1 2] [3 4]]))
( A 1x2 matrix


1.00e+00 2.00e+00
A 1x2 matrix


3.00e+00 4.00e+00
)
boltzmann.jblas> (seq (matrix [[1 2]]))
(1.0 2.0) ; seq of doubles

To make the single row case compatible with the other core.matrix/jblas routines, I had to convert the rows back to the matrix type explicitly (because ISliceWrapper of rows isn't countable and again not exchangable with the matrix type), which is somewhat ugly and adds probably some overhead (since all data has to pass through this copying code for each training epoch):

(map (comp mat/matrix vector)
          (rows v-probs-batch))

source
I can optimize that probably, but it also took some time to figure it out and I had such problems also with the special core.matrix Vector type some times in the past, since most jblas routines expect matrices and the Vector type is not really compatible.

I understand the Iterable interface problem which probably b0rks the jblas types, so maybe if the ISliceWrapper object would behave like a Matrix I would be fine. I am not sure whether core.matrix.Vector is a good idea with jblas.

@mikera
Copy link
Collaborator

mikera commented Jan 12, 2015

@ghubber I assume you mean the clatrix.core.Vector type? core.matrix doesn't have a Vector type specifically.

The clatrix.core.Vector type actually uses a JBlas matrix under the hood, so it should work fine for all the JBlas interop. Having said that, I think a lot of this code hasn't quite received all the testing it needs for all the interoperability cases. If anyone fancies doing some test.check generative testing that would be cool :-)

@whilo
Copy link

whilo commented Feb 10, 2015

Ok, stupid question. Why do we need this Vector type then? The problem, if I recall correctly, was that the operations like Matrix-multiplication return 1-column/row Matrices, so when I loop the type changes.

@mikera
Copy link
Collaborator

mikera commented Feb 11, 2015

Conceptually 1-D vectors are different from 2D matrices even if they have the same elements.

It's the same as the difference between [[1 2 3]] and [1 2 3].

Clatrix doesn't strictly need its own vector type. It could use Clojure vectors or something else in cases where it needs a 1D vector. The main requirement is that it is able to produce and consume 1D arrays where needed by the core.matrix API.

@whilo
Copy link

whilo commented Sep 6, 2015

You are right ofc., I got confused by the introduction of a separate Vector type. I am just working on a core.matrix implementation for nd4j, because I want to have fast GPU based deep learning in Clojure. I am not sure yet how consistent the deeplearning4j code is (I know @mikera had some discussions about ndarrays with them), but I don't see the Clojure community reinvent all the work there atm.

In core.matrix / generic array functionality the expected behaviour would be:

first on a row matrix [[1 2 3]] returns a vector [1 2 3]
first on a column matrix [[1] [2] [3]] returns a length one vector [1]

That is, first is always consistent with taking the first slice of the first (row) dimension.

Yes, that would be consistent iteration over ndarrays. A peculiar thing is to expect unboxed numbers when iterating over vectors, since this turns an Iterator<NDArrayThing> into Iterator<Object> in Java terms. But I really don't understand why the clatrix classes return a scalar on (first (clatrix/matrix [[1 2]])), since Clatrix wraps JBlas' DoubleMatrix.

Also couldn't you get the Iterable added in upstream JBlas to avoid wrapping? (for opensource libs this is at least possible). This makes seamless interaction with APIs written for JBlas possible and I try to get this for nd4j (maybe it is stupid?).

https://github.com/deeplearning4j/nd4j/pull/374

DoubleMatrix does not implement Iterable yet and can also store vectors. Why do you need a separate vector type? In general ISeq should be a protocol like in cljs, then we wouldn't have all that trouble, but this is a whole new story...

(I think an Iterator over elements for a tensor with dimensions is really weird and should be part of the API, not exposed by Iterable for any Java vector lib. Just my 2 pence.)

@mikera
Copy link
Collaborator

mikera commented Sep 7, 2015

@whilo an nd4j core.matrix implementation would be great!

However did you take a look at Vectorz / vectorz-clj? I think that implementing GPU support for Vectorz (via netlib-java) would actually be a pretty good way to get GPU matrix support in Clojure. And Vectorz has some advantages:

  • It's the most mature core.matrix implementation
  • It is very fast on the pure Java side for 1D vector maths, which don't really benefit much from GPUs but are very central to deep learning techniques
  • It has a lot of specialised array implementations (diagonal matrices, sparse vectors, specialised sub-array types etc.) that give big performance enhancements for many applications when used appropriately.

I actually started an implementation here and it seems to work:

https://github.com/mikera/vectorz-native

Anyway I'd be interested to compare performance of the nd4j based implementation with vectorz-clj, something along the lines of http://core.matrix.bench-t2.large.s3-website-eu-west-1.amazonaws.com/554c2ae93f357522cca7e383e7ad90fef451c139.html

@mikera
Copy link
Collaborator

mikera commented Sep 7, 2015

@whilo have you got a public repo yet for your nd4j core.matrix implementation? I'm interested to take a look and try it out!

@whilo
Copy link

whilo commented Sep 7, 2015

It was only a weekend hack, because I would like to have competitive Clojure implementation to theano: https://github.com/whilo/clj-nd4j

I stopped working on it once I figured out I had to wrap the ndarray class. But since this is out of the way now, I would try to get the compliance tests passing.

@whilo
Copy link

whilo commented Sep 7, 2015

For deeplearning the absolute bottle-neck in my experience is matrix multiplication between batches of training samples and weight matrices. What do you mean by

1D vector maths, which don't really benefit much from GPUs but are very central to deep learning techniques

?

@mikera
Copy link
Collaborator

mikera commented Sep 8, 2015

I've done quite a bit of deep leaning using only 1D vectors and sparse operations (which don't require full matrix x matrix multiplication). Agree that matrix multiplication will be the bottleneck if you are using big dense matrices, but that isn't always required.

@mikera
Copy link
Collaborator

mikera commented Sep 8, 2015

@whilo thanks for sharing! I'll take a look and see if I can get the compliance tests passing. You shouldn't need to wrap the ND4J INDArray, having taken a quick look I think it has all the functionality required to work as an N-dimensional array.

@whilo
Copy link

whilo commented Sep 8, 2015

Good, thank you! I will have a look into the issue you pointed out in the pull-request. Most models I have seen so far have dense matrices. Which models have you trained? Backprop then only needs to update non-zero weights so matrices stay sparse, right?

@mikera
Copy link
Collaborator

mikera commented Sep 9, 2015

Correct. It is a very efficient way to have a lot of feature detectors, but without exploding the number of weights.

@whilo
Copy link

whilo commented Sep 9, 2015

Interesting. I have used dropout a bit, which is really nice regularization. Do you set the sparsity at the beginning of training or is it adaptive by pruning small weights and speculatively "forming synapses". (I work on biologically inspired neuron models and port deep learning techniques for my master, Boltzmann machines so far, to them.) In biology synapses form all the time.

@mikera
Copy link
Collaborator

mikera commented Sep 9, 2015

I've played a bit with both..... seems that a fixed sparsity works fine since the nodes just specialise towards whatever they have available as input. The pruning / forming new synapses also seems to work and performs better in some cases but not sure it it is always worth the extra complexity and overhead.

Of course, if you are doing stuff like convolutional networks it really helps to have a sensible hypothesis about how nodes should be connected in advance.

@whilo
Copy link

whilo commented Sep 11, 2015

Back to the original issue, do you think this is fixable in Clatrix? We probably should not discuss this here :).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants