Implementation bug in ColumnIterator? #49

wuciawe · 2016-08-23T08:35:37Z

In the line of the file, it looks like it should be val cols = Array.fill[Int](matrix.rows)(index) instead of val cols = Array.fill[Int](matrix.cols)(index) ?

The text was updated successfully, but these errors were encountered:

rjagerman · 2016-08-23T08:42:58Z

I think you're right, that looks like a bug, it wouldn't make much sense to use matrix.cols there. Thanks for the report! The fix should be really easy, feel free to submit a pull request or I can do it as well later today.

Seems like this is missed by the unit tests. We'll also have to write some unit tests that test for this case, so it doesn't retroactively happen again in the future.

wuciawe · 2016-08-23T09:45:10Z

create a pull request #50 , only target to simply fix this bug, without adding tests. This method is protected, I have very limited experience in writing tests, so I don't know how to write tests to non-public method properly. also tests are needed for RowIterator etc.

wuciawe · 2016-08-23T09:47:06Z

oh, I think there is still a potential bug, val rows = (0L until matrix.rows).toArray and val cols = Array.fill[Int](matrix.rows.toInt)(index), there may exceed the limit length of Array.

rjagerman · 2016-08-23T09:52:55Z

Thanks, I'll look into that! 👍 I think it would be best to add a check to see if matrix.rows exceeds the size of an integer and return a failed future otherwise.

For the tests, I'll see if I can write some up tomorrow.

rjagerman · 2016-08-23T10:17:21Z

The check for matrix.rows size has been added in PR #51.

wuciawe · 2016-08-23T10:47:48Z

And I managed to add tests for fetchNextFuture in ColumnIterator and RowBlockIterator in #52 .

rjagerman · 2016-08-23T11:43:16Z

Thanks! It looks great 👍

wuciawe · 2016-08-23T12:34:32Z

eh, after further reading the code, I find the real leak in the init implementation, in this block of code, it shortcuts the cols as the test has a smaller row. So the appropriate test should be something like in #53 . And the tests in #52 is meaninglesss, as they should be tested via public interface next.

By the way, in the near future, I may have to implement logistic regression with parameter server. Do you know whether there have somebody already been working on that based on glint?

rjagerman · 2016-08-23T12:50:13Z

Yeah, now that you say it, it's probably better to test the public interface next, it also makes the tests a bit more readable.

I think @MLnick has been working on a basic logistic regression implementation. I plan to implement logistic regression, SVMs, etc. myself but haven't gotten around to it yet due to other obligations. One of the major difficulties I ran into is regularization. L2 regularization would require scalar multiplication on the entire distributed vector, which is not yet implemented and will probably require some code refactoring at certain parts. Unregularized linear methods should be relatively easy to implement though.

wuciawe · 2016-08-23T12:58:22Z

I roughly think about it today, I find it hard to enable arbitrary user define function on the server side. If I have enough time, I will take a look at the code of parameterserver.org to see how they handle the problem.

wuciawe · 2016-08-24T02:34:37Z

L2 regularization would require scalar multiplication on the entire distributed vector

Do you mean you want something like:

to update the parameter vector?

So the approach will be something like:

in each iteration:

the server issues tasks to the worker nodes to compute $(h(x^i) - y^i)x_j^i$
the server collects the results from the worker nodes
the server side do scalar multiplication on parameter vector and then minus them with corresponding parameter (looks like a temporary vector is needed to store those collected results from worker nodes)

do I guess right?

rjagerman · 2016-08-25T11:04:26Z

Yes, the idea is correct, but I should note that the tasks are not issued by the parameter servers and instead job scheduling is intended to be handled by Spark.

Spark issues workers to operate on subsets of some data set. These workers then asynchronously pull parts of the model and push updates to the parameter server. So the point of view is definitely different from other parameter server implementations, because Glint functions more akin to a key-value store. It has no concept of tasks or workers and merely provides distributed matrices and vectors that can be read and updated efficiently. So it would look something like this:

At the beginning of each iteration, the Spark driver will need to push a scalar multiplication to the parameter servers ($theta_j = theta_j(1 - \alpha \lambda / m$)
Spark issues workers to operate subsets of the data (rdd.foreachPartition { ... })
Each worker pulls (part of) the model it needs
Each worker locally computes gradient $(h(x^i) - y^i)x_j^i$
Each worker pushes its gradient to the parameter servers

Perhaps you will find this interesting as well: https://github.com/rjagerman/glintlda. It is an implementation of LightLDA using Glint. Although it is a completely different algorithm, it provides a good example of using Glint for a complex ML task. The code may be a bit hard to read though due the many optimizations that it uses.

wuciawe · 2016-08-25T14:24:49Z

I've read the code of glintlda yesterday, and currently reading the code of parameterserver.org's code optimizing with L1L2 reg. The code of parameterserver.org is c++(didn't program with c++ for years) and the project is quite large, it may cost serveral days to understand the implementation of parameterserver.org.

With your proposal on logistic regression with L2, it has strong consistent (different from glintlda, which I think it's bounded staling, as you use futures with locks to enable little inconsistency. am I right? or what is the usage of semaphore? ). If I'm right, it maybe also possible to be bounded staling with something like in glintlda. and multiply a scale is something like a mapper function.

But with L1, I think it needs to be strong consistent or we need to keep the old gradients of delayed works so as to enable bounded staling, which may increase the complexity of the system. and with L1, it will require something like mapper function to update weights.

And I also find that in yahoo's implementation, the parameter server provides map, reduce, and aggregate interface on server side. That may make the system more powerful.

MLnick · 2016-09-19T13:56:46Z

Hi there

I have been working on some basic linear models with Glint - mostly logistic regression. I've used an implementation based on the older Spark MLlib gradient descent primitives.

It's fairly easy to do LR with L2 regularization and I get the same results as MLlib. However, I haven't really run larger scale experiments yet where the async impact on consistency could be greater. L1 is more challenging though and I think will definitely require some form of computation on the param servers themselves. How to handle that in Glint is an interesting issue - I think there must be some form of UDFs supported for these types of operations - whether supporting shipping some function to the servers, or plugging in a custom Actor of some sort.

wuciawe · 2016-09-20T01:25:56Z

@rjagerman @MLnick
Hi,
In case of large scale, where even each of the item is sparse, the items splitted into each executor may require a large portion set of parameters, and thus require large amount of memory and cause some problems. I think we can use the block coordinate descent can be used to avoid this kind of problem.

rjagerman added Type: bug Priority: high labels Aug 23, 2016

rjagerman closed this as completed in 4111a5f Aug 23, 2016

rjagerman mentioned this issue Sep 19, 2016

Rework of Glint internals #55

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implementation bug in ColumnIterator? #49

Implementation bug in ColumnIterator? #49

wuciawe commented Aug 23, 2016

rjagerman commented Aug 23, 2016

wuciawe commented Aug 23, 2016

wuciawe commented Aug 23, 2016

rjagerman commented Aug 23, 2016

rjagerman commented Aug 23, 2016

wuciawe commented Aug 23, 2016

rjagerman commented Aug 23, 2016

wuciawe commented Aug 23, 2016

rjagerman commented Aug 23, 2016

wuciawe commented Aug 23, 2016

wuciawe commented Aug 24, 2016 •

edited

Loading

rjagerman commented Aug 25, 2016

wuciawe commented Aug 25, 2016

MLnick commented Sep 19, 2016

wuciawe commented Sep 20, 2016

Implementation bug in ColumnIterator? #49

Implementation bug in ColumnIterator? #49

Comments

wuciawe commented Aug 23, 2016

rjagerman commented Aug 23, 2016

wuciawe commented Aug 23, 2016

wuciawe commented Aug 23, 2016

rjagerman commented Aug 23, 2016

rjagerman commented Aug 23, 2016

wuciawe commented Aug 23, 2016

rjagerman commented Aug 23, 2016

wuciawe commented Aug 23, 2016

rjagerman commented Aug 23, 2016

wuciawe commented Aug 23, 2016

wuciawe commented Aug 24, 2016 • edited Loading

rjagerman commented Aug 25, 2016

wuciawe commented Aug 25, 2016

MLnick commented Sep 19, 2016

wuciawe commented Sep 20, 2016

wuciawe commented Aug 24, 2016 •

edited

Loading