Develop SearchsortedOp to wrap numpy's searchsorted function #4422

tsirif · 2016-04-22T17:04:44Z

Includes:

A SearchsortedOp, which includes perform and c_code
A searchsorted function with numpy-equivalent call
A method in tensor's basic methods which redirects to searchsorted with numpy-equivalent call to ndarray's method
Unit tests for SearchsortedOp's functionalities

Currently, side string argument is being given to SearchSortedOp as an init argument. A param version is not implemented in this patch.

nouiz · 2016-04-22T19:09:59Z

theano/tensor/basic.py

@@ -147,6 +147,8 @@ def as_tensor_variable(x, name=None, ndim=None):
        If an `Apply` with more than one output is fetched.
    AsTensorError
        If `x` cannot be converted to a TensorType Variable.
+    TypeError
+        If `x` cannot be made into a Variable with `ndim` dimensions.


I get a value error in that case:

>>> import theano >>>v=theano.tensor.matrix() >>> theano.tensor.as_tensor_variable(v, ndim=1) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "theano/tensor/basic.py", line 182, in as_tensor_variable % ndim, x.type ValueError: ('TensorType could not be cast to have 1 dimensions', TensorType(float64, matrix))

In which case you get that error?

In the params doc above, it says:
"""
ndim : None or integer
Return a Variable with this many dimensions. Raise TypeError if it's
not possible.
"""

I'm looking into it

nouiz · 2016-04-22T19:24:54Z

This PR look pretty good. I just got a few questions/comments.

thanks

tsirif · 2016-04-22T19:35:01Z

Thank you Fred. I am working on these comments right now. 2 questions:

does side argument need to be a dynamic op param or not?
About ndim > 1 in grad method: Documentation refers that handling grads of non vector inputs is not implemented yet.

nouiz · 2016-04-22T19:50:35Z

side don't need to be a dinamic param. But it can be. You can wrap the
string in a Generic() variable.

The other option is to use params.

Both change would allow to compile less version of this code. Otherwise, we
need to compile 2 versions of the code.

This isn't a must do to merge this PR, but if we could compile just one
version, it would be great.

I'm not sure what you are talking about. Here the grad implementation
would be the same as for 1d, zeros(). Where did you saw this?

On Fri, Apr 22, 2016 at 3:35 PM, Christos Tsirigotis <
notifications@github.com> wrote:

Thank you Fred. I am working on these comments right now. 2 questions:

does side argument need to be a dynamic op param or not?

About ndim > 1 in grad method: Documentation refers that handling grads
of non vector inputs is not implemented yet.

—
You are receiving this because you commented.
Reply to this email directly or view it on GitHub
#4422 (comment)

tsirif · 2016-04-22T20:43:45Z

Don't merge yet, I will fix what you commented and also I will make side a dynamic param for practice.
I cannot find it right now. Maybe it was a misconception of mine. I will fix this.
Thanks for accepting me in GSoC! 😃

lamblin · 2016-04-23T17:00:03Z

theano/tensor/extra_ops.py

+def searchsorted(x, v, side='left', sorter=None):
+    """Find indices where elements should be inserted to maintain order.
+
+    Wraping of numpy.searchsorted. Find the indices into a sorted array


tsirif · 2016-04-24T21:24:20Z

Ok, I think I completed changes according to your comments and I also added side as a dynamic param.

lamblin · 2016-04-25T03:37:47Z

General note: could you rebase your changes on top of the master branch, rather than merging with it? Merging brought in unrelated changes in this PR.
Also, on test is failing, it looks like numerical gradient estimation went across a point where the gradient is not defined, causing it to expect a huge gradient.

tsirif · 2016-04-25T12:55:24Z

@lamblin, what should be done on the failing test?

tsirif · 2016-04-25T13:17:18Z

Also, you mean that I should not merge with no-ff into the branch on which I will issue a pull request, right? You do not want e.g. Merge branch 'feature/searchsorted' into develop commits into history tree. I commit frequently though small partial changes which may not be complete code features to be run through ci yet and I do not want to spam hooks on travis with every push. Should I not care about this? Should I use [ci skip] in commit messages or do you suggest something else?

lamblin · 2016-04-25T23:46:33Z

For the failing test, I would suggest sampling fewer values, or spanning a larger range, to avoid almost-collisions in the values.
For the merge, that is right, you should avoid merging with central/master or upstream/master while developing. If there is a conflict or that you need to update, it is better to use git rebase -i central/master instead. This will "lift" your commits on top of the latest master commits and apply them there.
You can have partial local commits and push once in a while, I've never used [ci skip] but it may be a good idea.

tsirif · 2016-04-26T10:29:51Z

@lamblin : So far I used to have a feature/<something> branch in which I was working and when then the next x commits in this branch had completed a code feature, I was merging a develop branch with --no-ff to feature/<something>. develop branch is the one that I issue for a pull request, so every time a new merging commit is pushed to the repo, a new necessary hook on travis begins. If I keep rebasing this subtree on top of most recent theano/master branch by the time I want to push on develop, would this be a desirable development scheme?

- Make sure perform outputs an numpy.ndarray. - Add C implementation of searchsorted - Fix PEP8 format - Fix props, make_node, perform of SearchsortedOp - Add searchsorted method (wrapper) on tensor variable - Fix c_code and grad in SearchsortedOp, fix tests - Update documentation

- Fix tests and grad - Add more documentation - Fix doc Raises ValueError in `as_tensor_variable` - Remove 'DebugMode' from tests

- Resolve :215 of test_extra_ops to int explicitly Reason: numpy.uint64 + int -> numpy.float64 for some reason (numpy/numpy@3509704)

tsirif · 2016-04-27T18:47:40Z

I rebased on top of current theano/master and squashed commits into a single commit for every merge I had done in my develop from feature/searchsorted branch. The build is passing successfully.

nouiz · 2016-04-28T20:54:22Z

You can rebase on top of the central/master when needed. But you don't need to always do it. Most of the time it isn't needed.

But when a rebase/merge is needed, make a rebase as you did here. I'll check again this PR.

nouiz · 2016-04-28T20:58:16Z

theano/tensor/extra_ops.py

+    __props__ = ("side", )
+
+    def __init__(self, side='left'):
+        self.side = side


Add an assert that side is "left" or "right".

Maybe I should let numpy's exception to be raised. I will change the c_code implementation to allow this.

It is better to return the error when we build the graph when possible then during execution. So I would add the check here.

nouiz · 2016-04-28T21:04:42Z

theano/tensor/var.py

@@ -692,6 +692,9 @@ def cumsum(self, axis=None):
    def cumprod(self, axis=None):
        return theano.tensor.extra_ops.cumprod(self, axis)

+    def searchsorted(self, v, side='left', sorter=None):


This isn't used in your test. Just change one test to call it by this way. This make sure we won't break/remove that interface by mistake.

nouiz · 2016-04-28T21:05:13Z

I got very small/comments questions, so we should be able to merge this soon.

thanks.

- Add extra tests for searchsorted

tsirif · 2016-04-28T21:45:03Z

@nouiz I pushed fixes

MarcCote · 2016-05-06T16:29:35Z

@tsirif well done. 👍

nouiz reviewed Apr 22, 2016
View reviewed changes

tsirif force-pushed the develop branch from 57f8c81 to c4425a6 Compare April 22, 2016 20:46

lamblin reviewed Apr 23, 2016
View reviewed changes

MarcCote and others added 4 commits April 27, 2016 16:41

Add side of searchsorted as dynamic param

458a5cc

- Fix tests and grad - Add more documentation - Fix doc Raises ValueError in `as_tensor_variable` - Remove 'DebugMode' from tests

Fix flake8 in extra_ops and tests

15db4a1

- Resolve :215 of test_extra_ops to int explicitly Reason: numpy.uint64 + int -> numpy.float64 for some reason (numpy/numpy@3509704)

Fix range of values in test_extra_ops:setUp

8bffb72

tsirif force-pushed the develop branch from 820a22f to 8bffb72 Compare April 27, 2016 13:47

nouiz reviewed Apr 28, 2016
View reviewed changes

Raise ValueError on invalid side kwd of searchsorted

6506ecd

- Add extra tests for searchsorted

nouiz merged commit adffdc1 into Theano:master Apr 29, 2016

tsirif mentioned this pull request May 2, 2016

Make TensorVariable interface more similar to that of numpy.ndarray #1080

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Develop SearchsortedOp to wrap numpy's searchsorted function #4422

Develop SearchsortedOp to wrap numpy's searchsorted function #4422

tsirif commented Apr 22, 2016

nouiz Apr 22, 2016

tsirif Apr 22, 2016

nouiz commented Apr 22, 2016

tsirif commented Apr 22, 2016

nouiz commented Apr 22, 2016

tsirif commented Apr 22, 2016

lamblin Apr 23, 2016

tsirif Apr 23, 2016

tsirif commented Apr 24, 2016

lamblin commented Apr 25, 2016

tsirif commented Apr 25, 2016

tsirif commented Apr 25, 2016

lamblin commented Apr 25, 2016

tsirif commented Apr 26, 2016

tsirif commented Apr 27, 2016

nouiz commented Apr 28, 2016

nouiz Apr 28, 2016

tsirif Apr 28, 2016

nouiz Apr 28, 2016

nouiz Apr 28, 2016

tsirif Apr 28, 2016

nouiz commented Apr 28, 2016

tsirif commented Apr 28, 2016

MarcCote commented May 6, 2016

Develop SearchsortedOp to wrap numpy's searchsorted function #4422

Develop SearchsortedOp to wrap numpy's searchsorted function #4422

Conversation

tsirif commented Apr 22, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nouiz commented Apr 22, 2016

tsirif commented Apr 22, 2016

nouiz commented Apr 22, 2016

tsirif commented Apr 22, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tsirif commented Apr 24, 2016

lamblin commented Apr 25, 2016

tsirif commented Apr 25, 2016

tsirif commented Apr 25, 2016

lamblin commented Apr 25, 2016

tsirif commented Apr 26, 2016

tsirif commented Apr 27, 2016

nouiz commented Apr 28, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nouiz commented Apr 28, 2016

tsirif commented Apr 28, 2016

MarcCote commented May 6, 2016