Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Develop SearchsortedOp to wrap numpy's searchsorted function #4422

Merged
merged 5 commits into from Apr 29, 2016

Conversation

tsirif
Copy link
Contributor

@tsirif tsirif commented Apr 22, 2016

Includes:

  • A SearchsortedOp, which includes perform and c_code
  • A searchsorted function with numpy-equivalent call
  • A method in tensor's basic methods which redirects to searchsorted with numpy-equivalent call to ndarray's method
  • Unit tests for SearchsortedOp's functionalities

Currently, side string argument is being given to SearchSortedOp as an init argument. A param version is not implemented in this patch.

@@ -147,6 +147,8 @@ def as_tensor_variable(x, name=None, ndim=None):
If an `Apply` with more than one output is fetched.
AsTensorError
If `x` cannot be converted to a TensorType Variable.
TypeError
If `x` cannot be made into a Variable with `ndim` dimensions.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get a value error in that case:

>>> import theano
>>>v=theano.tensor.matrix()
>>> theano.tensor.as_tensor_variable(v, ndim=1)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "theano/tensor/basic.py", line 182, in as_tensor_variable
    % ndim, x.type
ValueError: ('TensorType could not be cast to have 1 dimensions', TensorType(float64, matrix))

In which case you get that error?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the params doc above, it says:
"""
ndim : None or integer
Return a Variable with this many dimensions. Raise TypeError if it's
not possible.
"""

I'm looking into it

@nouiz
Copy link
Member

nouiz commented Apr 22, 2016

This PR look pretty good. I just got a few questions/comments.

thanks

@tsirif
Copy link
Contributor Author

tsirif commented Apr 22, 2016

Thank you Fred. I am working on these comments right now. 2 questions:

  1. does side argument need to be a dynamic op param or not?
  2. About ndim > 1 in grad method: Documentation refers that handling grads of non vector inputs is not implemented yet.

@nouiz
Copy link
Member

nouiz commented Apr 22, 2016

side don't need to be a dinamic param. But it can be. You can wrap the
string in a Generic() variable.

The other option is to use params.

Both change would allow to compile less version of this code. Otherwise, we
need to compile 2 versions of the code.

This isn't a must do to merge this PR, but if we could compile just one
version, it would be great.

  1. I'm not sure what you are talking about. Here the grad implementation
    would be the same as for 1d, zeros(). Where did you saw this?

On Fri, Apr 22, 2016 at 3:35 PM, Christos Tsirigotis <
notifications@github.com> wrote:

Thank you Fred. I am working on these comments right now. 2 questions:

  1. does side argument need to be a dynamic op param or not?
  2. About ndim > 1 in grad method: Documentation refers that handling grads
    of non vector inputs is not implemented yet.


You are receiving this because you commented.
Reply to this email directly or view it on GitHub
#4422 (comment)

@tsirif
Copy link
Contributor Author

tsirif commented Apr 22, 2016

  1. Don't merge yet, I will fix what you commented and also I will make side a dynamic param for practice.

  2. I cannot find it right now. Maybe it was a misconception of mine. I will fix this.

  3. Thanks for accepting me in GSoC! 😃

def searchsorted(x, v, side='left', sorter=None):
"""Find indices where elements should be inserted to maintain order.

Wraping of numpy.searchsorted. Find the indices into a sorted array
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Wrapping"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oups.

@tsirif
Copy link
Contributor Author

tsirif commented Apr 24, 2016

Ok, I think I completed changes according to your comments and I also added side as a dynamic param.

@lamblin
Copy link
Member

lamblin commented Apr 25, 2016

General note: could you rebase your changes on top of the master branch, rather than merging with it? Merging brought in unrelated changes in this PR.
Also, on test is failing, it looks like numerical gradient estimation went across a point where the gradient is not defined, causing it to expect a huge gradient.

@tsirif
Copy link
Contributor Author

tsirif commented Apr 25, 2016

@lamblin, what should be done on the failing test?

@tsirif
Copy link
Contributor Author

tsirif commented Apr 25, 2016

Also, you mean that I should not merge with no-ff into the branch on which I will issue a pull request, right? You do not want e.g. Merge branch 'feature/searchsorted' into develop commits into history tree. I commit frequently though small partial changes which may not be complete code features to be run through ci yet and I do not want to spam hooks on travis with every push. Should I not care about this? Should I use [ci skip] in commit messages or do you suggest something else?

@lamblin
Copy link
Member

lamblin commented Apr 25, 2016

For the failing test, I would suggest sampling fewer values, or spanning a larger range, to avoid almost-collisions in the values.
For the merge, that is right, you should avoid merging with central/master or upstream/master while developing. If there is a conflict or that you need to update, it is better to use git rebase -i central/master instead. This will "lift" your commits on top of the latest master commits and apply them there.
You can have partial local commits and push once in a while, I've never used [ci skip] but it may be a good idea.

@tsirif
Copy link
Contributor Author

tsirif commented Apr 26, 2016

@lamblin : So far I used to have a feature/<something> branch in which I was working and when then the next x commits in this branch had completed a code feature, I was merging a develop branch with --no-ff to feature/<something>. develop branch is the one that I issue for a pull request, so every time a new merging commit is pushed to the repo, a new necessary hook on travis begins. If I keep rebasing this subtree on top of most recent theano/master branch by the time I want to push on develop, would this be a desirable development scheme?

MarcCote and others added 4 commits April 27, 2016 16:41
- Make sure perform outputs an numpy.ndarray.
- Add C implementation of searchsorted
- Fix PEP8 format
- Fix props, make_node, perform of SearchsortedOp
- Add searchsorted method (wrapper) on tensor variable
- Fix c_code and grad in SearchsortedOp, fix tests
- Update documentation
- Fix tests and grad
- Add more documentation
- Fix doc Raises ValueError in `as_tensor_variable`
- Remove 'DebugMode' from tests
- Resolve :215 of test_extra_ops to int explicitly
Reason: numpy.uint64 + int -> numpy.float64 for some reason (numpy/numpy@3509704)
@tsirif
Copy link
Contributor Author

tsirif commented Apr 27, 2016

I rebased on top of current theano/master and squashed commits into a single commit for every merge I had done in my develop from feature/searchsorted branch. The build is passing successfully.

@nouiz
Copy link
Member

nouiz commented Apr 28, 2016

You can rebase on top of the central/master when needed. But you don't need to always do it. Most of the time it isn't needed.

But when a rebase/merge is needed, make a rebase as you did here. I'll check again this PR.

__props__ = ("side", )

def __init__(self, side='left'):
self.side = side
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add an assert that side is "left" or "right".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I should let numpy's exception to be raised. I will change the c_code implementation to allow this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is better to return the error when we build the graph when possible then during execution. So I would add the check here.

@@ -692,6 +692,9 @@ def cumsum(self, axis=None):
def cumprod(self, axis=None):
return theano.tensor.extra_ops.cumprod(self, axis)

def searchsorted(self, v, side='left', sorter=None):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't used in your test. Just change one test to call it by this way. This make sure we won't break/remove that interface by mistake.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@nouiz
Copy link
Member

nouiz commented Apr 28, 2016

I got very small/comments questions, so we should be able to merge this soon.

thanks.

- Add extra tests for searchsorted
@tsirif
Copy link
Contributor Author

tsirif commented Apr 28, 2016

@nouiz I pushed fixes

@MarcCote
Copy link
Contributor

MarcCote commented May 6, 2016

@tsirif well done. 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants