New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generalize slicing and slice assignment ops (including gather and scatter) #206

Closed
girving opened this Issue Nov 13, 2015 · 83 comments

Comments

Projects
None yet
@girving
Contributor

girving commented Nov 13, 2015

We should make our slicing and assignment ops more general to capture more of the functionality of numpy slicing, and add __getitem__ sugar for all of it. Specifically,

  1. We should have a 2.5 dimensional set of ops, with dimensions (1) get vs. set, (2) slice type, and for the assignment ops (3) the update op. Currently we have slice, assign_update, assign_add, assign_sub, gather, scatter_update, scatter_add, scatter_sub. We should also have assign_slice_update, assign_slice_add, assign_slice_sub.
  2. Both slicing and slice assignment should support strides, with no performance cost if strides aren't used.
  3. Ideally, the slice ops should support negative indexing a la Python. Since the slice parameters are already CPU, this is implementable with near zero cost. The unfortunate bit is that since we picked the wrong format for specifying ranges (start + length instead of start : end), negative indexing might be awkward. Thus, it might be best left to a separate bug.
  4. Support numpy-like boolean indexing.
  5. Generalize gather and scatter_* to take an array of input index tensors, efficiently broadcast them, and do multidimensional indexing similar to numpy.
  6. Make __getitem__ provide sugar for all of the above. Ideally we'd have something idiomatically similar at least to __setitem__, but this is problematic since the returned assignment op is important to have, __setitem__ does not return a value, and the nice range sugar is available only inside indexing / assignment calls.

@ebrevdo: I'm assigning this to you for now since you might get to it first, but feel free to grab only the piece of it that you need for now.

@girving

This comment has been minimized.

Show comment
Hide comment
@girving

girving Nov 20, 2015

Contributor

Lasse requests the equivalent of numpy mixed indexing:

x[:, tensor]

which combines slicing with indexing-by-tensor.

Contributor

girving commented Nov 20, 2015

Lasse requests the equivalent of numpy mixed indexing:

x[:, tensor]

which combines slicing with indexing-by-tensor.

@lespeholt

This comment has been minimized.

Show comment
Hide comment
@lespeholt

lespeholt Nov 20, 2015

Contributor

...where tensor can be either a scalar (which would select all the values in that column) or a vector which can select individual columns in the rows, so:

foo = tf.constant([[1,2,3], [4,5,6]])
foo[:, 1] # [2, 5]
indexes = tf.constant([1, 2])
foo[:, indexes] # [2, 6]
Contributor

lespeholt commented Nov 20, 2015

...where tensor can be either a scalar (which would select all the values in that column) or a vector which can select individual columns in the rows, so:

foo = tf.constant([[1,2,3], [4,5,6]])
foo[:, 1] # [2, 5]
indexes = tf.constant([1, 2])
foo[:, indexes] # [2, 6]
@avostryakov

This comment has been minimized.

Show comment
Hide comment
@avostryakov

avostryakov Nov 20, 2015

If I understand correctly the following code is exactly what is needed for cross-entropy loss function
indexes = tf.constant([1, 2])
foo[:, indexes] # [2, 6]

If we will have this kind of indexing we can write:
cost = -tf.reduce_sum(tf.log(network_output[:, targets]))

, where targets is a vector of class's indexes instead of

cost = -tf.reduce_sum(targets*tf.log(network_output))

, where targets is a sparse matrix

Am I correct?

avostryakov commented Nov 20, 2015

If I understand correctly the following code is exactly what is needed for cross-entropy loss function
indexes = tf.constant([1, 2])
foo[:, indexes] # [2, 6]

If we will have this kind of indexing we can write:
cost = -tf.reduce_sum(tf.log(network_output[:, targets]))

, where targets is a vector of class's indexes instead of

cost = -tf.reduce_sum(targets*tf.log(network_output))

, where targets is a sparse matrix

Am I correct?

@yaroslavvb

This comment has been minimized.

Show comment
Hide comment
@yaroslavvb

yaroslavvb Dec 2, 2015

Contributor

Numpy also has newaxis and "Ellipsis" objects. IE, to prepend an axis using numpy notation
a[np.newaxis,...]

http://docs.scipy.org/doc/numpy-1.10.0/reference/arrays.indexing.html

Contributor

yaroslavvb commented Dec 2, 2015

Numpy also has newaxis and "Ellipsis" objects. IE, to prepend an axis using numpy notation
a[np.newaxis,...]

http://docs.scipy.org/doc/numpy-1.10.0/reference/arrays.indexing.html

@girving

This comment has been minimized.

Show comment
Hide comment
@girving

girving Dec 3, 2015

Contributor

Yes, newaxis is essential.

Contributor

girving commented Dec 3, 2015

Yes, newaxis is essential.

@girving

This comment has been minimized.

Show comment
Hide comment
@girving

girving Dec 8, 2015

Contributor

As part of this, we should make #418 work.

Contributor

girving commented Dec 8, 2015

As part of this, we should make #418 work.

@olange-google

This comment has been minimized.

Show comment
Hide comment
@olange-google

olange-google Feb 17, 2016

It would also be helpful if gather supports an axis parameter:

current behavior:
gather(v, indices) --> output[i, ... ] = params[indices[i], ... ]

wanted behavior:
gather(v, indices, axis=1) --> output[:, i, :] = params[:, indices[i], :]

(Please excuse, if this is covered by the list of requirements posted above already)

olange-google commented Feb 17, 2016

It would also be helpful if gather supports an axis parameter:

current behavior:
gather(v, indices) --> output[i, ... ] = params[indices[i], ... ]

wanted behavior:
gather(v, indices, axis=1) --> output[:, i, :] = params[:, indices[i], :]

(Please excuse, if this is covered by the list of requirements posted above already)

@girving

This comment has been minimized.

Show comment
Hide comment
@girving

girving Feb 17, 2016

Contributor

@olange-google: I think we're unlikely to implement an axis parameter since it's beyond numpy features, but what you want is covered by the combination of slice indexing and advanced indexing.

Contributor

girving commented Feb 17, 2016

@olange-google: I think we're unlikely to implement an axis parameter since it's beyond numpy features, but what you want is covered by the combination of slice indexing and advanced indexing.

@ebrevdo

This comment has been minimized.

Show comment
Hide comment
@ebrevdo

ebrevdo Feb 17, 2016

Contributor

Are you sure? If in numpy I use:

array[:, :, [1, 2, 3], :]

then that's equivalent to gather([1, 2, 3], axis=2)

is it not?

On Wed, Feb 17, 2016 at 11:06 AM, Geoffrey Irving notifications@github.com
wrote:

@olange-google https://github.com/olange-google: I think we're unlikely
to implement an axis parameter since it's beyond numpy features, but what
you want is covered by the combination of slice indexing and advanced
indexing.


Reply to this email directly or view it on GitHub
#206 (comment)
.

Contributor

ebrevdo commented Feb 17, 2016

Are you sure? If in numpy I use:

array[:, :, [1, 2, 3], :]

then that's equivalent to gather([1, 2, 3], axis=2)

is it not?

On Wed, Feb 17, 2016 at 11:06 AM, Geoffrey Irving notifications@github.com
wrote:

@olange-google https://github.com/olange-google: I think we're unlikely
to implement an axis parameter since it's beyond numpy features, but what
you want is covered by the combination of slice indexing and advanced
indexing.


Reply to this email directly or view it on GitHub
#206 (comment)
.

@girving

This comment has been minimized.

Show comment
Hide comment
@girving

girving Feb 17, 2016

Contributor

We're saying the same thing. I'm not objecting to that functionality, just to exposing it via that sort of axis parameter rather than as a special case of the combination of slice indexing and advanced indexing.

Contributor

girving commented Feb 17, 2016

We're saying the same thing. I'm not objecting to that functionality, just to exposing it via that sort of axis parameter rather than as a special case of the combination of slice indexing and advanced indexing.

@mhejrati

This comment has been minimized.

Show comment
Hide comment
@mhejrati

mhejrati Feb 23, 2016

@ebrevdo What is the status on this issue?
I am interested to implement this if no one is working on it.

mhejrati commented Feb 23, 2016

@ebrevdo What is the status on this issue?
I am interested to implement this if no one is working on it.

@ebrevdo

This comment has been minimized.

Show comment
Hide comment
@ebrevdo

ebrevdo Feb 28, 2016

Contributor

I plan to work on this but not until after next week. If you want to work
in this, I suggest supporting GPU and CPU in your kernels. If you can't
implement that, you may want to wait for us to implement it.
On Feb 23, 2016 11:45 AM, "Mohsen Hejrati" notifications@github.com wrote:

@ebrevdo https://github.com/ebrevdo What is the status on this issue?
I am interested to implement this if no one is working on it.


Reply to this email directly or view it on GitHub
#206 (comment)
.

Contributor

ebrevdo commented Feb 28, 2016

I plan to work on this but not until after next week. If you want to work
in this, I suggest supporting GPU and CPU in your kernels. If you can't
implement that, you may want to wait for us to implement it.
On Feb 23, 2016 11:45 AM, "Mohsen Hejrati" notifications@github.com wrote:

@ebrevdo https://github.com/ebrevdo What is the status on this issue?
I am interested to implement this if no one is working on it.


Reply to this email directly or view it on GitHub
#206 (comment)
.

@MInner

This comment has been minimized.

Show comment
Hide comment
@MInner

MInner Apr 2, 2016

Hi :) Could you please give an update on the status of this feature?

The _SliceHelper docstring says that the "stop" of the slice must not be omitted, however this case seem to be handled just fine in function implementation:

if s.stop is None or s.stop == sys.maxsize:
        sizes.append(-1)
  • or I'm getting something wrong?

MInner commented Apr 2, 2016

Hi :) Could you please give an update on the status of this feature?

The _SliceHelper docstring says that the "stop" of the slice must not be omitted, however this case seem to be handled just fine in function implementation:

if s.stop is None or s.stop == sys.maxsize:
        sizes.append(-1)
  • or I'm getting something wrong?
@ebrevdo

This comment has been minimized.

Show comment
Hide comment
@ebrevdo

ebrevdo Apr 2, 2016

Contributor

OK recently added the gather_nd op, which performs a special subset of the required functionality: given a tensor of indices, gather the requested values.

Advanced slicing is on the radar.

Contributor

ebrevdo commented Apr 2, 2016

OK recently added the gather_nd op, which performs a special subset of the required functionality: given a tensor of indices, gather the requested values.

Advanced slicing is on the radar.

@waleedka

This comment has been minimized.

Show comment
Hide comment
@waleedka

waleedka Apr 9, 2016

Contributor

@ebrevdo I tried using the gather_nd op to get the last relevant output from a variable length LSTM network. I'm passing sequence_length to the RNN, which means that the last few outputs of most examples are zeros, so I'm trying to read the last non-zero output. I'm getting this error, though, in the training phase:

NotImplementedError: Gradient for gather_nd is not implemented.

  outputs, state = rnn.rnn(multi_rnn_cell, inputs, dtype=tf.float32, sequence_length=lengths_ph)

  indicies = tf.concat(1, [
      tf.expand_dims(lengths_ph - 1, 1),
      tf.expand_dims(tf.range(tf.shape(vectors_ph)[0]), 1),
      tf.expand_dims(tf.zeros_like(lengths_ph), 1),
      ])
  output_tensor = tf.pack(outputs)
  relevant_output = tf.gather_nd(output_tensor, indicies)
Contributor

waleedka commented Apr 9, 2016

@ebrevdo I tried using the gather_nd op to get the last relevant output from a variable length LSTM network. I'm passing sequence_length to the RNN, which means that the last few outputs of most examples are zeros, so I'm trying to read the last non-zero output. I'm getting this error, though, in the training phase:

NotImplementedError: Gradient for gather_nd is not implemented.

  outputs, state = rnn.rnn(multi_rnn_cell, inputs, dtype=tf.float32, sequence_length=lengths_ph)

  indicies = tf.concat(1, [
      tf.expand_dims(lengths_ph - 1, 1),
      tf.expand_dims(tf.range(tf.shape(vectors_ph)[0]), 1),
      tf.expand_dims(tf.zeros_like(lengths_ph), 1),
      ])
  output_tensor = tf.pack(outputs)
  relevant_output = tf.gather_nd(output_tensor, indicies)
@ebrevdo

This comment has been minimized.

Show comment
Hide comment
@ebrevdo

ebrevdo Apr 9, 2016

Contributor

Yeah - we haven't written the gradient implementation for gather_nd yet.
It's essentially a reshape followed by a call to sparse_to_dense; but
sparse_to_dense doesn't have a GPU implementation (on my TODO) so I'm not
using it yet.

On Fri, Apr 8, 2016 at 7:24 PM, Waleed notifications@github.com wrote:

@ebrevdo https://github.com/ebrevdo I tried using the gather_nd op to
get the last relevant output from a variable length LSTM network. I'm
passing sequence_length to the RNN, which means that the last few outputs
of most examples are zeros, so I'm trying to read the last non-zero output.
I'm getting this error, though, in the training phase:

NotImplementedError: Gradient for gather_nd is not implemented.

outputs, state = rnn.rnn(multi_rnn_cell, inputs, dtype=tf.float32, sequence_length=lengths_ph)

indicies = tf.concat(1, [
tf.expand_dims(lengths_ph - 1, 1),
tf.expand_dims(tf.range(tf.shape(vectors_ph)[0]), 1),
tf.expand_dims(tf.zeros_like(lengths_ph), 1),
])
output_tensor = tf.pack(outputs)
relevant_output = tf.gather_nd(output_tensor, indicies)


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#206 (comment)

Contributor

ebrevdo commented Apr 9, 2016

Yeah - we haven't written the gradient implementation for gather_nd yet.
It's essentially a reshape followed by a call to sparse_to_dense; but
sparse_to_dense doesn't have a GPU implementation (on my TODO) so I'm not
using it yet.

On Fri, Apr 8, 2016 at 7:24 PM, Waleed notifications@github.com wrote:

@ebrevdo https://github.com/ebrevdo I tried using the gather_nd op to
get the last relevant output from a variable length LSTM network. I'm
passing sequence_length to the RNN, which means that the last few outputs
of most examples are zeros, so I'm trying to read the last non-zero output.
I'm getting this error, though, in the training phase:

NotImplementedError: Gradient for gather_nd is not implemented.

outputs, state = rnn.rnn(multi_rnn_cell, inputs, dtype=tf.float32, sequence_length=lengths_ph)

indicies = tf.concat(1, [
tf.expand_dims(lengths_ph - 1, 1),
tf.expand_dims(tf.range(tf.shape(vectors_ph)[0]), 1),
tf.expand_dims(tf.zeros_like(lengths_ph), 1),
])
output_tensor = tf.pack(outputs)
relevant_output = tf.gather_nd(output_tensor, indicies)


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#206 (comment)

@hycis

This comment has been minimized.

Show comment
Hide comment
@hycis

hycis Apr 17, 2016

hi @ebrevdo, do you have any timeline to implement gradient for that, as we are currently using that function. and it's quite useful function.

hycis commented Apr 17, 2016

hi @ebrevdo, do you have any timeline to implement gradient for that, as we are currently using that function. and it's quite useful function.

@nova77

This comment has been minimized.

Show comment
Hide comment
@nova77

nova77 Apr 20, 2016

While we wait for gather_nd for supporting gradients, this is a temporary solution:

x = tf.constant([[1, 2, 3],
                 [4, 5, 6],
                 [7, 8, 9]])
idx = tf.constant([1, 0, 2])
idx_flattened = tf.range(0, x.shape[0]) * x.shape[1] + idx
y = tf.gather(tf.reshape(x, [-1]),  # flatten input
              idx_flattened)  # use flattened indices

with tf.Session(''):
  print y.eval()  # [2 4 9]

nova77 commented Apr 20, 2016

While we wait for gather_nd for supporting gradients, this is a temporary solution:

x = tf.constant([[1, 2, 3],
                 [4, 5, 6],
                 [7, 8, 9]])
idx = tf.constant([1, 0, 2])
idx_flattened = tf.range(0, x.shape[0]) * x.shape[1] + idx
y = tf.gather(tf.reshape(x, [-1]),  # flatten input
              idx_flattened)  # use flattened indices

with tf.Session(''):
  print y.eval()  # [2 4 9]
@ebrevdo

This comment has been minimized.

Show comment
Hide comment
@ebrevdo

ebrevdo Apr 20, 2016

Contributor

I will implement a gradient in the next week. Keep in mind that it will be
CPU-only for now.

On Wed, Apr 20, 2016 at 2:27 AM, Norman Casagrande <notifications@github.com

wrote:

While we wait for gather_nd for supporting gradients, this is a temporary
solution:

x = tf.constant([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
idx = tf.constant([1, 0, 2])
idx_flattened = tf.range(0, x.shape[1]) * x.shape[0] + idx
y = tf.gather(tf.reshape(x, [-1]), # flatten input
idx_flattened) # use flattened indices
with tf.Session(''):
print y.eval() # [2 4 9]


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#206 (comment)

Contributor

ebrevdo commented Apr 20, 2016

I will implement a gradient in the next week. Keep in mind that it will be
CPU-only for now.

On Wed, Apr 20, 2016 at 2:27 AM, Norman Casagrande <notifications@github.com

wrote:

While we wait for gather_nd for supporting gradients, this is a temporary
solution:

x = tf.constant([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
idx = tf.constant([1, 0, 2])
idx_flattened = tf.range(0, x.shape[1]) * x.shape[0] + idx
y = tf.gather(tf.reshape(x, [-1]), # flatten input
idx_flattened) # use flattened indices
with tf.Session(''):
print y.eval() # [2 4 9]


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#206 (comment)

@danijar

This comment has been minimized.

Show comment
Hide comment
@danijar

danijar Apr 25, 2016

Member

@waleedka I adapted @ebrevdo's example to work with an additional dimension for the output neurons of an RNN. This should yield the last relevant output activations while preserving the shape information.

def extract_last_relevant(outputs, length):
    """
    Args:
        outputs: [Tensor(batch_size, output_neurons)]: A list containing the output
            activations of each in the batch for each time step as returned by
            tensorflow.models.rnn.rnn.
        length: Tensor(batch_size): The used sequence length of each example in the
            batch with all later time steps being zeros. Should be of type tf.int32.

    Returns:
        Tensor(batch_size, output_neurons): The last relevant output activation for
            each example in the batch.
    """
    output = tf.transpose(tf.pack(outputs), perm=[1, 0, 2])
    # Query shape.
    batch_size = tf.shape(output)[0]
    max_length = int(output.get_shape()[1])
    num_neurons = int(output.get_shape()[2])
    # Index into flattened array as a workaround.
    index = tf.range(0, batch_size) * max_length + (length - 1)
    flat = tf.reshape(output, [-1, num_neurons])
    relevant = tf.gather(flat, index)
    return relevant
Member

danijar commented Apr 25, 2016

@waleedka I adapted @ebrevdo's example to work with an additional dimension for the output neurons of an RNN. This should yield the last relevant output activations while preserving the shape information.

def extract_last_relevant(outputs, length):
    """
    Args:
        outputs: [Tensor(batch_size, output_neurons)]: A list containing the output
            activations of each in the batch for each time step as returned by
            tensorflow.models.rnn.rnn.
        length: Tensor(batch_size): The used sequence length of each example in the
            batch with all later time steps being zeros. Should be of type tf.int32.

    Returns:
        Tensor(batch_size, output_neurons): The last relevant output activation for
            each example in the batch.
    """
    output = tf.transpose(tf.pack(outputs), perm=[1, 0, 2])
    # Query shape.
    batch_size = tf.shape(output)[0]
    max_length = int(output.get_shape()[1])
    num_neurons = int(output.get_shape()[2])
    # Index into flattened array as a workaround.
    index = tf.range(0, batch_size) * max_length + (length - 1)
    flat = tf.reshape(output, [-1, num_neurons])
    relevant = tf.gather(flat, index)
    return relevant
@erickrf

This comment has been minimized.

Show comment
Hide comment
@erickrf

erickrf May 3, 2016

@danijar this is a working solution, but when I tried it, I got the following warning from tensorflow:

UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
"Converting sparse IndexedSlices to a dense Tensor of unknown shape. "

Apparently, this is caused by the index slices returned by tf.gather. Does anyone know if there's a way to avoid this problem?

erickrf commented May 3, 2016

@danijar this is a working solution, but when I tried it, I got the following warning from tensorflow:

UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
"Converting sparse IndexedSlices to a dense Tensor of unknown shape. "

Apparently, this is caused by the index slices returned by tf.gather. Does anyone know if there's a way to avoid this problem?

@danijar

This comment has been minimized.

Show comment
Hide comment
@danijar

danijar May 3, 2016

Member

@erickrf Would be glad to hear of a better solution as well. Of course you can hide the warning as any other Python warning though.

Member

danijar commented May 3, 2016

@erickrf Would be glad to hear of a better solution as well. Of course you can hide the warning as any other Python warning though.

@guillaumeBellec

This comment has been minimized.

Show comment
Hide comment
@guillaumeBellec

guillaumeBellec Nov 8, 2016

A little inconvenience I came across.
When indexing with the Tensor I get a type error.
I will cast 'a' to int32 in the following code, but it is a bit of a hack I believe.

import tensorflow as tf

A = tf.constant([0,2,3,1])
B = tf.constant([0,1,2,3])

a = tf.argmax(A,1)
B[a]

TypeError: Input 'strides' of 'StridedSlice' Op has type int32 that does not match type int64 of argument 'begin'.

guillaumeBellec commented Nov 8, 2016

A little inconvenience I came across.
When indexing with the Tensor I get a type error.
I will cast 'a' to int32 in the following code, but it is a bit of a hack I believe.

import tensorflow as tf

A = tf.constant([0,2,3,1])
B = tf.constant([0,1,2,3])

a = tf.argmax(A,1)
B[a]

TypeError: Input 'strides' of 'StridedSlice' Op has type int32 that does not match type int64 of argument 'begin'.

@ravigarg27

This comment has been minimized.

Show comment
Hide comment
@ravigarg27

ravigarg27 Nov 17, 2016

Hey all, I want to update a tensor at certain locations (along one dimension only). I understand I can do it via scatter_update however it appears it doesn't have a registered gradient. Is there any workaround this?

ravigarg27 commented Nov 17, 2016

Hey all, I want to update a tensor at certain locations (along one dimension only). I understand I can do it via scatter_update however it appears it doesn't have a registered gradient. Is there any workaround this?

@MInner

This comment has been minimized.

Show comment
Hide comment
@MInner

MInner Nov 18, 2016

@ravigarg27 a dumb workaround (if your new values are not Variables you're going to compute gradients over) is to do something conceptually like A - bool(update!=0)*A + update, where update has same shape as A and all not updated entries are equal to 0; however, that is quite out of scope of this thread :)

MInner commented Nov 18, 2016

@ravigarg27 a dumb workaround (if your new values are not Variables you're going to compute gradients over) is to do something conceptually like A - bool(update!=0)*A + update, where update has same shape as A and all not updated entries are equal to 0; however, that is quite out of scope of this thread :)

@ravigarg27

This comment has been minimized.

Show comment
Hide comment
@ravigarg27

ravigarg27 Nov 18, 2016

@MInner What I want is something on lines of A[indices, :] = B where A and B are Variables matrices

ravigarg27 commented Nov 18, 2016

@MInner What I want is something on lines of A[indices, :] = B where A and B are Variables matrices

@MInner

This comment has been minimized.

Show comment
Hide comment
@MInner

MInner Nov 26, 2016

@danijar , I wonder if your def extract_last_relevant() snipped for extracting last non-zero outputs of tensorflow.models.rnn.dynamic_rnns posted above is still required (downside: sparse to dense warning indicating that it might lead to huge matrix allocation) or one could use new (0.11rc) partial implementation of smart indexing in tensorflow to address this issue? (I'm still little confused regarding which parts of smart indexing work now; there was an announcement in 0.11rc changelog regarding improvements in indexing, but they does not seem to address this specific "indexing via another tensor" issue, aren't they?)

MInner commented Nov 26, 2016

@danijar , I wonder if your def extract_last_relevant() snipped for extracting last non-zero outputs of tensorflow.models.rnn.dynamic_rnns posted above is still required (downside: sparse to dense warning indicating that it might lead to huge matrix allocation) or one could use new (0.11rc) partial implementation of smart indexing in tensorflow to address this issue? (I'm still little confused regarding which parts of smart indexing work now; there was an announcement in 0.11rc changelog regarding improvements in indexing, but they does not seem to address this specific "indexing via another tensor" issue, aren't they?)

@danijar

This comment has been minimized.

Show comment
Hide comment
@danijar

danijar Dec 3, 2016

Member

@MInner I played around with. Unfortunately, it doesn't seem like the new indexing can simplify that.

Member

danijar commented Dec 3, 2016

@MInner I played around with. Unfortunately, it doesn't seem like the new indexing can simplify that.

@warmspringwinds

This comment has been minimized.

Show comment
Hide comment
@warmspringwinds

warmspringwinds Dec 28, 2016

Hey guys.

Sorry for bugging.
Any update on the gradient implementation for tf.gather_nd() ?

warmspringwinds commented Dec 28, 2016

Hey guys.

Sorry for bugging.
Any update on the gradient implementation for tf.gather_nd() ?

@aselle aselle added type:feature and removed enhancement labels Feb 9, 2017

yongtang added a commit to yongtang/tensorflow that referenced this issue Apr 5, 2017

Allow the output of `tf.argmax` as index type
This fix tries to fix the issue raised in #8951 where
the following will raise a `TypeError`:
```
a = tf.constant([1, 2, 3], dtype=tf.float32)
b = tf.argmax(a)
tf.Session().run(a[b])

TypeError: Input 'strides' of 'StridedSlice' Op has type int32 that does not match type int64 of argument 'begin'.
```
The reason for the erorr is that, `strides` is added as `append(1)`
without type while `begin` is appended with type.

The mismatch of `strides` and `begin` causes the error.

This fix fixes the issue by cast the stride with the same type
as `begin` when needed.

This issue was raised in #8951. It was also raised earlier in
tensorflow#206 (comment)

This fix fixes #8951.

Signed-off-by: Yong Tang <yong.tang.github@outlook.com>

yongtang added a commit to yongtang/tensorflow that referenced this issue Apr 5, 2017

Allow the output of `tf.argmax` as index type
This fix tries to fix the issue raised in #8951 where
the following will raise a `TypeError`:
```
a = tf.constant([1, 2, 3], dtype=tf.float32)
b = tf.argmax(a)
tf.Session().run(a[b])

TypeError: Input 'strides' of 'StridedSlice' Op has type int32 that does not match type int64 of argument 'begin'.
```
The reason for the erorr is that, `strides` is added as `append(1)`
without type while `begin` is appended with type.

The mismatch of `strides` and `begin` causes the error.

This fix fixes the issue by cast the stride with the same type
as `begin` when needed.

This issue was raised in #8951. It was also raised earlier in
tensorflow#206 (comment)

This fix fixes #8951.

Signed-off-by: Yong Tang <yong.tang.github@outlook.com>

yongtang added a commit to yongtang/tensorflow that referenced this issue Apr 6, 2017

Allow the output of `tf.argmax` as index type
This fix tries to fix the issue raised in #8951 where
the following will raise a `TypeError`:
```
a = tf.constant([1, 2, 3], dtype=tf.float32)
b = tf.argmax(a)
tf.Session().run(a[b])

TypeError: Input 'strides' of 'StridedSlice' Op has type int32 that does not match type int64 of argument 'begin'.
```
The reason for the erorr is that, `strides` is added as `append(1)`
without type while `begin` is appended with type.

The mismatch of `strides` and `begin` causes the error.

This fix fixes the issue by cast the stride with the same type
as `begin` when needed.

This issue was raised in #8951. It was also raised earlier in
tensorflow#206 (comment)

This fix fixes #8951.

Signed-off-by: Yong Tang <yong.tang.github@outlook.com>

yongtang added a commit to yongtang/tensorflow that referenced this issue Apr 6, 2017

Allow the output of `tf.argmax` as index type
This fix tries to fix the issue raised in #8951 where
the following will raise a `TypeError`:
```
a = tf.constant([1, 2, 3], dtype=tf.float32)
b = tf.argmax(a)
tf.Session().run(a[b])

TypeError: Input 'strides' of 'StridedSlice' Op has type int32 that does not match type int64 of argument 'begin'.
```
The reason for the erorr is that, `strides` is added as `append(1)`
without type while `begin` is appended with type.

The mismatch of `strides` and `begin` causes the error.

This fix fixes the issue by cast the stride with the same type
as `begin` when needed.

This issue was raised in #8951. It was also raised earlier in
tensorflow#206 (comment)

This fix fixes #8951.

Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
@danijar

This comment has been minimized.

Show comment
Hide comment
@danijar

danijar Apr 6, 2017

Member

Anyone who needs to tf.gather() along the second dimension:

def gather_along_second_axis(data, indices):
  flat_indices = tf.tile(indices[None, :], [tf.shape(data)[0], 1])
  batch_offset = tf.range(0, tf.shape(data)[0]) * tf.shape(data)[1]
  flat_indices = tf.reshape(flat_indices + batch_offset[:, None], [-1])
  flat_data = tf.reshape(data, tf.concat([[-1], tf.shape(data)[2:]], 0))
  result_shape = tf.concat([[tf.shape(data)[0], -1], tf.shape(data)[2:]], 0)
  result = tf.reshape(tf.gather(flat_data, flat_indices), result_shape)
  shape = data.shape[:1].concatenate(indices.shape[:1])
  result.set_shape(shape.concatenate(data.shape[2:]))
  return result
Member

danijar commented Apr 6, 2017

Anyone who needs to tf.gather() along the second dimension:

def gather_along_second_axis(data, indices):
  flat_indices = tf.tile(indices[None, :], [tf.shape(data)[0], 1])
  batch_offset = tf.range(0, tf.shape(data)[0]) * tf.shape(data)[1]
  flat_indices = tf.reshape(flat_indices + batch_offset[:, None], [-1])
  flat_data = tf.reshape(data, tf.concat([[-1], tf.shape(data)[2:]], 0))
  result_shape = tf.concat([[tf.shape(data)[0], -1], tf.shape(data)[2:]], 0)
  result = tf.reshape(tf.gather(flat_data, flat_indices), result_shape)
  shape = data.shape[:1].concatenate(indices.shape[:1])
  result.set_shape(shape.concatenate(data.shape[2:]))
  return result
@MInner

This comment has been minimized.

Show comment
Hide comment
@MInner

MInner Apr 6, 2017

@danijar I might have misunderstood what you are doing, but wouldn't tf.transpose(tf.gather(tf.transpose(x, ..), ..), ..) do a thing? Moreover, gather_nd() thankfully has gradients now.

MInner commented Apr 6, 2017

@danijar I might have misunderstood what you are doing, but wouldn't tf.transpose(tf.gather(tf.transpose(x, ..), ..), ..) do a thing? Moreover, gather_nd() thankfully has gradients now.

@aselle

This comment has been minimized.

Show comment
Hide comment
@aselle

aselle Apr 7, 2017

Member

@warmspringwinds it is deployed.

Member

aselle commented Apr 7, 2017

@warmspringwinds it is deployed.

@danijar

This comment has been minimized.

Show comment
Hide comment
@danijar

danijar Apr 14, 2017

Member

@MInner You're right, we can use transpose to to gather along any dimension:

def gather_along_axis(data, indices, axis=0):
  if not axis:
    return tf.gather(data, indices)
  rank = data.shape.ndims
  perm = [axis] + list(range(1, axis)) + [0] + list(range(axis + 1, rank))
  return tf.transpose(tf.gather(tf.transpose(data, perm), indices), perm)

This is slower than my snipped above though, as it needs to rotate the whole data array in memory instead of computing the correct indices into the original tensor.

Member

danijar commented Apr 14, 2017

@MInner You're right, we can use transpose to to gather along any dimension:

def gather_along_axis(data, indices, axis=0):
  if not axis:
    return tf.gather(data, indices)
  rank = data.shape.ndims
  perm = [axis] + list(range(1, axis)) + [0] + list(range(axis + 1, rank))
  return tf.transpose(tf.gather(tf.transpose(data, perm), indices), perm)

This is slower than my snipped above though, as it needs to rotate the whole data array in memory instead of computing the correct indices into the original tensor.

@MInner

This comment has been minimized.

Show comment
Hide comment
@MInner

MInner Apr 15, 2017

@danijar good to know!

MInner commented Apr 15, 2017

@danijar good to know!

yongtang added a commit to yongtang/tensorflow that referenced this issue Apr 23, 2017

Allow the output of `tf.argmax` as index type
This fix tries to fix the issue raised in #8951 where
the following will raise a `TypeError`:
```
a = tf.constant([1, 2, 3], dtype=tf.float32)
b = tf.argmax(a)
tf.Session().run(a[b])

TypeError: Input 'strides' of 'StridedSlice' Op has type int32 that does not match type int64 of argument 'begin'.
```
The reason for the erorr is that, `strides` is added as `append(1)`
without type while `begin` is appended with type.

The mismatch of `strides` and `begin` causes the error.

This fix fixes the issue by cast the stride with the same type
as `begin` when needed.

This issue was raised in #8951. It was also raised earlier in
tensorflow#206 (comment)

This fix fixes #8951.

Signed-off-by: Yong Tang <yong.tang.github@outlook.com>

yongtang added a commit to yongtang/tensorflow that referenced this issue Apr 28, 2017

Allow the output of `tf.argmax` as index type
This fix tries to fix the issue raised in #8951 where
the following will raise a `TypeError`:
```
a = tf.constant([1, 2, 3], dtype=tf.float32)
b = tf.argmax(a)
tf.Session().run(a[b])

TypeError: Input 'strides' of 'StridedSlice' Op has type int32 that does not match type int64 of argument 'begin'.
```
The reason for the erorr is that, `strides` is added as `append(1)`
without type while `begin` is appended with type.

The mismatch of `strides` and `begin` causes the error.

This fix fixes the issue by cast the stride with the same type
as `begin` when needed.

This issue was raised in #8951. It was also raised earlier in
tensorflow#206 (comment)

This fix fixes #8951.

Signed-off-by: Yong Tang <yong.tang.github@outlook.com>

yongtang added a commit to yongtang/tensorflow that referenced this issue May 2, 2017

Allow the output of `tf.argmax` as index type
This fix tries to fix the issue raised in #8951 where
the following will raise a `TypeError`:
```
a = tf.constant([1, 2, 3], dtype=tf.float32)
b = tf.argmax(a)
tf.Session().run(a[b])

TypeError: Input 'strides' of 'StridedSlice' Op has type int32 that does not match type int64 of argument 'begin'.
```
The reason for the erorr is that, `strides` is added as `append(1)`
without type while `begin` is appended with type.

The mismatch of `strides` and `begin` causes the error.

This fix fixes the issue by cast the stride with the same type
as `begin` when needed.

This issue was raised in #8951. It was also raised earlier in
tensorflow#206 (comment)

This fix fixes #8951.

Signed-off-by: Yong Tang <yong.tang.github@outlook.com>

vrv added a commit that referenced this issue May 2, 2017

Allow the output of `tf.argmax` as index type (#9000)
This fix tries to fix the issue raised in #8951 where
the following will raise a `TypeError`:
```
a = tf.constant([1, 2, 3], dtype=tf.float32)
b = tf.argmax(a)
tf.Session().run(a[b])

TypeError: Input 'strides' of 'StridedSlice' Op has type int32 that does not match type int64 of argument 'begin'.
```
The reason for the erorr is that, `strides` is added as `append(1)`
without type while `begin` is appended with type.

The mismatch of `strides` and `begin` causes the error.

This fix fixes the issue by cast the stride with the same type
as `begin` when needed.

This issue was raised in #8951. It was also raised earlier in
#206 (comment)

This fix fixes #8951.

Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
@ddmbr

This comment has been minimized.

Show comment
Hide comment
@ddmbr

ddmbr Oct 20, 2017

Hi,

I just saw @nova77 's code above,

x = tf.constant([[1, 2, 3],
                 [4, 5, 6],
                 [7, 8, 9]])
idx = tf.constant([1, 0, 2])
idx_flattened = tf.range(0, x.shape[0]) * x.shape[1] + idx
y = tf.gather(tf.reshape(x, [-1]),  # flatten input
              idx_flattened)  # use flattened indices

with tf.Session(''):
  print y.eval()  # [2 4 9]

Do we still need this workaround? Or is there already a solution (or an issue for me to track)? Thanks!

ddmbr commented Oct 20, 2017

Hi,

I just saw @nova77 's code above,

x = tf.constant([[1, 2, 3],
                 [4, 5, 6],
                 [7, 8, 9]])
idx = tf.constant([1, 0, 2])
idx_flattened = tf.range(0, x.shape[0]) * x.shape[1] + idx
y = tf.gather(tf.reshape(x, [-1]),  # flatten input
              idx_flattened)  # use flattened indices

with tf.Session(''):
  print y.eval()  # [2 4 9]

Do we still need this workaround? Or is there already a solution (or an issue for me to track)? Thanks!

@sifatron

This comment has been minimized.

Show comment
Hide comment
@sifatron

sifatron Dec 10, 2017

Can anyone help me with this? Embedding Gather always giving Negative Index. How to solve this? Using Keras with Tensorflow backend

InvalidArgumentError (see above for traceback): indices[0,6] = -1 is not in [0, 15664)
	 [[Node: embedding_1/Gather = Gather[Tindices=DT_INT32, Tparams=DT_FLOAT, validate_indices=true, _device="/job:localhost/replica:0/task:0/device:CPU:0"](embedding_1/embeddings/read, embedding_1/Cast)]]

sifatron commented Dec 10, 2017

Can anyone help me with this? Embedding Gather always giving Negative Index. How to solve this? Using Keras with Tensorflow backend

InvalidArgumentError (see above for traceback): indices[0,6] = -1 is not in [0, 15664)
	 [[Node: embedding_1/Gather = Gather[Tindices=DT_INT32, Tparams=DT_FLOAT, validate_indices=true, _device="/job:localhost/replica:0/task:0/device:CPU:0"](embedding_1/embeddings/read, embedding_1/Cast)]]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment