Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

scatter_add for non variable tensors #2358

Closed
altaetran opened this issue May 13, 2016 · 33 comments
Closed

scatter_add for non variable tensors #2358

altaetran opened this issue May 13, 2016 · 33 comments
Labels
stat:contribution welcome Status - Contributions welcome

Comments

@altaetran
Copy link

Hi,

I am interested in using scatter_add when the tensor being update is not a variable. Is this possible?

I am looking to do something like this:

X1_ph = tf.placeholder(tf.float32, shape=(None, 3))
ind_ph = tf.placeholder(tf.int32, shape=(None))

#Z = tf.Variable(tf.zeros([10, 3]))
Z = tf.zeros([10, N_feat])

X1 = np.array([[1,0.00,1],
[2,0.00,1],
[3,0.00,1],
[5,0.00,1.1],
[6,1.0,1.8]])

ind = [0, 1, 1, 0, 0]

Z = tf.scatter_add(Z, ind_ph, X1)

If I declare Z as a tf.Variable, I can do this, but I need to call this operation hundreds of thousands of times, and do not want to store any copies of Z once I am done with them. If I were to declare Z as a Variable, would there be any way to destroy Z once I am done with it (maybe with a garbage collector or something similar)? Thank you so much for your help!

@girving
Copy link
Contributor

girving commented May 14, 2016

Use tf.sparse_to_dense.

@girving
Copy link
Contributor

girving commented May 14, 2016

Actually, by "this operation" do you mean the whole thing, or do you want to allocate a Z and do a bunch of separate scatters into it before deallocating it?

@altaetran
Copy link
Author

I'd like to be able to allocate Z, do a bunch of scatters, and then deallocate it while allocating another Z for a different version. Is there a function I can call to deallocate a variable?

@girving
Copy link
Contributor

girving commented May 14, 2016

@altaetran: We could certainly make a deallocation op, but I don't think we have one at the moment? However, using it would be somewhat awkward.

@yuanbyu: Do you have an ideas?

@altaetran
Copy link
Author

Is there any way to retrofit the scatter_add function to work directly on regular tensors produced by other tensorflow operations?

@girving
Copy link
Contributor

girving commented May 16, 2016

Non-Variable tensors are immutable, so supporting scatter into them would break an important part of the model. The uninitialize op is much easier. Would you be interested in submitting a patch? :)

@mrry
Copy link
Contributor

mrry commented May 16, 2016

Judging by #2367, it appears that @altaetran also requires gradients for this operation. It sounds to me like a functional op would be preferable for this purpose.

@rryan
Copy link
Member

rryan commented Jul 14, 2016

/me whistles innocently ;)

var = gen_state_ops._temporary_variable(shape=shape, dtype=tensor_dtype)

@girving
Copy link
Contributor

girving commented Jul 14, 2016

Excellent. I won't have time to work on this PR in the near term. @rryan Would you want to either take over or elaborate about how to use temporary_variable for this purpose?

@rryan
Copy link
Member

rryan commented Jul 15, 2016

It wont help with the gradient bit (and this is a non-public op) but based on your example code you could do:

Z = gen_state_ops._temporary_variable(shape=..., dtype=...)
Z_name = Z.op.name
destroy_op = gen_state_ops._destroy_temporary_variable(Z, var_name=Z_name)
X1_ph = tf.placeholder(tf.float32, shape=(None, 3))
ind_ph = tf.placeholder(tf.int32, shape=(None))
Z = tf.scatter_add(Z, ind_ph, X1_ph)


with tf.Session() as sess:
  for _ in xrange(steps):
    # run Z with placeholders filled in for X1_ph and ind_ph
    sess.run(Z, feed_dict={X1_ph: ..., ind_ph: ...}) 
  # clean up -- destroy_op returns the value of Z and destroys the temporary variable
  sess.run(destroy_op) 

@girving
Copy link
Contributor

girving commented Jul 15, 2016

@rryan Is it possible to define valid gradients for this using gradient_override_map if we use temporary_variable?

@aselle aselle removed the triaged label Jul 28, 2016
@girving
Copy link
Contributor

girving commented Aug 9, 2016

Removing my assignment since I won't have time to work on this personally. @rryan Could you comment on my gradient question? If there's a reasonable path forward we can mark this contributions welcome, but I'm not sure how the temporary variables stuff works.

@girving girving removed their assignment Aug 9, 2016
@girving girving added the stat:awaiting tensorflower Status - Awaiting response from tensorflower label Aug 9, 2016
@yaroslavvb
Copy link
Contributor

I thought there was a thread a while back (Feb) from @vanhoucke about how
to do scatter add without using variables. If you can do without variables,
you can use same op to do it using same op hundreds of thousands of times
by using persistent Tensors, since memory is recycled as soon as Python
handle is unassigned

On Tue, Aug 9, 2016 at 4:36 PM, Geoffrey Irving notifications@github.com
wrote:

Removing my assignment since I won't have time to work on this personally.
@rryan https://github.com/rryan Could you comment on my gradient
question? If there's a reasonable path forward we can mark this
contributions welcome, but I'm not sure how the temporary variables stuff
works.


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#2358 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AABaHKCD1QjFu6UZA-ehlmmJQCecarEeks5qeQ8EgaJpZM4IebkP
.

@altaetran
Copy link
Author

Hey, this sounds really promising. Do you think you could provide a simple example of how one might do this? I'm not too familiar with persistent tensors. Thanks!

@yaroslavvb
Copy link
Contributor

sure, I'll put an example together tomorrow

On Tue, Aug 9, 2016 at 7:19 PM, Han Altae-Tran notifications@github.com
wrote:

Hey, this sounds really promising. Do you think you could provide a simple
example of how one might do this? I'm not too familiar with persistent
tensors. Thanks!


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#2358 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AABaHJhKeaPZYS5TAF9lYh-sM6ZKR7Wgks5qeTVOgaJpZM4IebkP
.

@girving girving added stat:contribution welcome Status - Contributions welcome and removed stat:awaiting tensorflower Status - Awaiting response from tensorflower labels Aug 10, 2016
@yaroslavvb
Copy link
Contributor

@altaetran here's an example that uses a helper I wrote to simplify dealing with persistent tensors

  1. Download imperative.py and save it in a place where you can import it (ie, same directory as your script)
import tensorflow as tf
import imperative
env = imperative.Env(tf)
tfi = env.tf

N_feat = 3
Z = tfi.zeros([10, N_feat])
X1 = tfi.constant([[1,0.00,1],
[2,0.00,1],
[3,0.00,1],
[5,0.00,1.1],
[6,1.0,1.8]])

ind = [0, 1, 1, 0, 0]
for right_pos, left_pos in enumerate(ind):
    new_row = Z[left_pos, :]+X1[right_pos, :]
    # turn vector into 1-by-x matrix so we can concat it
    new_row_mat = tfi.reshape(new_row, [1, -1])
    # make new Tensor with old row replaced by updated version
    Z = tfi.concat(0, [Z[:left_pos, :], new_row_mat, Z[left_pos+1:, :]])
print Z

That should give

ITensor([[ 12.           1.           3.89999986]
 [  5.           0.           2.        ]
 [  0.           0.           0.        ]
 [  0.           0.           0.        ]
 [  0.           0.           0.        ]
 [  0.           0.           0.        ]
 [  0.           0.           0.        ]
 [  0.           0.           0.        ]
 [  0.           0.           0.        ]
 [  0.           0.           0.        ]], dtype=float32)

I'm working on more docs for imperative.py, meanwhile there's an overview with some slides here
https://github.com/yaroslavvb/imperative/blob/master/imperative_slides.pdf

@AdityaGudimella
Copy link

AdityaGudimella commented Jan 23, 2017

Would this work? Since this is just using TensorFlow ops under the hood, it propogates gradients too.

def scatter_add_tensor(ref, indices, updates, name=None):
    """
    Adds sparse updates to a variable reference.

    This operation outputs ref after the update is done. This makes it easier to chain operations that need to use the
    reset value.

    Duplicate indices are handled correctly: if multiple indices reference the same location, their contributions add.

    Requires updates.shape = indices.shape + ref.shape[1:].
    :param ref: A Tensor. Must be one of the following types: float32, float64, int64, int32, uint8, uint16,
        int16, int8, complex64, complex128, qint8, quint8, qint32, half.
    :param indices: A Tensor. Must be one of the following types: int32, int64. A tensor of indices into the first
        dimension of ref.
    :param updates: A Tensor. Must have the same dtype as ref. A tensor of updated values to add to ref
    :param name: A name for the operation (optional).
    :return: Same as ref. Returned as a convenience for operations that want to use the updated values after the update
        is done.
    """
    with tf.name_scope(name, 'scatter_add_tensor', [ref, indices, updates]) as scope:
        ref = tf.convert_to_tensor(ref, name='ref')
        indices = tf.convert_to_tensor(indices, name='indices')
        updates = tf.convert_to_tensor(updates, name='updates')
        ref_shape = tf.shape(ref, out_type=indices.dtype, name='ref_shape')
        scattered_updates = tf.scatter_nd(indices, updates, ref_shape, name='scattered_updates')
        with tf.control_dependencies([tf.assert_equal(ref_shape, tf.shape(scattered_updates, out_type=indices.dtype))]):
            output = tf.add(ref, scattered_updates, name=scope)
        return output

@aliosmanulusoy
Copy link

Thanks @AdityaGudimella ! I've tested the gradients and it seems to work. Can anybody else confirm?

@aliosmanulusoy
Copy link

@AdityaGudimella you wrote "Duplicate indices are handled correctly: if multiple indices reference the same location, their contributions add." for your function. I've tested this and it seems correct. However, I don't understand why :) Can you please explain? tf.scatter_nd doesn't seems to produce any guarantees if multiple indices reference the same location.

@cjf00000
Copy link

@aliosmanulusoy According to #8102, it seems that tf.scatter_nd currently adds up duplicate updates

@itsmeolivia
Copy link
Contributor

Automatically closing due to lack of recent activity. Since this issue is old at this point, please reopen the issue if it still occurs when tried with the latest version of Tensorflow. Thank you.

@xcyan
Copy link

xcyan commented Jul 19, 2017

The latest version of Tensorflow (1.2) still only supports mutable Tensors.

Also, I think there is a bug of this function: "ref" is not updated.

@bodokaiser
Copy link

Any updates here?

@xcyan
Copy link

xcyan commented Sep 12, 2017

@bodokaiser I don't think so. They are not working on this thread actively. One solution is to implement your own scatter_add() layer.

@nikonikolov
Copy link

nikonikolov commented Oct 25, 2017

It would really be very useful if this is implemented.

albertz referenced this issue in Spotlight0xff/returnn Mar 2, 2018
@albertz
Copy link
Contributor

albertz commented Jan 28, 2019

@altaetran @yaroslavvb @girving @mrry @rryan @aselle @itsmeolivia (or anyone who has access): Can this be reopened? It still is missing, and still it would be a useful addition, as this could potentially speed up some code (where you would currently need to use tf.sparse_to_dense, or tf.where, or so instead).

@alextp
Copy link
Contributor

alextp commented Apr 9, 2019

tensor_scatter_nd_add and friends are the solution to this problem I think so no need to reopen.

@albertz
Copy link
Contributor

albertz commented Apr 11, 2019

Ah thanks, seems I missed that? Or when exactly was this added? (The documentation does not say this.)

@alextp
Copy link
Contributor

alextp commented Apr 11, 2019

@albertz this was added fairly recently :-)

@ypxie
Copy link

ypxie commented Apr 23, 2019

what if I don't need gradient, can I do scatter_update to tensor just like variable?

@alextp
Copy link
Contributor

alextp commented Apr 23, 2019 via email

@ypxie
Copy link

ypxie commented Apr 23, 2019

@alextp Hey, thanks! but that api is in Tensorflow 1.13. is there any workaround for tf .1.12?

@alextp
Copy link
Contributor

alextp commented Apr 23, 2019 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stat:contribution welcome Status - Contributions welcome
Projects
None yet
Development

No branches or pull requests