Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TensorFlow equivalent to numpy.repeat #8246

Closed
shoyer opened this issue Mar 9, 2017 · 23 comments
Closed

TensorFlow equivalent to numpy.repeat #8246

shoyer opened this issue Mar 9, 2017 · 23 comments
Assignees
Labels
stat:contribution welcome Status - Contributions welcome type:feature Feature requests

Comments

@shoyer
Copy link
Contributor

shoyer commented Mar 9, 2017

This is a popular question on StackOverflow:
http://stackoverflow.com/questions/35361467/tensorflow-numpy-repeat-alternative

But note that the answer so far only works for some use cases (the one presented in the question).

The best I could come up with for a general solution uses tf.while_loop, which is pretty verbose (and maybe slower than necessary). I'll add a link to the implementation I wrote for tf.contrib.training.resample_at_rate after the next internal/github sync.

@prb12 prb12 added type:feature Feature requests stat:contribution welcome Status - Contributions welcome labels Mar 9, 2017
@shoyer
Copy link
Contributor Author

shoyer commented Mar 15, 2017

See

def _repeat_range(counts, name=None):
for my example using tf.while_loop.

@lattas
Copy link

lattas commented Apr 3, 2017

Dear @shoyer , I am currently working on this with a new op in array_ops and python wrapper function. I will update you soon.

@shoyer
Copy link
Contributor Author

shoyer commented Apr 3, 2017

@a-lattas Great! Please replace _repeat_range in resample_at_rate (linked above) with the new op as part of your PR.

@shoyer
Copy link
Contributor Author

shoyer commented Apr 4, 2017

@waleedka
Copy link
Contributor

What's the status of this update?

@qianyizhang
Copy link

qianyizhang commented Oct 17, 2017

since the thread is still open, I will paste my solution here

def np_repeat(tensor, repeats):
    """
    Args:

    input: A Tensor. 1-D or higher.
    repeats: A list. Number of repeat for each dimension, length must be the same as the number of dimensions in input

    Returns:
    
    A Tensor. Has the same type as input. Has the shape of tensor.shape * repeats
    """
    assert len(repeats) == tensor.ndim, "dimension must match"
    repeated = tensor
    for axis, repeat in enumerate(repeats):
        repeated = np.repeat(repeated, repeat, axis = axis)
    return repeated

def tf_repeat(tensor, repeats):
    """
    Args:

    input: A Tensor. 1-D or higher.
    repeats: A list. Number of repeat for each dimension, length must be the same as the number of dimensions in input

    Returns:
    
    A Tensor. Has the same type as input. Has the shape of tensor.shape * repeats
    """
    with tf.variable_scope("repeat"):
        expanded_tensor = tf.expand_dims(tensor, -1)
        multiples = [1] + repeats
        tiled_tensor = tf.tile(expanded_tensor, multiples = multiples)
        repeated_tesnor = tf.reshape(tiled_tensor, tf.shape(tensor) * repeats)
    return repeated_tesnor

def repeat_test():
    shape = [1,3,3,3,2]
    repeat = [1,2,2,3,1]
    tensor = np.random.randn(*shape)
    np_repeated_tensor = np_repeat(tensor, repeat)
    tf_tensor = tf.constant(tensor)
    g = tf.get_default_graph()
    tf_new = tf_repeat(tf_tensor, repeat)
    with tf.Session(graph=g) as sess:
        tf_repeated_tensor = tf_new.eval()
#    tf_repeated_tensor = np.array(tf_repeated_tensor)
    if np.allclose(np_repeated_tensor, tf_repeated_tensor):
        print("tf_repeat is the same as np_repeat")
    else:
        print("something wrong")

@PeterMitrano
Copy link

👍 for what seems like a basic feature

@Tutufa
Copy link

Tutufa commented Nov 20, 2017

need one!

@yongtang
Copy link
Member

yongtang commented Dec 8, 2017

Added a PR #15224 for the fix.

@liuhu-bigeye
Copy link

@qianyizhang In numpy.repeat we can assign the number of repetitions for each element, e.g.
x = np.array([[1,2],[3,4]])
np.repeat(x, [1, 2], axis=0)=array([[1, 2], [3, 4], [3, 4]])

@itsmeolivia
Copy link
Contributor

This looks like it was fixed bu #15224. Please update the issue when new information becomes available, and we will reopen the issue. Thanks!

@shoyer
Copy link
Contributor Author

shoyer commented Feb 8, 2018

#15224 was indeed a PR to add tf.repeat, but it was closed without merging. Reopening.

@RylanSchaeffer
Copy link

@shoyer , any updates on this?

@shoyer
Copy link
Contributor Author

shoyer commented Jun 25, 2018

I haven't been looking at this, but check the referenced PRs above.

@sachinruk
Copy link

Unfortunately @qianyizhang's answer doesn't help with the following use of np.repeat:

x = np.array([0,1,2])
np.repeat(x,[3,4,5])
>>> array([0, 0, 0, 1, 1, 1, 1, 2, 2, 2, 2, 2])

Also has there been a merged PR?

@jhultman
Copy link

jhultman commented Sep 8, 2018

@sachinruk have you found a good solution for that use case? I would love to hear about it. In the meantime, if your scenario permits tf.py_func you can try my simple solution below.

import tensorflow as tf
import numpy as np

def tf_repeat(arr, repeats):
    return tf.py_func(np.repeat, [arr, repeats], tf.float32)

arr = np.float32([4, 2, 3, 5])
repeats = np.int32([2, 3, 0, 2])

with tf.Session() as sess:
    out = sess.run(tf_repeat(arr, repeats))
    print(out)

>>> [4. 4. 2. 2. 2. 5. 5.]

@mspinaci
Copy link

mspinaci commented Oct 5, 2018

At least for 1D tensors, a combination of tf.cumsum and tf.sparse_to_dense should do the trick:

import tensorflow as tf


def tf_repeat_1D(x,repeats):
    x = tf.constant(x, dtype=tf.float64)
    repeats = tf.constant(repeats, dtype=tf.int32)

    shape = tf.reduce_sum(repeats)
    idx = tf.concat([tf.constant([0], dtype=tf.int32), tf.cumsum(repeats[:-1])], axis=0)
    y = tf.sparse_to_dense(
        sparse_indices = idx,
        output_shape=(shape,),
        sparse_values=x - tf.concat([tf.constant([0], dtype=tf.float64), x[:-1]], axis=0)
    )

    return tf.cumsum(y)

z1 = tf_repeat_1D([0,1,2], [3,4,5])
z2 = tf_repeat_1D([4,2,5], [1, 3, 2])

with tf.Session() as sess:
    print(z1.eval())
    print(z2.eval())

This prints:

[0. 0. 0. 1. 1. 1. 1. 2. 2. 2. 2. 2.]
[4. 2. 2. 2. 5. 5.]

The trick is to use something like np.concatenate(([x[0]], x[1:]-x[:-1])) (placed at the correct indices) in order to reconstruct the correct values when using cumsum.

Generalizing to higher dim is made complicated by the fact that sparse_to_dense only accepts 1d sparse_values input... so probably using another function there is better.

@jackd
Copy link
Contributor

jackd commented Mar 6, 2019

For anyone else tired of waiting for a PR to get reviewed/accepted, I've ported this pull request to a separate repo with some minor tweaks. Only CPU version without gradients implemented.

Just found this hidden implementation. It's not a part of the exported API, so you'll have to be a little dirty to acces it.

from tensorflow.python.ops.ragged.ragged_util import repeat

@yongtang
Copy link
Member

yongtang commented Mar 9, 2019

cc @jackd Added a PR #26517 to expose repeat.

@jackd
Copy link
Contributor

jackd commented Sep 2, 2019

@yongtang Thanks for your work on this. I've been playing with this implementation (and the hidden version) for a while now, but just recently benchmarked performance and... well, it leaves a lot to be desired. For modestly sized arrays it can be significantly improved in terms of both speed and memory usage by gathering a repeated range.

def gather_repeat(values, repeats, axis=0):
    indices = tf.repeat(tf.range(tf.shape(values)[axis]), repeats)
    return tf.gather(values, indices, axis=axis)

(note I haven't tested different axes/confirmed it covers all edge cases)

For the very modestly sized arrays in the example below you can get a ~30% speed up and 75% memory reduction. Larger tensors result in converging computation time, but the memory factor remains roughly constant. From what I understand, gathers aren't particularly economical, so I imagine there's plenty of performance optimization to go from there (though possibly not in python). I'm happy to put the following together into a PR (or if anyone else is more comfortable with the tf PR process, feel free to take it and run with it), but if this motivates the tf team to accept an optimized kernel solution I'd prefer to skip straight to that.

Using pip-installed tf 1.15.0-dev20190821 , GTX-1070, python 3.6

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import numpy as np
import tensorflow as tf

max_repeats = 100
nrows = 1000
ndims = 100


def gather_repeat(values, repeats, axis=0):
    indices = tf.repeat(tf.range(tf.shape(values)[axis]), repeats)
    return tf.gather(values, indices, axis=axis)


def get_args():
    r = np.random.RandomState(123)  # pylint: disable=no-member
    repeats = (r.uniform(size=(nrows,),) * max_repeats).astype(np.int64)
    values = r.uniform(size=(nrows, ndims)).astype(np.float32)
    return tf.constant(values), tf.constant(repeats)


def benchmark_base():
    with tf.Graph().as_default():
        values, repeats = get_args()
        op = tf.repeat(values, repeats, axis=0)
        with tf.Session() as sess:
            print('***********************')
            print('base')
            tf.test.Benchmark().run_op_benchmark(sess, op)


def benchmark_gather():
    with tf.Graph().as_default():
        values, repeats = get_args()
        op = gather_repeat(values, repeats, axis=0)
        with tf.Session() as sess:
            print('***********************')
            print('gather')
            tf.test.Benchmark().run_op_benchmark(sess, op)


def ensure_same():
    with tf.Graph().as_default():
        values, repeats = get_args()
        base = tf.repeat(values, repeats, axis=0)
        gathered = gather_repeat(values, repeats, axis=0)
        max_err = tf.reduce_max(tf.abs(gathered - base))
        with tf.Session() as sess:
            print('Max error: {}'.format(sess.run(max_err)))


if __name__ == '__main__':
    benchmark_base()
    benchmark_gather()
    ensure_same()

Output:

***********************
base
entry {
  name: "TensorFlowBenchmark.run_op_benchmark"
  iters: 10
  wall_time: 0.002911686897277832
  extras {
    key: "allocator_maximum_num_bytes_GPU_0_bfc"
    value {
      double_value: 79595528.0
    }
  }
  extras {
    key: "allocator_maximum_num_bytes_gpu_host_bfc"
    value {
      double_value: 24.0
    }
  }
}

***********************
gather
entry {
  name: "TensorFlowBenchmark.run_op_benchmark"
  iters: 10
  wall_time: 0.0021082162857055664
  extras {
    key: "allocator_maximum_num_bytes_GPU_0_bfc"
    value {
      double_value: 20176400.0
    }
  }
  extras {
    key: "allocator_maximum_num_bytes_gpu_host_bfc"
    value {
      double_value: 197768.0
    }
  }
}
Max error: 0.0

@edloper
Copy link
Contributor

edloper commented Oct 2, 2019

@jackd If you have the cycles to write up an optimized kernel solution for this (incl. a gradient function), I'll help make sure it gets accepted. (I'm the original author of the pure-python version, and a faster version would help speed up several RaggedTensor operations.)

@mohantym mohantym self-assigned this Dec 3, 2021
@mohantym
Copy link
Contributor

mohantym commented Dec 3, 2021

Hi @shoyer ! Have you checked latest document on tf.repeat yet? Thanks!

@shoyer shoyer closed this as completed Dec 3, 2021
@mohantym
Copy link
Contributor

Are you satisfied with the resolution of your issue?
Yes
No

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stat:contribution welcome Status - Contributions welcome type:feature Feature requests
Projects
None yet
Development

Successfully merging a pull request may close this issue.