# Feature: Sparse matrix multiplications for Tensors with rank > 2 #9210

Open
opened this issue Apr 14, 2017 · 9 comments

## System Information:

Windows 10, x64, Tensorflow 1.1.0.rc1

## Description:

The 3-D sparse tensor (placeholder) multiply with 3-D dense tensor has bug, the operation will failed.

x = tf.sparse_placeholder(tf.float32, shape=[None, 2, 2])
y = tf.constant(np.ones([3, 2, 1]), dtype=tf.float32)
z = tf.matmul(x, y, a_is_sparse=True)

indices = [[1, 1, 1], [2, 0, 0], [3, 0, 1]]
values = [1.0, 2.0, 3.0]
dense_shape = [3, 2, 2]
x_val = tf.SparseTensorValue(indices, values, dense_shape)

with tf.Session() as sess:
res = sess.run(z, feed_dict={x: x_val})
print(res)

expected result(3x2x1):

[[[ 0.][ 1.]]
[[ 1.][ 0.]]
[[ 1.][ 0.]]]


but output some errors actually :

Traceback (most recent call last):
File "D:/Learning/master_project/clinicalText/SourceCode/Python/DNN_CWS/seg_dnn.py", line 369, in <module>
cws = SegDNN(constant.VOCAB_SIZE, embed_size, constant.DNN_SKIP_WINDOW)
File "D:/Learning/master_project/clinicalText/SourceCode/Python/DNN_CWS/seg_dnn.py", line 76, in __init__
self.loss = tf.reduce_sum(tf.matmul(self.slim_map_matrix,tf.expand_dims(tf.transpose(self.word_score),2),a_is_sparse=True))
File "E:\IntelPython35\envs\tensorflow-intel\lib\site-packages\tensorflow\python\ops\math_ops.py", line 1755, in matmul
a = ops.convert_to_tensor(a, name="a")
File "E:\IntelPython35\envs\tensorflow-intel\lib\site-packages\tensorflow\python\framework\ops.py", line 639, in convert_to_tensor
as_ref=False)
File "E:\IntelPython35\envs\tensorflow-intel\lib\site-packages\tensorflow\python\framework\ops.py", line 704, in internal_convert_to_tensor
ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref)
File "E:\IntelPython35\envs\tensorflow-intel\lib\site-packages\tensorflow\python\framework\constant_op.py", line 113, in _constant_tensor_conversion_function
return constant(v, dtype=dtype, name=name)
File "E:\IntelPython35\envs\tensorflow-intel\lib\site-packages\tensorflow\python\framework\constant_op.py", line 102, in constant
tensor_util.make_tensor_proto(value, dtype=dtype, shape=shape, verify_shape=verify_shape))
File "E:\IntelPython35\envs\tensorflow-intel\lib\site-packages\tensorflow\python\framework\tensor_util.py", line 444, in make_tensor_proto
tensor_proto.string_val.extend([compat.as_bytes(x) for x in proto_values])
File "E:\IntelPython35\envs\tensorflow-intel\lib\site-packages\tensorflow\python\framework\tensor_util.py", line 444, in <listcomp>
tensor_proto.string_val.extend([compat.as_bytes(x) for x in proto_values])
File "E:\IntelPython35\envs\tensorflow-intel\lib\site-packages\tensorflow\python\util\compat.py", line 65, in as_bytes
(bytes_or_text,))
TypeError: Expected binary or unicode string, got <tensorflow.python.framework.sparse_tensor.SparseTensor object at 0x00000195FA86B1D0>


change the z to

z = tf.sparse_tensor_dense_matmul(x,y)

also failed because the shape of sparse must 2-D,but xandbhas 3-D

Contributor

### asimshankar commented Apr 14, 2017

 Note that tf.matmul requires dense tensors as both arguments. The a_is_sparse and b_is_sparse arguments are used only to choose a possibly more efficient algorithm when a (and respectively b) have a lot zeros. The error message perhaps could be improved though. If you're using the SparseTensor class, then tf.sparse_tensor_dense_matmul is indeed what you want. Support for tensors with rank > 2 in that is not something that anyone is actively working on, contributions are welcome! (Long story short, this isn't a bug but arguably a missing feature for which contributions are welcome)
Contributor

### asimshankar commented Apr 14, 2017

 FWIW, the error message was improved in commit 3e239dc, but alas that is not part of the 1.1 release, so it will be improved in 1.2

### asimshankar changed the title 3-D sparse tensor (placeholder) multiply with 3-D dense tensor bugFeature: Sparse matrix multiplications for Tensors with rank > 2Apr 14, 2017

referenced this issue Apr 21, 2017

### kojino commented Jan 18, 2018

 Would a function like sparse_tensor_dense_tensortdot resolve this issue? I was thinking of implementing this function in python, similarly to tensordot. I took a look at the linked feature request which seems to do this in C++. But just like tensordot is implemented purely in python, I think this is possible too. Thoughts? If someone thinks it's a good idea, I'll go about implementing it.

### sdrelton commented Feb 27, 2018

 @kojino I just found this page because I wanted to do a sparse-dense tensordot operation. If you're willing the implement it (even as a custom Op) I would certainly much appreciate it! I'm sure we can't be the only two who would be interested in this! Cheers

### kojino commented May 3, 2018

 I implemented a function that is based off of tf.tensordot. It's long, but I think it works. def sparse_tensor_dense_tensordot(sp_a, b, axes, name=None): r"""Tensor contraction of a and b along specified axes. Tensordot (also known as tensor contraction) sums the product of elements from a and b over the indices specified by a_axes and b_axes. The lists a_axes and b_axes specify those pairs of axes along which to contract the tensors. The axis a_axes[i] of a must have the same dimension as axis b_axes[i] of b for all i in range(0, len(a_axes)). The lists a_axes and b_axes must have identical length and consist of unique integers that specify valid axes for each of the tensors. This operation corresponds to numpy.tensordot(a, b, axes). Example 1: When a and b are matrices (order 2), the case axes = 1 is equivalent to matrix multiplication. Example 2: When a and b are matrices (order 2), the case axes = [[1], [0]] is equivalent to matrix multiplication. Example 3: Suppose that \$$a_{ijk}\$$ and \$$b_{lmn}\$$ represent two tensors of order 3. Then, contract(a, b, [[0], [2]]) is the order 4 tensor \$$c_{jklm}\$$ whose entry corresponding to the indices \$$(j,k,l,m)\$$ is given by: \$$c_{jklm} = \sum_i a_{ijk} b_{lmi} \$$. In general, order(c) = order(a) + order(b) - 2*len(axes[0]). Args: a: SparseTensor of type float32 or float64. b: Tensor with the same type as a. axes: Either a scalar N, or a list or an int32 Tensor of shape [2, k]. If axes is a scalar, sum over the last N axes of a and the first N axes of b in order. If axes is a list or Tensor the first and second row contain the set of unique integers specifying axes along which the contraction is computed, for a and b, respectively. The number of axes for a and b must be equal. name: A name for the operation (optional). Returns: A Tensor with the same type as a. Raises: ValueError: If the shapes of a, b, and axes are incompatible. IndexError: If the values in axes exceed the rank of the corresponding tensor. """ def _tensordot_reshape(a, axes, flipped=False): """Helper method to perform transpose and reshape for contraction op. This method is helpful in reducing math_tf.tensordot to math_tf.matmul using tf.transpose and tf.reshape. The method takes a tensor and performs the correct transpose and reshape operation for a given set of indices. It returns the reshaped tensor as well as a list of indices necessary to reshape the tensor again after matrix multiplication. Args: a: Tensor. axes: List or int32 Tensor of unique indices specifying valid axes of a. flipped: An optional bool. Defaults to False. If True, the method assumes that a is the second argument in the contraction operation. Returns: A tuple (reshaped_a, free_dims, free_dims_static) where reshaped_a is the tensor a reshaped to allow contraction via matmul, free_dims is either a list of integers or an int32 Tensor, depending on whether the shape of a is fully specified, and free_dims_static is either a list of integers and None values, or None, representing the inferred static shape of the free dimensions """ if a.get_shape().is_fully_defined() and isinstance(axes, (list, tuple)): shape_a = a.get_shape().as_list() axes = [i if i >= 0 else i + len(shape_a) for i in axes] free = [i for i in range(len(shape_a)) if i not in axes] free_dims = [shape_a[i] for i in free] prod_free = int(np.prod([shape_a[i] for i in free])) prod_axes = int(np.prod([shape_a[i] for i in axes])) perm = list(axes) + free if flipped else free + list(axes) new_shape = [prod_axes, prod_free] if flipped else [prod_free, prod_axes] reshaped_a = tf.reshape(tf.transpose(a, perm), new_shape) return reshaped_a, free_dims, free_dims else: if a.get_shape().ndims is not None and isinstance(axes, (list, tuple)): shape_a = a.get_shape().as_list() axes = [i if i >= 0 else i + len(shape_a) for i in axes] free = [i for i in range(len(shape_a)) if i not in axes] free_dims_static = [shape_a[i] for i in free] else: free_dims_static = None shape_a = tf.shape(a) rank_a = tf.rank(a) axes = tf.convert_to_tensor(axes, dtype=tf.int32, name="axes") axes = tf.cast(axes >= 0, tf.int32) * axes + tf.cast( axes < 0, tf.int32) * ( axes + rank_a) free, _ = tf.setdiff1d(tf.range(rank_a), axes) free_dims = tf.gather(shape_a, free) axes_dims = tf.gather(shape_a, axes) prod_free_dims = tf.reduce_prod(free_dims) prod_axes_dims = tf.reduce_prod(axes_dims) perm = tf.concat([axes_dims, free_dims], 0) if flipped: perm = tf.concat([axes, free], 0) new_shape = tf.stack([prod_axes_dims, prod_free_dims]) else: perm = tf.concat([free, axes], 0) new_shape = tf.stack([prod_free_dims, prod_axes_dims]) reshaped_a = tf.reshape(tf.transpose(a, perm), new_shape) return reshaped_a, free_dims, free_dims_static def _tensordot_axes(a, axes): """Generates two sets of contraction axes for the two tensor arguments.""" a_shape = a.get_shape() if isinstance(axes, compat.integral_types): if axes < 0: raise ValueError("'axes' must be at least 0.") if a_shape.ndims is not None: if axes > a_shape.ndims: raise ValueError("'axes' must not be larger than the number of " "dimensions of tensor %s." % a) return (list(range(a_shape.ndims - axes, a_shape.ndims)), list(range(axes))) else: rank = tf.rank(a) return (range(rank - axes, rank, dtype=tf.int32), range(axes, dtype=tf.int32)) elif isinstance(axes, (list, tuple)): if len(axes) != 2: raise ValueError("'axes' must be an integer or have length 2.") a_axes = axes[0] b_axes = axes[1] if isinstance(a_axes, compat.integral_types) and \ isinstance(b_axes, compat.integral_types): a_axes = [a_axes] b_axes = [b_axes] if len(a_axes) != len(b_axes): raise ValueError( "Different number of contraction axes 'a' and 'b', %s != %s." % (len(a_axes), len(b_axes))) return a_axes, b_axes else: axes = tf.convert_to_tensor(axes, name="axes", dtype=tf.int32) return axes[0], axes[1] def _sparse_tensordot_reshape(a, axes, flipped=False): """Helper method to perform transpose and reshape for contraction op. This method is helpful in reducing math_tf.tensordot to math_tf.matmul using tf.transpose and tf.reshape. The method takes a tensor and performs the correct transpose and reshape operation for a given set of indices. It returns the reshaped tensor as well as a list of indices necessary to reshape the tensor again after matrix multiplication. Args: a: Tensor. axes: List or int32 Tensor of unique indices specifying valid axes of a. flipped: An optional bool. Defaults to False. If True, the method assumes that a is the second argument in the contraction operation. Returns: A tuple (reshaped_a, free_dims, free_dims_static) where reshaped_a is the tensor a reshaped to allow contraction via matmul, free_dims is either a list of integers or an int32 Tensor, depending on whether the shape of a is fully specified, and free_dims_static is either a list of integers and None values, or None, representing the inferred static shape of the free dimensions """ if a.get_shape().is_fully_defined() and isinstance(axes, (list, tuple)): shape_a = a.get_shape().as_list() axes = [i if i >= 0 else i + len(shape_a) for i in axes] free = [i for i in range(len(shape_a)) if i not in axes] free_dims = [shape_a[i] for i in free] prod_free = int(np.prod([shape_a[i] for i in free])) prod_axes = int(np.prod([shape_a[i] for i in axes])) perm = list(axes) + free if flipped else free + list(axes) new_shape = [prod_axes, prod_free] if flipped else [prod_free, prod_axes] reshaped_a = tf.sparse_reshape(tf.sparse_transpose(a, perm), new_shape) return reshaped_a, free_dims, free_dims else: if a.get_shape().ndims is not None and isinstance(axes, (list, tuple)): shape_a = a.get_shape().as_list() axes = [i if i >= 0 else i + len(shape_a) for i in axes] free = [i for i in range(len(shape_a)) if i not in axes] free_dims_static = [shape_a[i] for i in free] else: free_dims_static = None shape_a = tf.shape(a) rank_a = tf.rank(a) axes = tf.convert_to_tensor(axes, dtype=tf.int32, name="axes") axes = tf.cast(axes >= 0, tf.int32) * axes + tf.cast( axes < 0, tf.int32) * ( axes + rank_a) print(sess.run(rank_a), sess.run(axes)) free, _ = tf.setdiff1d(tf.range(rank_a), axes) free_dims = tf.gather(shape_a, free) axes_dims = tf.gather(shape_a, axes) prod_free_dims = tf.reduce_prod(free_dims) prod_axes_dims = tf.reduce_prod(axes_dims) perm = tf.concat([axes_dims, free_dims], 0) if flipped: perm = tf.concat([axes, free], 0) new_shape = tf.stack([prod_axes_dims, prod_free_dims]) else: perm = tf.concat([free, axes], 0) new_shape = tf.stack([prod_free_dims, prod_axes_dims]) reshaped_a = tf.sparse_reshape(tf.sparse_transpose(a, perm), new_shape) return reshaped_a, free_dims, free_dims_static def _sparse_tensordot_axes(a, axes): """Generates two sets of contraction axes for the two tensor arguments.""" a_shape = a.get_shape() if isinstance(axes, tf.compat.integral_types): if axes < 0: raise ValueError("'axes' must be at least 0.") if a_shape.ndims is not None: if axes > a_shape.ndims: raise ValueError("'axes' must not be larger than the number of " "dimensions of tensor %s." % a) return (list(range(a_shape.ndims - axes, a_shape.ndims)), list(range(axes))) else: rank = tf.rank(a) return (range(rank - axes, rank, dtype=tf.int32), range(axes, dtype=tf.int32)) elif isinstance(axes, (list, tuple)): if len(axes) != 2: raise ValueError("'axes' must be an integer or have length 2.") a_axes = axes[0] b_axes = axes[1] if isinstance(a_axes, compat.integral_types) and \ isinstance(b_axes, compat.integral_types): a_axes = [a_axes] b_axes = [b_axes] if len(a_axes) != len(b_axes): raise ValueError( "Different number of contraction axes 'a' and 'b', %s != %s." % (len(a_axes), len(b_axes))) return a_axes, b_axes else: axes = tf.convert_to_tensor(axes, name="axes", dtype=tf.int32) return axes[0], axes[1] with tf.name_scope(name, "SparseTensorDenseTensordot", [sp_a, b, axes]) as name: # a = tf.convert_to_tensor(a, name="a") b = tf.convert_to_tensor(b, name="b") sp_a_axes, b_axes = _sparse_tensordot_axes(sp_a, axes) sp_a_reshape, sp_a_free_dims, sp_a_free_dims_static = _sparse_tensordot_reshape(sp_a, sp_a_axes) b_reshape, b_free_dims, b_free_dims_static = _tensordot_reshape( b, b_axes, True) ab_matmul = tf.sparse_tensor_dense_matmul(sp_a_reshape, b_reshape) if isinstance(sp_a_free_dims, list) and isinstance(b_free_dims, list): return tf.reshape(ab_matmul, sp_a_free_dims + b_free_dims, name=name) else: sp_a_free_dims = tf.convert_to_tensor(sp_a_free_dims, dtype=tf.int32) b_free_dims = tf.convert_to_tensor(b_free_dims, dtype=tf.int32) product = tf.reshape( ab_matmul, tf.concat([sp_a_free_dims, b_free_dims], 0), name=name) if sp_a_free_dims_static is not None and b_free_dims_static is not None: product.set_shape(sp_a_free_dims_static + b_free_dims_static) return product 

### sdrelton commented May 25, 2018

 I ended up hacking something together using loops and sparse_slice but I'll take a look at using this instead (sparse_slice doesn't have a gradient)

### bkutt commented May 31, 2019

 To matrix multiply two 3-D tensors, one sparse and one dense, where the first dimension is the batch size, you can do this very succinctly using tf.map_fn(): def sparse_dense_matmult_batch(sp_a, b): def map_function(x): i, dense_slice = x[0], x[1] sparse_slice = tf.sparse.reshape(tf.sparse.slice( sp_a, [i, 0, 0], [1, sp_a.dense_shape[1], sp_a.dense_shape[2]]), [sp_a.dense_shape[1], sp_a.dense_shape[2]]) mult_slice = tf.sparse.matmul(sparse_slice, dense_slice) return mult_slice elems = (tf.range(0, sp_a.dense_shape[0], delta=1, dtype=tf.int64), b) return tf.map_fn(map_function, elems, dtype=tf.float32, back_prop=True)  This has the benefit that tf.map_fn() allows iterations to run in parallel as opposed to a sequential for loop. I was using tf.while_loop before to do this, but that resulted in a SparseTensor being added to a collection as a result of the while_context. This was a problem since SparseTensor doesn't have a "name" field and can't be added to any collections whose to_proto() implementation expects one (I needed to save and load the meta graph).

### sdrelton commented Jun 1, 2019

 @bkutt Brilliant thank you very much, I'll give this a go. Parallelizing the multiplies will certainly be a big help!

### bkutt commented Jun 1, 2019

 It also has support for backprop