New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
from_tensor_slices not compatible with sparse data #44565
Comments
Was able to reproduce the issue with TF v2.3 and TF-nightly. Please find the gist of it here. Thanks! |
@amahendrakar No worries! I can potentially look into creating a fix for this in the next couple of days and submit a PR. Though do you know of any workarounds to deal with such a dataset using the |
@amirhmk Each component of the argument passed to
sparse = tf.SparseTensor(indices=[[0, 0, 0], [0, 1, 2], [1, 0, 0], [1, 1, 1]],
values=[1, 2, 3, 4], dense_shape=[2, 3, 4])
dataset = tf.data.Dataset.from_tensor_slices(sparse)
sparse_1 = tf.SparseTensor(indices=[[0, 0], [1, 2]], values=[1, 2], dense_shape=[3, 4])
sparse_2 = tf.SparseTensor(indices=[[0, 0], [1, 1]], values=[3, 4], dense_shape=[2, 2])
def gen():
yield sparse_1
yield sparse_2
dataset = tf.data.Dataset.from_generator(gen, output_signature=tf.SparseTensorSpec(dtype=tf.int32)) |
@aaudiber Thank you for your comment, that makes sense. I like the first approach, I just have to keep track of the indices that represent each data point after merging them into a single As for the second approach, I think |
@amirhmk It will be available in TF 2.4.0, which should be released in the next week or two. |
Sounds good. I still think this feature would be nice to have, as in each data point has some sort of a |
I was trying to ingest sparse data into
I was getting the following error:
Thinking that the culprit may be the on-prem implementation of sparse data I tried the same with I get the same error:
After some debugging I saw that the error comes from the evaluation of the arguments of generator function, Which makes sense since in the docs here clearly states that the args should be tf.Tensors. https://www.tensorflow.org/api_docs/python/tf/data/Dataset#from_generator
However after the introduction of Using tensorflow==2.4.1 |
I am doing something very similar to @ktsitsi but instead inside the function that I pass to from_generator() I am converting the coo matrix to SparseTensor and specifying like @ktsitsi the SparseTensorSpec into the output_signature -> the error message I get is the following:
|
@sat2000pts This colab demonstrates how to convert from coo_matrix to SparseTensor: The crux is coo = coo_matrix((data, (row, col)), shape=shape)
tf_sparse = tf.sparse.SparseTensor(list(zip(coo.row, coo.col)), coo.data, coo.shape) |
@aaudiber thanks for the reply. I don't think this plays a role but I also yield an simple integer -> |
@sat2000pts Can you open a new issue, and include code to reproduce the error you're seeing? |
System information
Describe the current behavior
It's stated in the documentation that
Dataset
is able to handleSpareTensors
on top ofRagged
tensors and the other standard data types.Issue is
from_tensor_slices
is not able to handle a list ofSparseTensors
, butfrom_tensors
which accepts a single datapoint is able to instantiate it. CurrentlyDataset.from_generator
is not able to handleSparse
datatype either (in the current release at least) #41981 so I'm not sure how one is supposed to handle datasets that are compromised of both Sparse and dense data.Describe the expected behavior
from_tensor_slices
should accept a list ofSparseTensors
given that there is exactly the same number of items in the list as the other features passed intofrom_tensor_slices
Standalone code to reproduce the issue
Colab Link
Other info / logs
I'm not sure if it's just me, but when I read this part of the documentation I was under the impression that this feature would be supported. Thus you may identify this as a
feature_request
rather than a bug if I've misunderstood the documentarian.The text was updated successfully, but these errors were encountered: