Skip to content

SIGABRT when creating tensor with strings #370

@pashashiz

Description

@pashashiz

System information

  • OS Platform and Distribution: macOS Catalina 10.15.6
  • TensorFlow installed from: maven library org.tensorflow:tensorflow-core-platform:0.3.2
  • Java version 1.8.0_272:
  • Java command line flags: none

When creating a simple tensor with string the JVM crash (might need to loop it or do some other work until it does)

TString.vectorOf("AB", "C")

The error dump is attached hs_err_pid72721.log

Looks like the issue with the following code

public abstract class AbstractTF_Tensor extends Pointer {
    public static TF_Tensor allocateTensor(int dtype, long[] dims, long length) {
        TF_Tensor t = TF_AllocateTensor(dtype, dims, dims.length, length);
        if (t != null) {
            if (TF_TensorType(t) == TF_STRING) {
                long n = TF_TensorElementCount(t);
                TF_TString data = new TF_TString(TF_TensorData(t)); // Ooops!!!!!
                for (int i = 0; i < n; i++) {
                    TF_TString_Init(data.position(i));
                }
            }
            t.deallocator(new DeleteDeallocator(t));
        }
        return t;
    }
}

Here Tensor's data is simply cast to TF_TString. But that is wrong, tensor contains data that is supposed to be an array of TF_TString (just an array of pointers) but in reality, there are some random bytes. If we simply cast random bytes and assume that is an array of pointers and start cleaning that with TF_TString_Init which does memset to 0 inside we will corrupt the memory.

I suppose we need to initialize strings array first and create a tensor that would use that array as its data. For example (scala code, but should be straightforward to understand):

val n = 5
val shape = Shape(n)
val strings = new TF_TString(n)
(0L until n).foreach { i =>
  val data = s"x-$i"
  val bytes = data.getBytes("UTF-8")
  TF_TString_Init(strings.getPointer(i))
  TF_TString_Copy(strings.getPointer(i), new BytePointer(bytes: _*), bytes.length)
}
val pointerSize = 8 // Loader.sizeof(classOf[TF_TString]) gives 24 for some reason...
val tensor = TF_NewTensor(
  TF_STRING,
  shape.toLongArray,
  shape.rank,
  strings,
  shape.power * pointerSize,
  noopDeallocator,
  null)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions