You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
By tagging the SafeTensorHandles created during the lifetime of my application, I have been able to detect that the SafeTensorHandle causing the crash is the handle contained in a SafeStringTensorHandle. Actually, it is the handle 'safeTensorHandle' created in Tensor.StringTensor():
public class Tensor : DisposableObject, ITensorOrOperation, ITensorOrTensorArray, IPackable<Tensor>, ICanBeFlattened
{
public SafeStringTensorHandle StringTensor(byte[][] buffer, Shape shape)
{
SafeTensorHandle safeTensorHandle = c_api.TF_AllocateTensor(TF_DataType.TF_STRING, shape.dims, shape.ndim, (ulong)(shape.size * 24)); <-- the buffer of this handle will overflow
IntPtr intPtr = c_api.TF_TensorData(safeTensorHandle);
for (int i = 0; i < buffer.Length; i++)
{
c_api.TF_StringInit(intPtr);
c_api.TF_StringCopy(intPtr, buffer[i], buffer[i].Length);
intPtr += 24;
}
return new SafeStringTensorHandle(safeTensorHandle, shape);
}
}
This SafeStringTensorHandle is created during model.predict() and contains the optimization options. The call stack where it is created is like this:
Tensorflow.Binding.dll!Tensorflow.Tensor.StringTensor(byte[][] buffer, Tensorflow.Shape shape) Line 2322 C#
Tensorflow.Binding.dll!Tensorflow.Tensor.StringTensor(string[] strings, Tensorflow.Shape shape) Line 2316 C#
Tensorflow.Binding.dll!Tensorflow.Tensor.InitTensor(System.Array array, Tensorflow.Shape shape) Line 570 C#
Tensorflow.Binding.dll!Tensorflow.Tensor.Tensor(System.Array array, Tensorflow.Shape shape) Line 449 C#
Tensorflow.Binding.dll!Tensorflow.Eager.EagerTensor.EagerTensor(System.Array array, Tensorflow.Shape shape) Line 135 C#
Tensorflow.Binding.dll!Tensorflow.constant_op.convert_to_eager_tensor(object value, Tensorflow.Contexts.Context ctx, Tensorflow.TF_DataType dtype) Line 148 C#
Tensorflow.Binding.dll!Tensorflow.constant_op.convert_to_eager_tensor(object value, Tensorflow.TF_DataType dtype, Tensorflow.Shape shape, string name, bool verify_shape, bool allow_broadcast) Line 163 C#
Tensorflow.Binding.dll!Tensorflow.constant_op.constant(object value, Tensorflow.TF_DataType dtype, Tensorflow.Shape shape, bool verify_shape, bool allow_broadcast, string name) Line 34 C#
Tensorflow.Binding.dll!Tensorflow.ops.convert_to_tensor(object value, Tensorflow.TF_DataType dtype, string name, bool as_ref, Tensorflow.TF_DataType preferred_dtype, Tensorflow.Contexts.Context ctx) Line 482 C#
Tensorflow.Binding.dll!Tensorflow.tensorflow.convert_to_tensor(object value, Tensorflow.TF_DataType dtype, string name, Tensorflow.TF_DataType preferred_dtype) Line 2910 C#
Tensorflow.Binding.dll!Tensorflow.OptimizeDataset.OptimizeDataset(Tensorflow.IDatasetV2 dataset, string[] optimizations_enabled, string[] optimizations_disabled, string[] optimizations_default, string[] optimization_configs) Line 31 C#
Tensorflow.Binding.dll!Tensorflow.DatasetV2.apply_options() Line 143 C#
Tensorflow.Binding.dll!Tensorflow.OwnedIterator._create_iterator(Tensorflow.IDatasetV2 dataset) Line 30 C#
Tensorflow.Binding.dll!Tensorflow.OwnedIterator.OwnedIterator(Tensorflow.IDatasetV2 dataset) Line 26 C#
Tensorflow.Keras.dll!Tensorflow.Keras.Engine.DataAdapters.DataHandler.enumerate_epochs() Line 118 C#
Tensorflow.Keras.dll!Tensorflow.Keras.Engine.Model.PredictInternal(Tensorflow.Keras.Engine.DataAdapters.DataHandler data_handler, int verbose) Line 808 C#
Tensorflow.Keras.dll!Tensorflow.Keras.Engine.Model.predict(Tensorflow.Tensors x, int batch_size, int verbose, int steps, int max_queue_size, int workers, bool use_multiprocessing) Line 793 C#
This bug is very serious because it precludes the deployment of my application to my customers.
Reproduction Steps
I have not been able to create a minimal application to reproduce the bug, primarily because it occurs randomly when the GC decides to delete the handles.
Known Workarounds
No workaround found.
Configuration and Other Information
Tensorflow.NET 0.110.4
Tensorflow.Keras 0.11.4
Windows 11
The text was updated successfully, but these errors were encountered:
I've been able to create a small C# app which reproduces the problem. You will have to run it as debug from Visual Studio with "Enable native code debugging" checked in the Properties/Debug of the project.
The crash occurs when the app closes automatically after 50 predictions with an "Unhandled exception at 0x00007FF99875F61E (ucrtbase.dll) in TensorflowBufferOverflow.exe: Fatal program exit requested." in SafeEagerTensorHandle.ReleaseHandle().
You will probably have to run the app several times before this exception is raised.
Plase note that in my real application this exception is raised during the lifetime of the application, not only when closing, so this exception is a lot more critical. Furthermore the exception is raised in my app in SafeTensorHandle.ReleaseHandle() from SafeStringTensorHandle.ReleaseHandle() so it is not exactly the same error as here, but I hope that they are similar enough so that a fix can be applied to both classes.
Description
My application uses a model and regularly calls predict(). After a dozen calls to predict() (the number of calls is variable), my application crashes.
After a very long investigation, I have been able to detect that some data have been written after the end of a buffer (i.e. a buffer overflow).
Windows debugger displays the following message:
and another debug message states that data have been written after the end of a memory buffer.
The call stack after the exception tells that this memory buffer is managed by a SafeTensorHandle:
By tagging the SafeTensorHandles created during the lifetime of my application, I have been able to detect that the SafeTensorHandle causing the crash is the handle contained in a SafeStringTensorHandle. Actually, it is the handle 'safeTensorHandle' created in Tensor.StringTensor():
This SafeStringTensorHandle is created during model.predict() and contains the optimization options. The call stack where it is created is like this:
This bug is very serious because it precludes the deployment of my application to my customers.
Reproduction Steps
I have not been able to create a minimal application to reproduce the bug, primarily because it occurs randomly when the GC decides to delete the handles.
Known Workarounds
No workaround found.
Configuration and Other Information
Tensorflow.NET 0.110.4
Tensorflow.Keras 0.11.4
Windows 11
The text was updated successfully, but these errors were encountered: