Interaction between Tensorflow Java and JavaCPP Pointer deallocation #208

skirdey · 2021-02-05T00:35:18Z

I am trying to understand why when I use DJL + Tensorflow engine, there is a good amount of time spent in GC, while both DJL and Tensorflow Java seem to use pointerscope and do not relay on GC for object cleanup.

For example, I see JavaCPP Pointer.deallocator gets invoked https://github.com/bytedeco/javacpp/blob/master/src/main/java/org/bytedeco/javacpp/Pointer.java#L666-L667

which is a heavy synchronized call.

Any help is appreciated.

saudet · 2021-02-05T00:43:06Z

If you're confident you don't need GC, we can set the "org.bytedeco.javacpp.nopointergc" system property to "true" to reduce overhead.

skirdey · 2021-02-05T00:48:29Z

I am not sure wether I need GC or not. Looking at the EagerSession class seems like it is managing memory through the pointer scope. The NDArrays I allocate through DJL/TF also use the same.

skirdey · 2021-02-05T00:49:05Z

I will try the system property you have mentioned and run my benchmarks, thanks for such a quick response!

saudet · 2021-02-05T00:50:27Z

You'll need to keep in mind though that neither TF nor DJL are particularly designed to reduce GC. JavaCPP is only a small part of the overall design, and there is a lot garbage that gets generated elsewhere.

skirdey · 2021-02-17T17:09:43Z

Still investigating how GC optimizations work in the DJL + Tensorflow Environment.
There seems to be also a blocking call when using several eager sessions in parallel for multi-threading, and when JavaCPP needs to run GC, it blocks all of the threads with sessions.

saudet · 2021-02-18T00:42:39Z

Like I said, make sure to set the "org.bytedeco.javacpp.nopointergc" system property to "true" to prevent JavaCPP from calling System.gc(). It should also avoid blocking calls, but let me know if you still see anything of concern there.

skirdey · 2021-02-19T22:20:18Z

I've tried the org.bytedeco.javacpp.nopointergc and very quickly ran out of memory :-)

skirdey · 2021-02-19T22:20:49Z

If I keep it "false", then in JVM with G1GC, I spend 25% of the time in GC on a high throughput inference use-case, with occasional blocking.

skirdey · 2021-02-19T22:26:12Z

stacktrace related to allocation

Exception in thread "ForkJoinPool-1-worker-7" java.lang.OutOfMemoryError: Cannot allocate new PointerPointer(1000): totalBytes = 3640M, physicalBytes = 1226M
at org.bytedeco.javacpp.PointerPointer.(PointerPointer.java:149)
at org.tensorflow.EagerOperationBuilder.execute(EagerOperationBuilder.java:270)
at org.tensorflow.EagerOperationBuilder.build(EagerOperationBuilder.java:66)
at org.tensorflow.EagerOperationBuilder.build(EagerOperationBuilder.java:55)
at ai.djl.tensorflow.engine.TfNDArray.(TfNDArray.java:81)
at ai.djl.tensorflow.engine.TfNDManager.create(TfNDManager.java:154)

skirdey · 2021-02-19T22:58:20Z

So I've tried it, the blocking calls are removed, the thread with JavaCPP Deallocator is also disappears and no-more blocking GCs. Only issue is I quickly run out of memory :-)

skirdey · 2021-02-19T22:58:47Z

There is a team discussion on DJL around Tensorflow Java and JavaCPP performance, if you are interested - deepjavalibrary/djl#625

skirdey · 2021-02-19T23:46:18Z

The GC behaviour I see when

org.bytedeco.javacpp.nopointergc=false
Xmx2G
Xms2G

maxPhysicalBytes=1G

when running multi-threaded inference locally on my laptop. It uses Tensorflow Eager Session.

saudet · 2021-02-20T00:33:07Z

So I've tried it, the blocking calls are removed, the thread with JavaCPP Deallocator is also disappears and no-more blocking GCs. Only issue is I quickly run out of memory :-)

Obviously, when not using the GC, you'll need to make sure to deallocate native memory some other way!

skirdey · 2021-02-20T01:11:15Z

True that, I've just tested the theory that PointerScope takes care of my specific use case for inference. But you are right, there is quite a lot of garbage besides that.

…

On Fri, Feb 19, 2021 at 4:33 PM Samuel Audet ***@***.***> wrote: So I've tried it, the blocking calls are removed, the thread with JavaCPP Deallocator is also disappears and no-more blocking GCs. Only issue is I quickly run out of memory :-) Obviously, when not using the GC, you'll need to make sure to deallocate native memory some other way! — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#208 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABHIJQDEMC63YUQQAQFNHWTS737NBANCNFSM4XD2EC2Q> .

karllessard · 2021-02-21T17:57:50Z

Eager execution does allocate a lot of native resources, as each operation and each of their outputs will need to be freed.

I don't know much about details the GC feature of JavaCPP we are using now but I can tell that in 1.14, the way it was working is that the GC listener was running in a separate thread and was trying to free resources referenced by deleted objects, but just as a "best effort" since GC is not entirely reliable when it comes to native memory. @saudet, this thread-based implementation was just listening to a phantom reference queue and was therefore non-blocking, is JavaCPP doing something similar?

Ultimately, resources should be cleaned up by closing the eager session enclosing some piece of code and that part is still true:

try (EagerSession s = EagerSession.create()) {
    Ops tf = Ops.create(s);
    // your eager computations
}

This ensure to release all resources independently from the GC. So while it is not enforced by the API, it is recommended to scope your eager operations into multiple eager session instead of only relying on the default one (i.e. the one used when simply invoking Ops.create() without any parameter).

saudet · 2021-02-22T10:44:48Z

I don't know much about details the GC feature of JavaCPP we are using now but I can tell that in 1.14, the way it was working is that the GC listener was running in a separate thread and was trying to free resources referenced by deleted objects, but just as a "best effort" since GC is not entirely reliable when it comes to native memory. @saudet, this thread-based implementation was just listening to a phantom reference queue and was therefore non-blocking, is JavaCPP doing something similar?

It's doing that, but it also blocks by default when maxBytes or maxPhysicalBytes is reached, unless that is set to 0 as done in DJL here: https://github.com/awslabs/djl/blob/master/tensorflow/tensorflow-engine/src/main/java/ai/djl/tensorflow/engine/LibUtils.java#L52

Java users are typically used to having per-process memory thresholds like that, but it looks like users of DJL consider this a "bug", and prefer to have it behave more like C/C++/Python, where we can always allocate all memory available on the system with no restrictions.

skirdey · 2021-02-22T16:26:50Z

If I leave default settings for JavaCPP, use single interop and intraop thread in Tensorflow and use G1GC for GC - I get a weird behavior. Everything works fine until certain load threshold, after which heap gets to about 100% and full GC cycles can not clean anything in it. Seems like something is holding pointers forever. Need to test custom maxPhysicalBytes params.

…

On Monday, 22 February 2021, Samuel Audet ***@***.***> wrote: I don't know much about details the GC feature of JavaCPP we are using now but I can tell that in 1.14, the way it was working is that the GC listener was running in a separate thread and was trying to free resources referenced by deleted objects, but just as a "best effort" since GC is not entirely reliable when it comes to native memory. @saudet <https://github.com/saudet>, this thread-based implementation was just listening to a phantom reference queue and was therefore non-blocking, is JavaCPP doing something similar? It's doing that, but it also blocks by default when maxBytes or maxPhysicalBytes is reached, unless that is set to 0 as done in DJL here: https://github.com/awslabs/djl/blob/master/tensorflow/ tensorflow-engine/src/main/java/ai/djl/tensorflow/engine/LibUtils.java#L52 Java users are typically used to having per-process memory thresholds like that, but it looks like users of DJL consider this a "bug", and prefer to have it behave more like C/C++/Python, where we can always allocate all memory available on the system with no restrictions. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#208 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABHIJQGOEXLME6OWTJN3I53TAIYS7ANCNFSM4XD2EC2Q> .

saudet · 2021-02-22T23:02:54Z

If I leave default settings for JavaCPP, use single interop and intraop thread in Tensorflow and use G1GC for GC - I get a weird behavior. Everything works fine until certain load threshold, after which heap gets to about 100% and full GC cycles can not clean anything in it. Seems like something is holding pointers forever. Need to test custom maxPhysicalBytes params.

That sounds like an issue somewhere in TF Java, not JavaCPP. If you have a simple way to reproduce this, please let us know!

skirdey · 2021-02-22T23:42:57Z

I do, let me write it up.

…

On Mon, Feb 22, 2021 at 3:03 PM Samuel Audet ***@***.***> wrote: If I leave default settings for JavaCPP, use single interop and intraop thread in Tensorflow and use G1GC for GC - I get a weird behavior. Everything works fine until certain load threshold, after which heap gets to about 100% and full GC cycles can not clean anything in it. Seems like something is holding pointers forever. Need to test custom maxPhysicalBytes params. That sounds like an issue somewhere in TF Java, not JavaCPP. If you have a simple way to reproduce this, please let us know! — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#208 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABHIJQGU7A35W3KARSUXKGDTALPCZANCNFSM4XD2EC2Q> .

karllessard · 2021-02-23T14:56:48Z

@saudet , I just to verify with you one more thing concerning the GC listening thread in JavaCPP.

Before in TF Java (1.x), each resource that has native memory allocated was being referred by a PhantomReference retrieved from the GC queue once discarded by it: https://github.com/tensorflow/tensorflow/blob/3f113a31d250a5c772d95ffa8e8562cf174d222a/tensorflow/java/src/main/java/org/tensorflow/EagerSession.java#L387

So basically, that phantom reference contains all the information required to release any native resources associated to a particular object that was garbage collected.

In JavaCPP, how do you keep track of which object refers to which native resources? Does it exclusively rely on PointerScope, hence only on the lifetime of the EagerSession in the case?

saudet · 2021-02-24T08:35:44Z

No, there are PhantomReference getting queued, unless "org.bytedeco.javacpp.nopointergc" is set to "true", that is:
https://github.com/bytedeco/javacpp/blob/master/src/main/java/org/bytedeco/javacpp/Pointer.java#L459-L467
PhantomReference is referenced here, alongside the machinery for reference counting:
https://github.com/bytedeco/javacpp/blob/master/src/main/java/org/bytedeco/javacpp/Pointer.java#L274

karllessard · 2021-02-24T13:43:10Z

Ok so the PhantomReference is on the Pointer object itself, so that when this one is garbage collected, it's backing native resource is as well? If that's the case, sounds good.

Then another question, if the Pointer member of a given object have also been attached to an outer PointerScope, like it is done here for eager operations, will the garbage collection of the owning object prevail on the scope to release the resources? i.e. is the scope only keeping a weak reference to the Pointer or it will hold a strong reference to it, preventing the garbage collector to collect it when the EagerOperation is collected?

saudet · 2021-02-25T03:38:36Z

Ok so the PhantomReference is on the Pointer object itself, so that when this one is garbage collected, it's backing native resource is as well? If that's the case, sounds good.

Yes, that's idea. Everything is centered around Pointer to make it easier to reason about, and I've found that we can also use it to map anything of interest from the C/C++ world, well anything that has to do with memory living on the native heap anyway.

Then another question, if the Pointer member of a given object have also been attached to an outer PointerScope, like it is done here for eager operations, will the garbage collection of the owning object prevail on the scope to release the resources? i.e. is the scope only keeping a weak reference to the Pointer or it will hold a strong reference to it, preventing the garbage collector to collect it when the EagerOperation is collected?

It stores a strong reference, otherwise this kind of pattern may result in a crash:

try (PointerScope scope = new PointerScope()) {
    IntPointer something = new IntPointer(10);
    someNativeObject.keepReferenceTo(something);
    // The Java object "something" is no longer referenced from here, but someNativeObject expects the native memory to be
    // still allocated, so this call may crash, unless an object like PointerScope keeps a strong reference, which it does.
    someNativeObject.doSomething()
}

In the event that a scope is not closed, but becomes itself unreachable though, everything it contains becomes eligible for GC.

karllessard · 2021-02-25T12:33:48Z

It stores a strong reference, otherwise this kind of pattern may result in a crash:

Ok, this new paradigm introduced once we have switched to JavaCPP is probably the cause of the issue here. Before, the EagerSession scope was acting weakly and would not prevent the native memory of its garbage-collected resources to be released.

Some sessions (like the default one, which I think @skirdey uses) are never or rarely closed, therefore entirely rely on the GC or on explicit closing of the resources themselves (the tensors, for example). Since a strong reference is kept on the native pointer, the memory will never be released. That certainly happens with the EagerOperation resources.

It could be great if a user can decide to attach weakly or strongly a Pointer to a PointerScope. If that's not feasible, then we might want to prevent attaching these resources to the eager session if we know that session is the default one. But there is probably a cleaner way to fix this as well.

@skirdey , do you build your own version of TF Java when you are doing your tests or you use the prebuilt one coming with DJL? It would be interesting to see the behaviour of your code by adding this patch.

skirdey · 2021-02-25T16:32:51Z

@karllessard I would love to try a patch. I am using stock version of tensor-flow that comes with DJL. I am still not sure what would be the patch, so if you have one send it my way :-)

skirdey · 2021-02-25T16:42:53Z

btw, when using nopointergc=true the stock DJL benchmark goes OOM as well - deepjavalibrary/djl#690 so if you want to try it locally without my specific use case, you can also do that

…

On Thu, Feb 25, 2021 at 4:34 AM Karl Lessard ***@***.***> wrote: It stores a strong reference, otherwise this kind of pattern may result in a crash: Ok, this new paradigm introduced once we have switched to JavaCPP is probably the cause of the issue here. Before, the EagerSession scope was acting weakly and would not prevent the native memory of its garbage-collected resources to be released. Some sessions (like the default one, which I think @skirdey <https://github.com/skirdey> uses) are never or rarely closed, therefore entirely rely on the GC or on explicit closing of the resources themselves (the tensors, for example). Since a strong reference is kept on the native pointer, the memory will never be released. That certainly happens with the EagerOperation resources <https://github.com/tensorflow/java/blob/439373d4a76f4ba3665f55b929bb74876d4d96b7/tensorflow-core/tensorflow-core-api/src/main/java/org/tensorflow/EagerOperation.java#L56> . It could be great if a user can decide to attach weakly or strongly a Pointer to a PointerScope. If that's not feasible, then we might want to prevent attaching these resources to the eager session if we know that session is the default one. But there is probably a cleaner way to fix this as well. @skirdey <https://github.com/skirdey> , do you build your own version of TF Java when you are doing your tests or you use the prebuilt one coming with DJL? It would be interesting to see the behaviour of your code by adding this patch. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#208 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABHIJQG5LWHG4U26CGR3J4DTAY7TXANCNFSM4XD2EC2Q> .

stu1130 · 2021-02-25T22:11:01Z

@karllessard @saudet @skirdey Here is my thought.
There are two strategies of releasing native memory.

Java style: spawn another dedicated thread to clean up native resource object in the ReferenceQueue.
pros: I guess the performance would be slighter better than approach 2 as delete native memory takes time but needs experiments
cons: If the client API invoke the function in asynchronous way (which is usually the case in Java), the dedicated thread is not fast enough to clean up the native resource and cause OOM. Additionally, if engines like pytorch and apache mxnet don't support thread-safe delete, it will cause the JVM crash. As there might be a small chance that both PointerScope & dedicated thread release the native resource at the same cause and ran into double delete crash. Both PyTorch & Apache MXNet are of this type. So we have no choice but to go with approach 2. Do you know if tensorflow is thread-safe in terms of releasing the native resource like tensor?
C++ style: release the memory as soon as we mark the memory is not used.
pros: The memory consumption is small comparing to approach 1.
cons: The performance could be slightly slower than approach 1 but again it requires experiments.

Either way is trying to get rid of GC (System.gc()) which slows down the entire system. The right pattern of coding Deep Learning system is to make sure all native resources are tracked by a scope and release the memory without a leak and help of GC.

@saudet Does JavaCpp support approach 2 if users are working on memory-intensive application?

skirdey · 2021-02-26T13:06:08Z

It stores a strong reference, otherwise this kind of pattern may result in a crash:

Ok, this new paradigm introduced once we have switched to JavaCPP is probably the cause of the issue here. Before, the EagerSession scope was acting weakly and would not prevent the native memory of its garbage-collected resources to be released.

Some sessions (like the default one, which I think @skirdey uses) are never or rarely closed, therefore entirely rely on the GC or on explicit closing of the resources themselves (the tensors, for example). Since a strong reference is kept on the native pointer, the memory will never be released. That certainly happens with the EagerOperation resources.

It could be great if a user can decide to attach weakly or strongly a Pointer to a PointerScope. If that's not feasible, then we might want to prevent attaching these resources to the eager session if we know that session is the default one. But there is probably a cleaner way to fix this as well.

@skirdey , do you build your own version of TF Java when you are doing your tests or you use the prebuilt one coming with DJL? It would be interesting to see the behaviour of your code by adding this patch.

Looking at DJL code and TfNDManager it does create a single session in with async option https://github.com/awslabs/djl/blob/master/tensorflow/tensorflow-engine/src/main/java/ai/djl/tensorflow/engine/TfNDManager.java#L69

I can not see if it is a "default" session or not.

saudet · 2021-02-26T16:56:34Z

It stores a strong reference, otherwise this kind of pattern may result in a crash:

Ok, this new paradigm introduced once we have switched to JavaCPP is probably the cause of the issue here. Before, the EagerSession scope was acting weakly and would not prevent the native memory of its garbage-collected resources to be released.

Some sessions (like the default one, which I think @skirdey uses) are never or rarely closed, therefore entirely rely on the GC or on explicit closing of the resources themselves (the tensors, for example). Since a strong reference is kept on the native pointer, the memory will never be released. That certainly happens with the EagerOperation resources.

I see, that's what @rnett was referring in pull #188 (comment).

It could be great if a user can decide to attach weakly or strongly a Pointer to a PointerScope. If that's not feasible, then we might want to prevent attaching these resources to the eager session if we know that session is the default one. But there is probably a cleaner way to fix this as well.

It's never a good idea to rely on the GC. We cannot avoid situations like I mention above where the native side keeps a reference to an existing Tensor, but where the Java side doesn't. The only sane way to deal with this is by not relying on the GC at all. We can still have it as a sort of option for users that don't want to use something like TensorScope though, but in that case, since we cannot offer any guarantees anyway, it makes no difference whether eager session holds on to weak references, or no reference at all!

@skirdey , do you build your own version of TF Java when you are doing your tests or you use the prebuilt one coming with DJL? It would be interesting to see the behaviour of your code by adding this patch.

Which patch?

stu1130 · 2021-02-26T17:04:52Z

It stores a strong reference, otherwise this kind of pattern may result in a crash:

Ok, this new paradigm introduced once we have switched to JavaCPP is probably the cause of the issue here. Before, the EagerSession scope was acting weakly and would not prevent the native memory of its garbage-collected resources to be released.
Some sessions (like the default one, which I think @skirdey uses) are never or rarely closed, therefore entirely rely on the GC or on explicit closing of the resources themselves (the tensors, for example). Since a strong reference is kept on the native pointer, the memory will never be released. That certainly happens with the EagerOperation resources.
It could be great if a user can decide to attach weakly or strongly a Pointer to a PointerScope. If that's not feasible, then we might want to prevent attaching these resources to the eager session if we know that session is the default one. But there is probably a cleaner way to fix this as well.
@skirdey , do you build your own version of TF Java when you are doing your tests or you use the prebuilt one coming with DJL? It would be interesting to see the behaviour of your code by adding this patch.

Looking at DJL code and TfNDManager it does create a single session in with async option https://github.com/awslabs/djl/blob/master/tensorflow/tensorflow-engine/src/main/java/ai/djl/tensorflow/engine/TfNDManager.java#L69

I can not see if it is a "default" session or not.

I checked the source code. It should not be a default EagerSession. The defaultEagerSession is created only when you call initDefault(options) or getDefault()

saudet · 2021-02-26T17:16:43Z

@karllessard @saudet @skirdey Here is my thought.
There are two strategies of releasing native memory.

Java style: spawn another dedicated thread to clean up native resource object in the ReferenceQueue.
pros: I guess the performance would be slighter better than approach 2 as delete native memory takes time but needs experiments
cons: If the client API invoke the function in asynchronous way (which is usually the case in Java), the dedicated thread is not fast enough to clean up the native resource and cause OOM. Additionally, if engines like pytorch and apache mxnet don't support thread-safe delete, it will cause the JVM crash. As there might be a small chance that both PointerScope & dedicated thread release the native resource at the same cause and ran into double delete crash. Both PyTorch & Apache MXNet are of this type. So we have no choice but to go with approach 2. Do you know if tensorflow is thread-safe in terms of releasing the native resource like tensor?

C++ style: release the memory as soon as we mark the memory is not used.
pros: The memory consumption is small comparing to approach 1.
cons: The performance could be slightly slower than approach 1 but again it requires experiments.

Either way is trying to get rid of GC (System.gc()) which slows down the entire system. The right pattern of coding Deep Learning system is to make sure all native resources are tracked by a scope and release the memory without a leak and help of GC.

@saudet Does JavaCpp support approach 2 if users are working on memory-intensive application?

JavaCPP supports both the "Java style", that is using GC via PhantomReference and a ReferenceQueue, and "C++ style" via PointerScope. Currently, JavaCPP only uses 1 "deallocator thread", but it is possible to extend that to support multiple threads and increase throughput. However, there are too many limitations to that approach using GC that I don't think it's worth spending too much time on that. Like you mention, it's problematic when the native allocator isn't thread-safe, but even when it is thread-safe, allocating more memory than required leads to memory fragmentation and poor performance, typically poorer than managing it "C++ style" in my experience. So it is almost always a better idea to manage native memory like one would do it in C++ or Python. That's exactly the paradigm that PointerScope attempts to bring to Java, that we can use across libraries, like I showed here for the C++ APIs of TensorFlow and OpenCV: http://bytedeco.org/news/2018/07/17/bytedeco-as-distribution/

saudet · 2021-02-26T18:50:24Z

@karllessard I'm trying to find where in the old code it was using weak references, and I can't find. I don't remember removing anything like that myself either. From what I can tell, EagerSession has always been keeping strong references with this map: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/java/src/main/java/org/tensorflow/EagerSession.java#L499

Craigacp · 2021-02-26T21:33:14Z

NativeReference is a subclass of java.lang.ref.PhantomReference.

karllessard · 2021-02-26T22:12:05Z

I checked the source code. It should not be a default EagerSession. The defaultEagerSession is created only when you call initDefault(options) or getDefault()

@stu1130 , even if it is not the default session, if the created session remains open for a long time while creating a lot of ops and tensors, the issue can happen

@karllessard I'm trying to find where in the old code it was using weak references, and I can't find. I don't remember removing anything like that myself either. From what I can tell, EagerSession has always been keeping strong references with this map

@saudet: like @Craigacp pointed out, NativeReference extends PhantomReference and are extended by each type of native resource, like here. So basically how it works is that if a NativeReference was queued by the GC, its delete() method was invoked to release the native resources and it was removed from session the session scope. If it was never queued but the session was closed, all remaining NativeReference in the session scope were deleted explicitly. So what I was saying is that the eager session was holding references acting "weakly" and not necessarily stacking WeakReference objects.

Which patch?

There is none yet :) Ideally, like we discussed, it would be possible to refer weakly to a Pointer in a PointerScope from JavaCPP directly. Otherwise we have to find another way to restore the previous/correct behavior. I was curious to know if @skirdey was already setup to try a patch if we have one but I guess I can easily reproduce the problem on my side and validate this hypothesis.

saudet · 2021-02-27T01:29:19Z

NativeReference is a subclass of java.lang.ref.PhantomReference.

Ah, that's where I screwed up. I remember getting confused about what NativeReference was exactly and associating it with Pointer instead of more like PhantomReference<Pointer>. Well, let's see, that's a pattern currently supported by neither the Cleaner in JDK 9+ nor the ResourceScope as proposed by Panama, so it's probably not something I want to encourage users to start doing as part of JavaCPP either... See the point I make in pull #188 (comment).

If we still want to have something that behaves like the original implementation, we should probably just replace the PointerScope field in EagerSession with some sort of collection of WeakReference<Pointer>, as currently proposed by @rnett for TensorScope as per pull #188 (review).

karllessard · 2021-02-27T04:06:35Z

Ok, I understand that you want JavaCPP to follow closely the (future) behaviour of the JDK.

I'm personally totally comfortable with what @rnett proposed about weak tensor scopes, which is quite identical to the original behaviour of the eager sessions. So TF could have its own "weak" pointer scope. @rnett , were you planning to change your proposed TensorScope to make it more generic for any kind of native resources, like the eager ops?

If that's reshuffling too much of your work, we can simply keep our own weak references directly in the EagerSession like before, and since JavaCPP will still take care of the cleaner portion, it should really just be a small change to do.

rnett · 2021-02-27T21:22:52Z

I wasn't planning on doing it as part of TensorScope, but rather as part of Ops or Scope since we're passing it around anyways. But that's all up for discussion, I don't have anything particularly firm yet.

karllessard · 2021-02-27T22:29:02Z

Ok let's play it simple for now and I'll try to see how it works by simply replacing the PointerScope in EagerSession by a list of weak references, I'll let you all know of the results.

karllessard · 2021-03-01T03:57:44Z

Ok, so the issue was very easy to reproduce. Starting a JVM with only 256M of memory, this simple loop was hitting a OOM between 30K and 40K iterations:

  public static void main(String[] args) {
    try (EagerSession s = EagerSession.create()) {
      Ops tf = Ops.create(s);
      while (true) {
        tf.math.add(tf.constant(2), tf.constant(2));
      }
    }
  }

I pushed this PR #229 to keep only weak references on eager resources in the session, as proposed earlier, and now the garbage collection allows this loop to run forever. I'm pretty confident this should fix the issue observed earlier by @skirdey when using DJL.

skirdey · 2021-03-01T03:59:25Z

Awesome, I can give a try!

…

On Sunday, 28 February 2021, Karl Lessard ***@***.***> wrote: Ok, so the issue was very easy to reproduce. Starting a JVM with only 256M of memory, this simple loop was hitting a OOM between 30K and 40K iterations: public static void main(String[] args) { try (EagerSession s = EagerSession.create()) { Ops tf = Ops.create(s); while (true) { tf.math.add(tf.constant(2), tf.constant(2)); } } } I pushed this PR to keep only weak references on eager resources in the session, as proposed earlier, and now the garbage collection allows this loop to run forever. I'm pretty confident this should fix the issue observed earlier by @skirdey <https://github.com/skirdey> when using DJL. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#208 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABHIJQGJ5QAGOTI5ZNW3WW3TBMGENANCNFSM4XD2EC2Q> .

…nopointergc" (issue tensorflow/java#208)

saudet mentioned this issue Feb 26, 2021

Add TensorScope #188

Closed

karllessard mentioned this issue Mar 1, 2021

Keep weak references to eager resources in session #229

Merged

saudet added a commit to bytedeco/javacpp that referenced this issue Mar 2, 2021

* Ensure System.gc() never gets called with "org.bytedeco.javacpp.…

c1520a0

…nopointergc" (issue tensorflow/java#208)

nebulorum mentioned this issue May 3, 2021

Large latency on Tensor allocation #313

Closed

0x6675636b796f75676974687562 mentioned this issue Jul 10, 2024

JVM stuck forever at Pointer.physicalBytesInaccurate() bytedeco/javacpp#767

Open

Interaction between Tensorflow Java and JavaCPP Pointer deallocation #208

Interaction between Tensorflow Java and JavaCPP Pointer deallocation #208

Comments

skirdey commented Feb 5, 2021 • edited Loading

saudet commented Feb 5, 2021

skirdey commented Feb 5, 2021

skirdey commented Feb 5, 2021

saudet commented Feb 5, 2021

skirdey commented Feb 17, 2021

saudet commented Feb 18, 2021

skirdey commented Feb 19, 2021

skirdey commented Feb 19, 2021 • edited Loading

skirdey commented Feb 19, 2021

skirdey commented Feb 19, 2021

skirdey commented Feb 19, 2021

skirdey commented Feb 19, 2021

saudet commented Feb 20, 2021

skirdey commented Feb 20, 2021 via email

karllessard commented Feb 21, 2021

saudet commented Feb 22, 2021

skirdey commented Feb 22, 2021 via email • edited Loading

saudet commented Feb 22, 2021

skirdey commented Feb 22, 2021 via email

karllessard commented Feb 23, 2021

saudet commented Feb 24, 2021

karllessard commented Feb 24, 2021 • edited Loading

saudet commented Feb 25, 2021

karllessard commented Feb 25, 2021

skirdey commented Feb 25, 2021

skirdey commented Feb 25, 2021 via email

stu1130 commented Feb 25, 2021 • edited Loading

skirdey commented Feb 26, 2021

saudet commented Feb 26, 2021

stu1130 commented Feb 26, 2021

saudet commented Feb 26, 2021

saudet commented Feb 26, 2021

Craigacp commented Feb 26, 2021

karllessard commented Feb 26, 2021

saudet commented Feb 27, 2021

karllessard commented Feb 27, 2021 • edited Loading

rnett commented Feb 27, 2021

karllessard commented Feb 27, 2021

karllessard commented Mar 1, 2021 • edited Loading

skirdey commented Mar 1, 2021 via email

skirdey commented Feb 5, 2021 •

edited

Loading

skirdey commented Feb 19, 2021 •

edited

Loading

skirdey commented Feb 22, 2021 via email •

edited

Loading

karllessard commented Feb 24, 2021 •

edited

Loading

stu1130 commented Feb 25, 2021 •

edited

Loading

karllessard commented Feb 27, 2021 •

edited

Loading

karllessard commented Mar 1, 2021 •

edited

Loading