Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

about Memory growth test of Tritonserver #1141

Closed
jackyh opened this issue Jan 28, 2022 · 110 comments
Closed

about Memory growth test of Tritonserver #1141

jackyh opened this issue Jan 28, 2022 · 110 comments
Labels

Comments

@jackyh
Copy link
Contributor

jackyh commented Jan 28, 2022

We try to test the memory growth by gather the stats of memory usage when doing inference.
each time when we do an inference, we will get the statistics of memory it allocated. we found that:
("The max allocation of Memory when doing a single inference" - "The average allocation of Memory when doing a single inference") / ("The max allocation of Memory when doing a single inference") = 0.46, which means the variation is too big, why? it varies from about 700MB to 1500MB.

attached is the simple.java file and the test.sh script, probably to reproduce this, one need to modify the dir in the test.sh accordingly.

@saudet

L0_memory_growth.zip

@saudet
Copy link
Member

saudet commented Jan 28, 2022

There's a couple of things that could be happening, but the first thing you should check is for dangling Pointer objects. Try to run a command like this:

mvn clean compile exec:java -Dorg.bytedeco.javacpp.logger.debug -DargLine=-Xmx1000m 2>&1 | grep Collecting | grep -v 'ownerAddress=0x0'

If you see any output from that, you should find where those Pointer objects are not getting deallocated and call close() on them, or you could use PointerScope where appropriate: http://bytedeco.org/news/2018/07/17/bytedeco-as-distribution/

@jackyh
Copy link
Contributor Author

jackyh commented Jan 28, 2022

looks like there's quite a few there. Since there's lots of "new BytePointer" in Simple.java, which ones need me to do a close() on it?
client_10.log

@saudet
Copy link
Member

saudet commented Jan 28, 2022 via email

@jackyh
Copy link
Contributor Author

jackyh commented Jan 28, 2022

it will help to release/close by self?

@saudet
Copy link
Member

saudet commented Jan 28, 2022

Kind of, it's like a scope in C++, see the example here: http://bytedeco.org/news/2018/07/17/bytedeco-as-distribution/

@jackyh
Copy link
Contributor Author

jackyh commented Jan 29, 2022

Kind of, it's like a scope in C++, see the example here: http://bytedeco.org/news/2018/07/17/bytedeco-as-distribution/

I did some test today with "PointerScope" as attached. But memory usage is still dangled, vibrated:
("The max allocation of Memory when doing a single inference" - "The average allocation of Memory when doing a single inference") / ("The max allocation of Memory when doing a single inference") = 0.52, which means the variation is still too big. it varies from about 50MB to 500MB. Do we have more ways to debug?
20220129_PointerScope.zip

@saudet
Copy link
Member

saudet commented Jan 29, 2022

Please check the debug log like I asked you to do above #1141 (comment)

@jackyh
Copy link
Contributor Author

jackyh commented Jan 29, 2022

Please check the debug log like I asked you to do above #1141 (comment)

looks you mean even if "PointerScope" added, there still will be more leaks of pointers, if "PointerScope" does not include all the pointers?

@jackyh
Copy link
Contributor Author

jackyh commented Jan 29, 2022

I added "try (PointerScope scope = new PointerScope()) {" just at the beginning of the Main function, why there's still lots of "Debug: Collecting org.bytedeco.javacpp.Pointer$NativeDeallocator[ownerAddress=0x0,deallocatorAddress=0x0]" in the output log file?

@saudet
Copy link
Member

saudet commented Jan 30, 2022

Those are fine, their ownerAddress is 0. If you see any that have an address other than 0, then you should find what those are. If all that you see do not have an address, then you're probably dealing with GC issues of the Java heap. Try a different one:
https://developers.redhat.com/articles/2021/11/02/how-choose-best-java-garbage-collector

@saudet
Copy link
Member

saudet commented Jan 30, 2022

BTW, how did you make sure this is happening only with Java, and not with C++? Maybe it's a problem with Triton...

@jackyh
Copy link
Contributor Author

jackyh commented Jan 30, 2022

BTW, how did you make sure this is happening only with Java, and not with C++? Maybe it's a problem with Triton...

good point.

@jackyh
Copy link
Contributor Author

jackyh commented Jan 30, 2022

searched the log file:
Debug: Releasing org.bytedeco.javacpp.Pointer$NativeDeallocator[ownerAddress=0x7f29bf518190,deallocatorAddress=0x7f29c7ec4090]
Debug: Collecting org.bytedeco.javacpp.Pointer$NativeDeallocator[ownerAddress=0x0,deallocatorAddress=0x0]
all the log of Collecting are ownerAddress=0x0

@jackyh
Copy link
Contributor Author

jackyh commented Feb 3, 2022

Samuel:
We designed our test case like this:

  1. we will start a thread to monitor the memory usage:
    a, this thread will monitor the usage of memory every two seconds
    b, each time, we will use this to do statistics:
    DoubleSummaryStatistics stats = new DoubleSummaryStatistics();
    double memory = Runtime.getRuntime().totalMemory() - Runtime.getRuntime().freeMemory();
    stats.accept(memory);
    c, each time, we will use this to calculate the delta:
    double memory_allocation_delta = stats.getMax() - stats.getAverage();
    double memory_allocation_delta_mb = memory_allocation_delta / 1E6;
    double memory_allocation_delta_percent = memory_allocation_delta / stats.getMax();
    here, if the "memory_allocation_delta_percent" is larger than 10 percent, then the test will fail.

  2. for the main process, we will do this:
    for(int i = 0; i < 1000000; i++){
    RunInference(server, model_name, is_int, is_torch_model);
    }

We assume, every two seconds, there will be some member of function "RunInference" be processed, some memory will be allocated, some memory will be freed during each call of "RunInference", so the variation of "memory_allocation_delta_percent" should not be larger than 10%, how do you think? This is the right way to test memory growth of Java process?

@saudet

@saudet
Copy link
Member

saudet commented Feb 3, 2022

Well, that's a question about Triton more than anything else, I think. All buffers should be preallocated as much as possible, so variations like that don't occur.

@jackyh
Copy link
Contributor Author

jackyh commented Feb 4, 2022

Well, that's a question about Triton more than anything else, I think. All buffers should be preallocated as much as possible, so variations like that don't occur.

since, each time when to do the reference, in function of "RunInference", we will allocator lots of buffers, do you mean this needs to be replaced by some static/preallocated memory? If we re-allocate these buffers each time when doing inference, this variation is common to Java process?

@saudet
Copy link
Member

saudet commented Feb 4, 2022

That has nothing to do with Java! You're allocating these buffers for Triton, not Java. This is something that needs to be fixed for Triton.

@jackyh
Copy link
Contributor Author

jackyh commented Feb 4, 2022

buffers for Triton, not Java. This is something that needs to be fixed for Triton.

Yes, we allocate these buffers for Triton to do inference or compare some result. So, let's say, if I allocate these buffers as some static/preallocated ones, then the variation issue is fixed, does that mean the GC is not working well enough?

@saudet
Copy link
Member

saudet commented Feb 4, 2022

Preallocating and reusing objects that use memory on the Java heap helps the GC, but it's possible to tune the GC to be able to cope better with larger amounts of garbage too, yes.

@jackyh
Copy link
Contributor Author

jackyh commented Feb 5, 2022

Preallocating and reusing objects that use memory on the Java heap helps the GC, but it's possible to tune the GC to be able to cope better with larger amounts of garbage too, yes.

so you mean the ways listed https://developers.redhat.com/articles/2021/11/02/how-choose-best-java-garbage-collector#parallel_collector here to tune the GC for larger amounts of garbage?

@saudet
Copy link
Member

saudet commented Feb 5, 2022

That kind thing, yes, but if the requests that you get don't require allocating different kinds of buffers all the time, it's more efficient to just reuse those buffers. That's probably what your users are asking about.

@jackyh
Copy link
Contributor Author

jackyh commented Feb 6, 2022

here's the default JVM parameters:

root@4a42d065cf6e:/workspace/javacpp_presets_upstream/javacpp-presets/tritonserver# java -XX:+PrintCommandLineFlags -version
-XX:G1ConcRefinementThreads=10 -XX:GCDrainStackTargetSize=64 -XX:InitialHeapSize=524877248 -XX:MaxHeapSize=8398035968 -XX:+PrintCommandLineFlags -XX:ReservedCodeCacheSize=251658240 -XX:+SegmentedCodeCache -XX:+UseCompressedClassPointers -XX:+UseCompressedOops -XX:+UseG1GC
openjdk version "11.0.13" 2021-10-19
OpenJDK Runtime Environment (build 11.0.13+8-Ubuntu-0ubuntu1.20.04)
OpenJDK 64-Bit Server VM (build 11.0.13+8-Ubuntu-0ubuntu1.20.04, mixed mode, sharing)

Which parameters you think that probably need to tune?

@jackyh
Copy link
Contributor Author

jackyh commented Feb 6, 2022

That kind thing, yes, but if the requests that you get don't require allocating different kinds of buffers all the time, it's more efficient to just reuse those buffers. That's probably what your users are asking about.

so, for this, since I want to make the largest allocated memory as static ones, how can I know which buffers/object is the largest one?

@saudet
Copy link
Member

saudet commented Feb 6, 2022

Not just the largest one, all of them, if possible. I'm guessing that ideally your users want this to be "garbage free" to get the lowest latency possible, for real time applications, but I'm just guessing. You should try to find out what the needs of your users are, and then we can figure out how to meet those needs.

@jackyh
Copy link
Contributor Author

jackyh commented Feb 6, 2022

Not just the largest one, all of them, if possible. I'm guessing that ideally your users want this to be "garbage free" to get the lowest latency possible, for real time applications, but I'm just guessing. You should try to find out what the needs of your users are, and then we can figure out how to meet those needs.

temporary, this test is just internally, probably users will have such sort of requirements? I'm not sure what JAVA users will most care about

@saudet
Copy link
Member

saudet commented Feb 7, 2022

temporary, this test is just internally, probably users will have such sort of requirements? I'm not sure what JAVA users will most care about

Well, if what you care most about is money, HFT is where it's at for low-latency Java applications:
https://www.efinancialcareers.com/news/2020/11/low-latency-java-trading-systems
https://medium.com/@jadsarmo/why-we-chose-java-for-our-high-frequency-trading-application-600f7c04da94
https://www.azul.com/use-cases/trading-risk/
https://github.com/OpenHFT

But personally I prefer working on embedded systems such as the ones from aicas:
https://www.aicas.com/wp/use-cases/
@jjh-aicas Do you have use cases where machine learning and GPUs could be of help?

@jackyh
Copy link
Contributor Author

jackyh commented Feb 7, 2022

Samuel:

Today I did more tests on GC and Heap:

  1. Command line arg is: -DargLine=-Xmx1000m
  2. While the test is running, Memory allocated (heap) will grow gradually from about 60M to 4000M (Here: Memory allocated is calculated by: Runtime.getRuntime().totalMemory() - Runtime.getRuntime().freeMemory(). Detail is attached as client.log
  3. info of gc is collected by cmd: jstat -gc 10524 500. Looks like "OU" grows fast! Detail is attached as gc.log

Why "OU" grows fast here? @saudet
client.log
gc.log

@saudet
Copy link
Member

saudet commented Feb 7, 2022

That's the "old space" apparently: https://docs.oracle.com/javase/7/docs/technotes/tools/share/jstat.html
Here's some doc about that: https://docs.oracle.com/cd/E13150_01/jrockit_jvm/jrockit/geninfo/diagnos/garbage_collect.html
So it just looks like there are buffers that can't be freed because they are still referenced somewhere.

@jackyh
Copy link
Contributor Author

jackyh commented Apr 7, 2022

Actually, you made me think about something. It is possible to store a reference to any Java object in a Pointer using JNI. I've added a couple of helper methods for that in commit bytedeco/javacpp@f11d67d. With that, instead of allocating and deallocating new native objects, something like this should work (not tested):

diff --git a/tritonserver/samples/Simple.java b/tritonserver/samples/Simple.java
index 8fa4d4f339..fd5271f8b7 100644
--- a/tritonserver/samples/Simple.java
+++ b/tritonserver/samples/Simple.java
@@ -197,7 +197,7 @@ public class Simple {
             // releasing the buffer.
             if (!allocated_ptr.isNull()) {
               buffer.put(0, allocated_ptr);
-              buffer_userp.put(0, new BytePointer(tensor_name));
+              buffer_userp.put(0, Loader.newGlobalRef(tensor_name));
               System.out.println("allocated " + byte_size + " bytes in "
                                + TRITONSERVER_MemoryTypeString(actual_memory_type.get())
                                + " for result tensor " + tensor_name);
@@ -213,16 +213,16 @@ public class Simple {
             TRITONSERVER_ResponseAllocator allocator, Pointer buffer, Pointer buffer_userp,
             long byte_size, int memory_type, long memory_type_id)
         {
-          BytePointer name = null;
+          String name = null;
           if (buffer_userp != null) {
-            name = new BytePointer(buffer_userp);
+            name = (String)Loader.accessGlobalRef(buffer_userp);
           } else {
-            name = new BytePointer("<unknown>");
+            name = "<unknown>";
           }
 
           System.out.println("Releasing buffer " + buffer + " of size " + byte_size
                            + " in " + TRITONSERVER_MemoryTypeString(memory_type)
-                           + " for result '" + name.getString() + "'");
+                           + " for result '" + name + "'");
           switch (memory_type) {
             case TRITONSERVER_MEMORY_CPU:
               Pointer.free(buffer);
@@ -254,7 +254,7 @@ public class Simple {
               break;
           }
 
-          name.deallocate();
+          Loader.deleteGlobalRef(buffer_userp);
 
           return null;  // Success
         }

I will try this now! Today I did something like:

if (!allocated_ptr.isNull()) {
buffer.put(0, allocated_ptr);
//jack added 0407
tensor_name_ptr = Pointer.malloc(tensor_name.length());
buffer_userp.put(0, tensor_name_ptr);
//System.out.println("string size of tensor_name" + ": " + tensor_name.length() + " is newed.");
//Pointer.free(allocated_ptr);
Pointer.free(tensor_name_ptr);

JMC says "no memory leak", but still can only run 5600000 iterations, then "unknown exception"...

@jackyh
Copy link
Contributor Author

jackyh commented Apr 7, 2022

Actually, you made me think about something. It is possible to store a reference to any Java object in a Pointer using JNI. I've added a couple of helper methods for that in commit bytedeco/javacpp@f11d67d. With that, instead of allocating and deallocating new native objects, something like this should work (not tested):

diff --git a/tritonserver/samples/Simple.java b/tritonserver/samples/Simple.java
index 8fa4d4f339..fd5271f8b7 100644
--- a/tritonserver/samples/Simple.java
+++ b/tritonserver/samples/Simple.java
@@ -197,7 +197,7 @@ public class Simple {
             // releasing the buffer.
             if (!allocated_ptr.isNull()) {
               buffer.put(0, allocated_ptr);
-              buffer_userp.put(0, new BytePointer(tensor_name));
+              buffer_userp.put(0, Loader.newGlobalRef(tensor_name));
               System.out.println("allocated " + byte_size + " bytes in "
                                + TRITONSERVER_MemoryTypeString(actual_memory_type.get())
                                + " for result tensor " + tensor_name);
@@ -213,16 +213,16 @@ public class Simple {
             TRITONSERVER_ResponseAllocator allocator, Pointer buffer, Pointer buffer_userp,
             long byte_size, int memory_type, long memory_type_id)
         {
-          BytePointer name = null;
+          String name = null;
           if (buffer_userp != null) {
-            name = new BytePointer(buffer_userp);
+            name = (String)Loader.accessGlobalRef(buffer_userp);
           } else {
-            name = new BytePointer("<unknown>");
+            name = "<unknown>";
           }
 
           System.out.println("Releasing buffer " + buffer + " of size " + byte_size
                            + " in " + TRITONSERVER_MemoryTypeString(memory_type)
-                           + " for result '" + name.getString() + "'");
+                           + " for result '" + name + "'");
           switch (memory_type) {
             case TRITONSERVER_MEMORY_CPU:
               Pointer.free(buffer);
@@ -254,7 +254,7 @@ public class Simple {
               break;
           }
 
-          name.deallocate();
+          Loader.deleteGlobalRef(buffer_userp);
 
           return null;  // Success
         }

I will try this now! Today I did something like:

if (!allocated_ptr.isNull()) { buffer.put(0, allocated_ptr); //jack added 0407 tensor_name_ptr = Pointer.malloc(tensor_name.length()); buffer_userp.put(0, tensor_name_ptr); //System.out.println("string size of tensor_name" + ": " + tensor_name.length() + " is newed."); //Pointer.free(allocated_ptr); Pointer.free(tensor_name_ptr);

JMC says "no memory leak", but still can only run 5600000 iterations, then "unknown exception"...

There's no 1.5.8-snapshots including the latest JavaCPP? how to make pom.xml to include the latest one?

org.bytedeco javacpp-presets 1.5.8-SNAPSHOT

org.bytedeco
tritonserver
2.18-${project.parent.version}
JavaCPP Presets for Triton Inference Server

org.bytedeco javacpp

the above doesn't work, not including the latest code of javapp.

@saudet
Copy link
Member

saudet commented Apr 7, 2022

You can add an entry like this to override the version from the dependencies:

  <dependency>
    <groupId>org.bytedeco</groupId>
    <artifactId>javacpp</artifactId>
    <version>1.5.8-SNAPSHOT</version>
  </dependency>

@jackyh
Copy link
Contributor Author

jackyh commented Apr 7, 2022

You can add an entry like this to override the version from the dependencies:

  <dependency>
    <groupId>org.bytedeco</groupId>
    <artifactId>javacpp</artifactId>
    <version>1.5.8-SNAPSHOT</version>
  </dependency>

thanks, do it now.

You can add an entry like this to override the version from the dependencies:

  <dependency>
    <groupId>org.bytedeco</groupId>
    <artifactId>javacpp</artifactId>
    <version>1.5.8-SNAPSHOT</version>
  </dependency>

still not works:

[ERROR] COMPILATION ERROR :
[INFO] -------------------------------------------------------------
[ERROR] /workspace/forMemoryLeakIssue/jackyh0407/java_cpp_presets_0407/javacpp-presets/tritonserver/samples/Simple
.java:[201,41] cannot find symbol
symbol: method newGlobalRef(java.lang.String)
location: class org.bytedeco.javacpp.Loader
[ERROR] /workspace/forMemoryLeakIssue/jackyh0407/java_cpp_presets_0407/javacpp-presets/tritonserver/samples/Simple.java:[220,34] cannot find symbol
symbol: method accessGlobalRef(org.bytedeco.javacpp.Pointer)
location: class org.bytedeco.javacpp.Loader
[ERROR] /workspace/forMemoryLeakIssue/jackyh0407/java_cpp_presets_0407/javacpp-presets/tritonserver/samples/Simple.java:[258,17] cannot find symbol
symbol: method deleteGlobalRef(org.bytedeco.javacpp.Pointer)
location: class org.bytedeco.javacpp.Loader

even when I delete this:
rm -rf /root/.m2/repository/org/bytedeco/javacpp/1.5.8-SNAPSHOT/

@jackyh
Copy link
Contributor Author

jackyh commented Apr 7, 2022

You can add an entry like this to override the version from the dependencies:

  <dependency>
    <groupId>org.bytedeco</groupId>
    <artifactId>javacpp</artifactId>
    <version>1.5.8-SNAPSHOT</version>
  </dependency>

thanks, do it now.

You can add an entry like this to override the version from the dependencies:

  <dependency>
    <groupId>org.bytedeco</groupId>
    <artifactId>javacpp</artifactId>
    <version>1.5.8-SNAPSHOT</version>
  </dependency>

still not works:

[ERROR] COMPILATION ERROR : [INFO] ------------------------------------------------------------- [ERROR] /workspace/forMemoryLeakIssue/jackyh0407/java_cpp_presets_0407/javacpp-presets/tritonserver/samples/Simple .java:[201,41] cannot find symbol symbol: method newGlobalRef(java.lang.String) location: class org.bytedeco.javacpp.Loader [ERROR] /workspace/forMemoryLeakIssue/jackyh0407/java_cpp_presets_0407/javacpp-presets/tritonserver/samples/Simple.java:[220,34] cannot find symbol symbol: method accessGlobalRef(org.bytedeco.javacpp.Pointer) location: class org.bytedeco.javacpp.Loader [ERROR] /workspace/forMemoryLeakIssue/jackyh0407/java_cpp_presets_0407/javacpp-presets/tritonserver/samples/Simple.java:[258,17] cannot find symbol symbol: method deleteGlobalRef(org.bytedeco.javacpp.Pointer) location: class org.bytedeco.javacpp.Loader

even when I delete this: rm -rf /root/.m2/repository/org/bytedeco/javacpp/1.5.8-SNAPSHOT/

I modified javacpp-presets/tritonserver/pom.xml as you wrote above. not works.

@saudet
Copy link
Member

saudet commented Apr 7, 2022

Maybe something went wrong with the latest builds. I've restarted the builds, please try again with mvn -U ...

@jackyh
Copy link
Contributor Author

jackyh commented Apr 8, 2022

Maybe something went wrong with the latest builds. I've restarted the builds, please try again with mvn -U ...

OK, will do just after breakfast. I need to try my best.

@jackyh
Copy link
Contributor Author

jackyh commented Apr 8, 2022

[ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) on project simple: Compilation failure: Compilation failure:
[ERROR] /workspace/forMemoryLeakIssue/jackyh0407/java_cpp_presets_0407/javacpp-presets/tritonserver/samples/Simple.java:[201,41] cannot find symbol
[ERROR] symbol: method newGlobalRef(java.lang.String)
[ERROR] location: class org.bytedeco.javacpp.Loader
[ERROR] /workspace/forMemoryLeakIssue/jackyh0407/java_cpp_presets_0407/javacpp-presets/tritonserver/samples/Simple.java:[220,34] cannot find symbol
[ERROR] symbol: method accessGlobalRef(org.bytedeco.javacpp.Pointer)
[ERROR] location: class org.bytedeco.javacpp.Loader
[ERROR] /workspace/forMemoryLeakIssue/jackyh0407/java_cpp_presets_0407/javacpp-presets/tritonserver/samples/Simple.java:[258,17] cannot find symbol
[ERROR] symbol: method deleteGlobalRef(org.bytedeco.javacpp.Pointer)
[ERROR] location: class org.bytedeco.javacpp.Loader
[ERROR] -> [Help 1]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException

not works. I have to try a new clean container.

saudet added a commit to bytedeco/javacpp that referenced this issue Apr 8, 2022
@saudet
Copy link
Member

saudet commented Apr 8, 2022

Ok, I finally spent a little of time to debug this myself. I think I found the cause of the leak. I've pushed a fix in commit bytedeco/javacpp@0efc632. DeleteLocalRef() wasn't getting called on the temporary byte[] allocated to create a String for the callback. With this fix, I do not get any memory leaks.

If you're still not able to get the latest binaries from the snapshots for some reason, you can simply clone https://github.com/bytedeco/javacpp and mvn install that locally in your container.

You'll also need to rebuild the presets for Triton itself after that to make this effective:
https://github.com/bytedeco/javacpp-presets/tree/master/tritonserver#steps-to-run-this-sample-inside-an-ngc-container

@jackyh
Copy link
Contributor Author

jackyh commented Apr 8, 2022

Ok, I finally spent a little of time to debug this myself. I think I found the cause of the leak. I've pushed a fix in commit bytedeco/javacpp@0efc632. DeleteLocalRef() wasn't getting called on the temporary byte[] allocated to create a String for the callback. With this fix, I do not get any memory leaks.

If you're still not able to get the latest binaries from the snapshots for some reason, you can simply clone https://github.com/bytedeco/javacpp and mvn install that locally in your container.

You'll also need to rebuild the presets for Triton itself after that to make this effective: https://github.com/bytedeco/javacpp-presets/tree/master/tritonserver#steps-to-run-this-sample-inside-an-ngc-container

great thanks! try it this morning.

@jackyh
Copy link
Contributor Author

jackyh commented Apr 8, 2022

Ok, I finally spent a little of time to debug this myself. I think I found the cause of the leak. I've pushed a fix in commit bytedeco/javacpp@0efc632. DeleteLocalRef() wasn't getting called on the temporary byte[] allocated to create a String for the callback. With this fix, I do not get any memory leaks.

If you're still not able to get the latest binaries from the snapshots for some reason, you can simply clone https://github.com/bytedeco/javacpp and mvn install that locally in your container.

You'll also need to rebuild the presets for Triton itself after that to make this effective: https://github.com/bytedeco/javacpp-presets/tree/master/tritonserver#steps-to-run-this-sample-inside-an-ngc-container

my god! I did all the things above, but always failed to find the symbols!!

[ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) on project simple: Compilation failure: Compilation failure:
[ERROR] /workspace/forMemoryLeakIssue/jackh0408/javacpp-presets/tritonserver/samples/Simple.java:[201,41] cannot find symbol
[ERROR] symbol: method newGlobalRef(java.lang.String)
[ERROR] location: class org.bytedeco.javacpp.Loader
[ERROR] /workspace/forMemoryLeakIssue/jackh0408/javacpp-presets/tritonserver/samples/Simple.java:[220,34] cannot find symbol
[ERROR] symbol: method accessGlobalRef(org.bytedeco.javacpp.Pointer)
[ERROR] location: class org.bytedeco.javacpp.Loader
[ERROR] /workspace/forMemoryLeakIssue/jackh0408/javacpp-presets/tritonserver/samples/Simple.java:[258,17] cannot find symbol
[ERROR] symbol: method deleteGlobalRef(org.bytedeco.javacpp.Pointer)
[ERROR] location: class org.bytedeco.javacpp.Loader
[ERROR] -> [Help 1]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.

this make me mad...

@jackyh
Copy link
Contributor Author

jackyh commented Apr 8, 2022

there's no "nm" in java?

how can I know if there's "newGlobalRef" in /root/.m2/repository/org/bytedeco/javacpp/1.5.8-SNAPSHOT/javacpp-1.5.8-SNAPSHOT.jar??

@saudet
Copy link
Member

saudet commented Apr 8, 2022

If you're using the "shaded" library of the presets for Triton, you'll need to update that one as well.

JAR files are just zip files, use unzip to extract the classes, and you can use the decompiler javap on those classes.

@jackyh
Copy link
Contributor Author

jackyh commented Apr 8, 2022

newGlobalRef

looks like there's newGlobalRef in Loader.class:

root@55d883c095bc:~/.m2/repository/org/bytedeco/javacpp/1.5.8-SNAPSHOT/tmp/org/bytedeco/javacpp# javap Loader.class |grep newGlobalRef
public static native org.bytedeco.javacpp.Pointer newGlobalRef(java.lang.Object);

what's shaded? I just use these to compile and install tritonserver:

$ mvn clean install --projects .,tritonserver
$ mvn clean install -f platform --projects ../tritonserver/platform -Djavacpp.platform=linux-x86_64

@jackyh
Copy link
Contributor Author

jackyh commented Apr 8, 2022

in Simple.java, it includes:

import org.bytedeco.javacpp.*;

what's error there? driving me mad...

@saudet
Copy link
Member

saudet commented Apr 8, 2022

For example, this file: https://oss.sonatype.org/content/repositories/snapshots/org/bytedeco/tritonserver-platform/2.18-1.5.8-SNAPSHOT/tritonserver-platform-2.18-1.5.8-20220318.033114-6-shaded.jar
If your build uses this file, it's going to use the Loader.class from there, not from the JAR of JavaCPP.

@jackyh
Copy link
Contributor Author

jackyh commented Apr 8, 2022

I found this in javacpp-presets/tritonserver/samples/pom.xml:

org.bytedeco tritonserver-platform 2.18-1.5.8-SNAPSHOT shaded

I just remove "shaded" line:

org.bytedeco tritonserver-platform 2.18-1.5.8-SNAPSHOT

then do clean install, but, still get the same error...why??

@jackyh
Copy link
Contributor Author

jackyh commented Apr 8, 2022

[INFO] Changes detected - recompiling the module!
[DEBUG] Classpath:
[DEBUG] /workspace/forMemoryLeakIssue/jackh0408/javacpp-presets/tritonserver/samples/target/classes
[DEBUG] /root/.m2/repository/org/bytedeco/cuda-platform/11.6-8.3-1.5.7/cuda-platform-11.6-8.3-1.5.7.jar
[DEBUG] /root/.m2/repository/org/bytedeco/javacpp-platform/1.5.7/javacpp-platform-1.5.7.jar
[DEBUG] /root/.m2/repository/org/bytedeco/javacpp/1.5.7/javacpp-1.5.7.jar
[DEBUG] /root/.m2/repository/org/bytedeco/javacpp/1.5.7/javacpp-1.5.7-linux-x86_64.jar

it still uses javacpp1.5.7?? why not 1.5.8?

@saudet
Copy link
Member

saudet commented Apr 8, 2022

It's possible, yes, unless we force it to a particular version, as above #1141 (comment).

@saudet
Copy link
Member

saudet commented Apr 8, 2022

Anyway, I've updated the shaded JAR file, so we can use it without Maven, for example, this way:

wget https://oss.sonatype.org/content/repositories/snapshots/org/bytedeco/tritonserver-platform/2.18-1.5.8-SNAPSHOT/tritonserver-platform-2.18-1.5.8-20220408.064151-7-shaded.jar
java -cp tritonserver-platform-2.18-1.5.8-20220408.064151-7-shaded.jar SimpleCPUOnly.java -r /workspace/models ...

@jackyh
Copy link
Contributor Author

jackyh commented Apr 8, 2022

compiling is ok now.

I am now testing with:

java -Xms128m -Xmx384m -Dorg.bytedeco.javacpp.maxPhysicalBytes=800m -Dorg.bytedeco.javacpp.nopointergc=true -jar simple-1.5.8-SNAPSHOT.jar -r "/workspace/server-2.18.0/docs/examples/model_repository/models" -i 10000000

@jackyh
Copy link
Contributor Author

jackyh commented Apr 8, 2022

compiling is ok now.

I am now testing with:

java -Xms128m -Xmx384m -Dorg.bytedeco.javacpp.maxPhysicalBytes=800m -Dorg.bytedeco.javacpp.nopointergc=true -jar simple-1.5.8-SNAPSHOT.jar -r "/workspace/server-2.18.0/docs/examples/model_repository/models" -i 10000000

10 millions passed! bravo!

@jackyh
Copy link
Contributor Author

jackyh commented Apr 11, 2022

Ok, I finally spent a little of time to debug this myself. I think I found the cause of the leak. I've pushed a fix in commit bytedeco/javacpp@0efc632. DeleteLocalRef() wasn't getting called on the temporary byte[] allocated to create a String for the callback. With this fix, I do not get any memory leaks.

If you're still not able to get the latest binaries from the snapshots for some reason, you can simply clone https://github.com/bytedeco/javacpp and mvn install that locally in your container.

You'll also need to rebuild the presets for Triton itself after that to make this effective: https://github.com/bytedeco/javacpp-presets/tree/master/tritonserver#steps-to-run-this-sample-inside-an-ngc-container

Yes, I'm able to figure out from "oldSampleObject" that "BytePointer(tensor_name)" is not collected from heap. But since I'm not expert of JavaCPP, I'm not able to fix it...

@saudet
Copy link
Member

saudet commented Apr 11, 2022

Yes, I'm able to figure out from "oldSampleObject" that "BytePointer(tensor_name)" is not collected from heap. But since I'm not expert of JavaCPP, I'm not able to fix it...

Don't worry about that one, just apply the patch from above #1141 (comment).

@saudet
Copy link
Member

saudet commented Nov 3, 2022

The fix has now been released with JavaCPP 1.5.8! Thanks for reporting and for testing

@saudet saudet closed this as completed Nov 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants