Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JNI Test Fix #3825

Merged
merged 36 commits into from May 24, 2021
Merged

JNI Test Fix #3825

merged 36 commits into from May 24, 2021

Conversation

tucek
Copy link
Contributor

@tucek tucek commented May 4, 2021

Basics

These points need to be fulfilled for every PR:

  • Short descriptions of your changes are in the release notes
    (added as entry in doc/news/_preparation_next_release.md which
    contains _(my name)_)
    Please always add something to the release notes.
  • Details of what you changed are in commit messages
    (first line should have module: short statement syntax)
  • References to issues, e.g. close #X, are in the commit messages.
  • The buildservers are happy. If not, fix in this order:
    • add a line in doc/news/_preparation_next_release.md
    • reformat the code with scripts/dev/reformat-all
    • make all unit tests pass
    • fix all memleaks
  • The PR is rebased with current master.

If you have any troubles fulfilling these criteria, please write
about the trouble as comment in the PR. We will help you.
But we cannot accept PRs that do not fulfill the basics.

Checklist

Check relevant points but please do not remove entries.
For docu fixes, spell checking, and similar none of these points below
need to be checked.

  • I added unit tests for my code
  • I fully described what my PR does in the documentation
    (not in the PR description)
  • I fixed all affected documentation
  • I added code comments, logging, and assertions as appropriate (see Coding Guidelines)
  • I updated all meta data (e.g. README.md of plugins and METADATA.ini)
  • I mentioned every code not directly written by me in THIRD-PARTY-LICENSES

Review

Reviewers will usually check the following:

Labels

If you are already Elektra developer:

  • Add the "work in progress" label if you do not want the PR to be reviewed yet.
  • Add the "ready to merge" label if the basics are fulfilled and you also
    say that everything is ready to be merged.

@markus2330
Copy link
Contributor

@tucek can you please update the README.md of jni with latest introduction of how you installed everything? Most parts there are very outdated. The Debian 9+8 instructions can be safely removed.

When I try to compile this branch I get

INFOJNI test activated.

Probably some wrongly-used cmake debugging message you forgot to remove?
It was confusing for me as we usually only output something if tests were deactivated.

On Debian 10 with:

  • openjdk-11-jre and openjdk-11-jdk
  • gradle-6.8.3 manually installed

I can also reproduce the segfault. Running with valgrind, the segfault does not happen: the test can be executed successfully.

@tucek can you please try different Java versions (e.g. the one of Oracle) on different OSs to see which systems are affected and for which it works (if any)?

If we have a problem in our code it may be related to how we execute JNI_CreateJavaVM¹ or in line 120 or 309 (NewObject), as valgrind is pointing there.
¹ https://docs.oracle.com/javase/7/docs/technotes/guides/jni/spec/invocation.html they don't use the options on the stack.

@kodebach can you take a look about that, please? It would be a great help. Ideally even fixing the TODO in line 201.

Would be great if we can give the students a smooth experience when using the JNI plugin!

Note for myself, I started valgrind with: LD_LIBRARY_PATH=/home/markus/Projekte/Elektra/build/lib DB_TEST_BIN_DIR=/home/markus/Projekte/Elektra/build/bin valgrind --tool=memcheck --error-limit=no --track-origins=yes -v --leak-check=full --suppressions=/home/markus/Projekte/Elektra/current/tests/valgrind.suppression --show-reachable=yes /home/markus/Projekte/Elektra/build/bin/testmod_jni "/home/markus/Projekte/Elektra/current/src/plugins/jni"

@kodebach
Copy link
Member

kodebach commented May 7, 2021

I do not have time to fully diagnose this problem, but here are my findings from a quick investigation:

  1. Just executing testmod_jni with a debugger shows that the double free happens in keyDel().
  2. GDB doesn't show a backtrace between
    PLUGIN_OPEN ("jni");
    and keyDel() here
    elektraFree (key);

    This is because keyDel() is called inside the PLUGIN_OPEN macro.
  3. Since we have a double free, there has to be an earlier call to keyDel(). This lead me to believe that keyDel() is called once via JNI from Java and then again by the PLUGIN_OPEN macro.
  4. To confirm this, I created a "Remote JVM Debug" configuration in IntelliJ (default settings are fine). This gives you a set of arguments with which the Remote JVM needs to be started. In my case this was
    -agentlib:jdwp=transport=dt_socket,server=y,suspend=n,address=*:5005
    
    To use these arguments you need to modify testmod_jni.
    diff --git a/src/plugins/jni/jni.c b/src/plugins/jni/jni.c
    index 1577242b3..1c20933aa 100644
    --- a/src/plugins/jni/jni.c
    +++ b/src/plugins/jni/jni.c
    @@ -206,11 +206,12 @@ int elektraJniOpen (Plugin * handle, Key * errorKey)
            */
    
            JavaVMInitArgs vmArgs;
    -       JavaVMOption options[2];
    +       JavaVMOption options[3];
            options[0].optionString = classpath;
            options[1].optionString = option;
    +       options[2].optionString = "-agentlib:jdwp=transport=dt_socket,server=y,suspend=y,address=*:5005";
            vmArgs.version = JNI_VERSION_1_8;
    -       vmArgs.nOptions = 2;
    +       vmArgs.nOptions = 3;
            vmArgs.options = options;
            vmArgs.ignoreUnrecognized = ign;
    When you start testmod_jni now, it will wait for a Java Debugger to attach during the call to JNI_CreateJavaVM.
  5. I started testmod_jni again with a debugger. In IntelliJ I added a breakpoint in Key.release() and attached the Java Debugger testmod_jni was waiting. Sure enough, the breakpoint triggered and the Java Code called keyDel(). I continued and then I got the double free exception in GDB. [*]

Summary
The errorKey passed to elektraJniOpen is improperly freed by the Java code (via Key.release()). One solution might be to call keyIncRef()/keyDecRef() before/after passing the key to the JVM. This should probably happen somewhere in call2Arg. A similar issue will likely exist for KeySets.

@markus2330 For the future, we should probably reconsider #3631 (separate ref counters for KeySet -> Key and Other -> Key, references) and maybe think about even more changes to the reference counting. For example, maybe kdbOpen, kdbGet etc. should increment the ref count before passing to plugins to ensure the plugin doesn't accidentally delete the parent key.


[*] When I run testmod_jni with GDB I also get a segfault from somewhere within libjvm.so during the call to JNI_CreateJavaVM but before the JVM waits for the Java Debugger. There was no usable backtrace so I can't tell why it happens, but you can just continue and everything works.

@kodebach
Copy link
Member

kodebach commented May 7, 2021

PS. @tucek If you need to debug this further and would prefer a graphical interface for the debugger over GDB, you can use this VSCode config (assumes your build folder is called build):

{
  "name": "testmod_jni",
  "type": "cppdbg",
  "request": "launch",
  "program": "${workspaceFolder}/build/bin/testmod_jni",
  "args": [
    "${workspaceFolder}/src/plugins/jni"
  ],
  "stopAtEntry": false,
  "cwd": "${workspaceFolder}/build/bin",
  "environment": [
    {
      "name": "LD_LIBRARY_PATH",
      "value": "${workspaceFolder}/build/lib"
    }
  ],
  "externalConsole": false,
  "MIMode": "gdb",
  "setupCommands": [
    {
      "description": "Enable pretty-printing for gdb",
      "text": "-enable-pretty-printing",
      "ignoreFailures": true
    }
  ]
}

@kodebach
Copy link
Member

kodebach commented May 7, 2021

Another note: I just saw that the Return plugin used in testmod_jni returns 10, 20, 30 in the get, set and error functions. These values should be one of the ELEKTRA_PLUGIN_STATUS_* values (i.e. -1, 0, 1).

@tucek
Copy link
Contributor Author

tucek commented May 9, 2021

When I try to compile this branch I get

INFOJNI test activated.

Probably some wrongly-used cmake debugging message you forgot to remove?
It was confusing for me as we usually only output something if tests were deactivated.

I intentionally had not yet removed this debug message. (Since this is still a draft PR...)

On Debian 10 with:

  • openjdk-11-jre and openjdk-11-jdk
  • gradle-6.8.3 manually installed

I can also reproduce the segfault. Running with valgrind, the segfault does not happen: the test can be executed successfully.

I've tested Ubuntu 20.04 and Debian 10 with openjdk-11-jdk and get identical results. The only difference I found was, that i had to set JAVA_HOME on Debian 10 for the FindJNI cmake module to actually find the JNI header file. Are you sure plugin jni was successfully included by cmake and ctest -V -R testmod_jni did finish successfully?

@tucek can you please try different Java versions (e.g. the one of Oracle) on different OSs to see which systems are affected and for which it works (if any)?

Since i did not get different results between the two configurations above (in contrary to your report), I do not really see the indication for doing a (quite time consuming) extensive test with different java distribution / os combinations. Nonetheless i've tested the following additional combinations:

  • Debian 10 / AdoptOpenJDK 11.0.11.j9
  • Debian 10 / Microsoft 11.0.11.9.1
  • Debian 10 / Oracle GraalVM 21.1.0.r16 (jdk 16.0.1)

All yielded the same result.

@markus2330
Copy link
Contributor

markus2330 commented May 10, 2021

Are you sure plugin jni was successfully included by cmake and

Yes. I've set JAVA_HOME, I am not sure if it was needed, though.

Please update the README.md with such information if you know about it.

ctest -V -R testmod_jni did finish successfully?

Only with valgrind, otherwise I got a double free error (with termination of the process).

Since i did not get different results between the two configurations above

Which two configurations above?

(in contrary to your report)

You also get segfaults when running with valgrind?

All yielded the same result.

Which is?

Btw. in previous segfaults of JNI it was usually related to the Java version. If we would find any combination that works for CM, it would already help a lot. This is why I asked for the investigation. Seems like this is not the case this time, though.

@kodebach
Copy link
Member

I am pretty sure this issue is entirely independent of Java Version and OS. AFAICT everything is working as intended (i.e. as written in code). We're just telling the JVM to call keyDel() when the garbage collector collects a Key object, but also calling keyDel() manually afterwards.

There is a very similar problem, if you write a plugin in C++ and do something like this:

int elektraMyPluginGet (ckdb::Plugin * handle, ckdb::KeySet * returned, ckdb::Key * parentKey)
{
  kdb::Key parent(parentKey);
  // ...
}

At the end of elektraMyPluginGet the C++ runtime will call the destructor for kdb::Key. This in turn will call keyDel(). Because parentKey is not part of any KeySet and nothing else increments its reference counter before the kdb::Key constructor, the reference count will be 0 at this point and keyDel() will free the memory.

Later on, at some point after elektraMyPluginGet has return, the internals of libelektra-kdb will call keyDel() again and we get a double free.

The easiest solution for the C++ plugin is to not use kdb::Key, because you can just use the C API. You could also use keyIncRef()/keyDecRef() to prevent the kdb::Key destructor from freeing the memory. However, that is a bit more complicated, because you need to call keyDecRef() after the C++ runtime has destroyed the kdb::Key.

int elektraMyPluginGet (ckdb::Plugin * handle, ckdb::KeySet * returned, ckdb::Key * parentKey)
{
  keyIncRef (parentKey);
  {
    kdb::Key parent (parentKey);
    // here we can use the parent
    // ...
  }
  keyDecRef (parentKey);
}

Something similar probably needs to happen in call2Arg in the JNI plugin. And as suggested above, in general we should probably keyIncRef()/keyDecRef() before/after passing Key * to plugins. But that should probably be part of #3693.

@markus2330
Copy link
Contributor

markus2330 commented May 11, 2021

I am pretty sure this issue is entirely independent of Java Version and OS.

Yes, it looks so. I was only surprised as nearly nothing was changed in the JNI code (within one year basically only a small fix: f630de9) and the previous version was working on (at that time) recent Java versions (it did segfault on old Java versions, though).

We're just telling the JVM to call keyDel() when the garbage collector collects a Key object, but also calling keyDel() manually afterwards.

This might explain it, maybe older JVMs never collected the objects.

The easiest solution for the C++ plugin is to not use kdb::Key, because you can just use the C API.

Yes, it would be fine for me. I wonder where you see C++ code? src/plugins/jni/jni.c is pure C?

And as suggested above, in general we should probably keyIncRef()/keyDecRef() before/after passing Key * to plugins. But that should probably be part of #3693.

This discussion seems to be OT here, let us discuss this with the other changes of #3693. (When we have time for this, now we have an urgent fix and a release to do.)

@tucek
Copy link
Contributor Author

tucek commented May 11, 2021

@tucek can you please update the README.md of jni with latest introduction of how you installed everything? Most parts there are very outdated. The Debian 9+8 instructions can be safely removed.

Are you sure plugin jni was successfully included by cmake and

Yes. I've set JAVA_HOME, I am not sure if it was needed, though.

Please update the README.md with such information if you know about it.

will do...

ctest -V -R testmod_jni did finish successfully?

Only with valgrind, otherwise I got a double free error (with termination of the process).

Since i did not get different results between the two configurations above

Which two configurations above?

(in contrary to your report)

You also get segfaults when running with valgrind?

All yielded the same result.

Which is?

Btw. in previous segfaults of JNI it was usually related to the Java version. If we would find any combination that works for CM, it would already help a lot. This is why I asked for the investigation. Seems like this is not the case this time, though.

With all 5 combinations i've tested before and one new one:

  • Ubuntu 20.04 / openjdk-11-jdk
  • Debian 10 7 openjdk-11-jdk
  • Debian 10 / AdoptOpenJDK 11.0.11.j9
  • Debian 10 / Microsoft 11.0.11.9.1
  • Debian 10 / Oracle GraalVM 21.1.0.r16 (jdk 16.0.1)
  • Ubuntu 20.04 / Microsoft 11.0.11.9.1

i've got the double free when running ctest -V -R testmod_jni. When running make run_memcheck, the test always failed either. On Ubuntu, a lot more specific errors were found than on Debian (independent of the java distribution): 20210511-0853.zip

I hope this clears it up a little bit.

@tucek
Copy link
Contributor Author

tucek commented May 11, 2021

I've tried to get rid of the double free by incrementing the reference count of the keys in jni.c, but did not get anywhere...

@markus2330
It looks like, it should be impossible to use an Elektra JNI Java plugin, as long as this problem is not solved. Which lead me to the following questions:

  • When did it last work?
  • What should participants of CM2021s, committed to writing a JNI plugin, should hand in tomorrow?

@kodebach
Copy link
Member

A temporary workaround would be to remove the finalize() methods from the JNA binding. This would create memory leaks (Keys created in Java would not be deleted), but it should avoid the double-free problem and the segfaults.

@markus2330
Copy link
Contributor

@tucek

When did it last work?

Last CM it was working. A git-bisect would be needed to find the last working commit. But maybe the code never worked with recent Javas.

What should participants of CM2021s, committed to writing a JNI plugin, should hand in tomorrow?

We will discuss this tomorrow.

@kodebach

A temporary workaround would be to remove the finalize() methods from the JNA binding.

As you already debugged and found a plausible reason, please let us try this first. It is more important to have a tested&reliable solution than to have something very quick.

@kodebach
Copy link
Member

I wonder where you see C++ code? src/plugins/jni/jni.c is pure C?

There is no C++ code. It was just an example, because the problem is a bit more obvious with C++ (and easily found with valgrind because there is no JVM code in between).

As you already debugged and found a plausible reason, please let us try this first. It is more important to have a tested&reliable solution than to have something very quick.

It seems @tucek already tried with keyIncRef/keyDecRef:

I've tried to get rid of the double free by incrementing the reference count of the keys in jni.c, but did not get anywhere...

Maybe the attempt was not thorough enough, but I already suspected that a similar problem might occur with KeySet where no reference counting exists.

If keyIncRef/keyDecRef doesn't work, I'm not really sure how to solve this issue either. At least not without modifying internals of libelektra-kdb and maybe libelektra-core.

Last CM it was working.

The tests have been disabled for at least 4 years now. The double-free may have existed but no code actually triggered a segfault. Technically, although unlikely, a double-free can happen without a segfault and Java Plugins would definitely not be run with valgrind by the build server, because valgrind will detect lots of false positives from the JVM.

But maybe the code never worked with recent Javas.

If the issue is indeed what I described then it is entirely independent of Java Language or JVM Version. In fact the C++ example shows that it will happen in any language, where the binding automatically calls keyDel().

However, in Java the issue somewhat depends on the Garbage Collector implementation. The GC does change between some Java Versions, but AFAIK it was never entirely deterministic. So whether or not you actually get a segfault is probably somewhat random.


TL;DR if simply adding keyIncRef/keyDecRef in the JNI plugin doesn't work, I don't think there is a quick solution.

@kodebach
Copy link
Member

I have done some more investigations:

  1. Adding keyIncRef()/keyDecRef() around the call2Arg() and call1Arg() calls did not solve the problem. Instead of a segfault in keyDel() you get an error in ksNext() because of a double-free on a KeySet.
  2. Commenting out the release() call in Key.finalize and KeySet.finalize did not do anything. Turns out the finalize methods aren't even called. [*]
  3. I then took a closer look at the source code and noticed this in call2Arg()
    (*data->env)->CallVoidMethod (data->env, jks, data->midKeySetRelease);
    checkException (data, method, errorKey);
    (*data->env)->CallVoidMethod (data->env, jkey, data->midKeyRelease);
    checkException (data, method, errorKey);

    and this in call1Arg()
    (*data->env)->CallVoidMethod (data->env, jerrorKey, data->midKeyRelease);
    checkException (data, method, errorKey);

    These are direct calls to Key.release and KeySet.release from the C code. I have no idea, how this code could have ever worked since calling keyDel()/ksDel() at this point is completely wrong.
    Commenting out these six lines fixes the problem. @markus2330 If you know of any reason why we would call the release methods here, please say so. Otherwise @tucek can just remove these six lines.

[*] It seems that 1) the JVM never guaranteed that finalize() methods are called, unless you use Runtime.runFinalizersOnExit(true); (in which case they are only guaranteed to run before the JVM exits) and 2) finalize() is deprecated since Java 9. AFAICT the closest replacement for finalize() is the Cleaner class. However, I have zero experience in that area. IMO the solution might be to let Key and KeySet implement Closeable, thereby telling Java Developers that they need to call close() similar to e.g. an InputStream. See also this StackOverflow post

@tucek
Copy link
Contributor Author

tucek commented May 11, 2021

@kodebach Thank you for the thorough analysis! I was also wondering why release method is being called directly. but when i am understanding correctly:

  1. jni plugin instantiated the Key within the JVM via Constructor
  2. Constructor explicitly increasing the reference count from 0 to 1

Before removing release in jni plugin, while finalize never being called:
3. jni plugin calls Key::release (decreasing the reference count from1 to 0 and calling keyDel unallocating the key)
4. Key is double freed by

keyDel (errorKey); \

After removing release in jni plugin, while finalize never being called:
3. Is the reference counter is not being decreased before

keyDel (errorKey); \

it is not freed at all

While KeySet is being freed correctly by

ksDel (config);

Now, if we fix the Key finalization, we would get the same problem again.
One solution might be (as @kodebach already mentioned) to increment the Key reference counter before calling plugin open, but we would still need the same mechanism for KeySet IMHO.

Ad key finalization:

  • Introducing Closable for Key and KeySet would also mean having to check whether the underlying pointer has already been "closed" for each method of the Key / KeySet Java. (e.g. KeyClosedException, KeySetClosedException)
  • A more transparent method would be declaring the internal pointer reference of Key / KeySet as volatile and usinf a Cleaner. This would still introduce the need for checking the internal pointer for being null on each class's method and would raise the minimum required Java version to 9.

@kodebach
Copy link
Member

Yes, I missed that removing the release() calls would leave the reference count at 1. But the solution here is simple. Instead of calling the Java Key.release method to decrease the reference count, we can just call keyDecRef (errorKey) directly (at exactly the same point in code). Then the subsequent keyDel() call would correctly free the Key's memory.

KeySet doesn't have a reference counter, so just removing the KeySet.release call should solve everything. Neither the Java Code nor the JNI plugin would call ksDel() and the KeySet * would survive and be freed correctly by the caller of the plugin (probably libelektra-kdb, but not actually the line you posted -- that one is just in an error case).

@tucek
Copy link
Contributor Author

tucek commented May 12, 2021

Yes, I missed that removing the release() calls would leave the reference count at 1. But the solution here is simple. Instead of calling the Java Key.release method to decrease the reference count, we can just call keyDecRef (errorKey) directly (at exactly the same point in code). Then the subsequent keyDel() call would correctly free the Key's memory.

KeySet doesn't have a reference counter, so just removing the KeySet.release call should solve everything. Neither the Java Code nor the JNI plugin would call ksDel() and the KeySet * would survive and be freed correctly by the caller of the plugin (probably libelektra-kdb, but not actually the line you posted -- that one is just in an error case).

Very good! The last peace missing would be to provide an alternative constructor signature for suppressing the finalization of the Key / KeySet. Then we could leave the finalization code as is for now and i would create a ticket to deal with it in the future.

@markus2330 If this solution is fine with you, i would update my branch accordingly.

@markus2330
Copy link
Contributor

If this solution is fine with you

First and foremost the people using JNI and the CI needs to be happy with the solution. So yes, please create a PR and let us see if the CI is happy. Obviously it would be better to not create further issues and have a solution that fixes the problem without leftovers. Maybe you or @kodebach find a clean&nice way? But if this is not possible now, we'll have to live with one more open issue.

@tucek
Copy link
Contributor Author

tucek commented May 12, 2021

If this solution is fine with you

First and foremost the people using JNI and the CI needs to be happy with the solution. So yes, please create a PR and let us see if the CI is happy. Obviously it would be better to not create further issues and have a solution that fixes the problem without leftovers. Maybe you or @kodebach find a clean&nice way? But if this is not possible now, we'll have to live with one more open issue.

I think the proposed solution fixed the problem in a relatively clean way. The additional issue i would want to create is not actually directly related to this issue, but to the fact that the finalize mechanism has been deprecated several Java versions ago and we might want to be prepared before it finally gets removed.

@markus2330
Copy link
Contributor

The additional issue i would want to create is not actually directly related to this issue, but to the fact that the finalize mechanism has been deprecated several Java versions ago and we might want to be prepared before it finally gets removed.

I agree this is an additional issue but still it would be nice to have this fixed within the release, too. But you are right: let us first fix the important&urgent problems like segfaults, and then let us reevaluate.

@tucek
Copy link
Contributor Author

tucek commented May 12, 2021

The additional issue i would want to create is not actually directly related to this issue, but to the fact that the finalize mechanism has been deprecated several Java versions ago and we might want to be prepared before it finally gets removed.

I agree this is an additional issue but still it would be nice to have this fixed within the release, too. But you are right: let us first fix the important&urgent problems like segfaults, and then let us reevaluate.

@markus2330 means we never really supported Java 8 and therefore we can require Java 9+ and update the clean up procedures right away with this PR.

@tucek
Copy link
Contributor Author

tucek commented May 13, 2021

PR will also fix #3772

@tucek tucek linked an issue May 13, 2021 that may be closed by this pull request
tucek pushed a commit to tucek/libelektra that referenced this pull request May 13, 2021
* Introduced Optional return values for KeySet.lookup* and Key.getMeta
* JNA minimum Java  version increased from 8 to 9
* replaced KeySet and Key finalize() by using a Cleaner
* Improved Java doc
* updated jni.c fixing double free (also required updates to JNA binding)
* misc JNA clean-up and improvements
* TODO address empty key allocation issue
* TODO remove debug output
* TODO add contribution notes
* TODO update jni plugin README
This reverts commit 1960b03.

Revert "retrigger checks"

This reverts commit 560cd3c.

Revert "applied style"

This reverts commit edb1a41.

Revert "ElektraInitiative#3825 removed DeleteLocalRef in the hope of change for the better ;)"

This reverts commit dcb4c41.
@robaerd
Copy link
Member

robaerd commented May 22, 2021

The testmod_jni error does also occur if the build directory is deleted after installing.
E.g.:

  • make install
  • kdb run_all -> all tests succeed
  • delete build/
  • kdb run_all -> testmod_jni fails

error-report.log

@tucek
Copy link
Contributor Author

tucek commented May 22, 2021

The testmod_jni error does also occur if the build directory is deleted after installing.
E.g.:

  • make install
  • kdb run_all -> all tests succeed
  • delete build/
  • kdb run_all -> testmod_jni fails

error-report.log

I'm trying to find the dependency to the build dir using strace...

@tucek
Copy link
Contributor Author

tucek commented May 22, 2021

testmod_jni wants to access build/src/bindings/jna/libelektra/build/libs/libelektra-0.9.5-all.jar because it is injected into the code at build time in https://github.com/tucek/libelektra/blob/315a909579f5036c39544c65e6caa51cd5bd61fb/src/plugins/jni/testmod_jni.h.in#L12

We should find another way to set the classpath for the test, so it works with installed jna bindings and the build local version...

@kodebach
Copy link
Member

We should find another way to set the classpath for the test, so it works with installed jna bindings and the build local version...

I don't think there is currently an easy way to pass such a parameter to the test. There is a mechanism for plugins that need extra executables to work (e.g. gopts), but this is not directly usable here, since that would just create an second copy of the JAR file so that a relative path works.

I think it is totally fine to just exclude the test in the installed version. Either via some CMake setup, or just by checking the main() of the test if the file CLASSPATH exists (if (access(CLASSPATH, F_OK ) == 0)).

@tucek
Copy link
Contributor Author

tucek commented May 22, 2021

@kodebach what about first checking for the build version and then falling back to the install version and failing verbosely when neither has been found?

@kodebach
Copy link
Member

@kodebach what about first checking for the build version and then falling back to the install version and failing verbosely when neither has been found?

This is also just a work around and not a permanent solution. Take this scenario for example:

  • I download the source code, build and install.
  • I now change some code, rebuild and get a new JAR in my build folder
  • Crucially, I do not install again.
  • If I know run the installed version, testmod_jni will happily use the new JAR in my build folder. That could results in a test failure, because the installed plugin doesn't match the build JAR.

A similar scenario is possible, if we first check for the installed version.

The only permanent solution is, if the installed version always uses the installed JAR and the build folder version always uses the build folder JAR. If you really want to invest the time, look into how srcdir_file and bindir_file are implemented for tests:

/* return file name in srcdir.
* No bound checking on file size, may overflow. */
char * srcdir_file (const char * fileName)
{
strcpy (file, srcdir);
strcat (file, "/");
strcat (file, fileName);
return file;
}

However, I have to stress again, I do not think it is worth your time right now and disabling the tests in installed versions (probably easiest by just not installing the testmod_jni file) is the better solution.

tucek pushed a commit to tucek/libelektra that referenced this pull request May 22, 2021
… added sesnible erromessage instead of segfault
…added sensible error message instead of segfault, when jar is not found
@tucek
Copy link
Contributor Author

tucek commented May 22, 2021

The jni / package step problem is hereby resolved.

@tucek
Copy link
Contributor Author

tucek commented May 22, 2021

The last thing i would like to try is enabling the Cleaner again, while adding a locking so that KeySets and Keys are not being cleaned up in parallel.

Michael Tucek added 2 commits May 23, 2021 16:30
…t key sets from being released in parallel as well as prevent keys from being release in parallel to key sets
@tucek
Copy link
Contributor Author

tucek commented May 23, 2021

CI fails again because of some links being temporary unavailable... IMHO checks for broken links should only be executed manually or maybe automatically by the release pipeline...

@mpranj
Copy link
Member

mpranj commented May 23, 2021

CI fails again because of some links being temporary unavailable... IMHO checks for broken links should only be executed manually or maybe automatically by the release pipeline...

But then we do not know when a PR introduces a broken link until the release day, which is suboptimal. The way it is now we immediately see broken links and can fix or temporarily whitelist them, even if it is not our own fault.

@tucek
Copy link
Contributor Author

tucek commented May 23, 2021

CI fails again because of some links being temporary unavailable... IMHO checks for broken links should only be executed manually or maybe automatically by the release pipeline...

But then we do not know when a PR introduces a broken link until the release day, which is suboptimal. The way it is now we immediately see broken links and can fix or temporarily whitelist them, even if it is not our own fault.

That's correct, but IMHO the CI build should not depend on all referenced websites being up to succeed. But it's not my place to prioritize such issues for this project.

@kodebach
Copy link
Member

But then we do not know when a PR introduces a broken link until the release day, which is suboptimal.

A good compromise would probably to set the link checker, such that it only checks links in files that have been modified by the PR. That way we can catch broken links before they are added, but don't annoy people with websites that are temporarily offline.

But such a setup is more complicated than the current one and since the Link Check is a separate build job, we can always just ignore a failure and merge anyway.

@mpranj
Copy link
Member

mpranj commented May 23, 2021

the CI build should not depend on all referenced websites being up to succeed.

Agreed.

But it's not my place to prioritize

Good suggestions are always welcome. ❤️

But such a setup is more complicated than the current one and since the Link Check is a separate build job, we can always just ignore a failure and merge anyway.

I also think the simplest is just to ignore it, but be aware of what is going on. Better approaches are definitely welcome but currently probably not worth our time.

@tucek
Copy link
Contributor Author

tucek commented May 23, 2021

From my perspective, this PR is rdy to be merged.

@tucek tucek requested a review from markus2330 May 24, 2021 12:08
Copy link
Contributor

@markus2330 markus2330 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing job! 💖 Thanks to everyone participating! 💞

Now lets do further testing, especially also with a more real-world plugin (H3) and kdb mount.

@markus2330 markus2330 merged commit d048ae8 into ElektraInitiative:master May 24, 2021
Copy link
Member

@mpranj mpranj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work, thank you so much! 🚀

doc/news/_preparation_next_release.md Show resolved Hide resolved
@tucek tucek deleted the jniTestFix_3758 branch May 26, 2021 06:51
@kodebach kodebach mentioned this pull request Jun 1, 2021
20 tasks
@tucek tucek added this to In progress in Java bindings overhaul via automation Aug 9, 2021
@tucek tucek moved this from In progress to Done in Java bindings overhaul Aug 9, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
5 participants