Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sizeof broken after android-related merges #5

Closed
blueberry opened this issue Apr 20, 2016 · 15 comments
Closed

Sizeof broken after android-related merges #5

blueberry opened this issue Apr 20, 2016 · 15 comments

Comments

@blueberry
Copy link

I have just built and installed JOCL on Arch Linux and discovered regressions with the newest android-related changes in Sizeof, when used in linux.

JOCL builds and tests fine, but when I tried it with a well testet ClojureCL, the built failed, with a messages related to the newly introduced Sizeof native method:

java.lang.UnsatisfiedLinkError: org.jocl.Sizeof.computePointerSizeNative()I

This happens with the latest commit 18d9a04

I did a checkout of last week's commit 862f2ab: JOCL gets built, AND the existing projects work.

gpu added a commit that referenced this issue Apr 20, 2016
If one of the fields of the Sizeof class was used
before the CL class was loaded, this lead to an
UnsatisfiedLinkError, as reported in issue #5
@gpu
Copy link
Owner

gpu commented Apr 20, 2016

Thanks for this hint. This was a bit subtle: It only happened when the Sizeof class was used before the CL class was loaded, because only the CL class loaded the native library.

This should now be fixed, and covered with a new (fairly trivial...) test case. For me, the test threw the same error before, and now it passes, so hopefully, it resolved the issue for you as well.

@blueberry
Copy link
Author

Unfortunately, after cleaning everything and rebuilding, the error is still there (yes, I deleted JOCL and JOCL.build and did everything from the scratch):

Caused by: java.lang.UnsatisfiedLinkError: org.jocl.Sizeof.computePointerSizeNative()I
at org.jocl.Sizeof.computePointerSizeNative(Native Method)
at org.jocl.Sizeof.computePointerSize(Sizeof.java:362)
at org.jocl.Sizeof.(Sizeof.java:299)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
at clojure.lang.RT.classForName(RT.java:2168)

@blueberry
Copy link
Author

blueberry commented Apr 20, 2016

To clarify: your new test in JOCL passes, but the tests in my library that uses JOCL still have the Sizeof load problem.

@gpu
Copy link
Owner

gpu commented Apr 20, 2016

OK, that's odd - and it's distressing that I can't reproduce this right now. Is there any test case that might expose the same behavior (on windows?).

I wonder whether this is related to some classloader magic in Clojure, but this would not explain why loading the library works for the main CL class, but not for the Sizeof class.

In fact, without the possibility to analyze this further, I don't see what I could to right now, except for undoing the change. If until tomorrow evening nobody (also @phlip9, referring to the discussion in #4 (comment) ) has another idea, I'd rather go back to use the sun.arch.data.model approach (without the native method), so that it at least works reliably on desktop systems.

@blueberry
Copy link
Author

blueberry commented Apr 20, 2016

Thanks for looking at this. Unfortenately, I can't provide a java test case, since I am using it from Clojure library that has many methods, so I can't identify why Sizeof fails.

What I discovered is not much: CL calls work, Sizeof importing (interactive, from repl) works, but if i try to use Sizeof directly, even AFTER CL has been loaded and called, I get that exception.

@phlip9
Copy link
Contributor

phlip9 commented Apr 20, 2016

@gpu I actually just ran into this issue while testing on an Android device. Without calling something in CL first, the native library doesn't load and Sizeof.calculatePointerSizeNative() fails with an UnsatisfiedLinkError. While I haven't tested the most recent commit, I don't immediately see why it's still failing. If anything, just calling CL.loadNativeLibrary() should force the class loader to load CL and run its static block. Strange...

@blueberry
Copy link
Author

It doesn't help (at least in my case) but maybe a solution would be to move the calculatePointerSizeNative to the CL class? May be the problem is that loadNativeLibrary is called twice?

@phlip9
Copy link
Contributor

phlip9 commented Apr 21, 2016

It seems to work with the latest changes... I added a few print statements to see the control flow:

I/System.out(30313): Begin Sizeof static block
I/System.out(30313): Begin CL static block
I/System.out(30313): Begin CL.loadNativeLibrary()
I/System.out(30313): nativeLibraryLoaded = false
I/System.out(30313): End CL.loadNativeLibrary()
I/System.out(30313): End CL static block
I/System.out(30313): Begin CL.loadNativeLibrary()
I/System.out(30313): nativeLibraryLoaded = true
I/System.out(30313): End CL.loadNativeLibrary()
I/System.out(30313): End Sizeof static block
I/System.out(30313): Sizeof.computePointerSize()
I/System.out(30313): Sizeof.POINTER = 4

As I suspected, calling CL.loadNativeLibrary() from Sizeof triggers the class loader to run CL's static block first, then finishes executing Sizeof's static block. The code still works, however.

This is also running through Scala on Android, which may or may not affect the result.

If we move Sizeof.calculatePointerSizeNative() I expect this issue to resolve, as we will then isolate loading the native library to a single class.

@blueberry
Copy link
Author

blueberry commented Apr 21, 2016

I tried various changes with how CL and Sizeof are called, as well as tried changing CL and Sizeof classes themselves and the problem is still there. It seems to me, as gpu suggested, that the problem is in exotic class loading.

@gpu @phlip9 This works for me:

private static int computePointerSize()
        {
            if (LibUtils.calculateOS() == LibUtils.OSType.ANDROID) {
                return computePointerSizeNative();
            } else {
                String bits = System.getProperty("sun.arch.data.model");
                if (bits.equals("32"))
                {
                    return 4;
                }
                else if (bits.equals("64"))
                {
                    return 8;
                }
                else
                {
                    System.err.println(
                        "Unknown value for sun.arch.data.model - assuming 32 bits");
                    return 4;
                }
            }

    }

@gpu
Copy link
Owner

gpu commented Apr 21, 2016

Yes, I also considered making the native method a package-private method in the CL class. But I usually try to avoid tackling a problem when I do not know what causes the problem.

On the one hand, it seemed clear (in hindsight) : When using Sizeof without having used CL, then the native library was not loaded. That definitely was a problem, and I could reproduce this, and fix it with the commit linked above. And, as Dragan said, this "works", in the sense that the unit test passes.

But it's entirely unclear to me why it does not work in the Clojure setup. Introducing some of the System.out logs, as Philip suggested, in the static initializer blocks (of both classes) and in the CL.loadNativeLibrary method might bring some insights. The stack trace until now shows

at org.jocl.Sizeof.(Sizeof.java:299)
at java.lang.Class.forName0(Native Method)

which means that during class loading, it seems to do the initialization of the POINTER size by calling the native method, and this fails. But according to http://docs.oracle.com/javase/specs/jls/se8/html/jls-12.html#jls-12.4.2 , I think that the static initializers should have run before this initialization takes place.

So @phlip9 if the solution proposed by Dragan works on Android as well, then this may be a feasible workaround, and I'd add it later today to at least have a working state for both worlds.

But I'd really like to know what went wrong there under the hood. @blueberry It would be great if you could add some basic System.outs, as described above, just to get a hint about whether or not (and when) the static initializer is running.

@blueberry
Copy link
Author

I've already tried that, but did not find any hint:

Compiling uncomplicate.clojurecl.info
======= loadNativeLibrary ENTER
======= loadNativeLibrary !nativeLibraryLoaded ENTER
======= loadNativeLibrary !nativeLibraryLoaded LEAVE
======= loadNativeLibrary LEAVE
======= loadNativeLibrary ENTER
======= loadNativeLibrary LEAVE
======= computerPointerSize ENTER
java.lang.UnsatisfiedLinkError: org.jocl.Sizeof.computePointerSizeNative()I, compiling:(uncomplicate/clojurecl/info.clj:296:3)
Exception in thread "main" java.lang.UnsatisfiedLinkError: org.jocl.Sizeof.computePointerSizeNative()I, compiling:(uncomplicate/clojurecl/info.clj:296:3)

gpu added a commit that referenced this issue Apr 21, 2016
The computation of the pointer size in a generic
way has several caveats, as seen in issue #5
This is an attempt to support Android-based
systems as well as desktop systems.
@gpu
Copy link
Owner

gpu commented Apr 21, 2016

OK, this trace looks like it indeed does load the library (without errors). I could only guess now that this may be related to the use of different ClassLoaders in Clojure, as each class loader has its own set of native libraries ( https://docs.oracle.com/javase/8/docs/technotes/guides/jni/spec/invocation.html#library_version ) and thus, the library might in fact be loaded, but from a different loader than the one that loads the Sizeof class. (Still, this doesn't explain why it works for the CL class alone...). Further debugging could involve starting the JVM with -XX:+TraceClassLoading and hope for further insights, but ... it's difficult.

I have added the solution that Dragan suggested, and which hopefully works in both setups.

However, I'm still feeling uncomfortable about this, and will definitely leave this issue open until it is clear what's going on there....

@blueberry
Copy link
Author

Thank you for the quick response. Please, when you are ready to release a RC to the maven central, ping me to do the linux 64-bit build (and also 32 bit if it could be done on the same system from the cmake you supply).

@gpu
Copy link
Owner

gpu commented Apr 21, 2016

Thanks. As there seems to be no response from the original reporter: Did you encounter any issues during linking, related to the fPIC flag that was mentioned in #3 ?

@blueberry
Copy link
Author

No, JOCL works well with a fairly non-trivial Clojure code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants