Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for OS400 #9

Open
aaronbartell opened this issue Aug 14, 2014 · 79 comments
Open

Support for OS400 #9

aaronbartell opened this issue Aug 14, 2014 · 79 comments
Milestone

Comments

@aaronbartell
Copy link

There currently isn't jffi support for IBM's OS400** operating system and I am attempting to build jffi from a git clone of the current master. I see @pierrickrouxel is also working through this issue in the JRuby project and also this issue. Should I be logging this issue here or in the JRuby issues?

**Now known as IBM i or just i. You might also hear it referred to as iSeries.

What I've Done
I've made some changes to the build process (see fork) to support OS400 but have come to a point where I need input because the build isn't completing. The config.log declares alloca support was found, but alloca does NOT exist on OS400.

from config.log

. . .
| #define HAVE_ALLOCA_H 1
| #define HAVE_ALLOCA 1
. . .

The stdout to my shell gives more information - specifically:

 [exec] /home/aaron/git/jffi/jni/jffi/LongDouble.c:82:5: error: implicit declaration of function 'alloca' [-Werror=implicit-function-declaration]
 [exec] /home/aaron/git/jffi/jni/jffi/LongDouble.c:82:11: error: incompatible implicit declaration of built-in function 'alloca' [-Werror]
 [exec] cc1: all warnings being treated as errors
 [exec]
 [exec] /home/aaron/git/jffi/jni/GNUmakefile:295: recipe for target '/home/aaron/git/jffi/build/jni/jffi/LongDouble.o' failed
 [exec] gmake: *** [/home/aaron/git/jffi/build/jni/jffi/LongDouble.o] Error 1
@aaronbartell
Copy link
Author

I neglected to mention that malloc is supported on OS400.

I see malloc.h is optionally included in configure if _MSC_VER is specified. I also see alloca has the same parameter list/types as malloc. My understanding is that alloca releases memory upon a function going out of scope/returning and malloc requires a manual call to free(ptr) to accomplish the same (Microsoft malloc docs).

I tried altering configure to be as follows (note addition of || _OS400) without success and it produced the same error**.

# ifdef _MSC_VER || _OS400
#  include <malloc.h>
#  define alloca _alloca
# else
. . .

**[exec] /home/aaron/git/jffi/jni/jffi/LongDouble.c:82:11: error: incompatible implicit declaration of built-in function 'alloca' [-Werror]

@headius
Copy link
Member

headius commented Aug 25, 2014

This is the right place to report bugs with jffi!

We can put your good, working changes into a PR any time.

alloca allocates memory from the calling function's stack. When the called function returns, the stack pointer moves back to the caller's position, so the memory is deallocated automatically. This is equivalent to having "char[256] x" in your code.

Because of the stack effects and automatic deallocation, alloca can't directly be emulated with malloc. You're running into problems because the code calling alloca doesn't actually pay attention to the configure results.

I'm not sure how to proceed. We could modify all code that uses alloca, but it seems really unfortunate to cripple other platforms for OS/400. I'm poking around to see if there's a blessed alternative to alloca on OS/400. Maybe you have better developer docs?

@headius
Copy link
Member

headius commented Aug 25, 2014

I think it may be possible to transplant the magic ifdef alloca block from jni/libffi/include/ffi_common.h to jni/jffi/jffi.h. Can you try that?

@penberg
Copy link
Contributor

penberg commented Aug 25, 2014

What compiler does OS/400 use? Does it support C99? If so, you could just try to replace alloca() with variable-length arrays.

@penberg
Copy link
Contributor

penberg commented Aug 25, 2014

If I read the linked compiler documentation correctly, it does support C99.

So you should be able to replace the alloca call with something like this:

diff --git a/jni/jffi/LongDouble.c b/jni/jffi/LongDouble.c
index cb8f243..b6031a7 100644
--- a/jni/jffi/LongDouble.c
+++ b/jni/jffi/LongDouble.c
@@ -32,9 +32,6 @@

 #include <stdio.h>
 #include <stdlib.h>
-#ifdef __sun
-# include <alloca.h>
-#endif
 #include <stdint.h>
 #include <stdbool.h>
 #include <jni.h>
@@ -75,11 +72,10 @@ Java_com_kenai_jffi_Foreign_longDoubleFromString(JNIEnv *env, jobject self, jstr
   jbyteArray array, jint arrayOffset, jint arrayLength)
 {
     long double ld;
-    char* tmp;
     jsize len;

     len = (*env)->GetStringUTFLength(env, str);
-    tmp = alloca(len + 1);
+    char tmp[len + 1];
     (*env)->GetStringUTFRegion(env, str, 0, len, tmp);
     ld = strtold(tmp, NULL);
     jffi_encodeLongDouble(env, ld, array, arrayOffset, arrayLength);

@aaronbartell
Copy link
Author

I am using the gcc compiler**. I will work on testing the two suggestions and get back to you.

**

$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/QOpenSys/opt/freeware/bin/../libexec/gcc/powerpc-ibm-aix6.1.0.0/4.6.2/lto-wrapper
Target: powerpc-ibm-aix6.1.0.0
Configured with: ../gcc-4.6.2/configure --with-as=/usr/bin/as --with-ld=/usr/bin/ld --enable-languages=c,c++,fortran --prefix=/opt/freeware --mandir=/opt/freeware/man --infodir=/opt/freeware/info --enable-threads --enable-version-specific-runtime-libs --disable-nls --enable-decimal-float=dpd --host=powerpc-ibm-aix6.1.0.0
Thread model: aix
gcc version 4.6.2 (GCC)

@penberg
Copy link
Contributor

penberg commented Aug 25, 2014

@aaronbartell That's pretty recent GCC version. It should support VLAs so if the pragma magic @headius suggested doesn't work, please try the above. You probably need to convert more code, of course, but it should be straight-forward.

@aaronbartell
Copy link
Author

@penberg, it made it past LongDouble.c and is now choking on FastNumericInvoker.c so it would seem this is viable. Now I will try suggestion from @headius.

See log here

@aaronbartell
Copy link
Author

I think it may be possible to transplant the magic ifdef alloca block from jni/libffi/include/ffi_common.h to jni/jffi/jffi.h. Can you try that?

Here is the log (failed). Below are the diffs so I can make sure I made the correct changes.

jffi.h diff

ffi_common.h diff

@headius
Copy link
Member

headius commented Aug 26, 2014

@aaronbartell I meant to copy the pragmas into jffi.h. libffi and the jffi JNI binding get built separately, and will both need the pragmas. Since libffi built fine without the pragmas and failed (with alloca warnings and related errors) when they were removed, it would seem we're on the right track.

@aaronbartell
Copy link
Author

I now have that ifdef code in both areas. I did an ant clean and then ant jar with the shell results found here. Here is the config.log in case that helps.

Let me know if you'd like me to run it again with either -bloadmap or -bnoquiet to obtain more information.

@headius
Copy link
Member

headius commented Aug 26, 2014

That's looking much closer. Something wrong with the final linking of libjffi, but everything seems to have compiled. Perhaps there's a gcc flag needed here to indicate this is a shared library and not an executable?

@penberg
Copy link
Contributor

penberg commented Aug 26, 2014

@headius Yes, -shared. It looks like it's missing from the log @aaronbartell posted.

@aaronbartell
Copy link
Author

I believe we have a successful build! See log here. The issue is I had the operating system upper-cased on this line.

I think one thing needing to be cleaned up is the name of the resulting .jar file because it contains a slash (/) in it:

[zip] Building zip: /home/aaron/git/jffi/dist/jffi-ppc-OS/400.jar

I will work to find how to get rid of the slash in the name.

@aaronbartell
Copy link
Author

I am now running through the ant test motions. I first changed libtest/GNUmakefile to have a section for os400 by copying the aix equivalent:

diff --git a/libtest/GNUmakefile b/libtest/GNUmakefile
index 9e70664..65c3097 100644
--- a/libtest/GNUmakefile
+++ b/libtest/GNUmakefile
@@ -153,6 +153,13 @@ ifneq ($(findstring mingw, $(OS)),)
   LIBEXT = dll
   PICFLAGS=
 endif
+
+ifeq ($(OS), os400)
+  LIBEXT = a
+  SOFLAGS = -shared -static-libgcc
+  PICFLAGS += -pthread
+endif
+
 ifeq ($(CPU), sparcv9)
   MODEL = 64
 endif

That caused some pthread errors to show up during ant test - see log.

Looking through the libtest/GNUmakefile it doesn't appear PICFLAGS (where -pthread is added) is used on the compiles so I made the following change to see if it would further the test, and it did.

diff --git a/libtest/GNUmakefile b/libtest/GNUmakefile
index 9e70664..65c3097 100644
--- a/libtest/GNUmakefile
+++ b/libtest/GNUmakefile
 $(LIBTEST):  $(TEST_OBJS)
-       $(CC) -o $@ $(LDFLAGS) $(TEST_OBJS) -lm
+       $(CC) -pthread -o $@ $(LDFLAGS) $(TEST_OBJS) -lm

 clean::
        # nothing to do - ant will delete the build dir

Then I run ant test again (note I didn't do ant clean) and seemingly get further, though all of the tests fail (see log here)

Thoughts on what direction I should take these tests?

@headius
Copy link
Member

headius commented Aug 27, 2014

Does anything in jffi/build/test/results show what the errors are? With all tests failing I'd guess it's not able to load libjffi.so.

@headius
Copy link
Member

headius commented Aug 27, 2014

Great progress btw...I'm sure we'll iron this out soon :-)

@aaronbartell
Copy link
Author

@headius, there was a file for each failure in jffi/build/test (sorry I missed those). Here is an example of one and they are all conveying basically the same error of java.lang.UnsatisfiedLinkError.

Concerning libjffi.so, here is what a find produces:

-bash-4.2$ pwd
/home/aaron/git/jffi
-bash-4.2$ find . -name libjffi*
./build/jni/libjffi-1.2.a

@aaronbartell
Copy link
Author

Doh! If I would have looked further into the aforementioned stack trace I would have seen this:

Caused by: java.lang.RuntimeException: cannot determine operating system

I will do some digging to see what I can find.

@aaronbartell
Copy link
Author

I have made it a bit further and am now stuck with errors in this full stack trace.

Here's an excerpt:

java.lang.UnsatisfiedLinkError: /home/aaron/git/jffi/build/jni/jffi-1.2.srvpgm ( 0509-022 Cannot load module /home/aaron/git/jffi/build/jni/libjffi-1.2.srvpgm.so. 0509-026 System error: A file or directory in the path name does not exist.)

It is true the above link/file doesn't exist. What portion of the build should have created it? I ran ant jar and then ant test.

In case you need to know, here are my changes to StubLoader.java to account for OS400:

diff --git a/src/main/java/com/kenai/jffi/internal/StubLoader.java b/src/main/java/com/kenai/jffi/internal/StubLoader.java
index 9a1d842..1e2e14d 100644
--- a/src/main/java/com/kenai/jffi/internal/StubLoader.java
+++ b/src/main/java/com/kenai/jffi/internal/StubLoader.java
@@ -82,6 +82,8 @@ public class StubLoader {
         WINDOWS,
         /** IBM AIX */
         AIX,
+        /** IBM OS400 */
+        OS400,
         /** IBM zOS **/
         ZLINUX,

@@ -136,7 +138,9 @@ public class StubLoader {
         } else if (startsWithIgnoreCase(osName, "sunos") || startsWithIgnoreCase(osName, "solaris")) {
             return OS.SOLARIS;
         } else if (startsWithIgnoreCase(osName, "aix")) {
-            return OS.AIX;
+            return OS.AIX;
+        } else if (startsWithIgnoreCase(osName, "OS/400")) {
+            return OS.OS400;
         } else if (startsWithIgnoreCase(osName, "openbsd")) {
             return OS.OPENBSD;
         } else if (startsWithIgnoreCase(osName, "freebsd")) {
@@ -208,8 +212,10 @@ public class StubLoader {
         if (getOS().equals(OS.DARWIN)) {
             return "Darwin";
         }
+        if (getOS().equals(OS.OS400)) {
+            return "OS400";
+        }

-
         String osName = System.getProperty("os.name").split(" ")[0];
         return getCPU().name().toLowerCase(LOCALE) + "-" + osName;
     }

@headius
Copy link
Member

headius commented Aug 28, 2014

The StubLoader changes look good.

The libjffi.so would be created by the native part of the build. It occurs to me now you may need to manually copy the it to the archive directory (working from memory here since my machine is busted). Then I believe the jar task will incorporate it into the jffi jar file.

@aaronbartell
Copy link
Author

Bummer about your machine.

Is this the file you are talking about? Does it matter it is an archive library vs. shared object?

-bash-4.2$ pwd
/home/aaron/git/jffi
-bash-4.2$ find . -name "*libjffi*"
./build/jni/libjffi-1.2.a

@headius
Copy link
Member

headius commented Sep 1, 2014

No, the file is jffi-SOMETHING.jar, and the build will put it in dist:

-build-platform-jar:
     [echo] platform=Darwin
      [zip] Building zip: /Users/headius/projects/jffi/dist/jffi-Darwin.jar

Copy that to archive/ and rebuild, and I believe it will get included into the final jffi jar. The build does not copy to archive/ because the files there are known to git.

@headius
Copy link
Member

headius commented Sep 1, 2014

BTW...the final, working OS/400 native jar, copied to archive/, is what you'd PR (ideally as a separate commit from the changes needed to build it).

@aaronbartell
Copy link
Author

Here's the steps I've taken:

1 - rm archive/jffi-ppc-OS400.jar

2 - ant clean

3 - ant jar

4 - mv dist/jffi-ppc-OS400.jar archive/

5 - ant jar <---- This didn't update the archive/jffi-ppc-OS400.jar's timestamp so I figured to continue with another ant clean. I left archive/jffi-ppc-OS400.jar in place from the previous ant jar.

6 - ant clean

7 - ant jar

8 - ant test This produced the previous errors:

java.lang.UnsatisfiedLinkError: could not locate stub library in jar file. Tried [jni/OS400/jffi-1.2.srvpgm, /jni/OS400/jffi-1.2.srvpgm]

and

java.lang.UnsatisfiedLinkError: /home/aaron/git/jffi/build/jni/jffi-1.2.srvpgm ( 0509-022 Cannot load module /home/aaron/git/jffi/build/jni/libjffi-1.2.srvpgm.so.

9 - cp dist/jffi-ppc-OS400.jar archive/

10 - ant test <---- running this produced the same errors as in step 8.

I am guessing I did something wrong. What should I have done?

@headius
Copy link
Member

headius commented Sep 2, 2014

Your steps have a lot of duplication, but the end result should have been correct if the library's getting built right. Once you've built the native archive and copied the jar into archive/ you can just do normal builds from then on. The archive/ jar will not be updated unless you copy a new one in place, but it is where the build gets artfacts to insert into the jffi jar.

Check that the jffi jar has either the OS400 native jar or the files that would be inside it (the native lib itself). Check that the path/filename jffi uses to load that library actually matches the file (I see .so, and you mentioned .a, so there's a lead).

It looks like jffi may not have the right heuristics for building a shared lib filename on OS/400, and so it can't locate the built library.

@aaronbartell
Copy link
Author

Here's the archive/jffi-ppc-OS400.jar contents:

The call to System.mapLibraryName(stubLibraryName) produces jffi-1.2.srvpgm. I am not familiar with the .srvpgm extension and am wondering if this is an OS400 thing or if it is a Java thing. OS400 does have something called "service programs" but they have to do with a different/native runtime environment on the machine.

So then I tried manually changing the path to jni/ppc-OS400/libjffi-1.2.a and it still isn't finding it. Other ideas?

@aaronbartell
Copy link
Author

A little more info... it seems System.load in StubLoader.java always appends .so to the end of the name. Now I am curious how the AIX folks are getting this accomplished as they also have an archive (i.e. libjffi-1.2.a).

@headius
Copy link
Member

headius commented Sep 4, 2014

I wonder if System.loadLibrary is a way to route around this. I've never spent a lot of time looking at how those paths work.

I've put out a call for help on Twitter. If System.load is actually appending an invalid extension (and System.mapLibraryName is adding a bogus or atypical extension) that would seem to be a bug in IBM's JVM. That seems unlikely to me.

Do you have an IBM iSeries or J9 support person you can talk to about this? I'm running out of ideas and don't have an iSeries machine here to play around on :-)

@aaronbartell
Copy link
Author

@boskowski See this post for a unit test property change you need to make to get past the java.lang.ClassNotFoundException error.

@boskowski
Copy link

@aaronbartell Thanks, I definitely have to catch up with the backlog. Now I'm stuck with the OS complaining the library (build/jni/libjffi-1.2.a) has a wrong signature (i..e an invalid magic number).
Update: Here's the log. Error 0509-103 is in Italian, translated: "The module has an invalid magic number." The build log is here.

@aaronbartell
Copy link
Author

@boskowski Please include a gist of stack traces so I can look further into errors. Note, you will need to hard code the OS to not have a slash in it (i.e. os/400 vs os400) because the OS is used in the file names and obvious a slash will cause issues.

@boskowski
Copy link

@aaronbartell I just updated the previous message with the log.

@aaronbartell
Copy link
Author

@boskowski, so you've got all my changes including this one where I comment out the default JNIEXT to not be .a?

boskowski added a commit to boskowski/jffi that referenced this issue Dec 8, 2015
boskowski added a commit to boskowski/jffi that referenced this issue Dec 8, 2015
@boskowski
Copy link

Trying not to overdo it, I switched JVM from Java 8 64bit to IBM J9 VM (build 2.6, JRE 1.6.0 OS/400 ppc-32 jvmap3260_26sr8fp4-20150414_02 (JIT enabled, AOT enabled). Building against this codebase plus the diffs below resulted in all tests running successfully except for NumberTest, which fails regardless of the length of the input.

--- src/main/java/com/kenai/jffi/Platform.java  (revision a803d182d8cdbad1eb71fd9a759d764113b9bb11)
+++ src/main/java/com/kenai/jffi/Platform.java  (revision )
@@ -415,7 +415,7 @@

         @Override
         public String mapLibraryName(String libName) {
-            return "lib" + libName + ".a";
+            return "lib" + libName + ".so";
         }

         @Override

--- src/test/java/com/kenai/jffi/UnitHelper.java    (revision a803d182d8cdbad1eb71fd9a759d764113b9bb11)
+++ src/test/java/com/kenai/jffi/UnitHelper.java    (revision )
@@ -63,6 +63,7 @@
             case WINDOWS:
                 return "msvcrt.dll";
             case AIX:
+            case OS400:
                 if (Platform.getPlatform().addressSize() == 32){
                     return "libc.a(shr.o)";
                 } else {

NumberTest yields

java.lang.NumberFormatException: Not a valid char constructor input: 1,2345678901234567
    at java.math.BigDecimal.bad(BigDecimal.java:1859)

this is the test output file (here I shortened the number as already tried by @aaronbartell, but the result is the same as with the original number).

What looks strange is the comma, which isn't contained in com/kenai/jffi/NumberTest.java:264. So I tried this:

$ jruby -v
jruby 1.7.23 (1.9.3p551) 2015-11-24 f496dd5 on IBM J9 VM jvmap3260_26sr8fp4-20150414_022.6 +jit [OS/400-PowerPC]
$ irb
io/console on JRuby shells out to stty for most operations
irb(main):001:0> p = java.math.BigDecimal.new("1.234567890123456789")
=> #<Java::JavaMath::BigDecimal:0x7b6666da>
irb(main):002:0> p = java.math.BigDecimal.new("1,234567890123456789")
Java::JavaLang::NumberFormatException: Not a valid char constructor input: 1,234567890123456789
        from java.math.BigDecimal.bad(java/math/BigDecimal.java:1859)
...

Does anyone have an idea what magic managed to conjure up the comma replacement?

@aaronbartell
Copy link
Author

Does anyone have an idea what magic managed to conjure up the comma replacement?

What does DSPSYSVAL QCCSID convey?

% system "DSPSYSVAL QCCSID"
                                                   System Values                                                        Page     1
5770SS1 V7R2M0  140418                                                                            LITMIS1   12/08/15  16:36:45 UTC
                Current                         Shipped
 Name           value                           value                           Description
 QCCSID      >  37                              65535                           Coded character set identifier
     Note:  > means current value is different from the shipped value
                                         * * * * *   E N D  O F  L I S T I N G   * * * * *

@boskowski
Copy link

System-wide CCSID is 65535, I changed it to 37 for my user profile, but I guess this is not enough... I'll check with the sysadmin if a system-wide change is feasible. Thanks, @aaronbartell.

Update 2015-12-09: Changing the system-wide CCSID of the target machine is a no go.

However, looking at the Java properties, it seems that my CCSID (037) is inherited by the job as can be seen in the properties ant dumps when executing the test, e.g.

os400.job.file.encoding=Cp037
os400.file.encoding.ccsid=00819
ibm.system.encoding=ISO8859-1

I also wonder how any encoding issue can cause a change in the contents of a string constant, turning a dot into a comma, as the error appears before the string is converted into a number.

Update: I didn't notice that the error is caused by ret_f128 as it quite obviously receives the BigDecimal in the wrong format.

@headius
Copy link
Member

headius commented Mar 8, 2016

Hey folks, where do we stand today?

@aaronbartell
Copy link
Author

@headius I need your input on this question

Then I would need to re-fork (to get latest) and re-implement changes (assuming the un-passing test is ok).

SIDE NOTE: I am creating an online service for having easy access to the IBM i operating system. It's called Litmis Spaces and currently only supports Node.js and Ruby development with Python, Java, and PHP coming soon.

My hope is to automate builds of JRuby-like things so IBM i folks become first class citizens in open source.

@headius
Copy link
Member

headius commented Mar 8, 2016

@aaronbartell Thanks, sorry I missed the question. Yes, sounds like we should go forward with the binary assume that is a bad test, and if possible mask out or patch the test to work appropriately on AIX.

WRT getting regular testing on i-series...that would obviously be great :-) Let me know if I can do anything to help make that happen.

@headius
Copy link
Member

headius commented Mar 8, 2016

@aaronbartell I'll bump this to next release so you have a chance to re-pull and re-base your changes.

@headius headius added this to the 1.2.12 milestone Mar 8, 2016
@pierrickrouxel
Copy link

@headius You released 1.2.12 but I don't see the OS400 support. Why?

@headius
Copy link
Member

headius commented May 17, 2016

@pierrickrouxel Well, I don't see a PR for it anywhere and I'm a little confused at this point what to merge. It's an easy matter to spin a new release, so can someone sort out what we need to do?

@headius
Copy link
Member

headius commented May 17, 2016

@aaronbartell Still out there? If you're happy with what you have now, can we get a (rebased) PR?

@headius headius modified the milestones: 1.2.13, 1.2.12 May 17, 2016
@aaronbartell
Copy link
Author

@headius @pierrickrouxel

I can do a rebased PR but not until after I get done with two international travels (Toronto this week and Stockholm in June). Earliest I could accomplish this task is June 25th.

Side note: Bummed I can't make the RUM meetup tonight as I see JRuby is the topic (I live in Mankato, MN). Have fun!

@headius
Copy link
Member

headius commented May 26, 2016

@aaronbartell Ok, whenever you can get to it is great, or if someone else here wants to try to do it that's great too.

Yeah, the RUM thing was a last-minute decision...but I'm hoping to get to RUM more regularly.

@headius
Copy link
Member

headius commented Sep 26, 2016

All: This is one of the only outstanding issues in the JFFI tracker, and I'd like to resolve it. Do we have anything current that could go into 1.2.13 or a big bang 1.3 release?

@hancockm
Copy link

Could you change
BigDecimal param = new BigDecimal("1.234567890123456789");
to:
BigDecimal param = new BigDecimal("1.234567890123456789", MathContext.DECIMAL128); ?

@headius
Copy link
Member

headius commented Jul 9, 2020

Pinging this one last time to see if someone can contribute a binary or provide access to an environment where we can build for IBM i-series.

@hancockm
Copy link

hancockm commented Jul 9, 2020

I have access to IBMi 720, but we are in the process of upgrading to a Power 9 IBMi next week and would prefer to do it after we upgrade to the newer machine. It would also but built on the most current i hardware and software environment. Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants