# PJRmi

## Overview

PJRmi is an API for performing Remote Method Invocation (RMI, aka RPC) in a Java process from a Python one, and vice versa. The principle it works by is to create shim objects in Python which the user will transparently treat as they would most Python objects, but which actually cause things to happen inside the Java process.

There are three main modes of operation:
  1. A Java child process spawned from a Python parent.
  1. A Java process which runs a PJRmi server, which Python clients may connect to.
  1. An in-process JVM, created inside the Python process.

In the common server use-case, in order to provide a connection, a user must instantiate a PJRmi thread in their Java process. This can then register a service with our internal systems which the Python client can connect to. It's also possible to create a server instance which listens on a secured socket.

## These examples...

We'll be using the parent/child model in the examples below but pretty much everything you can do in one mode you can do in the other.

When in the Python client, there are two ways which one can get a handle on a Java object instance. You can either construct an object directly from a class, or you can get a handle on an existing object, provided by the PJRmi instance. The former case is just like creating a new Python object, and is something which we'll see a lot of below. The latter case is handled by the PJRmi instance in the Java process being an instance of a subclass of the general PJRmi class, but with an override for the the `getObjectInstance()` method; this only really makes sense when you have an explicit PJRmi server in a Java process.

PJRmi does everything by reflection so, aside from instantiating the PJRmi object in the Java process, you don't need any other boilerplate code.

## Let's see it in action!

First we'll import the Python module and get a PJRmi instance.

In [None]:
# What we'll need for this
import numpy
import os
import pjrmi
import time

# Get a new instance. 
#
# Here the application_args are just to allow the server to have
# multiple threads. This is only really needed if you intend to
# use callbacks, which we will in our examples later below.
#
# We also pipe the stdout and stderr of the Java process to 
# /dev/null by means of setting the filehandles to None.
c = pjrmi.connect_to_child_jvm(stdout=None,
                               stderr=None,
                               application_args=('num_workers=2',))

Right, we now have our instance. The first thing we'll want to do is to create some objects which we can play about with. We'll also use this as an example of the way that "compatible" types can flow between the two languages.

In [None]:
ArrayList = c.class_for_name('java.util.ArrayList')
HashMap   = c.class_for_name('java.util.HashMap')

What did we get?

In [None]:
ArrayList

In [None]:
HashMap

These are Python classes which are going to represent the Java ones. When you create instances of them you will actually be creating handles on the actual Java instances.

We can create them empty, or we can give them something to copy. Here's where we get to see the two languages flowing into each other, at the type level.

In [None]:
list1 = ArrayList()
list2 = ArrayList([3,1,4,1,5,9,2,6])
list3 = list(list2)
print(str(list1) + " " + str(list2) + " " + str(list3))

Similarly for the Map:

In [None]:
map1 = HashMap()
map2 = HashMap({1:2, 3:4, 5:6})
map3 = dict(map2.entrySet())
print(str(map1) + " " + str(map2) + " " + str(map3))

You'll see the subtlety here that the `dict` needs to take the result of `entrySet()` since a Map isn't really a `dict`, however `entrySet()` yields an interable of `Map.Entry` objects which are interpretable as key-value pairs. These are then used to instantiate a `dict` in the same way that any iterable of `2-tuple`s are.

These classes have the sugar which you would expect to have in Python. Above you can see that `str()` works as expected on them (calling their `toString()` methods). Similarly, they have the general Python syntax.

In [None]:
(map2[1], list2[4])

## Types and how they differ between Java and Python

Python and Java have two different type systems. They are _fairly_ similar at a high level but there are some important things to be aware of:
  1. Python only has one type of native floating point value, and one native integer one.
  1. Java allows method overloading but Python, since there may only be one method per name, does not.
  1. Java has both object and primitive types for things like `int`s and `float`s. Python just has objects.
  1. Python has `numpy` which has a more specific type system. This can be used to avoid some typing ambiguities.
  
Before we go any further let's get our ducks in a row by "importing" the bits which we need. We can get the Java classes using `class_for_name` but there's also a sugar method for this, which you'll see later on (spoilers!).

In [None]:
Byte    = c.class_for_name('java.lang.Byte')
Integer = c.class_for_name('java.lang.Integer')
Double  = c.class_for_name('java.lang.Double')
String  = c.class_for_name('java.lang.String')
System  = c.class_for_name('java.lang.System')
Thread  = c.class_for_name('java.lang.Thread')
Arrays  = c.class_for_name('java.util.Arrays')

Now let's see the two type systems in action. To make integration a little smoother PJRmi will box certain Java objects to represent them as native Python ones.

In [None]:
ints           = tuple(range(10))
int64s         = numpy.arange(10, dtype='int64')
floats         = tuple(map(float, ints))
float64s       = numpy.arange(10, dtype='float64')
small_int      = 7
big_int        = 12345678
ints_list      = ArrayList(ints)
int64s_list    = ArrayList(int64s)
floats_list    = ArrayList(floats)
float64s_list  = ArrayList(float64s)
small_int_list = ArrayList((small_int,))
big_int_list   = ArrayList((big_int,))

Notice how PJRmi picked the appropriate ArrayList constructor here. That constructor is an overloaded method in Java. In this instance we ended up using the one which takes a `Collection`.

We'll see the type inference and boxing in action next. PJRmi has to make guesses about the types of things if it doesn't have much information. However, it will handle type conversion if it can.

In [None]:
(ints_list     [0], type(ints_list     [0]),
 int64s_list   [0], type(int64s_list   [0]),
 small_int_list[0], type(small_int_list[0]),
 big_int_list  [0], type(big_int_list  [0]))

In [None]:
Byte.valueOf(small_int), type(Byte.valueOf(small_int))

In [None]:
Integer.valueOf(small_int), type(Integer.valueOf(small_int))

But, like Java, we'll get an exception if we attempt to turn a value into something which would cause overflow.

In [None]:
try:
    Byte.valueOf(big_int)
except Exception as e:
    print(f'{type(e)}')
    print(f'{e}')

We have similar boxing for Strings. These behave like Python strings (and are, in fact, a subclass of them).

In [None]:
s = String.valueOf("hello world")
(s, type(s), s.split())

All these types will be unboxed as their underlying Java objects when they are passed into Java methods. Here we use the "identity hashcode" to get at the underlying `Object.hashcode()` value, which is always the same for an object instance. The `Integer.hashCode()` value, here obtained by `hash()` is the identity function for the integer.


In [None]:
i1 = Integer.valueOf(big_int)
i2 = Integer.valueOf(big_int)
(i1,
 i2,
 hash(i1),
 hash(i1),
 hash(i2),
 hash(i2),
 System.identityHashCode(i1), 
 System.identityHashCode(i1), 
 System.identityHashCode(i2), 
 System.identityHashCode(i2))

PJRmi will also attempt to turn the various Python containers into the desired Java ones. We've seen how we created new `ArrayList` instances in the above, using Python container objects, via the `ArrayList(Collection)` constructor. We can also use those Python containers as arrays.

In [None]:
Arrays.asList?

In [None]:
arrays_list = Arrays.asList(int64s)
(arrays_list, str(arrays_list))

And, going in the other direction, we can use Java containers naturally in Python. This is similar to what we were doing with maps above.

In [None]:
tuple(arrays_list)

And, since we're here, let's see function overloading in action. The `ArrayList.add()` method takes two forms: one is just the object to add, the other also has an insertion point.

In [None]:
ArrayList.add?

In [None]:
str(int64s_list)

In [None]:
int64s_list.add('At the end')

In [None]:
int64s_list.add(0, 'At the start')

In [None]:
str(int64s_list)

Yipes! We can do that?! Well, yes.

The `ArrayList` loses its type paramterization when the running JVM, owing to type erasure; generics are really just a compile-time construct. An `ArrayList` is just a container of `Object`s and, because of that, we can put anything in one!

In [None]:
tuple(map(type, int64s_list))

So, just like Python, you can have a Java container of mixed types. Just be careful what you then do with it since you'll wind up with runtime type exceptions if you aren't careful.

Related to this exceptions work pretty much as you would expect.

In [None]:
try:
    print("ints_list[0] is %s" % ints_list[0])
    print("ints_list[1000] is %s" % ints_list[1000])
except Exception as e:
    print("Something went wrong: %s" % e)

## Moving on from the basics

### Lambdas

Okay, now that we have that. We can also get fancy. Let's use a Python lambda on the Java side!

In [None]:
map1.computeIfAbsent(10, lambda x: x + 1)

In [None]:
str(map1)

So, what's actually going on in the above? This is where the `num_workers` argument comes into play. When we tell the Java server to instantiate multiple workers it's a directive that means that we're going to be doing all our request handling in multiple threads on both the Java and Python sides. The `num_workers` value is the minimum size of the threadpool.

Both Java and Python will have a work queue to handle the messages coming in from the other side. When such a message comes in the task is handed off to a worker thread to handle; additional worker threads are created if the current pool is currently exhausted, but more on this in a bit. This model is required since, if Java calls back into Python, the Python thread which initiated the call into Java will be blocked waiting for a reply. As such, a different thread needs to handle the request from Java to do something (like invoke the lambda). Since it's possible for that lambda to do anything, like calling back into Java, the Java server has to be ready for more messages; same for the Python thread.

Here's a crazy example:

In [None]:
# Heading into Java
map1.computeIfAbsent(
    100, 
    # Java calls back into Python
    lambda x: 
        # And Python calls back-back into Java
        map2.computeIfAbsent(
            x // 2, 
            # And Java calls back-back into Python, which calls back-back-back into Java! Lummy!
            lambda y: Integer.valueOf(str(y)) + 1
        )
)
print(str(map1))
print(str(map2))

Next on this topic, you can also implement Java interfaces in Python. This is done using subclasses of the `JavaProxy` interface which just have to be duck-typed to look like what the Java interfaces define.

In [None]:
class PythonRunnable(pjrmi.JavaProxyBase):
    """A class which looks like a Java Runnable"""
    def run(self):
        print("I ran!")
runnable = PythonRunnable()
runnable.run()

Let's use this to spawn a Java thread, which will itself call back into our Python client.

In [None]:
print(Thread.getClass())
thread = Thread(runnable)
thread.start()

### Class and source injection

Since the JVM exposes methods for loading classes and compiling code on the fly we can inject new code into a running Java application. Here's a quick example which includes some things which we'll use later on too.

In [None]:
class_name = "Injected"
source     = """
import java.util.ArrayList;
import java.util.Collection;
import java.util.function.Function;

public class Injected {
    public static int addOne(int i) {
        return i+1;
   }

   public static double sum(final double[] array)
   {
       double result = 0;
       for (double value : array) {
           result += value;
       }
       return result;
   }

   public static float sum(final float[] array)
   {
       float result = 0;
       for (float value : array) {
           result += value;
       }
       return result;
   }

   public static <T,U> Collection<U> map(final Collection<T> c,
                                         final Function<T,U> f)
   {
       final Collection<U> result = new ArrayList<U>();
       for (T element : c) {
           result.add(f.apply(element));
       }
       return result;
   }
}
"""
Injected = c.inject_source(class_name, source)
Injected.addOne(1)

### Method "unbinding"

We can also "unbind" methods from Java objects or classes, as we would do in Java itself. This gives you back a Java version of the method, as opposed to its Python shim. If a method is overloaded then we need to tell PJRmi which version of the method we want, so that it grabs the correct one.

There are different forms of syntax for capturing a method, but we'll use the `[]` one in the below example.

In [None]:
jint = c.class_for_name('int')
method = Integer.toString[jint]

In [None]:
method(3141)

In [None]:
map1.computeIfAbsent(12345678, method)
map1.computeIfAbsent('hello', String.hashCode[None])
print(map1)

And now, we can pass in the Java method for use as a Java lambda, meaning that all the execution happens on the Java side.

In [None]:
StringValueOf = String.valueOf['java.lang.Object']
mapped = Injected.map(arrays_list, StringValueOf)
print(mapped)
print(tuple(map(type, mapped)))

Using unbinding can also make calls a little faster for heavily overloaded Java methods, since the Python side doesn't have to perform the disambiguation itself.

## Fun With `/dev/shm`

### Argument passing
If we have a Python process driving a Java child on the same machine then we can use `/dev/shm` for passing certain "large" arguments to the Java process. This is useful when you have big Numpy arrays which would otherwise need to be marshalled as a their binary representation by Python and unmarshalled into an array on the Java side. We'll need the JNI extension to be loaded for this to work since it uses C++ magic behind the scenes. You'll also notice that `Injected_.sum()` is an overloaded method and so we use the unbinding directive to pick the one which we want to use.

In [None]:
array = numpy.arange(1.0e6, dtype=numpy.float64)
iters = 100
print('*' * 40)
for use_shm_arg_passing in (True, False):
    with pjrmi.connect_to_child_jvm(stdout=None,
                                    stderr=None,
                                    use_shm_arg_passing=use_shm_arg_passing) as jvm:
        Injected_ = jvm.inject_source(class_name, source)
        sum_ = Injected_.sum['[D']
        start = time.time()
        for i in range(100):
            sum_(array)
        end = time.time()
        took_ms = int((end - start) / iters * 1000)
        print(f'use_shm_arg_passing={use_shm_arg_passing} took={took_ms}ms/iter')
print('*' * 40)

### Doing it all by hand

We can also share memory between a Python numpy array and a, say, Java `DoubleBuffer` using memory mapping.

There's no special PJRmi magic here; we're just writing plain Python code and Java code.

In [None]:
import numpy

ByteOrder          = c.class_for_name('java.nio.ByteOrder')
FileChannel        = c.class_for_name('java.nio.channels.FileChannel')
StandardOpenOption = c.class_for_name('java.nio.file.StandardOpenOption')
Path               = c.class_for_name('java.nio.file.Path')

FILENAME = '/dev/shm/mmap.srp'
SIZE     = 1024

Create the Python numpy array, backed by a file.

In [None]:
array = numpy.memmap(FILENAME, dtype=numpy.float64, mode='w+', shape=(SIZE,), order='C')

Create the Java version. This is a little more involved (and just gives us back a buffer).

In [None]:
path    = Path.of(FILENAME, [])
channel = FileChannel.open(path, [StandardOpenOption.READ, StandardOpenOption.WRITE])
buffer  = channel.map(FileChannel.MapMode.READ_WRITE, 0, SIZE * Double.BYTES)
buffer.order(ByteOrder.nativeOrder())
doubles = buffer.asDoubleBuffer()

So we now have this:

In [None]:
print(array[:15])
print([doubles.get(i) for i in range(15)])

And now just by changing the Python version we see the update reflected on both sides, as you would expect:

In [None]:
array[:10] = range(10)
print(array[:15])
print([doubles.get(i) for i in range(15)])

Or just changing the Java version also updates the Python version:

In [None]:
for i in range(10):
    doubles.put(i * 10.0)
print(array[:15])
print([doubles.get(i) for i in range(15)])

Tidy up:

In [None]:
os.unlink(FILENAME)

## Other examples of this and that

We can also look to do a copy-by-value operation on the objects on the Java side. This only works if the objects are ones which are well known Python ones (like primitives and basic container objects).

In [None]:
map1_copy = c.value_of(map1)
(type(map1_copy), str(map1_copy))

And, of course, one can always create a standalone script. We'll have a simple (contrived) example here. In this we mix Java and Python syntax pretty freely. A lot of the time you can only tell what we're using because of the difference in naming conventions.

In [None]:
# We'll use a 'with' context so that the JVM shuts down when we're done
with pjrmi.connect_to_child_jvm(stdout=None, stderr=None, application_args=('num_workers=2',)) as jvm:
    # Grab the Java classes, using a bit of syntactic sugar
    Arrays  = jvm.javaclass.java.util.Arrays
    HashMap = jvm.javaclass.java.util.HashMap

    # Now turn a number into our count of apples
    def as_apples(count):
        if count <= 0:
            return "I have no apples"
        elif count == 1:
            return "I have an apple"
        else:
            return "I have %d apples" % count

    # Set up the variables
    numbers = Arrays.asList(range(0, 5))
    apples  = HashMap()
    for number in numbers:
        apples.computeIfAbsent(number, as_apples)
    sum_sqs = numbers.stream().mapToInt(lambda x : numpy.int32(x*x)).sum()
    
    # Print them out!
    print("Numbers: %s" % numbers)
    print("Apples:  %s" % apples)
    print("Sum x^2: %d" % sum_sqs)

## Wrapping up...

This was just a very quick introduction to some of the things which we can do in PJRmi. The general rule of thumb is that: if you expect it to work, then it probably should. 

In [None]:
# Be kind, rewind^Wclose
c.close()

T- T- That's all folks!