Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide java.nio.ByteBuffer-friendly send/recv API #1

Closed
jholloway7 opened this issue Apr 6, 2010 · 15 comments
Closed

Provide java.nio.ByteBuffer-friendly send/recv API #1

jholloway7 opened this issue Apr 6, 2010 · 15 comments

Comments

@jholloway7
Copy link

I had a thought while reviewing the source that the java.nio.ByteBuffer enhancements to Java 1.4 might be a better fit for the send/recv calls (or perhaps some variation on them).

Given that ByteBuffer lets you directly use native heap you can avoid the need to memcpy messages to/from the managed heap.

I know from the documentation that 0MQ is very focused on performance, so this enhancement could possibly boost the performance of the Java bindings (faster send/recv, use less managed heap, fewer GCs) and get them a bit closer to native performance.

@zeromq
Copy link
Collaborator

zeromq commented Apr 6, 2010

I've spent some time trying to implement this kind of thing, but at the end my feeling was that it's not possible. Can you be more specific on how to do this?

@jholloway7
Copy link
Author

Unfortunately, it's been a few years since I've worked with JNI, so my comment was more "in theory" than "in practice". Let me see if I can dust off my JNI skills and come up with something (or run into the same wall as you). I wonder if you'd have to expose something like the zmq_msg_* API to do it.

@zeromq
Copy link
Collaborator

zeromq commented Apr 6, 2010

Definitely give it a try.

Just to hint what was the problem I hit, the ownership of the message data has to be transferred from C library to JVM (the garbage collector) and vice versa. As far as I was able to find out, there was no way to pass malloc allocated chunk to the Java garbage collector or detach Java-allocated chunk from the garbage collector and send it to C function.

@jholloway7
Copy link
Author

Yeah, I can see that being a difficult problem to solve given the current byte[] API.

I think the way to approach it would be to include a simple Message class that is mostly backed by the zmq_msg_* entry points. The consumer would construct a Message and then grab a ByteBuffer from it which allows them to manipulate the native buffer "directly" through the ByteBuffer interface.

class Message {
    Message (int size) { ... } // allocates empty native msg
    Message (byte[]) { ... } // memcpy into native msg
    ByteBuffer buffer() { ... } // returns a ByteBuffer backed by msg
}

Or I guess Message could even extend ByteBuffer...

Then you could extend the API in Socket to allow send/recv on Message objects instead of plain old byte[].

class Socket {
    boolean send (Message msg, long flags);
    Message recv (long flags);
}

Let me prototype and see what it feels like.

@zeromq
Copy link
Collaborator

zeromq commented Apr 8, 2010

I've had a look at the code, but I am not a Java developer, so I have no clear opinion. Would you mind to post the patch on the mailing list? There are several hundred people subscribed so you'll probably get a real feedback.

@jholloway7
Copy link
Author

Sure, I am going to take a stab at some of the issues I noted in the pull request before I hit up the mailing list. I think I've come to the conclusion that the proposed API had better account for lack of direct buffer support in a given JVM and that subclassing ByteBuffer is the best way to make that seamless to the consumer.

@gonzus
Copy link
Contributor

gonzus commented Apr 8, 2010

Only now have I realized this was an issue tracked here... Duh.

I believe the Message class idea will still require copying from / to the underlying ByteBuffer, right? Or maybe I am misunderstanding the proposal... Sample code would be best here.

@jholloway7
Copy link
Author

For the prototype code, check out my forked repo here: http://github.com/jholloway7/jzmq

It is a functional proof of concept, but there are a couple issues with it before it's primetime ready.

It boils down to how the Message class is designed so you can back it with a native buffer. For example, here's what the sample publisher app would look like:

class PublisherApp
{
    public static void main (String [] args) throws Exception
    {
        // Initialise 0MQ with a single application and I/O thread
        org.zmq.Context ctx = new org.zmq.Context (1, 1, 0);
        // Create a PUB socket for port 5555 on the lo interface
        org.zmq.Socket s = new org.zmq.Socket (ctx, org.zmq.Socket.PUB);
        s.bind ("tcp://lo:5555");

        for (long msg_id = 1; ; msg_id++) {
            // Create a new, empty, 8 byte message
           org.zmq.Message msg = new org.zmq.Message(8);

            // Fill it with the current message ID
            ByteBuffer bb = msg.buffer();
            bb.putLong (msg_id);

            // Publish our message   
            s.msend (msg, 0);

            if (msg_id % 10000 == 0) {
        System.out.println(Long.toString(msg_id));
        System.gc();
            }
        }
    }
}

If you make the Message class extend from ByteBuffer, you can simplify this code:

// Fill it with the current message ID
org.zmq.Message msg = new org.zmq.Message(8);
ByteBuffer bb = msg.buffer();
bb.putLong (msg_id);

to this:

org.zmq.Message msg = new org.zmq.Message(8);
msg.putLong (msg_id);

Either way, if the Message object is backed with a direct ByteBuffer, then the putLong call directly manipulates the native heap and there is no need for the memcpy during the send operation. Similarly, a Message object can be constructed during recv that is backed by the native zmq_msg_t.

The issues I alluded to are this:

  • Some JVMs do not support direct ByteBuffers, so my code would need to be enhanced a bit to fall back to managed buffers
  • Currently my code frees the native zmq_msg when the Message object is garbage collected. This could cause problems if consumer decides to keep a longer-lived reference to the ByteBuffer (since it would be dangling). I think the solution is using phantom references under the hood.

(Apologies if I started this discussion in the wrong place, I always get confused whether people prefer tickets or mailing list discussions for this sort of thing)

@gonzus
Copy link
Contributor

gonzus commented Apr 8, 2010

I think it is more common to see this type of discussion in the mailing list.

I am a noob when it comes to direct ByteBuffers... When you call s.msend(msg, 0), it will end up calling a native method; how do you access the contents of ByteBuffer inside that method, so that you will avoid the current call to GetByteArrayElements()? How do you turn those contents into a native zmq_msg_t? (I guess by using zmq_msg_init_data()).

@jholloway7
Copy link
Author

The Message class wraps a handle to a zmq_msg_t that has been allocated on the native heap. In other words, when you construct a Message object you are also creating a zmq_msg_t and retaining a handle to that object.

http://github.com/jholloway7/jzmq/blob/master/src/Message.cpp#L92

When the caller asks for a ByteBuffer from the Message object they are given a direct ByteBuffer that sits on top of the zmq_msg_t data.

http://github.com/jholloway7/jzmq/blob/master/src/Message.cpp#L138

(That code would be subject to change as I have now convinced myself that the better design is for Message to implement the ByteBuffer interface)

Manipulating the ByteBuffer (as in my earlier PublisherApp example) is directly manipulating the underlying zmq_msg_t data (hence a 'direct buffer'). Subsequently, when that Message is passed along to 'msend', there is no copy needed because the Message can simply pass its underlying zmq_msg_t handle to 0MQ.

http://github.com/jholloway7/jzmq/blob/master/src/Socket.cpp#L366

Here's what I propose. Let me incorporate the following two changes to my branch:

  • Message class implements ByteBuffer interface instead of exposing current Message#buffer method. Doing it this way will make it easier to support on JVMs that don't support direct buffers.
  • Address the 'finalize' issue that can occur when a Message is garbage collected but caller retains a ByteBuffer reference.

After I have implemented that, I will take the case to the 0MQ mailing ilst.

@jholloway7
Copy link
Author

Perhaps it would be easier to conceptualize how this works if you take ByteBuffer completely out of the equation and consider that the way I've implemented the Message class, it could expose its own API for fiddling with bytes in the underlying zmq_msg_t data structure.

class Message {
    void putInt(int v);
    void putByte(byte b);
    ...
}

Providing support for the ByteBuffer interface simply marries the API to something that the Java developer is likely already familiar with.

@gonzus
Copy link
Contributor

gonzus commented Apr 8, 2010

I understand now what you propose and how you intend to go about it. I would suggest you work on it on your branch, come up with some tests that show the benefits of your approach, and submit a patch to the list for consideration. I don't expect I will be able to assist you much, but I will be around anyway. Good luck!

@zeromq
Copy link
Collaborator

zeromq commented Apr 9, 2010

I would strongly recommend running some preliminary tests. (You can use the tests in perf directory) to show that using a byte buffer does provide a performance improvement. The possible problem I see is that filling the buffer using JNI functions may be slower than creating a native bytearray and copying it afterwards. For a benchmark of effect of copying data on latency have a look here: http://www.zeromq.org/results:copying

@jholloway7
Copy link
Author

I'm going to close this ticket so it's not looming on your project. I'll do some performance tests when I get some time and see if this provides value. To be clear, the proposed API lets the application developer choose between managed/unmanaged buffers based on their use case. But, I do agree that the complexity of unmanaged buffers may not prove to be worth the trouble if they only deliver marginal performance gains or only in extreme edge cases.

@zeromq
Copy link
Collaborator

zeromq commented Apr 10, 2010

I would say simply run a latency test with say 1/2 GB message and see if there's an statistically significant improvement. If so, the patch is worth of it.

This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants