Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add Dbi<byte[]> support #3

Closed
phraktle opened this issue Jul 10, 2016 · 48 comments
Closed

add Dbi<byte[]> support #3

phraktle opened this issue Jul 10, 2016 · 48 comments
Assignees

Comments

@phraktle
Copy link
Contributor

In some contexts (such as migrating from legacy lmdbjni/leveldbjni APIs) it would be nice to have a Dbi<byte[]> instead of wrapping arrays with ByteBuffers and copying out the data in the caller. On first glance the design seems to indicate this should be accomplished with a BufferProxy implementation – but it's not clear how... eg. at the point of the allocate() call the size is unknown, etc.

@benalexau
Copy link
Member

The main issue is Txn has a final T for the key and value. As such a ByteArrayBufferProxy class would need to allocate a large byte[] that is big enough to handle any future key/value fetched.

Given legacy migration is the main concern here, I've added some static helper methods that take care of byte[] to/from direct ByteBuffer and verified their use in a test and the tutorial. Hopefully this will be sufficient to help people migrate to something more production-optimised like DirectBuffer or a ByteBuffer.

@phraktle
Copy link
Contributor Author

Hi @benalexau,

Thanks for looking into this. I was able to implement the wrap/unwrap methods myself easily (in fact your implementation of array(ByteBuffer) has a potential issue, since it's creating the array at size buffer.capacity(), not buffer.remaining()), so that wasn't really the issue. Convenience is not the only problem, it is also about performance (lmdbjni is faster in this regard).

I'm also curious about how resizing works in general if the buffers are final? What are the safety considerations if you alter ByteBuffer address/size after it was created?

@krisskross
Copy link
Contributor

I feel the convenience methods does not belong in the API. We can show users how to do it through examples instead.

Me and Ben discussed the address/size altering offline and our conclusion is to never modify buffers that are provided by users. The key here is ownership of the buffer. We only provide buffers that are owned by lmdbjava and these buffers last for as long as a transaction / operation and users should never used them outside the scope of a transaction. Since we never let the Cleaner kick in, we can "safely" modify the address without interfering with GC.

@phraktle
Copy link
Contributor Author

Agreed that it shouldn't be convenience methods. I would suggest changing the API design / implementation so it allows for byte[] and heap allocated buffers (and can work from those directly).

@krisskross
Copy link
Contributor

Do you mean like a typed put(byte[] key, byte[] val) without generics?

@phraktle
Copy link
Contributor Author

phraktle commented Jul 14, 2016

From an API usage standpoint, a Dbi<byte[]> would be the preferred way to go. The internals would have to change a bit to support this. Since the whole point of this type parameter was that you should be able to plug in your favorite buffer implementation, arguably byte arrays or heap-allocated buffers are reasonable to be able to plug in here.

@krisskross
Copy link
Contributor

Yes, but this is problematic since a byte array cannot be re-positioned to any memory address. All interaction with LMDB goes through memory addresses and we designed the API for off-heap memory access.

Is this really a problem that lmdbjava needs to deal with? If your application is on the heap then you always need that extra copy anyway?

@phraktle
Copy link
Contributor Author

phraktle commented Jul 15, 2016

Isn't it two extra copies here on a simple put? on-heap byte[] or ByteBuffer → off-heap DirectBuffer → lmdb memory space. Whereas with lmdbjni, it's heap → lmdb memory space.

@krisskross
Copy link
Contributor

In the case of lmdbjni, I don't think the heap buffer will be copied directly into lmdb memory. You can get the address of a byte array but i'm not sure if that's how it works.

If you want good performance you really should use direct memory, maybe by reusing large pre-allocated buffers or something similar.

@krisskross
Copy link
Contributor

You can also use the Dbi.reserve method which allocate memory directly in LMDB. This method returns a pre-allocated ByteBuffer where you can use ByteBuffer.put(byte[] buf)

@phraktle
Copy link
Contributor Author

phraktle commented Jul 15, 2016

  1. Native code can access on-heap byte arrays directly.GetByteArrayElements may not copy and just pin the array in memory. GetPrimitiveArrayCritical temporarily suspends GC moving stuff around. So this is not a fundamental limitation of the JVM or JNI, in fact it's a very necessary feature for performant native calls.
  2. Dbi.reserve is not a cure-all, though it helps with put. However, even lookups incur an overhead, since the keys have to be copied to an off-heap buffer.
  3. There are very reasonable applications of on-heap arrays and buffers. Zero-copy reads with lmdb only work until you release the read transaction, so you can't keep that reference for longer (or risk unbounded database growth). Thus it typically needs to be copied for longer usage. Copying to an off-heap buffer will pose problems (they are only deallocated when finalizers are run, plus access performance may be worse from Java than for an on-heap byte[]). So it's best to use an on-heap array, even from a performance standpoint.

My point is, that it's not just laziness that leads one to want to keep using on-heap arrays and expecting performant native calls at the same time :)

@krisskross
Copy link
Contributor

I hear you and have sympathy with this use case, but not sure how to move forward. What's your opinion @benalexau?

@benalexau
Copy link
Member

benalexau commented Jul 16, 2016

First up, thanks to @phraktle for his interest in the project and taking the time to explain his use case and possible solutions.

I have removed the byte[] convenience methods and re-opened this ticket.

I have no philosophical objection against byte[] or on-heap ByteBuffer. The current code just reflects what was most pragmatic within our off-list discussed objectives of clear buffer ownership (ie user-owned vs LMDB-owned), avoiding allocations on the critical path, and simplifying long-term maintenance by avoiding custom C code or depending on unusual JNR APIs.

Having said that, the JNI calls do appear in JNR-FFF's JNINativeInterface. However, they are labelled as WARNING: Highly experimental!!! and do not cover all Java operating systems. So to answer @krisskross's question on how to explore this further, I'd suggest someone add a very simple test case that we can exercise via the existing Linux, OS X and Windows CI environment. It doesn't need to interact with LMDB; just interacting with any convenient C API that wants a byte[] address would be sufficient. If they work, we can figure out the most elegant way to support such use via LmdbJava. If they don't work, we'd at least have a starting test case for properly conversing with the JNR-FFF developers about it.

@benalexau benalexau reopened this Jul 16, 2016
benalexau added a commit that referenced this issue Jul 16, 2016
@phraktle
Copy link
Contributor Author

There are two, somewhat orthogonal issues here:

  1. could the API be refactored to support Dbi<byte[]> (and on-heap ByteBuffer)
  2. could this support be made more performant, than copying back/forth in client code

I'm not yet familiar enough with the current design to comment on how to go about this. (Also, running into #10 seems to indicate that even the current buffer handling could use some love).

@benalexau
Copy link
Member

I have logged jnr/jnr-ffi#68 seeking suggestions on how we might be able to use a byte[] on the C side without copying.

@phraktle
Copy link
Contributor Author

phraktle commented Jul 16, 2016

For reference, here's what I ended up doing in the client code for now. There's a reserved buffer for keys reused for all operations, such as get, put, etc). There's an assumption of low thread-contention, hence the use of an AtomicReference is justified. Getting the buffer is just a CAS. Releasing it is using lazySet so there's no memory barrier applied. The put is also using Dbi#reserve so there's no extra allocation there either.

Overall, this yields a pretty good performance profile. These tricks could be encapsulated within a Dbi<byte[]>.

Note: a ThreadLocal buffer cache would work just as well (no contention, but a bit more memory).

    private final AtomicReference<ByteBuffer> bufferCache = new AtomicReference<>();

    private ByteBuffer acquireKeyBuffer(byte[] key) {
        if (key.length > env.getMaxKeySize()) {
            throw new IllegalArgumentException("Key too long " + key.length);
        }
        ByteBuffer buf = bufferCache.getAndSet(null);
        if (buf == null) {
            buf = ByteBuffer.allocateDirect(env.getMaxKeySize());
        }
        buf.put(key);
        buf.flip();
        return buf;
    }

    private void releaseKeyBuffer(ByteBuffer bb) {
        bb.clear();
        bufferCache.lazySet(bb);
    }

    public void put(byte[] key, byte[] value) {
        ByteBuffer k = acquireKeyBuffer(key);
        try (Txn<ByteBuffer> tx = env.txnWrite()) {
            ByteBuffer v = db.reserve(tx, k, value.length);
            v.put(value);
            tx.commit();
        } finally {
            releaseKeyBuffer(k);
        }
    }

@krisskross
Copy link
Contributor

That's pretty neat. I'm not against implementing this if it turns out the JNR-FFI does not provide native support for byte arrays.

@benalexau
Copy link
Member

One challenge with putting it in Dbi<T> is if <T> is not a ByteBuffer there won't be a db.reserve(Txn, ByteBuffer, int) available.

I'm hopeful JNR-FFI offers something that would allow ByteBufferProxy.in(..) to support both heap and direct buffers without a copy. Even if that's constrained to specific operating systems, we can fall back to copying if needed or requested (ie via a system property).

That would still mean direct ByteBuffers are exposed by Txn.key() and Txn.val() for read-related cases, and parametrization would be Dbi<ByteBuffer> (rather than Dbi<byte[]>), but hopefully it would be an acceptable middle-ground given the benefits (no new types, no new methods, final modifiers on Txn fields remain intact, zero copy for both reads and writes, logic is encapsulated in a single, low-level, package-protected compilation unit etc).

Let's wait and see what the JNR-FFI ticket yields.

@benalexau
Copy link
Member

There's still no reply to the JNR-FFI issue, so it seems there's no obvious way to pass a byte[] address into MDB_val and lock its location. For reads we'd always have a direct ByteBuffer anyway, as LMDB always returns its memory-mapped file segment. As such I'm wondering what we should do with this ticket at a practical level? Would a ByteArraySupport class constructed with Txn<ByteBuffer> and offering some Dbi.put(..)-like methods be sufficient, or are there other ideas?

@phraktle
Copy link
Contributor Author

phraktle commented Aug 9, 2016

One idea is to refactor the internals of lmdbjava, so that a Dbi<byte[]> API is feasible. It would have no reserve calls (e.g. UnsupportedOperationException or maybe pushing reserve down to an interface). The Dbi<byte[]> implementation internally could / should use reserve as above for efficient put calls, and it would make copies of buffers on get, etc.

(Note, that "use reserve when doing a put" trick could also be used in Dbi<ByteBuffer> if an on-heap ByteBuffer is passed in to put)

@phraktle phraktle changed the title add byte[] BufferProxy support add Dbi<byte[]> support Aug 9, 2016
@phraktle phraktle changed the title add Dbi<byte[]> support add Dbi<byte[]> support Aug 9, 2016
@krisskross
Copy link
Contributor

Is this only for storing data or do you need byte arrays also for reading?

Instead of having a typed Dbi<byte[]> we could implement separate methods that would accept (only) byte arrays and continue to disallow ByteBuffer backed by bytes arrays?

@phraktle
Copy link
Contributor Author

phraktle commented Aug 10, 2016

This entire request is mostly about convenience of migrating from other DB APIs, such as leveldbjni, lmdbjni, RocksJava, etc. all of which use on-heap byte[] by default (and we have an internal abstraction layer above these that makes it easy to switch the storage engine).

It is also a safety concern: the on-heap copies are valid after leaving scope of Txn which is what users of most such APIs are used to, as opposed to the off-heap buffers in LMDB (any access to those outside the Txn scope could crash the JVM).

Of course I was able to make things work with lmdbjava not supporting on-heap arrays (with some extra cycles spent on surprises, such as Dbi<ByteBuffer> not actually working with any ByteBuffer, just off-heap ones). But as a general API user, it would feel much more natural to be able to use Dbi<byte[]> and only start dealing with off-heap ByteBuffers and riskier lifecycles when you need the extra performance.

@krisskross
Copy link
Contributor

Why noy copy the ByteBuffer into byte array in the context of Txn in the meantime of eventually resolving jnr/jnr-ffi#68? This is what we would do internally within lmdbjava if the JNR-FFI issue would not be resolved to our favor.

@krisskross
Copy link
Contributor

I'm not sure the JNR-FFI or Project Panana will do any better than copy memory anyways after watching the JCrete conference [1] with comments from Cliff Click saying that JNR-FFI is just a convince API that will not improve performance by any means.

[1] https://www.youtube.com/watch?v=Pel8YZlTtc8

@phraktle
Copy link
Contributor Author

phraktle commented Aug 11, 2016

As I said, I was able to use the off-heap-only Dbi<ByteBuffer> API, writing the wrapper that maps to on-heap arrays. To summarize my key points:

  1. several other libraries provide a byte[] API. using on-heap arrays is a reasonable approach, not something that should be frowned upon :)
  2. for someone migrating from these storage APIs or using on-heap data, it would be convenient to simply get a Dbi<byte[]> provided by lmdbjava
  3. Dbi<byte[]> can be made quite performant (recycle key buffers, use reserve on put) – but it's not quite straightforward, so it would be better if the library captured these practices.

@krisskross
Copy link
Contributor

I don't think its worth the effort to allow for Dbi<byte[]> but how about creating parallel methods suffixed with ByteArray like byte[] getByteArray(byte[] key) and void putByteArray(byte[] key, byte[] val)?

@phraktle
Copy link
Contributor Author

How about extracting the methods of Dbi to a super-interface instead, so there could be an alternate implementation of Dbi<byte[]> that provides a wrapper around a Dbi<ByteBuffer> with the tricks outlined above, and implements all methods (with the exception of reserve).

@krisskross
Copy link
Contributor

Sounds doable. I haven't looked at the code for a while so I need to refresh my memory. Let me get back after taking a stab at it.

@krisskross
Copy link
Contributor

@benalexau does this approach sound good to you?

@benalexau
Copy link
Member

Sounds fine to me. I'd prefer a Dbi<byte[]> if we can get it, as it will appeal to those who prefer byte[] (even if that means some refactoring to Dbi and/or BufferProxy).

@krisskross
Copy link
Contributor

Agreed. I did not envision the solution that @phraktle mentioned but it feels cleaner than poking around with BufferProxy. Ill see if I can take a stab at it during the weekend.

@krisskross krisskross self-assigned this Aug 11, 2016
@krisskross
Copy link
Contributor

I have been poking around with different approaches that share Dbi<T> and they all boil down to extracting interfaces and having double implementations of Env, Dbi, Txn, Cursor and CursorIterator where the byte array implementation delegate to the ByteBuffer the implementations. 10 extra classes+interfaces. We would need to add tests also.

Is this what we want? Makes me wonder if such an implementation would be more appropriate in a separate project to avoid the extra overhead in lmdbjava?

@benalexau
Copy link
Member

10 types is a lot, but then again, byte arrays seem core enough to try to find a solution. Is your code at a point where you could commit it to a branch for us to take a look at (there might be another pattern we could collectively identify, like a template or delegate etc?

@krisskross
Copy link
Contributor

Maybe we should allow ByteBuffer, DirectBuffer and ByteBuf wrapping a byte array instead? Its a better API than exposing raw byte arrays.

@krisskross
Copy link
Contributor

No prototype ready yet. I was only messing around to see the effect in the code base.

@krisskross
Copy link
Contributor

Sorry but I cannot find a clean solution for putting byte arrays and buffers under the same roof. Seems like every solution cause either and explosion of classes or code paths, including tests to cover them.

But we could provide a ByteArrays helper class that construct direct ByteBuffer from byte arrays. It will be somewhat more verbose for users dealing with byte arrays. But it feels like a good middle ground between internal complexity/verbosity vs external convenience.

Opinions?

@benalexau
Copy link
Member

Given you've been thinking about this, if hypothetically JNINativeInterface (or some C library that uses JNI to achieve the same ends) could be made to lock a byte[], obtain its address and submit it to LMDB functions, would this permit a natural-looking Dbi<byte[]> etc? If so maybe we should have a play around with JNINativeInterface?

@phraktle
Copy link
Contributor Author

I'm not familiar enough with the codebase yet to have deep intuitions here. But I'm guessing it would be easier to only address one concern at a time: 1) add support for on-heap ByteBuffer (copy on in/out). 2) add the same for byte[] 3) make it more performant (recycling buffers, reserve on put).

Why are duplicate implementations of all the classes you listed required? I imagine, for example, CursorIterator should work with any Cursor.

@krisskross
Copy link
Contributor

The problem is how to get data out from BufferProxy since we cannot modify the byte array. But if we wrap buffers/arrays in a Holder<T> class (cached by Txn) we have a chance to copy the byte array out. This may enable us to get Dbi<byte[]> after all.

Does this sound doable?

@krisskross
Copy link
Contributor

I just quickly hacked something together and it seems that using a Holder class does work and with quite minimal changes isolated inside a BufferProxy<byte[]>.

A new direct ByteBuffer is created on the way in and the byte array contents is copied before setting the pointer address. Similarly on the way out a byte array is created and pointers contents is copied.

There are probably other ways to do it -- but this might be one way forward. Let me know what you think.

public class ByteArrayProxy extends BufferProxy<byte[]> {

  @Override
  protected Holder<byte[]> allocate() {
    return new Holder(new byte[0]);
  }

  @Override
  protected void deallocate(Holder<byte[]> buff) {

  }

  @Override
  protected void in(Holder<byte[]> buffer, Pointer ptr, long ptrAddr) {
    ByteBuffer tmp = ByteBuffer.allocateDirect(buffer.get().length);
    tmp.put(buffer.get());
    tmp.flip();
    UNSAFE.putLong(ptrAddr + STRUCT_FIELD_OFFSET_SIZE, tmp.remaining());
    UNSAFE.putLong(ptrAddr + STRUCT_FIELD_OFFSET_DATA, address(tmp));
  }

  @Override
  protected void in(Holder<byte[]> buffer, int size, Pointer ptr, long ptrAddr) {

  }

  @Override
  protected void out(Holder<byte[]> buffer, Pointer ptr, long ptrAddr) {
    final long addr = UNSAFE.getLong(ptrAddr + STRUCT_FIELD_OFFSET_DATA);
    final long size = UNSAFE.getLong(ptrAddr + STRUCT_FIELD_OFFSET_SIZE);
    byte[] bytes = new byte[(int) size];
    buffer.wrap(bytes);
    org.agrona.UnsafeAccess.UNSAFE.copyMemory(null, addr, bytes, BufferUtil.ARRAY_BASE_OFFSET, size);
  }

  protected final long address(final ByteBuffer buffer) {
    return ((sun.nio.ch.DirectBuffer) buffer).address() + buffer.position();
  }
}

@benalexau
Copy link
Member

This looks encouraging. It requires a copy on the way in and the way out, but on the bright side it would mean an idiomatic API for those using byte[] and a convenient place to localise JNINativeInterface use for the in case.

It seems unlikely we'll ever be able to offer a copy-free byte[] for read cases, as this would require somehow remapping a byte[] to an arbitrary off-heap memory address. Therefore I'd expect users generally preferring byte[] will still have situations where they want to use a direct ByteBuffer for their critical paths.

We can only open an MDB_env once per process due to the LMDB C API restriction:

Do not have open an LMDB database twice in the same process at the same time. Not even from a plain open() call - close()ing it breaks flock() advisory locking.

Our current Env<T> forces all buffer types in a given JVM process to be identical, therefore preventing a mixture of ByteBuffer and byte[]. Perhaps we need to add a <T> Env.txnRead(BufferProxy<T>):Txn<T> (and equivalent txnWrite) so people can mix them together?

@krisskross
Copy link
Contributor

krisskross commented Aug 17, 2016

Indeed and if JNR-FFI would get better means to handle byte arrays we could easily adopt those. I'm guessing that its only right that ByteArrayProxy use JNR-FFI for safety reasons?

Here's the refactored JNR-FFI version of it.

public class ByteArrayProxy extends BufferProxy<byte[]> {
  private static final MemoryManager MEM_MGR = RUNTIME.getMemoryManager();

  @Override
  protected Holder<byte[]> allocate() {
    return new Holder(new byte[0]);
  }

  @Override
  protected void deallocate(Holder<byte[]> buff) {
  }

  @Override
  protected void in(Holder<byte[]> buffer, Pointer ptr, long ptrAddr) {
    byte[] bytes = buffer.get();
    Pointer pointer = MEM_MGR.allocateDirect(bytes.length);
    pointer.put(0, bytes, 0, bytes.length);
    ptr.putLong(STRUCT_FIELD_OFFSET_SIZE, bytes.length);
    ptr.putLong(STRUCT_FIELD_OFFSET_DATA, pointer.address());
  }

  @Override
  protected void in(Holder<byte[]> buffer, int size, Pointer ptr, long ptrAddr) {

  }

  @Override
  protected void out(Holder<byte[]> buffer, Pointer ptr, long ptrAddr) {
    final long addr = ptr.getLong(STRUCT_FIELD_OFFSET_DATA);
    final int size = (int) ptr.getLong(STRUCT_FIELD_OFFSET_SIZE);
    Pointer pointer = MEM_MGR.newPointer(addr, size);
    byte[] bytes = new byte[size];
    pointer.get(0, bytes, 0, size);
    buffer.wrap(bytes);
  }
}

Multiple buffer types seems a little artificial at the moment. I suggest we wait until we have a well defined requirement?

@krisskross
Copy link
Contributor

Note that I use MemoryManager for memory allocation for in since it may be smarter about it than ByteBuffer?

@krisskross
Copy link
Contributor

Unless there aren't any objections I will try to make a PR during the weekend.

@benalexau
Copy link
Member

A PR would be great.

Did you take a look at the various Pointer subclasses? There might be one that directly wraps a byte[] out of the box, in which case you could reasonably expect any future JNR-FFF optimisations (eg using JNINativeInterface) would be delivered via that abstraction. It seems preferable than client code performing the copy.

On the multiple buffer types per Env case, the issue is user projects will be forced to make a "big bang" switch to another buffer type when they find byte[] doesn't perform well enough for particular cases (due to the copy overhead). Nevertheless, I am happy to defer it until it's reported.

@krisskross
Copy link
Contributor

Yes, I looked through the different Pointer types but could only find ArrayMemoryIO which return zero when calling pointer.address(). Same thing with a Pointer wrapping a ByteBuffer wrapping a byte array.

@benalexau
Copy link
Member

@phraktle I merged @krisskross's #22 a week or so ago. Do you have any further thoughts on this ticket in light of that merge and the discussion in jnr/jnr-ffi#68?

@benalexau
Copy link
Member

I'll close this given I assume folks are happy and there's not much more we can do without upstream JNR-FFI enhancements in any event. Please comment here if you'd like it re-opened.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants