Problem description
Storage.write and SyncStorage.write use InputStream as data input.
Storage.read and SyncStorage.read expect a heap byte array to be allocated where data from LTS is copied into.
This presents certain drawbacks:
- On the write side, copying from an InputStream may not be the most efficient way of writing to LTS. For example, there are optimized methods to copy from heap/direct memory into a
FileChannel. For S3, there may be better ways than just providing an InputStream and for HDFS, copying from an InputStream into an OutputStream is not too efficient (uses an intermediate buffer).
- Since the source of the copy is s composition of memory buffers (Netty ByteBufs), we should investigate ways of transferring data directly from these buffers into the LTS API.
- On the read side, we have a similar situation, however not as bad. But the fact that we need to pre-allocate byte arrays just to read may not make it too efficient.
Problem location
Storage, SyncStorage and implementations.
Call sites of Storage.
Suggestions for an improvement
Add the following APIs and implement them:
SyncStorage.write(SegmentHandle handle, long offset, BufferView data) (similar in Storage)
BufferView SyncStorage.read(SegmentHandle handle, long offset, int length)
Problem description
Storage.writeandSyncStorage.writeuse InputStream as data input.Storage.readandSyncStorage.readexpect a heap byte array to be allocated where data from LTS is copied into.This presents certain drawbacks:
FileChannel. For S3, there may be better ways than just providing an InputStream and for HDFS, copying from an InputStream into an OutputStream is not too efficient (uses an intermediate buffer).Problem location
Storage, SyncStorage and implementations.
Call sites of Storage.
Suggestions for an improvement
Add the following APIs and implement them:
SyncStorage.write(SegmentHandle handle, long offset, BufferView data)(similar in Storage)BufferView SyncStorage.read(SegmentHandle handle, long offset, int length)