Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improving per-chunk latency #72

Open
chris-allan opened this issue Aug 11, 2020 · 3 comments
Open

Improving per-chunk latency #72

chris-allan opened this issue Aug 11, 2020 · 3 comments

Comments

@chris-allan
Copy link

chris-allan commented Aug 11, 2020

Hope you are well @axtimwalde, @igorpisarev.

First, some background.

We have a few projects now where we are using N5. The most obvious open source example is https://github.com/glencoesoftware/bioformats2raw. These projects are:

  • Cross platform (predominantly Windows and Linux)
  • Predominantly multi-threaded single node
  • Often executing on recent, high clock rate CPUs with local NVMe SSD storage
  • Heavy on I/O, light on compute

Several of our users are utilizing bioformats2raw to convert 100s of TBs of whole slide imaging data either to an N5/Zarr intermediate or in a pipeline with pyramidal OME-TIFF (via https://github.com/glencoesoftware/raw2ometiff) as the end goal. Consequently, time to conversion as well as resource utilization matter greatly to them; milliseconds matter.

While evaluating throughput we came across a few design decisions in N5 that we'd like to validate:

  1. The use of file locking

final OpenOption[] options = readOnly ? new OpenOption[]{StandardOpenOption.READ} : new OpenOption[]{StandardOpenOption.READ, StandardOpenOption.WRITE, StandardOpenOption.CREATE};
channel = FileChannel.open(path, options);
for (boolean waiting = true; waiting;) {
waiting = false;
try {
channel.lock(0L, Long.MAX_VALUE, readOnly);
} catch (final OverlappingFileLockException e) {
waiting = true;
try {
Thread.sleep(100);
} catch (final InterruptedException f) {
waiting = false;
Thread.currentThread().interrupt();
}
} catch (final IOException e) {}
}
}

Under load on Windows in particular we have seen several scenarios where file locking consumes the largest amount of time during chunk read/write.

As N5 and Zarr are quite similar, we have been operating under the concurrency assumptions outlined here:

Is it N5's desire to offer concurrency guarantees beyond those currently offered by Zarr? Would you accept PRs that remove or make file locking optional?

  1. Checks for the existance of files

final Path path = Paths.get(basePath, getDataBlockPath(pathName, gridPosition).toString());
if (!Files.exists(path))
return null;

Similar to [1], we have seen scenarios where checking for file existence takes longer than the actual read/write. Would you accept PRs that change the semantics surrounding missing chunks?

  1. Just in time allocation

Again, similar to [1], we have seen several scenarios where memory allocation, array copying, and alleviating GC pressure dominate runtime. Would you accept PRs that allow API consumers to perform their own memory management perhaps using pre-allocated DataBlock instances? Allow for DataBlock<byte> to be used for any datatype to avoid copies to and from typed Java arrays?

Thanks!

/cc @joshmoore, @kkoz, @melissalinkert

@axtimwalde
Copy link
Collaborator

Is it N5's desire to offer concurrency guarantees beyond those currently offered by Zarr? Would you accept PRs that remove or make file locking optional?

Fine with me, we wanted to avoid partially overlapping reads and writes into blocks and meta-data. So creating a writer with optionally turned off file locking would be a welcome PR.

Similar to [1], we have seen scenarios where checking for file existence takes longer than the actual read/write. Would you accept PRs that change the semantics surrounding missing chunks?

Depends. What would the semantics be? If you mean failing through catching equivalent exceptions, and this being faster than the exists check, I am all ears. If it means failing that does not distinguish between files that are locked and files that don't exist, I am negative. On what file-systems are these issues relevant? I feel hesitant to give up guarantees to speed up operation on some niche Windows file system that will never be used in HPC environments. I usually find that single workstation use is typically well served with HDF5 for which we have an N5 driver.

Again, similar to [1], we have seen several scenarios where memory allocation, array copying, and alleviating GC pressure dominate runtime. Would you accept PRs that allow API consumers to perform their own memory management perhaps using pre-allocated DataBlock instances? Allow for DataBlock to be used for any datatype to avoid copies to and from typed Java arrays?

Absolutely! This is a left-over from the very first day and a thorn in my eyes. Particularly for compressed data, however, the advantages can quickly fizzle out because the existing compressors often create byte arrays from byte arrays. I still think that converting the DataBlock API from <T> to ByteBuffer can offer some advantages. This includes Buffer backed pixel access in ImgLib2.

@chris-allan
Copy link
Author

chris-allan commented Aug 11, 2020

Fine with me, we wanted to avoid partially overlapping reads and writes into blocks and meta-data. So creating a writer with optionally turned off file locking would be a welcome PR.

👍

We'll get on that.

Depends. What would the semantics be? If you mean failing through catching equivalent exceptions, and this being faster than the exists check, I am all ears.

Definitely this. I was concerned that the original intent may have been to prevent a read/write from happening at all for some particular reason.

On what file-systems are these issues relevant? I feel hesitant to give up guarantees to speed up operation on some niche Windows file system that will never be used in HPC environments. I usually find that single workstation use is typically well served with HDF5 for which we have an N5 driver.

NTFS on Windows 10. It is also a problem when working with network filesystems.

We definitely don't consider Windows niche. A sizeable component of our user base is using fat Windows workstations. To be able to use the same data layout at workstation, local HPC with parallel filesystem and object storage levels is hugely beneficial.

With respect to the current structure of N5 backends there is the separate issue of composability which probably goes beyond what I wanted to raise in this issue. For example, we'd really like to be able to use n5-zarr and object storage. Have you thought about pursuing refactoring on N5 similar to what has been discussed on zarr-developers/zarr-python#540?

Absolutely! This is a left-over from the very first day and a thorn in my eyes. Particularly for compressed data, however, the advantages can quickly fizzle out because the existing compressors often create byte arrays from byte arrays. I still think that converting the DataBlock API from to ByteBuffer can offer some advantages. This includes Buffer backed pixel access in ImgLib2.

👍

Great. I'll give it some thought and try to put something together for review.

Edit: Forgot one comment.

@axtimwalde
Copy link
Collaborator

axtimwalde commented Aug 11, 2020

NTFS on Windows 10. It is also a problem when working with network filesystems.

We definitely don't consider Windows niche. A sizeable component of our user base is using fat Windows workstations. To be able to use the same data layout at workstation, local HPC with parallel filesystem and object storage levels is hugely beneficial.

Absolutely, I should have skipped that comment. However, my main concern was that not every file-system is well suited for the N5/Zarr approach, e.g. minimum block sizes are just one thing that can get in your way, or limits for # of files in a directory, ls speed, etc. I therefore strongly believe that HDF5 files are an excellent solution for data that fits on big workstations that often run Windows. The transfer into cloud land can then be performed via copy-conversion with tools such as n5-copy. The API of the consuming code remains N5, regardless of whether it uses HDF5 or other N5 backends. I find this superior to guaranteeing data compatibility by making everything look like a filesystem. That is why I am suggesting caution when aiming for performance on platforms that are may be not a relevant target.

With respect to the current structure of N5 backends there is the separate issue of composability which probably goes beyond what I wanted to raise in this issue. For example, we'd really like to be able to use n5-zarr and object storage. Have you thought about pursuing refactoring on N5 similar to what has been discussed on zarr-developers/zarr-python#540?

Yes, we had this discussion. File format filters and storage primitives should be decoupled. Haven't done it yet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants