-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JSFive to support VBZ compression #26
Comments
I think it would be straightforward to add a very lightweight plugin mechanism for filters, where users can add a new function with corresponding filter_id to a "user_filters" object, which would then get called in btree.js when running chunks through the data filter pipeline. Then applications that use jsfive would import both the jsfive and VBZ libraries, and register the vbz decoder with jsfive before opening VBZ-compressed files. |
See the separate_filters branch - I moved all the filters to a map that is exported, so that you can add new filters to the list with hdf5.Filters.set(filter_id, function my_filter(buf: ArrayBuffer) => ArrayBuffer)) where my_filter should act on the ArrayBuffer for each chunk. |
Thanks for this, we're currently investigating implementing this into our code. We will let you know how we get on. |
Hi @bmaranville could you please confirm how the chunking works inside JSFive? - Would this send an entire read as a chunk or parts of a read? |
Sure - in HDF5 the filter pipeline operates on individual chunks, which are defined when the dataset is created and information about chunk size is encoded in the file for that dataset. In jsfive the filter functions are called with two arguments: the chunk ArrayBuffer, and the byte size of an individual data element (this is needed for the HDF5 predefined shuffle filter). The filters are applied sequentially to a chunk in the order they are listed in the filter pipeline, and later the data buffer is constructed by concatenating the filtered chunks. The size of the chunk's buffer is not passed to the filter functions, but is available through inspection (buf.byteLength). The output buffer is usually not the same length as the input buffer. Here is the typescript signature of a filter function: type FilterFn = {
(buf: ArrayBuffer, itemsize: number): ArrayBuffer;
}; |
Hi thanks for this we've managed to get it working with our vbz decompression method but with hardcoded compression options. Is there a way we can view the compression options on a particular read so that we can feed this into our a decompression method? |
There is an array of "client_data" encoded in the filter settings. I think it needs to be passed to the filter functions! So far none of the filters I'd seen used that but I think VBZ might... I added it to the end of the function call, so if you pull the latest version from the separate_filters branch, your filter should now be called on each chunk as (where client_data is an array of integers): if (Filters.has(filter_id)) {
buf = Filters.get(filter_id)(buf, itemsize, client_data);
} |
That's brilliant thank you, we believe everything is working, would it be possible to publish the latest changes to NPM when you're ready please. |
I will merge and publish a new version soon. |
Version 0.3.8 was published just now, with support for user filters. |
Part of our project at Oxford Nanopore Tech, we need to read fast5 files that are VBZ compressed. We would like to extend the JSFive Project to support VBZ compression.
Any support/guidance would be appreciated please, thank you.
The text was updated successfully, but these errors were encountered: