#viewsetups >100 #26

martinschorb · 2020-07-01T15:25:14Z

Hi,

the HDF5 structure does not support more than 2 digit ViewSetups.

Would N5 support it? I can see the setupN directory structure. Is there a limitation to the digits of N?

BTW: Where can I find the elf package? I cannot get pybdv's n5 conversion to work.

The text was updated successfully, but these errors were encountered:

constantinpape · 2020-07-02T08:09:44Z

the HDF5 structure does not support more than 2 digit ViewSetups.
Would N5 support it? I can see the setupN directory structure. Is there a limitation to the digits of N?

Yes, the n5 structure supports an arbitrary number of setups. To make it work in pybdv I would need to change this check so it is only triggered for hdf5:
https://github.com/constantinpape/pybdv/blob/master/pybdv/converter.py#L90-L92

BTW: Where can I find the elf package? I cannot get pybdv's n5 conversion to work.

You can find it here; but maybe I shouldn't depend on it in pybdv. I will check if it's easy to replace later.
https://github.com/constantinpape/elf

martinschorb · 2020-07-02T09:29:30Z

OK, cool,

then I will just use n5 if there are more than 100 ViewSetups.

elf is a bit of an ambiguous package name. I did some research and just could not find the right one... Maybe just rename it. And yes, that obvious place I did not check...

constantinpape · 2020-07-02T09:29:43Z

I have updated this:

elf is fully optional now and you can convert to n5 as long as you have z5py in your env (conda install -c conda-forge z5py)
I have disabled the check for number of setup ids for n5.

martinschorb · 2020-07-13T14:10:13Z

Hi,

this seems to work.

The conversion to n5 with the default chunk size (that is 64, correct?) takes ages. I guess this is because the group shrae filers are not optimized for dealing with such many small files...?

Is there any mechanism in n5 (or similar other data format) that would overcome this?

martinschorb · 2020-07-13T14:18:13Z

I just found that even reading the chunks seems very slow from the group shares as compared to h5.
Is there some hybrid format, or would you just increase the chunk size?

constantinpape · 2020-07-13T15:24:31Z

I just found that even reading the chunks seems very slow from the group shares as compared to h5.
Is there some hybrid format, or would you just increase the chunk size?

Normally h5 and n5 should be more or less the same speed; could you maybe post the h5 and n5 file where you have observed this, the exact environment you have used and the access pattern?

martinschorb · 2020-07-13T15:35:22Z

can you see /g/emcf/schorb/data/BDV/montages/LLP_001/bdv_LLP ?

That's both the same thing. N5 took 5x as long to create using the current master commit.

When loading in BDV, h5 appears instantaneously while N5 takes >20 s until reaching the stage where bdv-playground considers the data loaded and performs the centering and auto-contrast. This is in a VM, so IO to the group share should be comparable.

martinschorb · 2020-07-13T15:36:52Z

that's with default chunk size (64,64,64). It gets a bit better when setting the chunks to something like (1,512,512) instead.

constantinpape · 2020-07-13T19:18:33Z

Ok, I had a look at the data. Indeed I also see quite a big difference in the loading speed.
However, this data is 2d, so (1, 64, 64) chunks are tiny! I would definitely go with (1, 512, 512).

In my experience, the over-head of reading the individual chunks from file system is not too large; however at some point it becomes problematic.

At some point I measured it on the Janelia distributed file system and there it wasn't a big problem if chunks were ~ 64** 3 size; we should measure this at EMBL at some point as well to determine what is a good minimal size.

In any case, for 2d data I would always go with at least (1, 512, 512) chunks.

constantinpape · 2020-11-18T21:21:56Z

I closed this, feel free to reopen if this is still relevant.

martinschorb closed this as completed Jul 13, 2020

martinschorb reopened this Jul 13, 2020

constantinpape closed this as completed Nov 18, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

#viewsetups >100 #26

#viewsetups >100 #26

martinschorb commented Jul 1, 2020

constantinpape commented Jul 2, 2020

martinschorb commented Jul 2, 2020 •

edited

Loading

constantinpape commented Jul 2, 2020

martinschorb commented Jul 13, 2020

martinschorb commented Jul 13, 2020

constantinpape commented Jul 13, 2020

martinschorb commented Jul 13, 2020

martinschorb commented Jul 13, 2020

constantinpape commented Jul 13, 2020

constantinpape commented Nov 18, 2020

#viewsetups >100 #26

#viewsetups >100 #26

Comments

martinschorb commented Jul 1, 2020

constantinpape commented Jul 2, 2020

martinschorb commented Jul 2, 2020 • edited Loading

constantinpape commented Jul 2, 2020

martinschorb commented Jul 13, 2020

martinschorb commented Jul 13, 2020

constantinpape commented Jul 13, 2020

martinschorb commented Jul 13, 2020

martinschorb commented Jul 13, 2020

constantinpape commented Jul 13, 2020

constantinpape commented Nov 18, 2020

martinschorb commented Jul 2, 2020 •

edited

Loading