Initial Zarr support #627

normanrz · 2022-03-03T20:19:20Z

Description:

Adds the StorageArray abstraction for WKW and Zarr
Changes block_len to chunk_size and file_len to chunks_per_shard, but with backwards-compatible support
Uses pathlib in more locations instead of string paths. This can be seen as a preparation for using upath later on.
Adds tests for Zarr datasets
Renames wkwResolutions to mags in datasource-properties.json for Zarr datasets. cc @fm3

Issues:

fixes Add Zarr backend for Dataset #622

Todos:

Make sure to delete unnecessary points or to check all before merging:

Updated Changelog

normanrz · 2022-03-10T21:38:48Z

I reviewed everything except for the (huge) test file. Wouldn't it make sense to generalize the tests so that most of them simply run twice (once for wkw and once for zarr). I think, most of the tests don't deal with the implementation specifics. If some do, they could stay separate. What do you think?

I reworked test_dataset.py and dropped test_zarr_dataset.py, again.

normanrz · 2022-03-10T21:40:40Z

I think the remaining points from your review are:

del mag1._array in test_dataset.py
data_format or array_format consistency
resolutions, magnifications/mags, multiscales

philippotto

nice 👍 I think, more of the tests could be generalized to work on wkw and zarr, but I understand that it can be quite a hassle to convert this and the returns are probably diminishing. So, I don't have a hard opinion on that.

webknossos/tests/test_dataset.py

philippotto · 2022-03-11T10:47:37Z

del mag1._array in test_dataset.py

I replied to the corresponding comment.

data_format or array_format consistency

Is this something you need an opinion on? I don't really have a tendency here 🤷

resolutions, magnifications/mags, multiscales

I'd vote for mags, as it's mainly used in newer code, concise and a bit clearer than the others (e.g., dataset resolution could also mean the physical size of a voxel (which we usually but not always refer to as scale)). I see, that it's not necessarily the most common term in the industry, but there will always be naming differences between different organizations/projects (see output ~ artifacts ~ debris). Being consistent within our own ecosystem is what matters most in my opinion.

jstriebel

Awesome work 🎉
I added a couple of comments, but overall this looks great! On the open questions:

data_format or array_format consistency

Is this something you need an opinion on? I don't really have a tendency here 🤷

Also no strong opinion, but tending a bit towards array_format, since array is more specific than data. (Also fit's the …Array class renaming suggestion in the comments below)

resolutions, magnifications/mags, multiscales

I'd vote for mags, as it's mainly used in newer code, concise and a bit clearer than the others (e.g., dataset resolution could also mean the physical size of a voxel (which we usually but not always refer to as scale)). I see, that it's not necessarily the most common term in the industry, but there will always be naming differences between different organizations/projects (see output ~ artifacts ~ debris). Being consistent within our own ecosystem is what matters most in my opinion.

Agreeing with @philippotto, but also no strong opinion here.

webknossos/script_collection/globalize_floodfill.py

webknossos/webknossos/annotation/annotation.py

webknossos/webknossos/dataset/storage.py

webknossos/webknossos/geometry/vec3_int.py

jstriebel

IMO this is good to go 🎉, just the respective Changelog entries are still needed. Leaving the final go to @philippotto 🏁

philippotto

Excellent 💯 Good to go from my side, too.

normanrz added 30 commits February 25, 2022 21:59

change from os.path to pathlib

a65e9b8

moar

9e2c386

relpath

cb70d21

Merge remote-tracking branch 'origin/master' into more-pathlib

b7be44f

fixes

ffd7ad5

fixes

3505808

adds wkw backend

1ffd585

stuff

6eaa0d0

fixes

c6e1e71

add zarr+numcodecs

6ee905d

typing

059147c

fixes

3c72f4a

fixes

6392f41

fixe

ed5320b

fixes

f5965ee

fixes

61477cb

fixes

3ec4d8e

fxies

a380c37

fixes

061a2fb

fixes

659967b

fixes

cf151a0

fixes

1a13d69

fixes

cf17185

fixes

a2ccfd0

fixes

d6decff

fixes

a62962f

fixes

4a9fef5

fixes

5c81d06

merged

682cc25

poetry.lock

cec6a89

normanrz added 2 commits March 10, 2022 22:00

formatting

3ec59fe

formatting

6d8df68

fixes

b6c434e

philippotto reviewed Mar 11, 2022

View reviewed changes

normanrz added 2 commits March 11, 2022 13:55

tests

cf16a81

rename resolutions -> mags

61782bc

fm3 mentioned this pull request Mar 14, 2022

Read Zarr Datasets scalableminds/webknossos#6019

Merged

27 tasks

jstriebel reviewed Mar 14, 2022

View reviewed changes

normanrz added 2 commits March 14, 2022 18:31

merged

6c9255a

pr feedback

eb17222

jstriebel reviewed Mar 15, 2022

View reviewed changes

normanrz added 2 commits March 15, 2022 12:29

Update Changelog.md

4b607ec

merge

2539a35

This was referenced Mar 15, 2022

Add cloud-storage support to Zarr for Datasets #651

Closed

Enable more compression methods for Zarr datasets #652

Open

Enable use of Zarr datasets in wkcuber #654

Closed

philippotto approved these changes Mar 16, 2022

View reviewed changes

normanrz added 3 commits March 16, 2022 13:30

use F order

b100bed

format

4b42afb

Merge branch 'master' into more-pathlib

70c939c

normanrz added the automerge label Mar 16, 2022

normanrz added 3 commits March 16, 2022 18:57

trigger ci

0656cb5

trigger ci

c4f2331

lint

3b936af

bulldozer-boy bot merged commit 04938de into master Mar 16, 2022

bulldozer-boy bot deleted the more-pathlib branch March 16, 2022 20:42

normanrz changed the title ~~ZarrStorageArray and refactorings~~ Initial Zarr support Mar 17, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Initial Zarr support #627

Initial Zarr support #627

normanrz commented Mar 3, 2022 •

edited

Loading

normanrz commented Mar 10, 2022

normanrz commented Mar 10, 2022

philippotto left a comment

philippotto commented Mar 11, 2022

jstriebel left a comment

jstriebel left a comment

philippotto left a comment

Initial Zarr support #627

Initial Zarr support #627

Conversation

normanrz commented Mar 3, 2022 • edited Loading

Description:

Issues:

Todos:

normanrz commented Mar 10, 2022

normanrz commented Mar 10, 2022

philippotto left a comment

Choose a reason for hiding this comment

philippotto commented Mar 11, 2022

jstriebel left a comment

Choose a reason for hiding this comment

jstriebel left a comment

Choose a reason for hiding this comment

philippotto left a comment

Choose a reason for hiding this comment

normanrz commented Mar 3, 2022 •

edited

Loading