Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Blockwise Carving Refactor #1365

Open
wants to merge 27 commits into
base: main
Choose a base branch
from

Conversation

holstgr-kaust
Copy link
Contributor

Blockwise Carving refactor of ilastik (requires ilastiktools/blockwise_carving and latest lazyflow/h5-cache from holstgr-kaust). The segmentor is now built in multiple steps, processing the large filter and labels arrays in smaller chunks. Other functionality, like export meshes and load object, are also done blockwise. The watershed labels are now cached as part of the project file. With these changes, ilastik can carve a large EM stack (4000x4000x1500) in less than 250GB of memory.

Updated how WatershedSegmentor is created, invoked for preprocessing, and interface (to match with blockwise refactor in ilastiktools)
Added many TODOs to account for next refactor to pipe the label sub-regions into the WatershedSegmentor calls that now need them.
numNodes and nodeNum were duplicate variables for the same thing. Picked nodeNum (numNodes is the better name, but vigra uses nodeNum; so, consistency).  In the project file, it is still 'numNodes'.
Fix typo in comments
Small cleanup (renamed ununsed variables)
…r of ilastiktools).

There remain outstanding issues (documented in code with TODO:), in particular, data corruption
occurs with the chunked / blockwise arrays when the outer blocksize is less than the actual size
(needed to utilize blockwise functionality to reduce memory use).
watershed labels are now cached / saved into project file.
Fixed issues with saving and displaying data
Removed some commented-out code.
Fix and cleanup how roi are done for mst segmentation and luts
Cleanup how hdf5blocking cache and hdf5Group are created; cache fixing is how handled a bit better (but not perfectly) as it assumes that having watershed_label entries in the hdf5 file means the calculations were completely done.
…le flush after each save, and garbage collection.
…lobal normalize

with halo so block edges align.
Still need to remove some hacks (for calculating global min/max) and test all the filter cases
OpNormalize255 now takes as input the TotalStats for its Input;
Added OpBlockwiseTotalStats operator;
Updated operator network for OpPreprocessing to account for changes.
removed some cruft
Hdf5Cache is a filebacked cache with is not as performant as RAM;
Previously, segmentation export would read through each slice,
requiring a re-read of the label data (most of which was not used);
Now we also cache the segmentation results to save recomputing them.
…export;

added some timeLogged wrappers for functions of interest;
_exportMeshes now works (however, it isn't memory efficient);
_update_rendering now uses the opSegmentationCache (above) to directly set the renderMgr.volume (rather than recompute from labels and super voxels via a look-up-table);
doneObjectNamesForPosition now only needs to read a single block to get the single label value at the given position (instead of needing the entire super-voxels array);
opPreprocessing caches reworked to eliminate double caching;
setSeeds now works (however it requires the entire super-voxels array);
addSeeds was simplified;
removed some commented out code;
…for carving, and removing them cleared up a thread-lock issue.
Blocksize is no longer a hard-coded constant, but a configurable option (reasonable default of 256^3) an attribute of the project file (to ensure consistency)
Mesh export and addSeeds (loading a saved object) now work blockwise.
3D rendering is not updated when there is a re-init pending (save wasted cycles).
We only get the voxel segmentation is there already is a segmentation (otherwise a lot of work is spent on an empty result)
OpFilter now works in the 2D case, and all the cases were tested (there is still TODO in for HESSIAN_BRIGHT, which needs a prior HESSIAN_DARK as input, to get the max value -- otherwise it leaves small seams along border edges... may be insignificant in practice).
Added notes to OpSimpleWatershed on behaviour requirements that are not captured, but are accounted for in cache coherency code.
Added cache coherency support to ensure that watershed labels cache matches with filter and segmentor graphs. Now test dirty status more accurately.
Cleanup of WatershedSegmentor logic (only load existing graph if cache is valid) and use 3D segmentor for 2D (because the filter and labels arrays are all in 3D, even for 2D data).
Removed some dead code.
Tested code paths for 2D, 3D, and preprocessing (initial and re-run).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant