Skip to content

Plugins

Tim Gion edited this page Oct 19, 2018 · 3 revisions

Overview

The Ingest Client uses "plugins" to support the many different ways users can store and organize their data. The ingest service works by computing "upload tasks" based on the ingest job configuration file, where an upload task specifies a single image tile to upload to the Boss. Upload tasks contain tile indices in x, y, z, and t that specify the target tile to upload. Plugins are responsible for interpreting the tile indices and providing to the ingest client a file handle for the correct bit of data.

Below are the plugins currently packaged with the ingest client, and in general have developed to support MICrONS teams directly.

CATMAID based plugins

CatmaidFileImageStackZoomLevelPathProcessor

Path Processor targeting Tile source type 4 in the CATMAID documentation. Only works with local data sources.

Custom Parameters:

"root_dir": "<path_to_stack_root>",
"filetype": "<png|tif|jpg>"

CatmaidFileImageStackZoomLevelTileProcessor

Tile Processor targeting Tile source type 4 in the CATMAID documentation. Should be used with CatmaidFileImageStackZoomLevelPathProcessor.

No custom parameters.

CatmaidDirectoryImageStackPathProcessor

Path Processor targeting directory-based image stacks, Tile source type 5 in the documentation. Only works with local data sources.

Custom Parameters:

"root_dir": "<path_to_stack_root>",
"filetype": "<png|tif|jpg>"

CatmaidDirectoryImageStackTileProcessor

Tile Processor targeting directory-based image stacks, Tile source type 5 in the documentation. Should be used with CatmaidDirectoryImageStackPathProcessor.

No custom parameters.

CatmaidFileImageStackPathProcessor

Path Processor targeting File-based image stacks, Tile source type 1 in the documentation. Only works with local data sources.

Custom Parameters:

"root_dir": "<path_to_stack_root>",
"filetype": "<png|tif|jpg>"

CatmaidFileImageStackTileProcessor

Tile Processor targeting File-based image stacks, Tile source type 1 in the documentation. Should be used with CatmaidFileImageStackPathProcessor.

No custom parameters.

HDF5 based plugins

Hdf5TimeSeriesPathProcessor

A Path processor for time-series, multi-channel data (e.g. calcium imaging). Assumes the data is stored (t,y,z, channel) in individual hdf5 files, with 1 hdf5 file per z-slice.

Custom Parameters:

"root_dir": "<path_to_stack_root>",
"filetype": "<png|tif|jpg>"
"base_filename": the base filename

base_filename string identifies how to insert the z-index value into the filename. Identify a place to insert the z_index with "<...>". If you want to offset add o:number. If you want to zero pad add p:number.

*Example with 3 z-slices (index 0, 1, 2):

No modifiers:

my_base_<> -> my_base_0, my_base_1, my_base_2

Offset or zero padding:

<o:200>_my_base_<p:4> -> 200_my_base_0000, 201_my_base_0001, 202_my_base_0002

Both offset and zero pad. Here, the offset shoudl always go first and the zero pad value second:

my_base_<o:1p:4> -> my_base_0001, my_base_0002, my_base_0003

Hdf5TimeSeriesTileProcessor

A Tile processor for time-series, multi-channel data (e.g. calcium imaging). It assumes the data is stored (t, x, y, channel) in individual hdf5 files, with 1 hdf5 file per z-slice. X is the column dimension and y is the row dimension.

Custom Parameters:

"upload_format": "<png|tif>",
"channel_index": integer, //Used to index into the HDF5 file's "channel" dimension
"scale_factor": float, //This value is multiplied by the extracted matrix if re-scaling is desired
"dataset": str, //Name of the HDF5 dataset to load
"filesystem": "<s3|local>", //Select the local or S3 file system using the DynamicFileSystem helper
"bucket": (if s3 filesystem)

Hdf5TimeSeriesLabelTileProcessor

A Tile processor for label data packed in a time-series, multi-channel HDF5 (e.g. ROIs for calcium imaging). Assumes the data is stored (x, y) in individual hdf5 files, with 1 hdf5 file per z-slice where x is the column dim and y is the row dim. Currently it uploads the data as uint32, and updates will be needed for full 64-bit support.

Custom Parameters:

"upload_format": "<png|tif>",
"dataset": str, //Name of the HDF5 dataset to load
"filesystem": "<s3|local>", //Select the local or S3 file system using the DynamicFileSystem helper
"bucket": (if s3 filesystem)

Hdf5SlicePathProcessor

A Path processor for large single slices stored in hdf5 files. Assumes the data is stored in a dataset and an optional offset is stored in a dataset

Custom Parameters:

"root_dir": "<path_to_stack_root>",
"extension": "hdf5|h5",
"base_filename": the base filename

base_filename string identifies how to insert the z-index value into the filename. Identify a place to insert the z_index with "<...>". If you want to offset add o:number. If you want to zero pad add p:number.

*Example with 3 z-slices (index 0, 1, 2):

No modifiers:

my_base_<> -> my_base_0, my_base_1, my_base_2

Offset or zero padding:

<o:200>_my_base_<p:4> -> 200_my_base_0000, 201_my_base_0001, 202_my_base_0002

Both offset and zero pad. Here, the offset shoudl always go first and the zero pad value second:

my_base_<o:1p:4> -> my_base_0001, my_base_0002, my_base_0003

Hdf5SliceTileProcessor

A Tile processor for large slices stored in hdf5 files. Assumes the data is stored in a dataset and an optional offset is stored in a dataset

Custom Parameters:

"upload_format": "<png|tif>",
"data_name": str, // Name of the HDF5 dataset where the matrix data is stored
"offset_name": str, // Name of the HDF5 dataset where offset data is stored
"extent_name": str, // Name of the HDF5 dataset where the extent data is stored
"offset_origin_x": int, // Offset from tile index origin in the x (column) dimension
"offset_origin_y": int, // Offset from tile index origin in the y (row) dimension
"filesystem": "<s3|local>", //Select the local or S3 file system using the DynamicFileSystem helper
"bucket": (if s3 filesystem)

Hdf5ChunkPathProcessor

A Path processor for a dataset where it has been broken into regular sized chunks stored in hdf5 files. Assumes the data is stored in a dataset and the filename contains the chunk location. Supports an xyz offset

Custom Parameters:

"root_dir": "<path_to_stack_root>",
"extension": "hdf5|h5",
"prefix": <string>, // Prefix for the filename
"x_offset": <integer> // The offset from 0 in the x dim (column)
"y_offset": <integer> // The offset from 0 in the y dim (row)
"z_offset": <integer> // The offset from 0 in the z dim
"x_chunk_size": <integer> // The chunk extent in the x dimension
"y_chunk_size": <integer> // The chunk extent in the y dimension
"z_chunk_size": <integer> // The chunk extent in the z dimension
"use_python_convention": <bool> // A flag indicating if ranges use python slice convention

Assumed filename format: prefix_xstart-xstop_ystart-ystop_zstart-zstop.h5

Hdf5ChunkTileProcessor

A Tile processor for a dataset where it has been broken into regular sized chunks stored in hdf5 files.

Custom Parameters:

"upload_format": "<png|tiff>",
"data_name": str, // Dataset name
"z_chunk_size": <integer> // The chunk extent in the z dimension,
"filesystem": "<s3|local>",
"bucket": (if s3 filesystem)

Hdf5SingleFilePathProcessor

A simple Path processor for a 3D datasets stored in a single hdf5 file.

Custom Parameters:

"filename": "<path_to_file>"

Assumed filename format: prefix_xstart-xstop_ystart-ystop_zstart-zstop.h5

Hdf5SingleFileTileProcessor

A Tile processor for 3D datasets stored in a single HDF5 file. Assumes the data is stored in a dataset and an optional offset is stored in a dataset. It is able to index into the file based on the tile size of the ingest job.

Custom Parameters:

"upload_format": "<png|tiff>",
"data_name": str, // Dataset name
"datatype": <uint8|uint16|uint32> // Datatype of the matrix data. Currently uint32 for annotation data
"offset_x": <integer>, // Offset in the x dimension
"offset_y": <integer>, // Offset in the y dimension
"offset_z": <integer>, // Offset in the z dimension
"filesystem": "<s3|local>",
"bucket": (if s3 filesystem)

Multi-page TIFF based plugins

SingleTimeTiffPathProcessor

A Path processor for a multi-page TIFF that contains all time points for a single z-slice

Custom Parameters:

For each file add an entry indicating the "z index" for that file, starting from 0 for the first slice.

"z_<index>": "<filename>"

Example with 5 slices:

"z_0": "/data/my_file_0.tiff",
"z_1": "/data/my_file_1.tiff",
"z_2": "/data/my_file_2.tiff",
"z_3": "/data/my_file_3.tiff",
"z_4": "/data/my_file_4.tiff"

SingleTimeTiffTileProcessor

A Tile processor for a file where a multi-page TIFF contains all time points for a single z-slice. Only works with local paths, as the DynamicFilesystem utility is not used.

Custom Parameters:

"datatype": "<uint16>" // Currently only supports uint16 data and does not perform any rescaling

TiffMultiFileHyperStackPathProcessor

A Path processor for a hyperstack stored across multiple multi-page TIFF files, with the time dimension being the dimension split across files.

Custom Parameters:

"root_dir": "<path_to_stack_root>",
"extension": "tiff|tif",
"base_filename": <string> // The base filename
"time_chunk_size": <int>  // The number of time samples in a single file

base_filename string identifies how to insert the z-index value into the filename. Identify a place to insert the z_index with "<...>". If you want to offset add o:number. If you want to zero pad add p:number.

Example with 3 z-slices (index 0, 1, 2):

No modifiers:

my_base_<> -> my_base_0, my_base_1, my_base_2

Offset or zero padding:

<o:200>_my_base_<p:4> -> 200_my_base_0000, 201_my_base_0001, 202_my_base_0002

Both offset and zero pad. Here, the offset shoudl always go first and the zero pad value second:

my_base_<o:1p:4> -> my_base_0001, my_base_0002, my_base_0003

TiffMultiFileHyperStackTileProcessor

A Tile processor for multi-channel, multi-slice, time-series datasets stored as a hyperstack in a multi-page tiff. Time is the dimension split accross files.

Custom Parameters:

"time_chunk_size": <int>, // Number of time samples in a single file
"num_z_slices": <int>, // Total number of Z slices stored in a single file
"num_channels": <int>, // Total number of channels stored in a single file
"channel_index": <int>, // Index of the channel that is being loaded
"filesystem": "<s3|local>",
"bucket": (if s3 filesystem)

Image Stack based plugins

ZindexStackPathProcessor

A Path processor for simple image stacks that only increment in Z. Supports the DynamicFilesystem utility

Custom Parameters:

"root_dir": "<path_to_stack_root>",
"extension": "<png|tif|jpg>",
"filesystem": "<s3|local>",
"bucket": (if s3 filesystem),
"base_filename": <str> // The base filename

base_filename string identifies how to insert the z-index value into the filename. Identify a place to insert the z_index with "<...>". If you want to offset add o:number. If you want to zero pad add p:number.

Example with 3 z-slices (index 0, 1, 2):

No modifiers:

my_base_<> -> my_base_0, my_base_1, my_base_2

Offset or zero padding:

<o:200>_my_base_<p:4> -> 200_my_base_0000, 201_my_base_0001, 202_my_base_0002

Both offset and zero pad. Here, the offset shoudl always go first and the zero pad value second:

my_base_<o:1p:4> -> my_base_0001, my_base_0002, my_base_0003

ZindexStackTileProcessor

A Tile processor for a single image file identified by z index

Custom Parameters:

"extension": "<png|tif|jpg>",
"filesystem": "<s3|local",
"bucket": (if s3 filesystem)

Intern based plugins

Not recommended for general use

This plugin takes data from 1 Boss channel and copies it to another. Due to how the ingest service operates, this is not the most efficient process and should be used only for debug purposes.

InternPathProcessor

A no op path processor since all data is coming from the Boss

InternTileProcessor

A Tile processor that converts the tile indices to a single slice cutout from the Boss. This data is then loaded as an image tile.

Custom Parameters:

"x_offset": <integer> // Offset to apply when querying the Boss (added to the tile index)
"y_offset": <integer> // Offset to apply when querying the Boss (added to the tile index)
"z_offset": <integer> // Offset to apply when querying the Boss (added to the tile index)
"x_tile": <integer> // size of a tile in x dimension set in the ingest job config
"y_tile": <integer> // size of a tile in y dimension set in the ingest job config
"collection": <string> // Source collection
"experiment": <string> // Source experiment
"channel": <string> // Source channel
"resolution": <integer> // Source resolution

CloudVolume Plugins

This plugin supports uploading data using Seung Lab's CloudVolume.

CloudVolumePathProcessor

This processor does nothing since CloudVolume manages file lookup internally if the data is stored on the local file system.

CloudVolumeChunkProcessor

This processor provides chunks to the ingest client from CloudVolume.

Custom Parameters:

All parameters are passed directly to CloudVolume.__init__(). See the CloudVolume source file: https://github.com/seung-lab/cloud-volume/blob/master/cloudvolume/cloudvolume.py

Bare minimum:

"cloudpath": <string> // Specifies location of CloudVolume data