Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat/biowriter: Scalability and Extended Writing Ability #29

Merged
merged 6 commits into from
Apr 4, 2022

Conversation

Nicholas-Schaub
Copy link
Contributor

This PR adds in a number of changes to how the BioWriter operates. The major changes are:

  1. Channel and timepoint data can now be saved
  2. Implements a scattered tile approach to saving data, permitting scalable image writes

The scattered tile approach to saving data has all IFD header information written at the beginning of the file, leaving the remainder of the file to be written as new tiles are requested to be saved. Previously, it was only permitted to save images one z-plane at a time, and z-planes could not be saved out of order. This had to do primarily with IFDs being written as follows:
[IFDHeader1 image_tiles1] [IFD_Header2 image_tiles2] ...

Since compression is always on, the size of image_tiles could not be known beforehand, so the offsets between IFDs could not be known. However, now data is stored as follows
IFD_Header1 IFD_Header2 ... image_tiles

This allows planes and tiles to be written in arbitrary order, including out of order. This also makes it more convenient to expand the capability of the BioWriter to support z-slices, channels, and timepoints for arbitrarily sized data. It also allows us to perform near full threaded writing of data across n-dimensions, whereas prior threaded writes could only occur along a plane. This has led to a pretty significant improvement of writing performance for images with many ZCT planes. Writing is now less than 2x slower than reading (ignoring OME XML parsing, see below).

Since we are using ome-types to parse xml, the current bottleneck to writing images is parsing the OME model to XML. Estimates suggest this could be an order of magnitude slower or more than actually compressing and saving the data to disk. We opened an issue on ome-types, suggesting creation of an option to use lxml to dump the data to string.

https://github.com/tlambert03/ome-types/tree/main/src/ome_types

@Nicholas-Schaub Nicholas-Schaub merged commit 27272f7 into PolusAI:dev Apr 4, 2022
Nicholas-Schaub added a commit that referenced this pull request Apr 7, 2022
* Modified bumpversion

* Added code formatting, linting

* Fixed documentation, modified bumpversion

* Modified bump2version and readme

* Fixed bumpversion for README

* Removed buggy precommit hooks

* Removed bumpversion sign option

* Changed bumpversion fields

* Bump version: 2.1.9 → 2.2.0-dev0

* Fixed bug in bumpversion dev builds

* Bump version: 2.2.0-dev0 → 2.2.0-dev1

* Upgrade/bioformats (#20)

* Migration from loci_tools.jar to bioformats_package.jar

* Changed setup.py

* Fixed bug associated with initializing a BioWriter without metadata parameter (#21)

* Feat/ci (#22)

* Added code format and lint git actions

* fix: Fixed format gitaction, disabled zarr tests

* Added dev dependencies and test workflow

* Started docker ci build

* Docker build base

* Black formatting

* Updated git actions

* Changed bumpversion config

* Finished Docker configs

* Added PyPi publishing action

* Bump version: 2.2.0-dev1 → 2.2.0-dev2

* Fixed docker and pypi workflows

* Reverted version

* Changed pypi workflow

* Fixed twine username

* Feat/ometypes (#23)

* Replaced OmeXml.py with ome_types

* Resolved uncaught OmeXml references

* Bump version: 2.2.0-dev1 → 2.2.0-dev2

* Bump version: 2.2.0-dev2 → 2.2.0-dev3

* Fix/metadata (#24)

* Added additional metadata reformatting

* Fixed repeated values in metadata replacement dictionary

* Fixed unresolved merge conflict in README

* Fixed additional README conflict

* Fix/rgb (#26)

* Fixed versioning in docs

* Fix czi rgb image bug

* Fix czi rgb image bug

* Special catch for Zeiss CZI rgb images

* Fix/biowriter (#27)

* Fixed metadata issues with BioWriter

* fix: JavaWriter backend exports correct results on any XYZCT data

* Removed debug code

* Fixed error in unittest from missing dependency

* Feat/bioreader (#28)

* Bump version: 2.2.0 → 2.2.1-dev0

* re-added support for CT dimensions, improved performance

* Added pickle support for PythonReader

* Bump version: 2.2.1-dev0 → 2.2.1-dev1

* Feat/biowriter: Scalability and Extended Writing Ability (#29)

* Updated Python BioWriter to use newer version of tifffile

* Prototype of scalable scatter tile format

* Removed debug print statements

* added support for multi-channel/timepoint writing

* Added unittest, fix abstract class error

* Updated reader and biowriter metadata

* Bump version: 2.2.1-dev1 → 2.3.0-dev0

* Bioformats (#30)

* Removed bioformats_package.jar, added optional install of bioformats_jar

* Updated documentation and install options

* fixed README typo

* Update/docs (#31)

* Updated docs and tests. Changed java backend name to bioformats

* Updated examples to use new backend names

* Bump version: 2.3.0-dev0 → 2.3.0
Nicholas-Schaub added a commit that referenced this pull request May 17, 2023
* Modified bumpversion

* Added code formatting, linting

* Fixed documentation, modified bumpversion

* Modified bump2version and readme

* Fixed bumpversion for README

* Removed buggy precommit hooks

* Removed bumpversion sign option

* Changed bumpversion fields

* Bump version: 2.1.9 → 2.2.0-dev0

* Fixed bug in bumpversion dev builds

* Bump version: 2.2.0-dev0 → 2.2.0-dev1

* Upgrade/bioformats (#20)

* Migration from loci_tools.jar to bioformats_package.jar

* Changed setup.py

* Fixed bug associated with initializing a BioWriter without metadata parameter (#21)

* Feat/ci (#22)

* Added code format and lint git actions

* fix: Fixed format gitaction, disabled zarr tests

* Added dev dependencies and test workflow

* Started docker ci build

* Docker build base

* Black formatting

* Updated git actions

* Changed bumpversion config

* Finished Docker configs

* Added PyPi publishing action

* Bump version: 2.2.0-dev1 → 2.2.0-dev2

* Fixed docker and pypi workflows

* Reverted version

* Changed pypi workflow

* Fixed twine username

* Feat/ometypes (#23)

* Replaced OmeXml.py with ome_types

* Resolved uncaught OmeXml references

* Bump version: 2.2.0-dev1 → 2.2.0-dev2

* Bump version: 2.2.0-dev2 → 2.2.0-dev3

* Fix/metadata (#24)

* Added additional metadata reformatting

* Fixed repeated values in metadata replacement dictionary

* Fixed unresolved merge conflict in README

* Fixed additional README conflict

* Fix/rgb (#26)

* Fixed versioning in docs

* Fix czi rgb image bug

* Fix czi rgb image bug

* Special catch for Zeiss CZI rgb images

* Fix/biowriter (#27)

* Fixed metadata issues with BioWriter

* fix: JavaWriter backend exports correct results on any XYZCT data

* Removed debug code

* Fixed error in unittest from missing dependency

* Feat/bioreader (#28)

* Bump version: 2.2.0 → 2.2.1-dev0

* re-added support for CT dimensions, improved performance

* Added pickle support for PythonReader

* Bump version: 2.2.1-dev0 → 2.2.1-dev1

* Feat/biowriter: Scalability and Extended Writing Ability (#29)

* Updated Python BioWriter to use newer version of tifffile

* Prototype of scalable scatter tile format

* Removed debug print statements

* added support for multi-channel/timepoint writing

* Added unittest, fix abstract class error

* Updated reader and biowriter metadata

* Bump version: 2.2.1-dev1 → 2.3.0-dev0

* Bioformats (#30)

* Removed bioformats_package.jar, added optional install of bioformats_jar

* Updated documentation and install options

* fixed README typo

* Update/docs (#31)

* Updated docs and tests. Changed java backend name to bioformats

* Updated examples to use new backend names

* Bump version: 2.3.0-dev0 → 2.3.0
Nicholas-Schaub added a commit that referenced this pull request May 22, 2023
* Dev (#32)

* Modified bumpversion

* Added code formatting, linting

* Fixed documentation, modified bumpversion

* Modified bump2version and readme

* Fixed bumpversion for README

* Removed buggy precommit hooks

* Removed bumpversion sign option

* Changed bumpversion fields

* Bump version: 2.1.9 → 2.2.0-dev0

* Fixed bug in bumpversion dev builds

* Bump version: 2.2.0-dev0 → 2.2.0-dev1

* Upgrade/bioformats (#20)

* Migration from loci_tools.jar to bioformats_package.jar

* Changed setup.py

* Fixed bug associated with initializing a BioWriter without metadata parameter (#21)

* Feat/ci (#22)

* Added code format and lint git actions

* fix: Fixed format gitaction, disabled zarr tests

* Added dev dependencies and test workflow

* Started docker ci build

* Docker build base

* Black formatting

* Updated git actions

* Changed bumpversion config

* Finished Docker configs

* Added PyPi publishing action

* Bump version: 2.2.0-dev1 → 2.2.0-dev2

* Fixed docker and pypi workflows

* Reverted version

* Changed pypi workflow

* Fixed twine username

* Feat/ometypes (#23)

* Replaced OmeXml.py with ome_types

* Resolved uncaught OmeXml references

* Bump version: 2.2.0-dev1 → 2.2.0-dev2

* Bump version: 2.2.0-dev2 → 2.2.0-dev3

* Fix/metadata (#24)

* Added additional metadata reformatting

* Fixed repeated values in metadata replacement dictionary

* Fixed unresolved merge conflict in README

* Fixed additional README conflict

* Fix/rgb (#26)

* Fixed versioning in docs

* Fix czi rgb image bug

* Fix czi rgb image bug

* Special catch for Zeiss CZI rgb images

* Fix/biowriter (#27)

* Fixed metadata issues with BioWriter

* fix: JavaWriter backend exports correct results on any XYZCT data

* Removed debug code

* Fixed error in unittest from missing dependency

* Feat/bioreader (#28)

* Bump version: 2.2.0 → 2.2.1-dev0

* re-added support for CT dimensions, improved performance

* Added pickle support for PythonReader

* Bump version: 2.2.1-dev0 → 2.2.1-dev1

* Feat/biowriter: Scalability and Extended Writing Ability (#29)

* Updated Python BioWriter to use newer version of tifffile

* Prototype of scalable scatter tile format

* Removed debug print statements

* added support for multi-channel/timepoint writing

* Added unittest, fix abstract class error

* Updated reader and biowriter metadata

* Bump version: 2.2.1-dev1 → 2.3.0-dev0

* Bioformats (#30)

* Removed bioformats_package.jar, added optional install of bioformats_jar

* Updated documentation and install options

* fixed README typo

* Update/docs (#31)

* Updated docs and tests. Changed java backend name to bioformats

* Updated examples to use new backend names

* Bump version: 2.3.0-dev0 → 2.3.0

* Updated dependency range for tifffile, changed metadata saving

* Added no verify to bump2version config

* Upgraded ome-types

* Bump version: 2.3.0 → 2.3.1-dev0
Nicholas-Schaub added a commit that referenced this pull request Jul 24, 2023
* Updated dependency range for tifffile, changed metadata saving

* Added no verify to bump2version config

* Upgraded ome-types

* Bump version: 2.3.0 → 2.3.1-dev0

* Dev (#32) (#40)

* Modified bumpversion

* Added code formatting, linting

* Fixed documentation, modified bumpversion

* Modified bump2version and readme

* Fixed bumpversion for README

* Removed buggy precommit hooks

* Removed bumpversion sign option

* Changed bumpversion fields

* Bump version: 2.1.9 → 2.2.0-dev0

* Fixed bug in bumpversion dev builds

* Bump version: 2.2.0-dev0 → 2.2.0-dev1

* Upgrade/bioformats (#20)

* Migration from loci_tools.jar to bioformats_package.jar

* Changed setup.py

* Fixed bug associated with initializing a BioWriter without metadata parameter (#21)

* Feat/ci (#22)

* Added code format and lint git actions

* fix: Fixed format gitaction, disabled zarr tests

* Added dev dependencies and test workflow

* Started docker ci build

* Docker build base

* Black formatting

* Updated git actions

* Changed bumpversion config

* Finished Docker configs

* Added PyPi publishing action

* Bump version: 2.2.0-dev1 → 2.2.0-dev2

* Fixed docker and pypi workflows

* Reverted version

* Changed pypi workflow

* Fixed twine username

* Feat/ometypes (#23)

* Replaced OmeXml.py with ome_types

* Resolved uncaught OmeXml references

* Bump version: 2.2.0-dev1 → 2.2.0-dev2

* Bump version: 2.2.0-dev2 → 2.2.0-dev3

* Fix/metadata (#24)

* Added additional metadata reformatting

* Fixed repeated values in metadata replacement dictionary

* Fixed unresolved merge conflict in README

* Fixed additional README conflict

* Fix/rgb (#26)

* Fixed versioning in docs

* Fix czi rgb image bug

* Fix czi rgb image bug

* Special catch for Zeiss CZI rgb images

* Fix/biowriter (#27)

* Fixed metadata issues with BioWriter

* fix: JavaWriter backend exports correct results on any XYZCT data

* Removed debug code

* Fixed error in unittest from missing dependency

* Feat/bioreader (#28)

* Bump version: 2.2.0 → 2.2.1-dev0

* re-added support for CT dimensions, improved performance

* Added pickle support for PythonReader

* Bump version: 2.2.1-dev0 → 2.2.1-dev1

* Feat/biowriter: Scalability and Extended Writing Ability (#29)

* Updated Python BioWriter to use newer version of tifffile

* Prototype of scalable scatter tile format

* Removed debug print statements

* added support for multi-channel/timepoint writing

* Added unittest, fix abstract class error

* Updated reader and biowriter metadata

* Bump version: 2.2.1-dev1 → 2.3.0-dev0

* Bioformats (#30)

* Removed bioformats_package.jar, added optional install of bioformats_jar

* Updated documentation and install options

* fixed README typo

* Update/docs (#31)

* Updated docs and tests. Changed java backend name to bioformats

* Updated examples to use new backend names

* Bump version: 2.3.0-dev0 → 2.3.0

* Updated Metadata and Dependencies (#41)

* Dev (#32)

* Modified bumpversion

* Added code formatting, linting

* Fixed documentation, modified bumpversion

* Modified bump2version and readme

* Fixed bumpversion for README

* Removed buggy precommit hooks

* Removed bumpversion sign option

* Changed bumpversion fields

* Bump version: 2.1.9 → 2.2.0-dev0

* Fixed bug in bumpversion dev builds

* Bump version: 2.2.0-dev0 → 2.2.0-dev1

* Upgrade/bioformats (#20)

* Migration from loci_tools.jar to bioformats_package.jar

* Changed setup.py

* Fixed bug associated with initializing a BioWriter without metadata parameter (#21)

* Feat/ci (#22)

* Added code format and lint git actions

* fix: Fixed format gitaction, disabled zarr tests

* Added dev dependencies and test workflow

* Started docker ci build

* Docker build base

* Black formatting

* Updated git actions

* Changed bumpversion config

* Finished Docker configs

* Added PyPi publishing action

* Bump version: 2.2.0-dev1 → 2.2.0-dev2

* Fixed docker and pypi workflows

* Reverted version

* Changed pypi workflow

* Fixed twine username

* Feat/ometypes (#23)

* Replaced OmeXml.py with ome_types

* Resolved uncaught OmeXml references

* Bump version: 2.2.0-dev1 → 2.2.0-dev2

* Bump version: 2.2.0-dev2 → 2.2.0-dev3

* Fix/metadata (#24)

* Added additional metadata reformatting

* Fixed repeated values in metadata replacement dictionary

* Fixed unresolved merge conflict in README

* Fixed additional README conflict

* Fix/rgb (#26)

* Fixed versioning in docs

* Fix czi rgb image bug

* Fix czi rgb image bug

* Special catch for Zeiss CZI rgb images

* Fix/biowriter (#27)

* Fixed metadata issues with BioWriter

* fix: JavaWriter backend exports correct results on any XYZCT data

* Removed debug code

* Fixed error in unittest from missing dependency

* Feat/bioreader (#28)

* Bump version: 2.2.0 → 2.2.1-dev0

* re-added support for CT dimensions, improved performance

* Added pickle support for PythonReader

* Bump version: 2.2.1-dev0 → 2.2.1-dev1

* Feat/biowriter: Scalability and Extended Writing Ability (#29)

* Updated Python BioWriter to use newer version of tifffile

* Prototype of scalable scatter tile format

* Removed debug print statements

* added support for multi-channel/timepoint writing

* Added unittest, fix abstract class error

* Updated reader and biowriter metadata

* Bump version: 2.2.1-dev1 → 2.3.0-dev0

* Bioformats (#30)

* Removed bioformats_package.jar, added optional install of bioformats_jar

* Updated documentation and install options

* fixed README typo

* Update/docs (#31)

* Updated docs and tests. Changed java backend name to bioformats

* Updated examples to use new backend names

* Bump version: 2.3.0-dev0 → 2.3.0

* Updated dependency range for tifffile, changed metadata saving

* Added no verify to bump2version config

* Upgraded ome-types

* Bump version: 2.3.0 → 2.3.1-dev0

* Bump version: 2.3.1-dev0 → 2.3.1
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant