# Conversion Tools

https://forum.image.sc/t/converting-whole-slide-images-to-ome-tiff-a-new-workflow/32110/4

<img src="blog-2019-12-converting-whole-slide-images.jpg" style="height:300px" />



## Basics

The two basic commands are `bioformats2raw` and `raw2ometiff`. Together they provide a pipeline to scalably convert large images into OME-TIFF. The primary caveat is that they require **twice** the storage for the conversion.

In [1]:
%%time
!bioformats2raw --help

Missing required parameters: <inputPath>, <outputPath>
Usage: [1m<main class>[21m[0m [[33m--debug[39m[0m] [[33m--extra-readers[39m[0m[=[3m<extraReaders>[23m[0m[,
                    [3m<extraReaders>[23m[0m...]]]...
                    [[33m--additional-scale-format-string-args[39m[0m=[3m<additionalScaleForma[23m[0m
[3m                    tStringArgsCsv>[23m[0m] [[33m-c[39m[0m=[3m<compressionType>[23m[0m]
                    [[33m--compression-parameter[39m[0m=[3m<compressionParameter>[23m[0m]
                    [[33m--dimension-order[39m[0m=[3m<dimensionOrder>[23m[0m]
                    [[33m--file_type[39m[0m=[3m<fileType>[23m[0m] [[33m-h[39m[0m=[3m<tileHeight>[23m[0m]
                    [[33m--max_cached_tiles[39m[0m=[3m<maxCachedTiles>[23m[0m]
                    [[33m--max_workers[39m[0m=[3m<maxWorkers>[23m[0m] [[33m--pyramid-name[39m[0m=[3m<pyramidName>[23m[0m]
                    [[33m-r[39m[0m=[3m<py

In [2]:
%%time
!raw2ometiff --help

Missing required parameters: <inputDirectory>, <outputFilePath>
Usage: [1m<main class>[21m[0m [[33m--debug[39m[0m] [[33m--legacy[39m[0m] [[33m--rgb[39m[0m] [[33m--compression[39m[0m=[3m<compression>[23m[0m]
                    [[33m--max_workers[39m[0m=[3m<maxWorkers>[23m[0m] [33m<inputDirectory>[39m[0m
                    [33m<outputFilePath>[39m[0m
[33m [39m[0m     [33m<inputDirectory>[39m[0m   Directory containing pixel data to convert
[33m [39m[0m     [33m<outputFilePath>[39m[0m   Relative path to the output OME-TIFF file
      [33m--compression[39m[0m=[3m<com[23m[0m[3mpression>[23m[0m
                         Compression type for output OME-TIFF file
                           (Uncompressed, LZW, JPEG-2000, JPEG-2000 Lossy,
                           JPEG, zlib; default: LZW)
      [33m--debug[39m[0m            Turn on debug logging
      [33m--legacy[39m[0m           Write a Bio-Formats 5.9.x pyramid instead of OME-TIFF
   

## Simple Invocation

Here we will use Bio-Formats ability to generate test data in order to quickly test within the notebook.

See https://docs.openmicroscopy.org/bio-formats/latest/developers/generating-test-images.html for more information.

**First we generate the raw intermediate format:**

In [1]:
%%time
!bioformats2raw a.fake /tmp/output-1

2020-05-28 20:36:16,927 [main] INFO  loci.formats.ImageReader - FakeReader initializing a.fake
2020-05-28 20:36:17,092 [main] INFO  c.g.bioformats2raw.Converter - Using 2 pyramid resolutions
2020-05-28 20:36:17,092 [main] INFO  c.g.bioformats2raw.Converter - Preparing to write pyramid sizeX 512 (tileWidth: 1024) sizeY 512 (tileWidth: 1024) imageCount 1
2020-05-28 20:36:17,427 [main] WARN  c.g.bioformats2raw.Converter - Reducing active tileWidth to 512
2020-05-28 20:36:17,427 [main] WARN  c.g.bioformats2raw.Converter - Reducing active tileHeight to 512
2020-05-28 20:36:17,440 [pool-1-thread-1] INFO  c.g.bioformats2raw.Converter - requesting tile to write at [0, 0, 0, 0, 0] to /0/0
2020-05-28 20:36:17,452 [pool-1-thread-1] INFO  c.g.bioformats2raw.Converter - tile read complete 1/1
2020-05-28 20:36:17,452 [pool-1-thread-1] INFO  org.perf4j.TimingLogger - start[1590690977440] time[12] tag[getTile]
2020-05-28 20:36:17,455 [pool-1-thread-1] INFO  c.g.bioformats2raw.Converter - successfully 

**Then we convert that output into an OME-TIFF:**

In [2]:
%%time
!raw2ometiff /tmp/output-1 /tmp/output-1.ome.tiff

2020-05-28 20:36:22,550 [main] INFO  c.g.p.PyramidFromDirectoryWriter - Creating tiled pyramid file /tmp/output-1.ome.tiff
2020-05-28 20:36:22,614 [main] INFO  c.g.p.PyramidFromDirectoryWriter - Number of resolution levels: 2
2020-05-28 20:36:22,624 [main] INFO  c.g.p.PyramidFromDirectoryWriter - Adding metadata for resolution: 0
2020-05-28 20:36:22,625 [main] INFO  c.g.p.PyramidFromDirectoryWriter - Adding metadata for resolution: 1
2020-05-28 20:36:22,627 [main] INFO  org.perf4j.TimingLogger - start[1590690982263] time[364] tag[initialize]
2020-05-28 20:36:22,686 [main] INFO  c.g.p.PyramidFromDirectoryWriter - Converting resolution #0
2020-05-28 20:36:22,688 [main] INFO  org.perf4j.TimingLogger - start[1590690982686] time[2] tag[getInputTileBytes]
2020-05-28 20:36:22,690 [main] INFO  c.g.p.PyramidFromDirectoryWriter - Converting resolution #1
2020-05-28 20:36:22,692 [main] INFO  org.perf4j.TimingLogger - start[1590690982690] time[1] tag[getInputTileBytes]
2020-05-28 20:36:22,697 [poo

**The same operation on larger generated data, still proceeds timely:**

In [None]:
%%time
!bioformats2raw 'big&sizeX=10000&sizeY=10000.fake' /tmp/output-2

**But of course produces larger output:**

In [6]:
%%time
!raw2ometiff /tmp/output-2 /tmp/output-2.ome.tiff

2020-05-28 09:03:51,378 [main] INFO  c.g.p.PyramidFromDirectoryWriter - Creating tiled pyramid file /tmp/output-2.ome.tiff
2020-05-28 09:03:51,430 [main] INFO  c.g.p.PyramidFromDirectoryWriter - Number of resolution levels: 7
2020-05-28 09:03:51,443 [main] INFO  c.g.p.PyramidFromDirectoryWriter - Adding metadata for resolution: 0
2020-05-28 09:03:51,444 [main] INFO  c.g.p.PyramidFromDirectoryWriter - Adding metadata for resolution: 1
2020-05-28 09:03:51,445 [main] INFO  c.g.p.PyramidFromDirectoryWriter - Adding metadata for resolution: 2
2020-05-28 09:03:51,445 [main] INFO  c.g.p.PyramidFromDirectoryWriter - Adding metadata for resolution: 3
2020-05-28 09:03:51,445 [main] INFO  c.g.p.PyramidFromDirectoryWriter - Adding metadata for resolution: 4
2020-05-28 09:03:51,445 [main] INFO  c.g.p.PyramidFromDirectoryWriter - Adding metadata for resolution: 5
2020-05-28 09:03:51,445 [main] INFO  c.g.p.PyramidFromDirectoryWriter - Adding metadata for resolution: 6
2020-05-28 09:03:51,447 [main] I

2020-05-28 09:03:51,667 [pool-1-thread-14] INFO  org.perf4j.TimingLogger - start[1590649431643] time[23] tag[writeTile]
2020-05-28 09:03:51,667 [pool-1-thread-12] INFO  org.perf4j.TimingLogger - start[1590649431642] time[24] tag[writeTile]
2020-05-28 09:03:51,668 [pool-1-thread-16] INFO  org.perf4j.TimingLogger - start[1590649431642] time[25] tag[writeTile]
2020-05-28 09:03:51,668 [main] INFO  org.perf4j.TimingLogger - start[1590649431646] time[21] tag[getInputTileBytes]
2020-05-28 09:03:51,668 [pool-1-thread-13] INFO  org.perf4j.TimingLogger - start[1590649431643] time[25] tag[writeTile]
2020-05-28 09:03:51,668 [pool-1-thread-5] INFO  org.perf4j.TimingLogger - start[1590649431642] time[26] tag[writeTile]
2020-05-28 09:03:51,669 [pool-1-thread-7] INFO  org.perf4j.TimingLogger - start[1590649431643] time[25] tag[writeTile]
2020-05-28 09:03:51,669 [main] INFO  org.perf4j.TimingLogger - start[1590649431668] time[1] tag[getInputTileBytes]
2020-05-28 09:03:51,670 [pool-1-thread-1] INFO  org

2020-05-28 09:03:51,816 [pool-1-thread-5] INFO  org.perf4j.TimingLogger - start[1590649431804] time[12] tag[writeTile]
2020-05-28 09:03:51,818 [main] INFO  org.perf4j.TimingLogger - start[1590649431816] time[2] tag[getInputTileBytes]
2020-05-28 09:03:51,818 [pool-1-thread-13] INFO  org.perf4j.TimingLogger - start[1590649431805] time[13] tag[writeTile]
2020-05-28 09:03:51,818 [pool-1-thread-9] INFO  org.perf4j.TimingLogger - start[1590649431806] time[12] tag[writeTile]
2020-05-28 09:03:51,818 [pool-1-thread-10] INFO  org.perf4j.TimingLogger - start[1590649431805] time[12] tag[writeTile]
2020-05-28 09:03:51,818 [pool-1-thread-11] INFO  org.perf4j.TimingLogger - start[1590649431805] time[13] tag[writeTile]
2020-05-28 09:03:51,819 [pool-1-thread-4] INFO  org.perf4j.TimingLogger - start[1590649431807] time[12] tag[writeTile]
2020-05-28 09:03:51,819 [main] INFO  org.perf4j.TimingLogger - start[1590649431818] time[1] tag[getInputTileBytes]
2020-05-28 09:03:51,819 [main] INFO  c.g.p.PyramidFro

2020-05-28 09:03:51,918 [main] INFO  org.perf4j.TimingLogger - start[1590649431447] time[470] tag[convertToPyramid]
CPU times: user 36.5 ms, sys: 17.4 ms, total: 53.9 ms
Wall time: 1.47 s


In [7]:
!ls -ltrah /tmp/output*tiff

-rw-r--r--  1 jamoore  wheel    54K May 28 09:03 /tmp/output-1.ome.tiff
-rw-r--r--  1 jamoore  wheel   138M May 28 09:03 /tmp/output-2.ome.tiff


## Compression

Both commands additionally provide additional arguments like `--compression` which you can experiment with:

**Here we leave the OME-TIFF uncompressed to see how much larger it will be:**

In [8]:
%%time
!raw2ometiff /tmp/output-2 /tmp/output-2.ome.tiff --compression=Uncompressed

2020-05-28 09:03:56,536 [main] INFO  c.g.p.PyramidFromDirectoryWriter - Creating tiled pyramid file /tmp/output-2.ome.tiff
2020-05-28 09:03:56,586 [main] INFO  c.g.p.PyramidFromDirectoryWriter - Number of resolution levels: 7
2020-05-28 09:03:56,601 [main] INFO  c.g.p.PyramidFromDirectoryWriter - Adding metadata for resolution: 0
2020-05-28 09:03:56,602 [main] INFO  c.g.p.PyramidFromDirectoryWriter - Adding metadata for resolution: 1
2020-05-28 09:03:56,602 [main] INFO  c.g.p.PyramidFromDirectoryWriter - Adding metadata for resolution: 2
2020-05-28 09:03:56,602 [main] INFO  c.g.p.PyramidFromDirectoryWriter - Adding metadata for resolution: 3
2020-05-28 09:03:56,602 [main] INFO  c.g.p.PyramidFromDirectoryWriter - Adding metadata for resolution: 4
2020-05-28 09:03:56,602 [main] INFO  c.g.p.PyramidFromDirectoryWriter - Adding metadata for resolution: 5
2020-05-28 09:03:56,602 [main] INFO  c.g.p.PyramidFromDirectoryWriter - Adding metadata for resolution: 6
2020-05-28 09:03:56,604 [main] I

2020-05-28 09:03:56,759 [main] INFO  org.perf4j.TimingLogger - start[1590649436757] time[2] tag[getInputTileBytes]
2020-05-28 09:03:56,760 [pool-1-thread-10] INFO  org.perf4j.TimingLogger - start[1590649436760] time[0] tag[writeTile]
2020-05-28 09:03:56,761 [main] INFO  org.perf4j.TimingLogger - start[1590649436760] time[1] tag[getInputTileBytes]
2020-05-28 09:03:56,762 [pool-1-thread-11] INFO  org.perf4j.TimingLogger - start[1590649436762] time[0] tag[writeTile]
2020-05-28 09:03:56,763 [main] INFO  org.perf4j.TimingLogger - start[1590649436761] time[1] tag[getInputTileBytes]
2020-05-28 09:03:56,765 [pool-1-thread-12] INFO  org.perf4j.TimingLogger - start[1590649436763] time[1] tag[writeTile]
2020-05-28 09:03:56,765 [main] INFO  org.perf4j.TimingLogger - start[1590649436763] time[2] tag[getInputTileBytes]
2020-05-28 09:03:56,766 [pool-1-thread-13] INFO  org.perf4j.TimingLogger - start[1590649436765] time[0] tag[writeTile]
2020-05-28 09:03:56,767 [main] INFO  org.perf4j.TimingLo

2020-05-28 09:03:56,809 [pool-1-thread-2] INFO  org.perf4j.TimingLogger - start[1590649436808] time[1] tag[writeTile]
2020-05-28 09:03:56,810 [pool-1-thread-3] INFO  org.perf4j.TimingLogger - start[1590649436809] time[0] tag[writeTile]
2020-05-28 09:03:56,810 [main] INFO  org.perf4j.TimingLogger - start[1590649436809] time[0] tag[getInputTileBytes]
2020-05-28 09:03:56,811 [pool-1-thread-5] INFO  org.perf4j.TimingLogger - start[1590649436810] time[0] tag[writeTile]
2020-05-28 09:03:56,811 [main] INFO  org.perf4j.TimingLogger - start[1590649436810] time[1] tag[getInputTileBytes]
2020-05-28 09:03:56,812 [pool-1-thread-6] INFO  org.perf4j.TimingLogger - start[1590649436811] time[0] tag[writeTile]
2020-05-28 09:03:56,812 [main] INFO  org.perf4j.TimingLogger - start[1590649436811] time[0] tag[getInputTileBytes]
2020-05-28 09:03:56,813 [pool-1-thread-7] INFO  org.perf4j.TimingLogger - start[1590649436812] time[0] tag[writeTile]
2020-05-28 09:03:56,813 [main] INFO  org.perf4j.TimingLog

2020-05-28 09:03:56,866 [main] INFO  org.perf4j.TimingLogger - start[1590649436604] time[262] tag[convertToPyramid]
CPU times: user 28.5 ms, sys: 14.4 ms, total: 42.8 ms
Wall time: 1.23 s


In [9]:
!ls -ltrah /tmp/output*tiff

-rw-r--r--  1 jamoore  wheel    54K May 28 09:03 /tmp/output-1.ome.tiff
-rw-r--r--  1 jamoore  wheel   138M May 28 09:03 /tmp/output-2.ome.tiff


# OME Zarr format

Another option provided by `bioformats2raw` is `--file_type` which produces Zarr output rather than N5 as the intermediate format. If we additionally pass the `--dimension-order` argument, then the intermediate result can be used directly by the ome-zarr library.

In [1]:
%%time
!bioformats2raw a.fake /tmp/output-3 --file_type=zarr --dimension-order=XYZCT

2020-11-30 09:26:37,833 [main] INFO  c.g.bioformats2raw.Converter - Output will be incompatible with raw2ometiff (pyramidName: data.zarr, scaleFormatString: %d/%d)
2020-11-30 09:26:38,642 [main] INFO  loci.formats.ImageReader - FakeReader initializing a.fake
2020-11-30 09:26:38,903 [main] INFO  c.g.bioformats2raw.Converter - Using 2 pyramid resolutions
2020-11-30 09:26:38,904 [main] INFO  c.g.bioformats2raw.Converter - Preparing to write pyramid sizeX 512 (tileWidth: 1024) sizeY 512 (tileWidth: 1024) imageCount 1
2020-11-30 09:26:39,536 [main] WARN  c.g.bioformats2raw.Converter - Reducing active tileWidth to 512
2020-11-30 09:26:39,536 [main] WARN  c.g.bioformats2raw.Converter - Reducing active tileHeight to 512
2020-11-30 09:26:39,575 [pool-1-thread-1] INFO  c.g.bioformats2raw.Converter - requesting tile to write at [0, 0, 0, 0, 0] to /0/0
2020-11-30 09:26:39,596 [pool-1-thread-1] INFO  c.g.bioformats2raw.Converter - tile read complete 1/1
2020-11-30 09:26:39,596 [pool-1-thread-1] INF

In [None]:
import ome_zarr
z = ome_zarr.parse_url("/tmp/output-3/data.zarr/0")
z.is_ome_zarr()

In [None]:
!ome_zarr info /tmp/output-3/data.zarr/0

### License
Copyright (C) 2019-2020 University of Dundee. All Rights Reserved.
This program is free software; you can redistribute it and/or modify it
under the terms of the GNU General Public License as published by the
Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful, but
WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
more details. You should have received a copy of the GNU General
Public License along with this program; if not, write to the
Free Software Foundation,
Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.