# Conversion Tools

https://forum.image.sc/t/converting-whole-slide-images-to-ome-tiff-a-new-workflow/32110/4

<img src="blog-2019-12-converting-whole-slide-images.jpg" style="height:300px" />



The two basic commands are `bioformats2raw` and `raw2ometiff`. Together they provide a pipeline to scalably convert large images into OME-TIFF. The primary caveat is that they require **twice** the storage for the conversion.

In [1]:
%%time
!bioformats2raw --help

Missing required parameters: <inputPath>, <outputPath>
Usage: [1m<main class>[21m[0m [[33m--debug[39m[0m] [[33m--version[39m[0m] [[33m--extra-readers[39m[0m[=[3m<extraReaders>[23m[0m[,
                    [3m<extraReaders>[23m[0m...]]]...
                    [[33m--additional-scale-format-string-args[39m[0m=[3m<additionalScaleForma[23m[0m
[3m                    tStringArgsCsv>[23m[0m] [[33m-c[39m[0m=[3m<compressionType>[23m[0m]
                    [[33m--compression-parameter[39m[0m=[3m<compressionParameter>[23m[0m]
                    [[33m--dimension-order[39m[0m=[3m<dimensionOrder>[23m[0m]
                    [[33m--file_type[39m[0m=[3m<fileType>[23m[0m] [[33m-h[39m[0m=[3m<tileHeight>[23m[0m]
                    [[33m--max_cached_tiles[39m[0m=[3m<maxCachedTiles>[23m[0m]
                    [[33m--max_workers[39m[0m=[3m<maxWorkers>[23m[0m]
                    [[33m--memo-directory[39m[0m=[3m<memoDirectory>[23m

## Simple Invocation

Here we will use Bio-Formats ability to generate test data in order to quickly test within the notebook.

See https://docs.openmicroscopy.org/bio-formats/latest/developers/generating-test-images.html for more information.

**First we generate the raw intermediate format:**

In [8]:
%%time
!bioformats2raw a.fake /tmp/output-1

2020-11-30 17:20:39,875 [main] INFO  loci.formats.ImageReader - FakeReader initializing a.fake
2020-11-30 17:20:40,164 [main] INFO  c.g.bioformats2raw.Converter - Using 2 pyramid resolutions
2020-11-30 17:20:40,165 [main] INFO  c.g.bioformats2raw.Converter - Preparing to write pyramid sizeX 512 (tileWidth: 1024) sizeY 512 (tileWidth: 1024) imageCount 1
2020-11-30 17:20:40,567 [main] WARN  c.g.bioformats2raw.Converter - Reducing active tileWidth to 512
2020-11-30 17:20:40,568 [main] WARN  c.g.bioformats2raw.Converter - Reducing active tileHeight to 512
2020-11-30 17:20:40,587 [pool-1-thread-1] INFO  c.g.bioformats2raw.Converter - requesting tile to write at [0, 0, 0, 0, 0] to /0/0
2020-11-30 17:20:40,611 [pool-1-thread-1] INFO  c.g.bioformats2raw.Converter - tile read complete 1/1
2020-11-30 17:20:40,612 [pool-1-thread-1] INFO  org.perf4j.TimingLogger - start[1606753240587] time[24] tag[getTile]
2020-11-30 17:20:40,617 [pool-1-thread-1] INFO  c.g.bioformats2raw.Converter - successfully 

**The same operation on larger generated data, still proceeds timely:**

In [4]:
%%time
!bioformats2raw 'big&sizeX=10000&sizeY=10000.fake' /tmp/output-2

2020-11-30 17:19:45,987 [main] INFO  loci.formats.ImageReader - FakeReader initializing big&sizeX=10000&sizeY=10000.fake
2020-11-30 17:19:46,243 [main] INFO  c.g.bioformats2raw.Converter - Using 7 pyramid resolutions
2020-11-30 17:19:46,243 [main] INFO  c.g.bioformats2raw.Converter - Preparing to write pyramid sizeX 10000 (tileWidth: 1024) sizeY 10000 (tileWidth: 1024) imageCount 1
2020-11-30 17:19:46,727 [pool-1-thread-1] INFO  c.g.bioformats2raw.Converter - requesting tile to write at [0, 0, 0, 0, 0] to /0/0
2020-11-30 17:19:46,769 [pool-1-thread-1] INFO  c.g.bioformats2raw.Converter - tile read complete 1/100
2020-11-30 17:19:46,769 [pool-1-thread-1] INFO  org.perf4j.TimingLogger - start[1606753186727] time[41] tag[getTile]
2020-11-30 17:19:46,778 [pool-1-thread-1] INFO  c.g.bioformats2raw.Converter - successfully wrote at [0, 0, 0, 0, 0] to /0/0
2020-11-30 17:19:46,779 [pool-1-thread-1] INFO  c.g.bioformats2raw.Converter - Successfully processed tile; resolution=0 plane=0 xx=0 yy=0

2020-11-30 17:19:47,046 [pool-1-thread-3] INFO  c.g.bioformats2raw.Converter - successfully wrote at [2, 1, 0, 0, 0] to /0/0
2020-11-30 17:19:47,046 [pool-1-thread-3] INFO  c.g.bioformats2raw.Converter - Successfully processed tile; resolution=0 plane=0 xx=2048 yy=1024 width=1024 height=1024
2020-11-30 17:19:47,046 [pool-1-thread-2] INFO  c.g.bioformats2raw.Converter - requesting tile to write at [1, 0, 0, 0, 0] to /0/0
2020-11-30 17:19:47,048 [pool-1-thread-4] INFO  c.g.bioformats2raw.Converter - successfully wrote at [3, 1, 0, 0, 0] to /0/0
2020-11-30 17:19:47,048 [pool-1-thread-4] INFO  c.g.bioformats2raw.Converter - Successfully processed tile; resolution=0 plane=0 xx=3072 yy=1024 width=1024 height=1024
2020-11-30 17:19:47,049 [pool-1-thread-3] INFO  c.g.bioformats2raw.Converter - requesting tile to write at [4, 1, 0, 0, 0] to /0/0
2020-11-30 17:19:47,050 [pool-1-thread-4] INFO  c.g.bioformats2raw.Converter - requesting tile to write at [5, 1, 0, 0, 0] to /0/0
2020-11-30 17:19:47,0

2020-11-30 17:19:47,146 [pool-1-thread-1] INFO  c.g.bioformats2raw.Converter - successfully wrote at [7, 2, 0, 0, 0] to /0/0
2020-11-30 17:19:47,146 [pool-1-thread-1] INFO  c.g.bioformats2raw.Converter - Successfully processed tile; resolution=0 plane=0 xx=7168 yy=2048 width=1024 height=1024
2020-11-30 17:19:47,148 [pool-1-thread-1] INFO  c.g.bioformats2raw.Converter - requesting tile to write at [0, 3, 0, 0, 0] to /0/0
2020-11-30 17:19:47,151 [pool-1-thread-3] INFO  c.g.bioformats2raw.Converter - tile read complete 29/100
2020-11-30 17:19:47,151 [pool-1-thread-3] INFO  org.perf4j.TimingLogger - start[1606753187134] time[16] tag[getTile]
2020-11-30 17:19:47,157 [pool-1-thread-3] INFO  c.g.bioformats2raw.Converter - successfully wrote at [8, 2, 0, 0, 0] to /0/0
2020-11-30 17:19:47,157 [pool-1-thread-3] INFO  c.g.bioformats2raw.Converter - Successfully processed tile; resolution=0 plane=0 xx=8192 yy=2048 width=1024 height=1024
2020-11-30 17:19:47,160 [pool-1-thread-3] INFO  c.g.bioformat

2020-11-30 17:19:47,298 [pool-1-thread-2] INFO  c.g.bioformats2raw.Converter - successfully wrote at [3, 4, 0, 0, 0] to /0/0
2020-11-30 17:19:47,298 [pool-1-thread-2] INFO  c.g.bioformats2raw.Converter - Successfully processed tile; resolution=0 plane=0 xx=3072 yy=4096 width=1024 height=1024
2020-11-30 17:19:47,300 [pool-1-thread-2] INFO  c.g.bioformats2raw.Converter - requesting tile to write at [6, 4, 0, 0, 0] to /0/0
2020-11-30 17:19:47,306 [pool-1-thread-4] INFO  c.g.bioformats2raw.Converter - tile read complete 46/100
2020-11-30 17:19:47,306 [pool-1-thread-4] INFO  org.perf4j.TimingLogger - start[1606753187289] time[17] tag[getTile]
2020-11-30 17:19:47,309 [pool-1-thread-4] INFO  c.g.bioformats2raw.Converter - successfully wrote at [5, 4, 0, 0, 0] to /0/0
2020-11-30 17:19:47,309 [pool-1-thread-4] INFO  c.g.bioformats2raw.Converter - Successfully processed tile; resolution=0 plane=0 xx=5120 yy=4096 width=1024 height=1024
2020-11-30 17:19:47,310 [pool-1-thread-2] INFO  c.g.bioformat

2020-11-30 17:19:47,400 [pool-1-thread-3] INFO  c.g.bioformats2raw.Converter - requesting tile to write at [9, 5, 0, 0, 0] to /0/0
2020-11-30 17:19:47,400 [pool-1-thread-4] INFO  c.g.bioformats2raw.Converter - successfully wrote at [7, 5, 0, 0, 0] to /0/0
2020-11-30 17:19:47,400 [pool-1-thread-4] INFO  c.g.bioformats2raw.Converter - Successfully processed tile; resolution=0 plane=0 xx=7168 yy=5120 width=1024 height=1024
2020-11-30 17:19:47,402 [pool-1-thread-4] INFO  c.g.bioformats2raw.Converter - requesting tile to write at [0, 6, 0, 0, 0] to /0/0
2020-11-30 17:19:47,411 [pool-1-thread-3] INFO  c.g.bioformats2raw.Converter - tile read complete 58/100
2020-11-30 17:19:47,411 [pool-1-thread-3] INFO  org.perf4j.TimingLogger - start[1606753187400] time[11] tag[getTile]
2020-11-30 17:19:47,414 [pool-1-thread-3] INFO  c.g.bioformats2raw.Converter - successfully wrote at [9, 5, 0, 0, 0] to /0/0
2020-11-30 17:19:47,415 [pool-1-thread-3] INFO  c.g.bioformats2raw.Converter - Successfully proces

2020-11-30 17:19:47,507 [pool-1-thread-3] INFO  c.g.bioformats2raw.Converter - tile read complete 74/100
2020-11-30 17:19:47,507 [pool-1-thread-3] INFO  org.perf4j.TimingLogger - start[1606753187492] time[15] tag[getTile]
2020-11-30 17:19:47,511 [pool-1-thread-4] INFO  c.g.bioformats2raw.Converter - tile read complete 75/100
2020-11-30 17:19:47,511 [pool-1-thread-4] INFO  org.perf4j.TimingLogger - start[1606753187498] time[13] tag[getTile]
2020-11-30 17:19:47,511 [pool-1-thread-3] INFO  c.g.bioformats2raw.Converter - successfully wrote at [3, 7, 0, 0, 0] to /0/0
2020-11-30 17:19:47,511 [pool-1-thread-3] INFO  c.g.bioformats2raw.Converter - Successfully processed tile; resolution=0 plane=0 xx=3072 yy=7168 width=1024 height=1024
2020-11-30 17:19:47,513 [pool-1-thread-3] INFO  c.g.bioformats2raw.Converter - requesting tile to write at [5, 7, 0, 0, 0] to /0/0
2020-11-30 17:19:47,514 [pool-1-thread-4] INFO  c.g.bioformats2raw.Converter - successfully wrote at [4, 7, 0, 0, 0] to /0/0
2020-11

2020-11-30 17:19:47,658 [pool-1-thread-2] INFO  c.g.bioformats2raw.Converter - successfully wrote at [3, 9, 0, 0, 0] to /0/0
2020-11-30 17:19:47,658 [pool-1-thread-2] INFO  c.g.bioformats2raw.Converter - Successfully processed tile; resolution=0 plane=0 xx=3072 yy=9216 width=1024 height=784
2020-11-30 17:19:47,660 [pool-1-thread-2] INFO  c.g.bioformats2raw.Converter - requesting tile to write at [6, 9, 0, 0, 0] to /0/0
2020-11-30 17:19:47,661 [pool-1-thread-3] INFO  c.g.bioformats2raw.Converter - successfully wrote at [4, 9, 0, 0, 0] to /0/0
2020-11-30 17:19:47,661 [pool-1-thread-3] INFO  c.g.bioformats2raw.Converter - Successfully processed tile; resolution=0 plane=0 xx=4096 yy=9216 width=1024 height=784
2020-11-30 17:19:47,663 [pool-1-thread-3] INFO  c.g.bioformats2raw.Converter - requesting tile to write at [7, 9, 0, 0, 0] to /0/0
2020-11-30 17:19:47,666 [pool-1-thread-4] INFO  c.g.bioformats2raw.Converter - tile read complete 95/100
2020-11-30 17:19:47,666 [pool-1-thread-4] INFO  o

2020-11-30 17:19:47,871 [pool-1-thread-1] INFO  c.g.bioformats2raw.Converter - requesting tile to write at [3, 1, 0, 0, 0] to /0/1
2020-11-30 17:19:47,879 [pool-1-thread-3] INFO  org.perf4j.TimingLogger - start[1606753187863] time[16] tag[getTileDownsampled]
2020-11-30 17:19:47,880 [pool-1-thread-3] INFO  c.g.bioformats2raw.Converter - tile read complete 6/25
2020-11-30 17:19:47,880 [pool-1-thread-3] INFO  org.perf4j.TimingLogger - start[1606753187862] time[17] tag[getTile]
2020-11-30 17:19:47,882 [pool-1-thread-3] INFO  c.g.bioformats2raw.Converter - successfully wrote at [2, 1, 0, 0, 0] to /0/1
2020-11-30 17:19:47,882 [pool-1-thread-3] INFO  c.g.bioformats2raw.Converter - Successfully processed tile; resolution=1 plane=0 xx=2048 yy=1024 width=1024 height=1024
2020-11-30 17:19:47,884 [pool-1-thread-3] INFO  c.g.bioformats2raw.Converter - requesting tile to write at [4, 1, 0, 0, 0] to /0/1
2020-11-30 17:19:47,890 [pool-1-thread-1] INFO  org.perf4j.TimingLogger - start[1606753187871] ti

2020-11-30 17:19:48,024 [pool-1-thread-3] INFO  c.g.bioformats2raw.Converter - Successfully processed tile; resolution=1 plane=0 xx=3072 yy=3072 width=1024 height=1024
2020-11-30 17:19:48,026 [pool-1-thread-1] INFO  org.perf4j.TimingLogger - start[1606753188004] time[21] tag[getTileDownsampled]
2020-11-30 17:19:48,026 [pool-1-thread-3] INFO  c.g.bioformats2raw.Converter - requesting tile to write at [1, 4, 0, 0, 0] to /0/1
2020-11-30 17:19:48,026 [pool-1-thread-1] INFO  c.g.bioformats2raw.Converter - tile read complete 19/25
2020-11-30 17:19:48,026 [pool-1-thread-1] INFO  org.perf4j.TimingLogger - start[1606753188004] time[21] tag[getTile]
2020-11-30 17:19:48,030 [pool-1-thread-1] INFO  c.g.bioformats2raw.Converter - successfully wrote at [4, 3, 0, 0, 0] to /0/1
2020-11-30 17:19:48,030 [pool-1-thread-1] INFO  c.g.bioformats2raw.Converter - Successfully processed tile; resolution=1 plane=0 xx=4096 yy=3072 width=904 height=1024
2020-11-30 17:19:48,032 [pool-1-thread-1] INFO  c.g.bioforma

2020-11-30 17:19:48,182 [pool-1-thread-1] INFO  c.g.bioformats2raw.Converter - requesting tile to write at [2, 2, 0, 0, 0] to /0/2
2020-11-30 17:19:48,189 [pool-1-thread-2] INFO  org.perf4j.TimingLogger - start[1606753188175] time[13] tag[getTileDownsampled]
2020-11-30 17:19:48,189 [pool-1-thread-2] INFO  c.g.bioformats2raw.Converter - tile read complete 6/9
2020-11-30 17:19:48,189 [pool-1-thread-2] INFO  org.perf4j.TimingLogger - start[1606753188175] time[13] tag[getTile]
2020-11-30 17:19:48,189 [pool-1-thread-1] INFO  org.perf4j.TimingLogger - start[1606753188182] time[6] tag[getTileDownsampled]
2020-11-30 17:19:48,189 [pool-1-thread-1] INFO  c.g.bioformats2raw.Converter - tile read complete 7/9
2020-11-30 17:19:48,189 [pool-1-thread-1] INFO  org.perf4j.TimingLogger - start[1606753188182] time[7] tag[getTile]
2020-11-30 17:19:48,191 [pool-1-thread-1] INFO  c.g.bioformats2raw.Converter - successfully wrote at [2, 2, 0, 0, 0] to /0/2
2020-11-30 17:19:48,192 [pool-1-thread-1] INFO  c.g.

2020-11-30 17:19:48,660 [pool-1-thread-4] INFO  org.perf4j.TimingLogger - start[1606753188658] time[1] tag[getTile]
2020-11-30 17:19:48,661 [pool-1-thread-4] INFO  c.g.bioformats2raw.Converter - successfully wrote at [0, 0, 0, 0, 0] to /0/6
2020-11-30 17:19:48,661 [pool-1-thread-4] INFO  c.g.bioformats2raw.Converter - Successfully processed tile; resolution=6 plane=0 xx=0 yy=0 width=156 height=156
CPU times: user 186 ms, sys: 80.8 ms, total: 267 ms
Wall time: 4.57 s


**But of course produces larger output:**

In [9]:
!du -sh /tmp/output-*

 28K	/tmp/output-1
1.1M	/tmp/output-2


## OME Zarr format

Another option provided by `bioformats2raw` is `--file_type` which produces Zarr output rather than N5 as the intermediate format. If we additionally pass the `--dimension-order` argument, then the intermediate result can be used directly by the ome-zarr library.

In [12]:
%%time
!bioformats2raw i2k2020.gif /tmp/output-3 --file_type=zarr --dimension-order=XYZCT

2020-11-30 17:22:36,836 [main] INFO  c.g.bioformats2raw.Converter - Output will be incompatible with raw2ometiff (pyramidName: data.zarr, scaleFormatString: %d/%d)
2020-11-30 17:22:37,636 [main] INFO  loci.formats.ImageReader - GIFReader initializing i2k2020.gif
2020-11-30 17:22:37,638 [main] INFO  loci.formats.FormatHandler - Verifying GIF format
2020-11-30 17:22:37,638 [main] INFO  loci.formats.FormatHandler - Reading dimensions
2020-11-30 17:22:37,638 [main] INFO  loci.formats.FormatHandler - Reading data blocks
2020-11-30 17:22:37,655 [main] INFO  loci.formats.FormatHandler - Populating metadata
2020-11-30 17:22:38,180 [main] INFO  c.g.bioformats2raw.Converter - Using 1 pyramid resolutions
2020-11-30 17:22:38,181 [main] INFO  c.g.bioformats2raw.Converter - Preparing to write pyramid sizeX 128 (tileWidth: 1024) sizeY 128 (tileWidth: 1024) imageCount 8
2020-11-30 17:22:38,620 [main] WARN  c.g.bioformats2raw.Converter - Reducing active tileWidth to 128
2020-11-30 17:22:38,620 [main] W

## Moving to the cloud

You can then move the generated output to S3

In [15]:
!mc cp --recursive /tmp/output-3/data.zarr/0/ play/i2k2020/gif.zarr/

...7.0.0.0.0:  4.19 KiB / 4.19 KiB  ▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓  5.06 KiB/s 0s[0m[0m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m

You can see your image under http://hms-dbmi.github.io/vizarr?source=https://play.minio.io:9000/i2k2020/gif.zarr

### License
Copyright (C) 2019-2020 University of Dundee. All Rights Reserved.
This program is free software; you can redistribute it and/or modify it
under the terms of the GNU General Public License as published by the
Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful, but
WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
more details. You should have received a copy of the GNU General
Public License along with this program; if not, write to the
Free Software Foundation,
Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.