# 3. Converting data to OME-NGFF (practical)

***

**ELMI 2021 NGFF Workshop**, 25 June 2021

***


## Summary

* 3.1. Data from IDR
* 3.2. Converting your data to OME-NGFF
* 3.3. Publishing your data with S3
* 3.4. Extras (time-permitting)

***

## 3.1. Data from S3
We're going to start off by looking at some images you will likely have seen during the OMERO or IDR sessions.

**Our goal is to share these *without* using an OMERO.**

<table>
    <tr>
        <td>
            <img alt="idr0062 thumbnails" src="images/training-1.png" style="height:150px"/>
        </td>
        <td>
            <img alt="idr0062 thumbnails" src="images/training-2.png" style="height:150px"/>
        </td>
        <td>
            <img alt="idr0023 3D screenshot" src="images/training-3.png" style="height:150px"/>
        </td>
    </tr>
</table>
    
The left two images are from  the ilastik plugin guide presented by Petr: https://omero-guides.readthedocs.io/en/latest/ilastik/docs/ilastik_fiji.html

They are available in the "idr0062" project on the workshop server: https://workshop.openmicroscopy.org/webclient/?show=project-1952

The original dataset can be found in IDR study idr0062 by Blin _et al._: https://idr.openmicroscopy.org/webclient/?show=project-801

The image on the right is from idr0023 by Szymborska _et al_: http://idr.openmicroscopy.org/webclient/?show=project-52 and is **much** smaller.


***

## 3.2 Converting your data to OME-NGFF

The two basic commands are `bioformats2raw` and `raw2ometiff`. Together they provide a pipeline to scalably convert large images into OME-TIFF. The primary caveat is that they require **twice** the storage for the conversion.


### 3.2.1 Conversion tools

https://forum.image.sc/t/converting-whole-slide-images-to-ome-tiff-a-new-workflow/32110/4


<img src="images/conversion.png" style="height:400px" />

In [1]:
!bioformats2raw --version

Version = 0.3.0-rc4
Bio-Formats version = 6.5.1


In [2]:
!bioformats2raw

Missing required parameters: <inputPath>, <outputPath>
Usage: [1m<main class>[21m[0m [[33m-p[39m[0m] [[33m--no-hcs[39m[0m] [[33m--[no-]nested[39m[0m] [[33m--no-root-group[39m[0m]
                    [[33m--overwrite[39m[0m] [[33m--version[39m[0m] [[33m--debug[39m[0m[=[3m<logLevel>[23m[0m]]
                    [[33m--extra-readers[39m[0m[=[3m<extraReaders>[23m[0m[,[3m<extraReaders>[23m[0m...]]]...
                    [[33m--options[39m[0m[=[3m<readerOptions>[23m[0m[,[3m<readerOptions>[23m[0m...]]]... [[33m-s[39m[0m
                    [=[3m<seriesList>[23m[0m[,[3m<seriesList>[23m[0m...]]]...
                    [[33m--additional-scale-format-string-args[39m[0m=[3m<additionalScaleForma[23m[0m
[3m                    tStringArgsCsv>[23m[0m] [[33m-c[39m[0m=[3m<compressionType>[23m[0m]
                    [[33m--dimension-order[39m[0m=[3m<dimensionOrder>[23m[0m]
                    [[33m--downsample-type[39m[0m=[3

In [3]:
!java --version

openjdk 11.0.9.1-internal 2020-11-04
OpenJDK Runtime Environment (build 11.0.9.1-internal+0-adhoc..src)
OpenJDK 64-Bit Server VM (build 11.0.9.1-internal+0-adhoc..src, mixed mode)


In [4]:
import os
os.environ["JAVA_OPTS"]="--illegal-access=deny"

In [8]:
%%time
!bioformats2raw --overwrite trans_norm.tif trans_norm.ome.zarr

It's highly recommended that you fix the library with 'execstack -c <libfile>', or link it with '-z noexecstack'.
CPU times: user 66 ms, sys: 42.3 ms, total: 108 ms
Wall time: 4.12 s


In [9]:
!find trans_norm.ome.zarr -name ".z*"

trans_norm.ome.zarr/.zgroup
trans_norm.ome.zarr/.zattrs
trans_norm.ome.zarr/0/.zgroup
trans_norm.ome.zarr/0/.zattrs
trans_norm.ome.zarr/0/0/.zarray


In [10]:
!ls -ltrah trans_norm.ome.zarr/0/0/0/0/0/0/0

-rw-r--r-- 1 jmarie jmarie 1.2K Jun 23 15:39 trans_norm.ome.zarr/0/0/0/0/0/0/0


In [11]:
!ome_zarr info trans_norm.ome.zarr/0

/home/jmarie/trans_norm.ome.zarr/0 [zgroup]
 - metadata
   - Multiscales
 - data
   - (1, 1, 571, 30, 30)


***

## 3.3. Publishing your data with S3

You can then move the generated output to S3. Note: one of the most frequent mistakes here is the slash (`/`) at the end of the commands.x

In [12]:
YOURNAME = input()

josh


In [13]:
!time mc cp --recursive trans_norm.ome.zarr/0/ elmi2021/idr-upload/elmi2021/$YOURNAME/my_trans_norm.ome.zarr/

.../0/98/0/0:  773.27 KiB / 773.27 KiB ┃▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓┃ 18.10 KiB/s 42s[0m[0m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[

In [14]:
!mc cat elmi2021/idr-upload/elmi2021/$YOURNAME/my_trans_norm.ome.zarr/.zattrs

{
  "multiscales" : [
    {
      "metadata" : {
        "method" : "loci.common.image.SimpleImageScaler",
        "version" : "Bio-Formats 6.5.1"
      },
      "datasets" : [
        {
          "path" : "0"
        }
      ],
      "version" : "0.2"
    }
  ]
}

In the cell below, please enter the password used [What is the "Cloud"?](2_Minio.ipynb).
The password was sent prior to the workshop.

In [15]:
import getpass
import os

os.environ["S3FS_LOGGING_LEVEL"] = "WARN"
os.environ["FSSPEC_CONFIG_DIR"] = "/tmp"
os.environ["AWS_ACCESS_KEY_ID"] = "elmi2021"
os.environ["AWS_SECRET_ACCESS_KEY"] = getpass.getpass()
with open("/tmp/conf.json", "w") as o:
    o.write("""
    {"s3":
        {"client_kwargs":
          {"endpoint_url": "https://idr-ftp.openmicroscopy.org"}
        }
    }""")

!ome_zarr -qqq info s3://idr-upload/elmi2021/josh/my_trans_norm.ome.zarr/

········
s3://idr-upload/elmi2021/josh/my_trans_norm.ome.zarr/ [zgroup]
 - metadata
   - Multiscales
 - data
   - (1, 1, 571, 30, 30)


In [16]:
from IPython.display import Video
Video("images/idr0023.mp4")

# 3.4 Extras

## 3.4.1 Renaming

Another important distinction to filesystems is that though it looks like hello is in a directory, you should really think of the entire string after the bucket just as a "key".

In [17]:
!mc mv --recursive elmi2021/idr-upload/elmi2021/$YOURNAME/my_trans_norm.ome.zarr/ elmi2021/idr-upload/elmi2021/$YOURNAME/public_trans_norm.ome.zarr

[31;3;1mmc: <ERROR> [0m[31;3;1mUnable to get bucket lock configuration of `elmi2021/idr-upload/elmi2021/josh/my_trans_norm.ome.zarr/`. Access Denied.
[0m

## 3.4.2 omero-cli-zarr

Loading the image from: https://outreach.openmicroscopy.org/webclient/img_detail/55204/?dataset=6107. In this case, you will need the password to connect to the OMERO.server.
This is different from the password used previously.
Enter the password and click Enter. Move manually to the next cell.

In [1]:
# Another block to get your workshop password from a previous session
import getpass
workshop_pass = getpass.getpass()

········


In [3]:
!omero login trainer-1@wss://outreach.openmicroscopy.org/omero-ws -w $workshop_pass

Created session for trainer-1@wss://outreach.openmicroscopy.org/omero-ws:443. Idle timeout: 10 min. Current group: Lab1


In [4]:
!rm -rf 55204.zarr
!time omero zarr export Image:55204

Using session for trainer-1@wss://outreach.openmicroscopy.org/omero-ws:443. Idle timeout: 10 min. Current group: Lab1
Exporting to 55204.zarr (0.2)
Finished.

real	1m4.184s
user	0m7.230s
sys	0m3.500s


In [18]:
!find 55204.zarr -name ".z*"

55204.zarr/.zattrs
55204.zarr/.zgroup
55204.zarr/0/.zarray
55204.zarr/1/.zarray
55204.zarr/2/.zarray


## 3.4.3 Other resources

<table>
    <tr>
        <td>
            <a href="https://downloads.openmicroscopy.org/presentations/2020/Dundee/Workshops/NGFF/zarr_diagram/">
<img src="images/resources-1.png" alt="Screenshot of the Zarr diagram from OME2020" style="height:200px"/>
            </a>
        </td>
        <td>
<a href="https://downloads.openmicroscopy.org/presentations/2020/Dundee/Workshops/NGFF/zarr_diagram/">Diagram for how data moves</a>
        </td>
    </tr>
    <tr>
        <td>
      <a href="https://blog.openmicroscopy.org/file-formats/community/2020/11/04/zarr-data/">      
<img src="images/resources-2.png" alt="Screenshot of the Zarr diagram from OME2020" style="height:200px"/>
            </a>
        </td>
        <td>
<a href="https://blog.openmicroscopy.org/file-formats/community/2020/11/04/zarr-data/">Blog post for an easy way to publish OME-Zarr files</a>
        </td>
    </tr>
</table>    

### License (BSD 2-Clause)

Copyright (c) 2021, University of Dundee All rights reserved.

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.