## Convert image data to h5 for ilastik

Converting your image data to hdf5 will ensure ilastik can access it in the most optimal way (block-wise).
In this notebook, we show you how to do it from Python, which is useful if you're doing some pre-processing there.
There are alternatives to convert your data to hdf5:

* When loading files, you can [specify to save them to the project file](https://www.ilastik.org/documentation/basics/dataselection#properties), which will convert to hdf5 in ilastik.
  An additional benefit of this is that your project file becomes fully portable if all images are saved to it.
* You can also load your data into Fiji and use our [Fiji plugin](https://www.ilastik.org/documentation/fiji_export/plugin) to convert your data to hdf5.

In [None]:
import h5py
import numpy
import vigra
# add additional imports to read/pre-process your data

We skip the loading part here, because this will vary for different file formats.
With `tifffile` for example you could have something like

```Python
image = tifffile.imread("/path/to/myimage.tiff")
```

For this example we generate some random data:

In [None]:
# generate 5 dimensional data
data_shape = (64, 512, 384, 10, 4)  # z, y, x, t, c
image = numpy.random.randint(0, 256, data_shape, dtype="uint8")
# specify, axistags, this helps ilastik to interpret the data correctly
# order is the same as in the generated shape
# ilastik uses the vigra library for handling axistags, internally:
axistags = vigra.defaultAxistags("zyxtc")

In [None]:
# write out the file
output_filename = "output.h5"
with h5py.File("output.h5", "a") as f:
    ds = f.create_dataset(name="mydataset", data=image, chunks=(64, 64, 64, 1, 1))
    ds.attrs["axistags"] = axistags.toJSON()

Note on chunks: This is the most important part to get the performance boost in ilastik.
In hdf5, data can be accessed in blocks (specified via the `chunks` keyword-argument in `create_dataset`.
It follows the same axis order as the image data.
Since ilastik usually processes timepoints and channels independently, we set the chunk size there to `1`.
For 3D data `64` is a sensible choice along the spacial axes.
For 2D data we recommend using `256` alon `x` and `y`.