# Zarr Basics

In [1]:
import zarr
import dask.array as da

In [2]:
# Initializing arrays
z = zarr.empty(shape = (1024,1024,1024), chunks = (128,128,128))
arr = da.ones(shape=(1024,1024,1024), chunks = (128,128,128))

## Writing Data
Data can written to dask to zarr by a variety of methods. The most simple is to simply assign your data, whether in the form of a dask or numpy array, to the region of the zarr array that you wish to write to.

In [3]:
z[:] = arr
display(z)

<zarr.core.Array (1024, 1024, 1024) float64>

## Parallel writes
When writing from a dask array, it is possible to use the `to_zarr` method, which can be parallelized and distributed across multiple threads/proccesses/workers. This can be especially useful when the data is being written to some remote location, such as an S3 bucket, or Google Cloud Storage. As of now, however, this can only be used when the entire array is being written at once (i.e. not with appends or updates).

In [4]:
arr.to_zarr(z)
display(z)

<zarr.core.Array (1024, 1024, 1024) float64>

## Appending Data
Appending data to a zarr array is can be done with the `append` method. It is important to specify the axis that data is being appended along, as this operation will reshape the array accordingly. The shape of the resulting array is returned from the function.

In [5]:
arr = da.ones(shape=(1,1024,1024), chunks = (128,128,128))
z.append(arr)

(1025, 1024, 1024)

## Persistent Data
Zarr exposes a number of methods for writing and reading persistent arrays. Data can be written to and retrieved from a directory on the local filesystem, a variety of databases, and cloud storage containers. Operations on these arrays are directly reflected in these storage systems (no need for saving, etc.).

### Writing to Directory Store Example

In [6]:
# Open Zarr store
store = zarr.storage.DirectoryStore('./example.zarr')
# Retrieve group object from store. 
# Stores can store several arrays in groups, which can be arranged in heirarchies
# In this case, we simply store our array under the root group
root = zarr.group(store=store, overwrite=True)
# Create an array in root group of store
arr = root.create('array', shape=(128,128,128), chunks = (32,32,32))
# Fill array with ones
arr[:] = zarr.ones(shape=(128,128,128), chunks = (32,32,32))

### Reading from Directory Store Example

In [7]:
# Open Zarr store
store = zarr.storage.DirectoryStore('./example.zarr')
# Retrieve group objects
root = zarr.group(store=store)
# Retrieve array object
arr = root['array']
# Display contents of array
display(arr)

<zarr.core.Array '/array' (128, 128, 128) float64>