# Tutorial

Zarr provides classes and functions for working with N-dimensional arrays that
behave like NumPy arrays but whose data is divided into chunks and each chunk is
compressed. If you are already familiar with HDF5 then Zarr arrays provide
similar functionality, but with some additional flexibility.

## Creating an array

Zarr has several functions for creating arrays. For example:

In [None]:
import zarr
z = zarr.zeros((10_000, 10_000), chunks=(1_000, 1_000), dtype='i4')
z

The code above creates a 2-dimensional array of 32-bit integers with 10000 rows
and 10000 columns, divided into chunks where each chunk has 1000 rows and 1000
columns (and so there will be 100 chunks in total).

For a complete list of array creation routines see the :mod:`zarr.creation`
module documentation.

## Reading and writing data

Zarr arrays support a similar interface to NumPy arrays for reading and writing
data. For example, the entire array can be filled with a scalar value:

In [None]:
z[:] = 42

Regions of the array can also be written to, e.g.:

In [None]:
import numpy as np
z[0, :] = np.arange(10_000)
z[:, 0] = np.arange(10_000)

In [None]:
z[0, 0]

In [None]:
z[-1, -1]

In [None]:
z[0, :]

In [None]:
z[:, 0]

In [None]:
z[:]

## Persistent arrays

In the examples above, compressed data for each chunk of the array was stored in
main memory. Zarr arrays can also be stored on a file system, enabling
persistence of data between sessions. For example:

In [None]:
z1 = zarr.open('data/example.zarr', mode='w', shape=(10_000, 10_000), chunks=(1_000, 1_000), dtype='i4')

The array above will store its configuration metadata and all compressed chunk data in a directory called 'data/example.zarr' relative to the current working directory. The `zarr.convenience.open` function provides a convenient way to create a new persistent array or continue working with an existing array. Note that although the function is called "open", there is no need to close an array: data are automatically flushed to disk, and files are automatically closed whenever an array is modified.


Persistent arrays support the same interface for reading and writing data, e.g.:

In [None]:
z1[:] = 42
z1[0, :] = np.arange(10_000)
z1[:, 0] = np.arange(10_000)

Check that the data have been written and can be read again:

In [None]:
z2 = zarr.open('data/example.zarr', mode='r')
np.all(z1[:] == z2[:])

## Resizing and appending

## Compressors

## Filters

Some filters stuff

## Groups

## Array and group diagnostics

## User attributes

## Advanced indexing

## Indexing with coordinate arrays

## Indexing with a mask array

## Orthogonal indexing

## Indexing fields in structured arrays

## Storage alternatives



### Distributed/cloud storage

### Consolidating metadata

## String arrays

## Object arrays

## Ragged arrays

## Chunk optimizations

### Chunk size and shape



### Chunk memory layout

## Parallel computing and synchronization

## Pickle support

## Datetimes and timedeltas

## Usage tips

### Copying large arrays

### Configuring Blosc