# Reading and writing a local N5 dataset

Create a new N5 dataset on the local filesystem using the file Key-Value Store driver :

In [None]:
import tensorstore as ts
import numpy as np
dataset = ts.open({
        'driver': 'n5',
    'kvstore': {
            'driver': 'file',
        'path': 'tmp/dataset/',
    },
    'metadata': {
            'compression': {
                'type': 'gzip'
        },
        'dataType': 'uint32',
        'dimensions': [1000, 20000],
        'blockSize': [100, 100],
    },
    'create': True,
    'delete_existing': True,
}).result()

Asynchronously write to a sub-region :

In [None]:
write_future = dataset[80:82, 99:102].write([[1, 2, 3], [4, 5, 6]])

Wait for the write to complete using tensorstore.Future.result :

In [None]:
write_future.result()

In an async function (or with top-level await support), await can also be used for interoperability with asyncio :

In [None]:
await write_future

Subscript assignment can aslo be used to write synchronously

In [None]:
dataset[80:82, 99:102] = [[1, 2, 3], [4, 5, 6]]

Read back a larger region that contains the region that was written (positions not written have the fill value of 0):

In [None]:
dataset[0:1000, 0:20000].read().result().shape

# Reading the Janelia FlyEM Hemibrain dataset

This example demonstrates accessing the Janelia FlyeEM Hemibrain 1.1 segmentation using the neuroglancer_precomputed Driver.

While this dataset is public, the gcs Key-Value Store driver currently requires that you supply Google Cloud credentials.

Open the dataset asynchronously to obtain a tensorstore.Future:

In [None]:
import tensorstore as ts
import numpy as np
dataset_future = ts.open({
    'driver':
        'neuroglancer_precomputed',
    'kvstore':
        'gs://neuroglancer-janelia-flyem-hemibrain/v1.1/segmentation/',
    # Use 100MB in-memory cache.
    'context': {
        'cache_pool': {
            'total_bytes_limit': 100_000_000
        }
    },
    'recheck_cached_data':
        'open',
})
dataset_future

wait for the open to complete:

In [None]:
dataset = dataset_future.result()
dataset

In an async function, a tensorstore.Future is also compatible with await.

In [None]:
dataset = await dataset_future

In [None]:
dataset.domain

There is only a single channel, so create a 3-d view without the 'channel' dimension:

In [None]:
dataset_3d = dataset[ts.d['channel'][0]]
dataset_3d.domain

Create a view of a 100x100x1 slice from the middle, without performing any I/O:

In [None]:
x = dataset_3d[15000:15100, 15000:15100, 20000]
x

In [None]:
x.domain

Read the slice asynchronously using the tensorstore.TensorStore.read method to obtain a tensorstore.Future:

In [None]:
read_future = x.read()

Wait for the read to complete:

In [None]:
read_future.result()

Conversion to an numpy.ndarray also implicitly performs a synchronous read (which hits the in-memory cache since the same region was just retrieved):

In [None]:
np.array(dataset_3d[15000:15100, 15000:15100, 20000])

In [None]:
store = await ts.open(
    {
        'driver': 'zarr',
        'kvstore': {
            'driver': 'memory'
        }
    },
    create=True,
    shape=[100],
    dtype=ts.uint32,
    fill_value=42)

In [None]:
store.read().result()

In [None]:
store = await ts.open(
    {
        'driver': 'n5',
        'kvstore': {
            'driver': 'memory'
        }
    },
    create=True,
    shape=[100, 200],
    dtype=ts.uint32,
    dimension_units=['5nm', '8nm'])


In [None]:
store + np.array(0)

In [None]:
dataset = ts.open({
    'driver': 'zarr',
    'kvstore': {
        'driver': 'memory'
    }
},
                  dtype=ts.uint32,
                  shape=[2, 70, 80],
                  create=True).result()
dataset.vindex[:, [5, 6, 8], [2, 5, 6]] = [[1, 2], [3, 4], [5, 6]]
dataset[:, 5:10, 0:6].read().result()

In [None]:
X = store.transaction