# NDArray: mutidimensional SChunk

NDArray functions let users perform different operations with NDArray arrays like setting, copying or slicing them.
In this section, we are going to see how to create and manipulate a NDArray array in a simple way.


In [1]:
import numpy as np

import blosc2

## Creating an array
First, we create an array, with zeros being used as the default value for uninitialized portions of the array.


In [2]:
array = blosc2.zeros((10000, 10000), dtype=np.int32)
print(array.info)

type    : NDArray
shape   : (10000, 10000)
chunks  : (80, 10000)
blocks  : (2, 10000)
dtype   : int32
cratio  : 100000.00
cparams : CParams(codec=<Codec.ZSTD: 5>, codec_meta=0, clevel=1, use_dict=False, typesize=4,
        : nthreads=11, blocksize=80000, splitmode=<SplitMode.ALWAYS_SPLIT: 1>,
        : filters=[<Filter.NOFILTER: 0>, <Filter.NOFILTER: 0>, <Filter.NOFILTER: 0>,
        : <Filter.NOFILTER: 0>, <Filter.NOFILTER: 0>, <Filter.SHUFFLE: 1>], filters_meta=[0, 0,
        : 0, 0, 0, 0], tuner=<Tuner.STUNE: 0>)
dparams : DParams(nthreads=11)



Note that all the compression and decompression parameters, as well as the chunks and blocks shapes are set to the default.

## Reading and writing data
We can access and edit NDArray arrays using NumPy.

In [3]:
array[0, :] = np.arange(10000, dtype=array.dtype)
array[:, 0] = np.arange(10000, dtype=array.dtype)

In [4]:
array[0, 0]

array(0, dtype=int32)

In [5]:
array[0, :]

array([   0,    1,    2, ..., 9997, 9998, 9999], dtype=int32)

In [6]:
array[:, 0]

array([   0,    1,    2, ..., 9997, 9998, 9999], dtype=int32)

## Persistent data
As in the SChunk, when we create a NDArray array, we can specify where it will be stored. Indeed, we can specify all the compression/decompression parameters that we can specify in a SChunk.
So as in the SChunk, to store an array on-disk we only have to specify a `urlpath` where to store the new array.


In [7]:
array = blosc2.full(
    (1000, 1000),
    fill_value=b"pepe",
    chunks=(100, 100),
    blocks=(50, 50),
    urlpath="ndarray_tutorial.b2nd",
    mode="w",
)
print(array.info)

type    : NDArray
shape   : (1000, 1000)
chunks  : (100, 100)
blocks  : (50, 50)
dtype   : |S4
cratio  : 1111.11
cparams : CParams(codec=<Codec.ZSTD: 5>, codec_meta=0, clevel=1, use_dict=False, typesize=4,
        : nthreads=11, blocksize=10000, splitmode=<SplitMode.ALWAYS_SPLIT: 1>,
        : filters=[<Filter.NOFILTER: 0>, <Filter.NOFILTER: 0>, <Filter.NOFILTER: 0>,
        : <Filter.NOFILTER: 0>, <Filter.NOFILTER: 0>, <Filter.SHUFFLE: 1>], filters_meta=[0, 0,
        : 0, 0, 0, 0], tuner=<Tuner.STUNE: 0>)
dparams : DParams(nthreads=11)



This time we even set the chunks and blocks shapes. You can now open it with modes `w`, `a` or `r`.

In [8]:
array2 = blosc2.open("ndarray_tutorial.b2nd")
print(array2.info)

type    : NDArray
shape   : (1000, 1000)
chunks  : (100, 100)
blocks  : (50, 50)
dtype   : |S4
cratio  : 1111.11
cparams : CParams(codec=<Codec.ZSTD: 5>, codec_meta=0, clevel=1, use_dict=False, typesize=4,
        : nthreads=1, blocksize=10000, splitmode=<SplitMode.ALWAYS_SPLIT: 1>,
        : filters=[<Filter.NOFILTER: 0>, <Filter.NOFILTER: 0>, <Filter.NOFILTER: 0>,
        : <Filter.NOFILTER: 0>, <Filter.NOFILTER: 0>, <Filter.SHUFFLE: 1>], filters_meta=[0, 0,
        : 0, 0, 0, 0], tuner=<Tuner.STUNE: 0>)
dparams : DParams(nthreads=1)



## Compression params
Here we can see how when we make a copy of a NDArray array we can change its compression parameters in an easy way.

In [9]:
b = np.arange(1000000).tobytes()
array1 = blosc2.frombuffer(b, shape=(1000, 1000), dtype=np.int64, chunks=(500, 10), blocks=(50, 10))
print(array1.info)

type    : NDArray
shape   : (1000, 1000)
chunks  : (500, 10)
blocks  : (50, 10)
dtype   : int64
cratio  : 7.45
cparams : CParams(codec=<Codec.ZSTD: 5>, codec_meta=0, clevel=1, use_dict=False, typesize=8,
        : nthreads=11, blocksize=4000, splitmode=<SplitMode.ALWAYS_SPLIT: 1>,
        : filters=[<Filter.NOFILTER: 0>, <Filter.NOFILTER: 0>, <Filter.NOFILTER: 0>,
        : <Filter.NOFILTER: 0>, <Filter.NOFILTER: 0>, <Filter.SHUFFLE: 1>], filters_meta=[0, 0,
        : 0, 0, 0, 0], tuner=<Tuner.STUNE: 0>)
dparams : DParams(nthreads=11)



In [10]:
cparams = blosc2.CParams(
    codec=blosc2.Codec.ZSTD,
    clevel=9,
    filters=[blosc2.Filter.BITSHUFFLE],
    filters_meta=[0],
)

array2 = array1.copy(chunks=(500, 10), blocks=(50, 10), cparams=cparams)
print(array2.info)

type    : NDArray
shape   : (1000, 1000)
chunks  : (500, 10)
blocks  : (50, 10)
dtype   : int64
cratio  : 7.49
cparams : CParams(codec=<Codec.ZSTD: 5>, codec_meta=0, clevel=1, use_dict=False, typesize=8,
        : nthreads=11, blocksize=4000, splitmode=<SplitMode.ALWAYS_SPLIT: 1>,
        : filters=[<Filter.NOFILTER: 0>, <Filter.NOFILTER: 0>, <Filter.NOFILTER: 0>,
        : <Filter.NOFILTER: 0>, <Filter.NOFILTER: 0>, <Filter.SHUFFLE: 1>], filters_meta=[0, 0,
        : 0, 0, 0, 0], tuner=<Tuner.STUNE: 0>)
dparams : DParams(nthreads=11)



## Metalayers and variable length metalayers

We have seen that you can pass to the NDArray constructor any compression or decompression parameters that you may pass to a SChunk. Indeed, you can also pass the metalayer dict. Metalayers are small metadata for informing about the properties of data that is stored on a container. As explained in [the SChunk basics](00.schunk-basics.html), there are two kinds. The first one (`meta`), cannot be deleted, must be added at construction time and can only be updated with values that have the same bytes size as the old value. They are easy to access and edit by users:

In [11]:
meta = {"dtype": "i8", "coords": [5.14, 23.0]}
array = blosc2.zeros((1000, 1000), dtype=np.int16, chunks=(100, 100), blocks=(50, 50), meta=meta)

You can work with them like if you were working with a dictionary. To access this dictionary you will use the SChunk attribute that an NDArray has.

In [12]:
array.schunk.meta

<blosc2.schunk.Meta at 0x10873b140>

In [13]:
array.schunk.meta.keys()

['b2nd', 'dtype', 'coords']

As you can see, Blosc2 internally uses these metalayers to store shapes, ndim, dtype, etc, and retrieve this data when needed in the `b2nd` metalayer.

In [14]:
array.schunk.meta["b2nd"]

[0, 2, [1000, 1000], [100, 100], [50, 50], 0, '<i2']

In [15]:
array.schunk.meta["coords"]

[5.14, 23.0]

To add a metalayer after the creation or a variable length metalayer, you can use the `vlmeta` accessor from the SChunk. As well as the `meta`, it works similarly to a dictionary.

In [16]:
print(array.schunk.vlmeta.getall())
array.schunk.vlmeta["info1"] = "This is an example"
array.schunk.vlmeta["info2"] = "of user meta handling"
array.schunk.vlmeta.getall()

{}


{b'info1': 'This is an example', b'info2': 'of user meta handling'}

You can update them with a value larger than the original one:

In [17]:
array.schunk.vlmeta["info1"] = "This is a larger example"
array.schunk.vlmeta.getall()

{b'info1': 'This is a larger example', b'info2': 'of user meta handling'}

## Creating a NDArray from a NumPy array

Let's create a NDArray from a NumPy array using the `asarray` constructor:

In [18]:
shape = (100, 100, 100)
dtype = np.float64
nparray = np.linspace(0, 100, np.prod(shape), dtype=dtype).reshape(shape)
b2ndarray = blosc2.asarray(nparray)
print(b2ndarray.info)

type    : NDArray
shape   : (100, 100, 100)
chunks  : (50, 100, 100)
blocks  : (1, 100, 100)
dtype   : float64
cratio  : 13.73
cparams : CParams(codec=<Codec.ZSTD: 5>, codec_meta=0, clevel=1, use_dict=False, typesize=8,
        : nthreads=11, blocksize=80000, splitmode=<SplitMode.ALWAYS_SPLIT: 1>,
        : filters=[<Filter.NOFILTER: 0>, <Filter.NOFILTER: 0>, <Filter.NOFILTER: 0>,
        : <Filter.NOFILTER: 0>, <Filter.NOFILTER: 0>, <Filter.SHUFFLE: 1>], filters_meta=[0, 0,
        : 0, 0, 0, 0], tuner=<Tuner.STUNE: 0>)
dparams : DParams(nthreads=11)



## Building a NDArray from a buffer

Furthermore, you can create a NDArray filled with data from a buffer:

In [19]:
rng = np.random.default_rng()
buffer = bytes(rng.normal(size=np.prod(shape)) * 8)
b2ndarray = blosc2.frombuffer(buffer, shape, dtype=dtype)
print("Compression ratio:", b2ndarray.schunk.cratio)
b2ndarray[:5, :5, :5]

Compression ratio: 1.1229628926738042


array([[[-3.96897423e+00,  6.78822852e-01, -9.73353461e+00,
         -5.25066318e+00, -8.10953899e+00],
        [-1.16942534e+00,  7.90569662e+00, -1.86902681e-02,
          1.10567731e+01, -2.28302854e+01],
        [-4.34054919e+00, -3.06629355e+00, -5.59964532e+00,
          3.65939356e+00, -7.02211724e+00],
        [-8.05212557e+00,  1.20204426e+01, -6.16981764e+00,
          4.12928644e+00, -1.08064571e+01],
        [ 6.55334654e+00,  6.79592013e+00,  7.98966925e+00,
          2.06804711e+00, -1.45707535e+01]],

       [[-3.97094655e+00,  5.26575483e+00, -2.78133428e+00,
          1.92634240e+01,  2.58091087e+00],
        [-1.99099042e+00,  1.68035645e+00,  1.97563075e+00,
         -1.43755674e+00, -7.26761318e+00],
        [-1.40747598e+01,  8.71854370e-01,  1.22266717e+00,
          1.83992836e+01,  1.03014943e+01],
        [-7.81112771e+00,  1.03845046e+00, -1.97135146e-01,
         -3.63030374e+00,  3.68253771e-01],
        [ 2.53781672e+00,  3.51551328e+00,  5.84144983e+00,
  

That's all for now.  There are more examples in the [examples directory of the git repository](https://github.com/Blosc/python-blosc2/tree/main/examples/) for you to explore.  Enjoy!