# Slicing and Views

ironArray is meant to store large arrays, but in practice you only want to access single elements or small parts of them.  Here you will learn how to do that with the help of so called *views*.  Views are just references to the part of a larger array that is interesting.

Let's start by creating an array in memory:

In [17]:
import numpy as np
import iarray as ia

dtshape = ia.DTShape((1000, 10000), np.float64)
arr = ia.arange(dtshape)
arr

<IArray (1000, 10000) np.float64>

Now suppose that we just want to access the first 2x2 square.  We just use the NumPy notation for doing this:

In [7]:
view = arr[0, 0:2]
view

<IArray (2,) np.float64>

as you see, we get another `IArray` instance, but a special one: a *view*.  You can always check whether an array is a view or not with the `is_view()` method:

In [19]:
view.is_view()

True

Of course, our initial array is not a view:

In [20]:
arr.is_view()

False

You can visualize the data in the view in the same way than a regular `IArray`, i.e. using `ia.iarray2numpy`.  However, there is a handy shortcut for doing the same thing:

In [21]:
view.data


array([[0.0000e+00, 1.0000e+00],
       [1.0000e+04, 1.0001e+04]])

In general, it is pretty common to use the `.data` accessor when you want to retrieve the actual data out of an `IArray`:

In [13]:
arr[:4, 0].data

array([    0., 10000., 20000., 30000.])

So, retrieving the interesting data out of your IArray is pretty similar to NumPy convention.

**Note:** IArray objects support most of the NumPy indexing syntax for getting slices with some exceptions:

1) IArray does not have support for strides (the `step` part of slices).

2) It does not implement advanced indexing tricks.

At any rate, whenever you want to use this functionality, you can always get a NumPy array out of an IArray (or a view of it) and apply your desired indexing there.  Remember that ironArray is meant for handling very large arrays, so there is no shame in getting the interesting slice as a NumPy object and then do your work over it.

Finally, indexing also applies to arrays that are stored persistently on disk.  ironArray will use the information about the data you want and will read and decompress only the part that is necessary.  And due to the double partitioning and fast compression codecs, this is in general very fast.  See the `Optimization Tips` section on how to fine-tune the slicing of arrays.

## Optimization Tips

ironArray offers two levels of partitioning, the chunk and the block, and that means more degree of flexibility in optimizing I/O when compared to other solutions offering just one single level (Zarr, HDF5...).  In addition, if you don't say anything default values for chunkshape and blockshape (based on CPU's cache sizes).  However, you can still override defaults and fine tune I/O manually.

Here it comes a small tutorial on how to do this fine tuning.  Let's start with a large array that is on-disk; as we don't want to use too much disk space, let's use a zero-filled array which compresses very well.  We are going to exercise slicing in two different directions: getting 2 rows and getting 2 columns.  Here it is how this performs:

In [23]:
import numpy as np
import iarray as ia

SHAPE = (10000, 10000)
x = np.linspace(1, 10, SHAPE[0])
y = np.linspace(1, 10, SHAPE[1])
xx, yy = np.meshgrid(x, y, sparse=True)
z = np.sin(xx**2 + yy**2) / (xx**2 + yy**2)

dtshape = ia.DTShape(SHAPE, np.float64)
storage= ia.Storage(filename="large-arr.iarr")
with ia.config(fp_mantissa_bits=10, storage=storage) as cfg:
    arr = ia.numpy2iarray(z, cfg=cfg)
print(f"cratio: {arr.cratio:.1f}")
del x, y, xx, yy

%time arr[:, 10:12].data
%time arr[10:12, :].data

cratio: 9.2
CPU times: user 5.8 ms, sys: 10.4 ms, total: 16.2 ms
Wall time: 6.19 ms
CPU times: user 2.89 ms, sys: 6.77 ms, total: 9.66 ms
Wall time: 3.04 ms


array([[0.4465332 , 0.44580078, 0.44506836, ..., 0.00431442, 0.00447464,
        0.00463104],
       [0.44580078, 0.44506836, 0.44433594, ..., 0.00432968, 0.0044899 ,
        0.0046463 ]])

As you see, the total time being smaller than the wall time is an indication of ironArray's parallel reading capabilities.

For comparison, let's see how Zarr and HDF5 behaves at doing the same thing:

In [24]:
import zarr
from numcodecs import Blosc
from functools import reduce

lz4 = Blosc(
    cname="lz4",
    clevel=9,
    shuffle=Blosc.SHUFFLE,
    blocksize=reduce(lambda x, y: x * y, arr.blockshape) * np.dtype(arr.dtype).itemsize,
)

store = zarr.DirectoryStore('large-arr.zarr')
z_arr = zarr.empty(SHAPE, dtype=np.float64, store=store, overwrite=True, compressor=lz4)
arr.copyto(z_arr)
print(f"cratio: {z_arr.nbytes / z_arr.nbytes_stored:.1f}")

%time z_arr[:, 10:12]
%time z_arr[10:12, :]

cratio: 10.0
CPU times: user 15.7 ms, sys: 6.04 ms, total: 21.7 ms
Wall time: 10.1 ms
CPU times: user 7.6 ms, sys: 3.13 ms, total: 10.7 ms
Wall time: 5.05 ms


array([[0.4465332 , 0.44580078, 0.44506836, ..., 0.00431442, 0.00447464,
        0.00463104],
       [0.44580078, 0.44506836, 0.44433594, ..., 0.00432968, 0.0044899 ,
        0.0046463 ]])

In [25]:
import h5py
f = h5py.File("large-arr.hdf5", "w")
# We are using LZF codec which is a fast codec distributed with h5py
h5_arr = f.create_dataset("h5_arr", SHAPE, dtype=np.float64, chunks=True, compression="lzf", shuffle=True)
arr.copyto(h5_arr)

data = None
%time data = h5_arr[:, 10:12]
%time data = h5_arr[10:12, :]
f.close()
data

CPU times: user 20.7 ms, sys: 922 µs, total: 21.7 ms
Wall time: 21.7 ms
CPU times: user 9.49 ms, sys: 200 µs, total: 9.69 ms
Wall time: 9.7 ms


array([[0.4465332 , 0.44580078, 0.44506836, ..., 0.00431442, 0.00447464,
        0.00463104],
       [0.44580078, 0.44506836, 0.44433594, ..., 0.00432968, 0.0044899 ,
        0.0046463 ]])

For further optimization, let's have a look at the chunkshape/blockshape that is assigned automatically to your ironArray array:

In [26]:
arr.chunkshape, arr.blockshape

((512, 512), (64, 128))

Now, let's suppose that you prefer to accelerate the access in the **row** direction, and that the row strips will be thin in general (but still, you don't want to penalize accesses in **column** direction too much). In this case, you might want to select a blockshape that stretches more in the row direction (but without making the block too small):

In [27]:
storage = ia.Storage((512, 512), (32, 512), filename="large-arr-2.iarr")
with ia.config(fp_mantissa_bits=30, storage=storage) as cfg:
    arr = ia.numpy2iarray(z, cfg=cfg)
print(f"cratio: {arr.cratio:.1f}")
%time arr[:, 10:12].data
%time arr[10:12, :].data

cratio: 2.2
CPU times: user 17.3 ms, sys: 20 ms, total: 37.2 ms
Wall time: 7.64 ms
CPU times: user 2.11 ms, sys: 6.3 ms, total: 8.41 ms
Wall time: 3.47 ms


array([[0.44677258, 0.44598785, 0.44520237, ..., 0.00431732, 0.00447622,
        0.00463364],
       [0.44598079, 0.44519602, 0.44441049, ..., 0.00433343, 0.00449218,
        0.00464944]])

ok, so by just changing the blockshape and adapting it to our read pattern may or *may not* improve the slicing times.  A better bet is adapting both the blockshape *and* the chunkshape as well:

In [28]:
storage = ia.Storage((128, 2048), (8, 2048), filename="large-arr-2.iarr")
with ia.config(fp_mantissa_bits=30, storage=storage) as cfg:
    arr = ia.numpy2iarray(z, cfg=cfg)
print(f"cratio: {arr.cratio:.1f}")
%time arr[:, 10:12].data
%time arr[10:12, :].data

cratio: 2.1
CPU times: user 57.3 ms, sys: 78.1 ms, total: 135 ms
Wall time: 22.3 ms
CPU times: user 1.18 ms, sys: 2.6 ms, total: 3.77 ms
Wall time: 2 ms


array([[0.44677258, 0.44598785, 0.44520237, ..., 0.00431732, 0.00447622,
        0.00463364],
       [0.44598079, 0.44519602, 0.44441049, ..., 0.00433343, 0.00449218,
        0.00464944]])

In this case, we see that we have been able to get much better speed in the row axis (at the cost of being slower in the column axis).  Finally, let's increase the chunkshape to the maximum in the row axis:

In [29]:
storage = ia.Storage((128, 10000), (2, 10000), filename="large-arr-3.iarr")
with ia.config(fp_mantissa_bits=30, storage=storage) as cfg:
    arr = ia.numpy2iarray(z, cfg=cfg)
print(f"cratio: {arr.cratio:.1f}")
%time arr[:, 10:12].data
%time arr[10:12, :].data

cratio: 2.0
CPU times: user 486 ms, sys: 454 ms, total: 940 ms
Wall time: 127 ms
CPU times: user 787 µs, sys: 649 µs, total: 1.44 ms
Wall time: 1.39 ms


array([[0.44677258, 0.44598785, 0.44520237, ..., 0.00431732, 0.00447622,
        0.00463364],
       [0.44598079, 0.44519602, 0.44441049, ..., 0.00433343, 0.00449218,
        0.00464944]])

In this case we have been able to reduce the times in the row axis to a bare minimum, but times for column-wise access went up to values that are a bit too much.  What is the best partition configuration largely depends on your needs.  In this case it is pretty clear that we prefer the previous configuration.

It is worth noting that compression ratio has been badly affected because of the new partitionings.  In general, when you change the partition shapes, you should expect significant changes in compression ratios.

Just for comparison purposes, let's see how a similar configuration affects to Zarr and HDF5:

In [30]:
store = zarr.DirectoryStore('large-arr-2.zarr')
z_arr = zarr.empty(SHAPE, dtype=np.float64, store=store, overwrite=True, chunks=(128, 2048), compressor=lz4)
arr.copyto(z_arr)
print(f"cratio: {z_arr.nbytes / z_arr.nbytes_stored:.1f}")

%time z_arr[:, 10:12]
%time z_arr[10:12, :]

cratio: 2.0
CPU times: user 87.7 ms, sys: 35.9 ms, total: 124 ms
Wall time: 57.3 ms
CPU times: user 9.1 ms, sys: 2.75 ms, total: 11.8 ms
Wall time: 4.97 ms


array([[0.44677258, 0.44598785, 0.44520237, ..., 0.00431732, 0.00447622,
        0.00463364],
       [0.44598079, 0.44519602, 0.44441049, ..., 0.00433343, 0.00449218,
        0.00464944]])

In [31]:
f = h5py.File("large-arr-2.hdf5", "w")
h5_arr = f.create_dataset("h5_arr", SHAPE, dtype=np.float64, chunks=(128, 2048), compression="lzf", shuffle=True)
arr.copyto(h5_arr)

data = None
%time data = h5_arr[:, 10:12]
%time data = h5_arr[10:12, :]
f.close()
data

CPU times: user 215 ms, sys: 16.4 ms, total: 231 ms
Wall time: 231 ms
CPU times: user 13.6 ms, sys: 1.37 ms, total: 15 ms
Wall time: 15 ms


array([[0.44677258, 0.44598785, 0.44520237, ..., 0.00431732, 0.00447622,
        0.00463364],
       [0.44598079, 0.44519602, 0.44441049, ..., 0.00433343, 0.00449218,
        0.00464944]])

So, we see that ironArray, by providing two levels of partitioning, it has more fine-grained flexibility in adapting to different I/O patterns.  Also, this two-level partitions allow for better reducing the number of data read from disk (or memory) than similar solutions with just one level partitioning.


So, we see that ironArray, by providing two levels of partitioning, it has more fine-grained flexibility in adapting to different I/O patterns.  Also, this two-level partitions allow for better reducing the number of data read from disk (or memory) than similar solutions with just one level partitioning.
