# Introduction to Dense TileDB Arrays

## About this tutorial

This is a simple example into creating, reading, and writing dense TileDB arrays.

## Resources

* [TileDB Embedded Docs: API Usage](https://docs.tiledb.com/main/solutions/tiledb-embedded/api-usage)
* [TileDB-Py API Docs](https://tiledb-inc-tiledb.readthedocs-hosted.com/projects/tiledb-py/en/stable/python-api.html#)
* [TileDB-Py Examples](https://github.com/TileDB-Inc/TileDB-Py/tree/dev/examples)

In [1]:
import tiledb
import numpy as np

dense_array_uri = "arrays/dense"

In [2]:
import shutil

# clean up any previous runs
try:
    shutil.rmtree(dense_array_uri)
except:
    pass

## Dense arrays

A dense TileDB array stores data in every cell of the arrays. Let's suppose we have some integer data and some floating-point data defined on 4-by-4 matrices.

In [3]:
data_a = np.array(
    (
        [1, 2, 3, 4],
        [5, 6, 7, 8],
        [9, 10, 11, 12],
        [13, 14, 15, 16]
    )
)
data_b = np.random.random_sample(16).reshape(4, 4)
data_b

array([[0.52037498, 0.0566791 , 0.85427941, 0.97244928],
       [0.01249442, 0.0593756 , 0.39000132, 0.21699454],
       [0.92515826, 0.44584226, 0.50435509, 0.84017679],
       [0.18086935, 0.45255558, 0.74427037, 0.89040362]])

To write this data to a dense array, first create the [Schema](https://tiledb-inc-tiledb-py.readthedocs-hosted.com/en/stable/python-api.html#array-schema) with the [Domain](https://tiledb-inc-tiledb-py.readthedocs-hosted.com/en/stable/python-api.html#domain), containing the [Dimensions](https://tiledb-inc-tiledb-py.readthedocs-hosted.com/en/stable/python-api.html#dimension), and [Attributes](https://tiledb-inc-tiledb-py.readthedocs-hosted.com/en/stable/python-api.html#tiledb.Attr) that define the shape, size and data type of the dense array:

In [4]:
rows = tiledb.Dim(name="rows", domain=(1, 4), tile=4, dtype=np.int32)
cols = tiledb.Dim(name="cols", domain=(1, 4), tile=4, dtype=np.int32)
dom = tiledb.Domain(rows, cols)
attr_a = tiledb.Attr(name="a", dtype=np.int32, filters=tiledb.FilterList([tiledb.ZstdFilter(7)]))
attr_b = tiledb.Attr(name="b", dtype=np.float64, filters=tiledb.FilterList([tiledb.ZstdFilter(7)]), fill=np.nan)
schema = tiledb.ArraySchema(domain=dom, attrs=[attr_a, attr_b])
print(schema)

ArraySchema(
  domain=Domain(*[
    Dim(name='rows', domain=(1, 4), tile='4', dtype='int32'),
    Dim(name='cols', domain=(1, 4), tile='4', dtype='int32'),
  ]),
  attrs=[
    Attr(name='a', dtype='int32', var=False, nullable=False, filters=FilterList([ZstdFilter(level=7), ])),
    Attr(name='b', dtype='float64', var=False, nullable=False, filters=FilterList([ZstdFilter(level=7), ])),
  ],
  cell_order='row-major',
  tile_order='row-major',
  capacity=10000,
  sparse=False,
  coords_filters=FilterList([ZstdFilter(level=-1)]),
)



Now, create the (empty) array on disk:

In [5]:
tiledb.Array.create(dense_array_uri, schema)

Let's see what this looks like on disk:

In [6]:
%ls $dense_array_uri

__array_schema.tdb  __lock.tdb  [0m[01;34m__meta[0m/


What happens when we try reading the values now before anything was written?  We [open](https://tiledb-inc-tiledb.readthedocs-hosted.com/projects/tiledb-py/en/stable/python-api.html?highlight=uri#tiledb.open), and we print the [non-emtpy domain](https://tiledb-inc-tiledb.readthedocs-hosted.com/projects/tiledb-py/en/stable/python-api.html#tiledb.libtiledb.Array.nonempty_domain) and data in the array.

In [7]:
with tiledb.open(dense_array_uri, mode="r") as array:
    print(f"Non-empty domain: {array.nonempty_domain()}")
    data = array[:, :]
for name, values in data.items():
    print(f"{name}: \n{values}")

Non-empty domain: None
a: 
[[-2147483648 -2147483648 -2147483648 -2147483648]
 [-2147483648 -2147483648 -2147483648 -2147483648]
 [-2147483648 -2147483648 -2147483648 -2147483648]
 [-2147483648 -2147483648 -2147483648 -2147483648]]
b: 
[[nan nan nan nan]
 [nan nan nan nan]
 [nan nan nan nan]
 [nan nan nan nan]]


Now we open and write the data to the array:

In [8]:
with tiledb.open(dense_array_uri, mode="w") as array:
    array[:] = {"a": data_a, "b": data_b}


Let's take another look at the array on disk:

In [9]:
%ls $dense_array_uri

[0m[01;34m__1632770590808_1632770590808_f510a0caad8f4ff1b4f5f0c5ccfe6312_9[0m/    __lock.tdb
__1632770590808_1632770590808_f510a0caad8f4ff1b4f5f0c5ccfe6312_9.ok  [01;34m__meta[0m/
__array_schema.tdb


Now you can try reading the data again:

In [10]:
with tiledb.open(dense_array_uri, mode="r") as array:
    print(f"Non-empty domain: {array.nonempty_domain()}")
    data = array[:, :]
for name, values in data.items():
    print(f"{name}: \n{values}")

Non-empty domain: ((array(1, dtype=int32), array(4, dtype=int32)), (array(1, dtype=int32), array(4, dtype=int32)))
a: 
[[ 1  2  3  4]
 [ 5  6  7  8]
 [ 9 10 11 12]
 [13 14 15 16]]
b: 
[[0.52037498 0.0566791  0.85427941 0.97244928]
 [0.01249442 0.0593756  0.39000132 0.21699454]
 [0.92515826 0.44584226 0.50435509 0.84017679]
 [0.18086935 0.45255558 0.74427037 0.89040362]]


We can also read subsets of an array. The [multi_index](https://tiledb-inc-tiledb.readthedocs-hosted.com/projects/tiledb-py/en/stable/python-api.html#tiledb.libtiledb.Array.multi_index) function allows querying with values, slices, or lists of values. For instance, we can read the values at rows 1-3 only for columns 1 and 4, and print the results:

In [11]:
with tiledb.open(dense_array_uri, mode="r") as array:
    data = array.multi_index[1:3, [1,3]] # returns a dictionary of values
    for name, values in data.items():
        print(f"{name}: \n{values}")

a: 
[[ 1  3]
 [ 5  7]
 [ 9 11]]
b: 
[[0.52037498 0.85427941]
 [0.01249442 0.39000132]
 [0.92515826 0.50435509]]


Read the values of the array just at row=0 and 3, and col=1:

We can also write to subsets of an array:

In [12]:
with tiledb.open(dense_array_uri, mode="w") as array:
    array[1, 1] = {"a": np.array([-1]), "b": np.array([0.0])}

Or a subset of only 1 attribute in the array:

In [13]:
with tiledb.open(dense_array_uri, mode="w", attr="b") as array:
    array[4, 4] = -1.0

Write new value for attribute `a` along row=1:

In [14]:
# Write new values for row 4 or the array

Simple key value pairs can be added as metadata:

In [15]:
with tiledb.open(dense_array_uri, mode="w") as array:
    array.meta["description"] = "a simple intro to dense arrays"
    array.meta["some value"] = 1.0

In [16]:
with tiledb.open(dense_array_uri) as array:
    for key, value in array.meta.items():
        print(f"{key}: {(type(value))} {value}")

description: <class 'str'> a simple intro to dense arrays
some value: <class 'float'> 1.0


Try reading and writing some additional data to TileDB: