In [2]:
import numpy as np

from tables import *

PyTables supports the handling of nested structures (or nested datatypes, as you prefer) in table objects, allowing you to define arbitrarily nested columns.

An example will clarify what this means. Let’s suppose that you want to group your data in pieces of information that are more related than others pieces in your table, So you may want to tie them up together in order to have your table better structured but also be able to retrieve and deal with these groups more easily.

You can create such a nested substructures by just nesting subclasses of IsDescription. Let’s see one example (okay, it’s a bit silly, but will serve for demonstration purposes):

In [3]:
class Info(IsDescription):
    """A sub-structure of Test"""
    _v_pos = 2   # The position in the whole structure
    name = StringCol(10)
    value = Float64Col(pos=0)


colors = Enum(['red', 'green', 'blue'])


class NestedDescr(IsDescription):
    """A description that has several nested columns"""
    color = EnumCol(colors, 'red', base='uint32')
    info1 = Info()


    class info2(IsDescription):
        _v_pos = 1
        name = StringCol(10)
        value = Float64Col(pos=0)


        class info3(IsDescription):
            x = Float64Col(dflt=1)
            y = UInt8Col(dflt=1)

The root class is NestedDescr and both info1 and info2 are substructures of it. Note how info1 is actually an instance of the class Info that was defined prior to NestedDescr. Also, there is a third substructure, namely info3 that hangs from the substructure info2. You can also define positions of substructures in the containing object by declaring the special class attribute _v_pos.

# Nested table creation

In [7]:
# Now that we have defined our nested structure, let’s create a nested table, 
# that is a table with columns that contain other subcolumns:
fileh = open_file("tmp/nested-tut.h5", "w")
table = fileh.create_table(fileh.root, 'table', NestedDescr)

In [9]:
table

/table (Table(0,)) ''
  description := {
  "info2": {
    "value": Float64Col(shape=(), dflt=0.0, pos=0),
    "info3": {
      "x": Float64Col(shape=(), dflt=1.0, pos=0),
      "y": UInt8Col(shape=(), dflt=1, pos=1)},
    "name": StringCol(itemsize=10, shape=(), dflt=b'', pos=2)},
  "info1": {
    "value": Float64Col(shape=(), dflt=0.0, pos=0),
    "name": StringCol(itemsize=10, shape=(), dflt=b'', pos=1)},
  "color": EnumCol(enum=Enum({'red': 0, 'green': 1, 'blue': 2}), dflt='red', base=UInt32Atom(shape=(), dflt=0), shape=(), pos=2)}
  byteorder := 'little'
  chunkshape := (1337,)

In [13]:
row = table.row
for i in range(10):
    row['color'] = colors[['red', 'green', 'blue'][i%3]]
    row['info1/name'] = "name1-%s" % i
    row['info2/name'] = "name2-%s" % i
    row['info2/info3/y'] =  i
    # All the rest will be filled with defaults
    row.append()

table.flush()
table.nrows

30

In [21]:
table.read(0, 3)['info1']['name']

array([b'name1-0', b'name1-1', b'name1-2'], dtype='|S10')

In [22]:
table.description

{
  "info2": {
    "value": Float64Col(shape=(), dflt=0.0, pos=0),
    "info3": {
      "x": Float64Col(shape=(), dflt=1.0, pos=0),
      "y": UInt8Col(shape=(), dflt=1, pos=1)},
    "name": StringCol(itemsize=10, shape=(), dflt=b'', pos=2)},
  "info1": {
    "value": Float64Col(shape=(), dflt=0.0, pos=0),
    "name": StringCol(itemsize=10, shape=(), dflt=b'', pos=1)},
  "color": EnumCol(enum=Enum({'red': 0, 'green': 1, 'blue': 2}), dflt='red', base=UInt32Atom(shape=(), dflt=0), shape=(), pos=2)}