In [29]:
import numpy as np

from tables import *

PyTables includes support for handling enumerated types. Those types are defined by providing an exhaustive set or list of possible, named values for a variable of that type. Enumerated variables of the same type are usually compared between them for equality and sometimes for order, but are not usually operated upon.

Enumerated values have an associated name and concrete value. Every name is unique and so are concrete values. An enumerated variable always takes the concrete value, not its name. Usually, the concrete value is not used directly, and frequently it is entirely irrelevant. For the same reason, an enumerated variable is not usually compared with concrete values out of its enumerated type. For that kind of use, standard variables and constants are more adequate.

PyTables provides the Enum (see The Enum class) class to provide support for enumerated types. Each instance of Enum is an enumerated type (or enumeration). For example, let us create an enumeration of colors
All these examples can be found in examples/play-with-enums.py:

In [3]:
colorList = ['red', 'green', 'blue', 'white', 'black']
colors = Enum(colorList)

Here we used a simple list giving the names of enumerated values, but we left the choice of concrete values up to the Enum class. Let us see the enumerated pairs to check those values:

In [13]:
print("Colors:", [v for v in colors])

Colors: [('red', 0), ('green', 1), ('blue', 2), ('white', 3), ('black', 4)]


Names have been given automatic integer concrete values. We can iterate over the values in an enumeration, but we will usually be more interested in accessing single values. We can get the concrete value associated with a name by accessing it as an attribute or as an item (the later can be useful for names not resembling Python identifiers):

In [15]:
print("Value of 'red' and 'white':", (colors.red, colors.white))

Value of 'red' and 'white': (0, 3)


In [16]:
print("Value of 'yellow':", colors.yellow)

AttributeError: no enumerated value with that name: 'yellow'

In [17]:
print("Value of 'red' and 'white':", (colors['red'], colors['white']))

Value of 'red' and 'white': (0, 3)


In [18]:
print("Value of 'yellow':", colors['yellow'])

KeyError: "no enumerated value with that name: 'yellow'"

See how accessing a value that is not in the enumeration raises the appropriate exception. We can also do the opposite action and get the name that matches a concrete value by using the `__call__()` method of Enum:

In [19]:
print("Name of value %s:" % colors.red, colors(colors.red))

Name of value 0: red


In [20]:
print("Name of value 1234:", colors(1234))

ValueError: no enumerated value with that concrete value: 1234

# Enumerated columns
Columns of an enumerated type can be declared by using the EnumCol (see The Col class and its descendants) class. To see how this works, let us open a new PyTables file and create a table to collect the simulated results of a probabilistic experiment. In it, we have a bag full of colored balls; we take a ball out and annotate the time of extraction and the color of the ball:

In [37]:
h5f = open_file('tmp/enum.h5', 'w')
class BallExt(IsDescription):
    ballTime = Time32Col()
    ballColor = EnumCol(colors, 'black', base='uint8')

tbl = h5f.create_table('/', 'extractions', BallExt, title="Random ball extractions")

In [38]:
tbl

/extractions (Table(0,)) 'Random ball extractions'
  description := {
  "ballColor": EnumCol(enum=Enum({'red': 0, 'green': 1, 'blue': 2, 'white': 3, 'black': 4}), dflt='black', base=UInt8Atom(shape=(), dflt=0), shape=(), pos=0),
  "ballTime": Time32Col(shape=(), dflt=0, pos=1)}
  byteorder := 'little'
  chunkshape := (13107,)

We declared the ballColor column to be of the enumerated type colors, with a default value of black. We also stated that we are going to store concrete values as unsigned 8-bit integer values 4.

In [39]:
# Let us use some random values to fill the table:
import time
import random

now = time.time()
row = tbl.row
for i in range(10):
    row['ballTime'] = now + i
    row['ballColor'] = colors[random.choice(colorList)]  # notice this
    row.append()

In [40]:
row['ballTime'] = now + 42
row['ballColor'] = 1234

ValueError: no enumerated value with that concrete value: 1234

In [41]:
tbl.flush()
for r in tbl:
    ballTime = r['ballTime']
    ballColor = colors(r['ballColor'])  # notice this
    print("Ball extracted on %d is of color %s." % (ballTime, ballColor))

Ball extracted on 1569437048 is of color green.
Ball extracted on 1569437049 is of color blue.
Ball extracted on 1569437050 is of color black.
Ball extracted on 1569437051 is of color blue.
Ball extracted on 1569437052 is of color blue.
Ball extracted on 1569437053 is of color white.
Ball extracted on 1569437054 is of color black.
Ball extracted on 1569437055 is of color blue.
Ball extracted on 1569437056 is of color green.
Ball extracted on 1569437057 is of color blue.


As a last note, you may be wondering how to have access to the enumeration associated with ballColor once the file is closed and reopened. You can call `tbl.get_enum(‘ballColor’)` (see `Table.get_enum()`) to get the enumeration back.

# Enumerated arrays
EArray and VLArray leaves can also be declared to store enumerated values by means of the EnumAtom (see The Atom class and its descendants) class, which works very much like EnumCol for tables. Also, Array leaves can be used to open native HDF enumerated arrays.

Let us create a sample EArray containing ranges of working days as bidimensional values:

In [43]:
workingDays = {'Mon': 1, 'Tue': 2, 'Wed': 3, 'Thu': 4, 'Fri': 5}
dayRange = EnumAtom(workingDays, 'Mon', base='uint16')
earr = h5f.create_earray('/', 'days', dayRange, (0, 2), title="Working day ranges")
earr.flavor = 'python'

Nothing surprising, except for a pair of details. In the first place, we use a dictionary instead of a list to explicitly set concrete values in the enumeration. In the second place, there is no explicit Enum instance created! Instead, the dictionary is passed as the first argument to the constructor of EnumAtom. If the constructor gets a list or a dictionary instead of an enumeration, it automatically builds the enumeration from it.

In [44]:
# Now let us feed some data to the array:
wdays = earr.get_enum()
earr.append([(wdays.Mon, wdays.Fri), (wdays.Wed, wdays.Fri)])
earr.append([(wdays.Mon, 1234)])

Please note that, since we had no explicit Enum instance, we were forced to use get_enum() (see EArray methods) to get it from the array (we could also have used dayRange.enum). Also note that we were able to append an invalid value (1234). Array methods do not check the validity of enumerated values.

In [55]:
# Finally, we will print the contents of the array:
for (d1, d2) in earr:
    print("From %s to %s (%d days)." % (wdays(d1), wdays(d2), d2-d1+1))

From Mon to Fri (5 days).
From Wed to Fri (3 days).


ValueError: no enumerated value with that concrete value: 1234

That was an example of operating on concrete values. It also showed how the value-to-name conversion failed because of the value not belonging to the enumeration.

Now we will close the file, and this little tutorial on enumerated types is done:

In [56]:
h5f.close()