# Attributes 101

**Source:** *Python and HDF5* by Andrew Collette, O'Reilly 2013.

HDF5 attributes are the main facility to store *metadata*. In the simplest case, they are mere key/value pairs. However, in HDF5, they can be full-blown array variables, albeit without some of the conveniences of their dataset cousins (no partial I/O, no chunking or compression, etc.)

In [1]:
import numpy as np, h5py

In [2]:
f = h5py.File('attrsdemo.hdf5','w', libver="latest")

In [3]:
dset = f.create_dataset('dataset',(100,))

The `attrs` property of an HDF5 object is the gateway to it's collection of HDF5 attributes.

In [4]:
dset.attrs

<Attributes of HDF5 object at 4382102624>

## Create

HDF5 attributes can be created directly from certain Python objects, or from srcatch if more control over their creation is required.

### From Python Object

In [5]:
dset.attrs['title'] = "Dataset from third round of experiments"

In [6]:
dset.attrs['sample_rate'] = 100e6    # 100 MHz digitizer setting

In [7]:
dset.attrs['run_id'] = 144

We can "stuff" entire Python objects into attributes. *This may or may not be a good idea.*

In [8]:
import pickle

In [9]:
pickled_object = pickle.dumps({'key': 42}, protocol=0)

In [10]:
pickled_object

"(dp0\nS'key'\np1\nI42\ns."

In [11]:
dset.attrs['object'] = pickled_object

In [12]:
obj = pickle.loads(dset.attrs['object'])

In [None]:
.obj

In [None]:
dset.attrs['object']

Attributes can be full-blown array variables.

In [None]:
dset.attrs['ones'] = np.ones((100, 100))

#### (Not the) Latest File Format

In the old days, there was a 64K size limit on attributes. This is what happened back then:

In [None]:
o = h5py.File('old.hdf5','w', driver="core")

In [None]:
o.attrs

In [None]:
o.attrs['ones'] = np.ones((100, 100))

In [None]:
o.close()

### From Scratch

Creating an attribute from scratch gives you more control over its (element) type and shape.

In [None]:
dset.attrs.create('two_byte_int', 190, dtype='i2')

In [None]:
dset.attrs['two_byte_int']

## Read

In [None]:
 [(name, val) for name, val in dset.attrs.items()]

In [None]:
dset.attrs.get('run_id')

In [None]:
print(dset.attrs.get('missing'))

## Update

In [None]:
dset.attrs['run_id']

In [None]:
dset.attrs['run_id'] = 142

In [None]:
dset.attrs['run_id']

In [None]:
dset.attrs.modify('two_byte_int', 40000)

In [None]:
dset.attrs['two_byte_int']

## Delete

Unlike HDF5 objects, attributes are **not** linked and the underlying space in the file will be freed when an attribute is deleted.

In [None]:
del dset.attrs['run_id']

In [None]:
dset.attrs['run_id']

In [None]:
f.close()

### Advanced Topic for Discussion

*What if I would like to share an attribute between several objects?*