Taken from http://docs.h5py.org/en/stable/quick.html#appendix-creating-a-file

In [1]:
import h5py
import numpy as np
f = h5py.File("mytestfile.hdf5", "w")

  from ._conv import register_converters as _register_converters


One of them is create_dataset, which as the name suggests, creates a data set of given shape and dtype

In [2]:
dset = f.create_dataset("mydataset", (100,), dtype='i')

The File object is a context manager; so the following code works too

In [3]:
# with h5py.File("mytestfile.hdf5", "w") as f:
#     dset = f.create_dataset("mydataset", (100,), dtype='i')

“HDF” stands for “Hierarchical Data Format”. Every object in an HDF5 file has a name, and they’re arranged in a POSIX-style hierarchy with /-separators:

In [4]:
dset.name

'/mydataset'

The “folders” in this system are called groups. The File object we created is itself a group, in this case the root group, named /:

In [5]:
f.name

'/'

Creating a subgroup is accomplished via the aptly-named create_group. But we need to open the file in the “append” mode first (Read/write if exists, create otherwise)

In [6]:
f = h5py.File('mydataset.hdf5', 'a')

In [7]:
grp = f.create_group("subgroup")

All Group objects also have the create_* methods like File

In [8]:
dset2 = grp.create_dataset("another_dataset", (50,), dtype='f')
dset2.name

'/subgroup/another_dataset'

By the way, you don’t have to create all the intermediate groups manually. Specifying a full path works just fine:

In [9]:
dset3 = f.create_dataset('subgroup2/dataset_three', (10,), dtype='i')

In [10]:
dset3.name

'/subgroup2/dataset_three'

Groups support most of the Python dictionary-style interface. You retrieve objects in the file using the item-retrieval syntax:

In [11]:
dataset_three = f['subgroup2/dataset_three']

Iterating over a group provides the names of its members:

In [12]:
for name in f:
    print(name)

subgroup
subgroup2


Membership testing also uses names: (check this)

In [13]:
"mydataset" in f

False

In [14]:
"somethingelse" in f

False

In [15]:
"subgroup/another_dataset" in f

True

There are also the familiar keys(), values(), items() and iter() methods, as well as get().

Since iterating over a group only yields its directly-attached members, iterating over an entire file is accomplished with the Group methods visit() and visititems(), which take a callable:

In [16]:
def printname(name):
    print(name)

In [17]:
f.visit(printname)

subgroup
subgroup/another_dataset
subgroup2
subgroup2/dataset_three


In [18]:
dset

<HDF5 dataset "mydataset": shape (100,), type "<i4">

In [20]:
dset.name
f.name

'/'

Attributes

One of the best features of HDF5 is that you can store metadata right next to the data it describes. All groups and datasets support attached named bits of data called attributes.

Attributes are accessed through the attrs proxy object, which again implements the dictionary interface:

In [21]:
dset.attrs['temperature'] = 99.5

In [22]:
dset.attrs['temperature']

99.5

In [23]:
'temperature' in dset.attrs

True

An HDF5 file is a container for two kinds of objects: datasets, which are array-like collections of data, and groups, which are folder-like containers that hold datasets and other groups. The most fundamental thing to remember when using h5py is:

    Groups work like dictionaries, and datasets work like NumPy arrays

Suppose someone has sent you a HDF5 file, mytestfile.hdf5. (To create this file, read Appendix: Creating a file.) The very first thing you’ll need to do is to open the file for reading:

In [24]:
 f2 = h5py.File('mytestfile.hdf5', 'r')

The File object is your starting point. What is stored in this file? Remember h5py.File acts like a Python dictionary, thus we can check the keys,

In [25]:
 list(f2.keys())

['mydataset']

Based on our observation, there is one data set, mydataset in the file. Let us examine the data set as a Dataset object

In [27]:
dset = f2['mydataset']

The file name may be a byte string or unicode string. Valid modes are:

    r 	Readonly, file must exist
    r+ 	Read/write, file must exist
    w 	Create file, truncate if exists
    w- or x 	Create file, fail if exists
    a 	Read/write if exists, create otherwise (default)