# File exploration

To find data quickly or to collect certain objects the `h5RDMtoolbox` provides helpful methods:

In [None]:
import h5rdmtoolbox as h5tbx

As always, let's build a HDF5 file from scratch:

In [None]:
with h5tbx.H5File() as h5:
    h5.create_group('grp_1')
    h5.create_group('grp_2', long_name='my other group', attrs=dict(one=2, two='a second attr'))
    h5.create_dataset('ds_1', shape=(2, 4), units='', long_name='dataset 1')
    h5.create_dataset('ds_2', shape=(2, 4), units='', long_name='dataset 2')
    h5.create_dataset('gr_1/ds_1', shape=(2, 4), units='', long_name='dataset 2')

To get all groups in a current level, call `get_groups()`:

In [None]:
with h5tbx.H5File(h5.hdf_filename) as h5:
    print(h5.get_groups())

To get all groups with a specific pattern, provide a pattern string (uses package `re`):

In [None]:
with h5tbx.H5File(h5.hdf_filename) as h5:
    print(h5.get_groups('^grp_[0-9]$'))

Exact same thing work with datasets:

In [None]:
with h5tbx.H5File(h5.hdf_filename) as h5:
    print(h5.get_datasets('^ds_[0-9]$'))

You can find datasets or groups by searching for specific attributes:

In [None]:
with h5tbx.H5File(h5.hdf_filename) as h5:
    print(h5.get_by_attribute('long_name', 'dataset 1', recursive=False))
    print(h5.get_by_attribute('long_name', 'dataset 2', recursive=True, h5type='dataset'))

If you specify the object type or directly call the respective method, you only get those objects returned:

In [None]:
with h5tbx.H5File(h5.hdf_filename) as h5:
    print(h5.get_datasets_by_attribute('long_name', 'dataset 2', recursive=True))