# Writing hepfiles from Dictionaries

In [1]:
# imports
import hepfile as hf

  if group == "_SINGLETONS_GROUP_" and dataset is not "COUNTER":


Say that we have two events in a particle physics experiment and we measured the x and y momentum for both the jets and muons for each as well as the number of particles emitted. Each event can be represented as a nested dictionary:

In [2]:
event1 = {
    'jet': {
        'px': [1,2,3],
        'py': [1,2,3]
     },
    'muons': {
        'px': [1,2,3],
        'py': [1,2,3]
     },
    'nParticles': 3
    }

event2 = {
    'jet': {
        'px': [3,4,6,7],
        'py': [3,4,6,7]
     },
    'muons': {
        'px': [3,4,6,7],
        'py': [3,4,6,7],
        },
    'nParticles': 4
    }

Note how these events do not have the same number of particles emitted and therefore could not be stored in a typical "homogeneous" data structure. This is where hepfile becomes useful! 

To easily create a hepfile from these events we can store them in a list:

In [3]:
to_write_to_hepfile = [event1, event2]

Then, to write this out to a hepfile, we simply need to use the `dict_tools` module of `hepfile`.

In [4]:
out_filename = 'output_from_dict.hdf5'
data = hf.dict_tools.dictlike_to_hepfile(to_write_to_hepfile, out_filename)
data.show()

Adding group [1mjet[0m
Adding a counter for [1mjet[0m as [1mnjet[0m
Adding dataset [1mpx[0m to the dictionary under group [1mjet[0m.
Adding dataset [1mpy[0m to the dictionary under group [1mjet[0m.
Adding group [1mmuons[0m
Adding a counter for [1mmuons[0m as [1mnmuons[0m
Adding dataset [1mpx[0m to the dictionary under group [1mmuons[0m.
Adding dataset [1mpy[0m to the dictionary under group [1mmuons[0m.
Adding dataset [1mnParticles[0m to the dictionary as a SINGLETON.
Writing the hdf5 file from the awkward array...
{'_SINGLETONS_GROUP_/COUNTER': <class 'int'>, 'jet/njet': <class 'int'>, 'jet/px': <class 'numpy.int64'>, 'jet/py': <class 'numpy.int64'>, 'muons/nmuons': <class 'int'>, 'muons/px': <class 'numpy.int64'>, 'muons/py': <class 'numpy.int64'>, 'nParticles': <class 'numpy.int64'>}
_SINGLETONS_GROUP_/COUNTER       has 2            entries
jet/njet                         has 2            entries
muons/nmuons                     has 2            entries

In [5]:
d = hf.awkward_tools.awkward_to_hepfile(data, write_hepfile=False)
ds = set(d.keys())

Adding group [1mjet[0m
Adding a counter for [1mjet[0m as [1mnjet[0m
Adding dataset [1mpx[0m to the dictionary under group [1mjet[0m.
Adding dataset [1mpy[0m to the dictionary under group [1mjet[0m.
Adding group [1mmuons[0m
Adding a counter for [1mmuons[0m as [1mnmuons[0m
Adding dataset [1mpx[0m to the dictionary under group [1mmuons[0m.
Adding dataset [1mpy[0m to the dictionary under group [1mmuons[0m.
Adding dataset [1mnParticles[0m to the dictionary as a SINGLETON.


In [6]:
x, _ = hf.load(out_filename)
xs = set(x.keys())

Building the indices...

Built the indices!
Data is read in and input file is closed.


In [7]:
# in input to write but not in output of load
ds - xs

{'_GROUPS_', '_MAP_DATASETS_TO_DATA_TYPES_', '_PROTECTED_NAMES_'}

In [8]:
# in output of load but not in input to write
xs - ds

{'_LIST_OF_DATASETS_',
 '_MAP_DATASETS_TO_INDEX_',
 '_NUMBER_OF_BUCKETS_',
 '_SINGLETONS_GROUP_',
 '_SINGLETONS_GROUP_/COUNTER_INDEX',
 'jet/njet_INDEX',
 'muons/nmuons_INDEX'}

In [9]:
x

{'_MAP_DATASETS_TO_COUNTERS_': {'_SINGLETONS_GROUP_': '_SINGLETONS_GROUP_/COUNTER',
  'jet': 'jet/njet',
  'jet/px': 'jet/njet',
  'jet/py': 'jet/njet',
  'muons': 'muons/nmuons',
  'muons/px': 'muons/nmuons',
  'muons/py': 'muons/nmuons',
  'nParticles': '_SINGLETONS_GROUP_/COUNTER'},
 '_MAP_DATASETS_TO_INDEX_': {'_SINGLETONS_GROUP_': '_SINGLETONS_GROUP_/COUNTER_INDEX',
  'jet': 'jet/njet_INDEX',
  'jet/px': 'jet/njet_INDEX',
  'jet/py': 'jet/njet_INDEX',
  'muons': 'muons/nmuons_INDEX',
  'muons/px': 'muons/nmuons_INDEX',
  'muons/py': 'muons/nmuons_INDEX',
  'nParticles': '_SINGLETONS_GROUP_/COUNTER_INDEX'},
 '_LIST_OF_COUNTERS_': ['_SINGLETONS_GROUP_/COUNTER',
  'jet/njet',
  'muons/nmuons'],
 '_LIST_OF_DATASETS_': ['_SINGLETONS_GROUP_',
  '_SINGLETONS_GROUP_/COUNTER',
  'jet',
  'jet/njet',
  'jet/px',
  'jet/py',
  'muons',
  'muons/nmuons',
  'muons/px',
  'muons/py',
  'nParticles'],
 '_NUMBER_OF_BUCKETS_': 2,
 '_SINGLETONS_GROUP_': ['nParticles'],
 '_SINGLETONS_GROUP_/COUNTER'

In [10]:
d

{'_GROUPS_': {'_SINGLETONS_GROUP_': ['COUNTER', 'nParticles'],
  'jet': ['njet', 'px', 'py'],
  'muons': ['nmuons', 'px', 'py']},
 '_MAP_DATASETS_TO_COUNTERS_': {'_SINGLETONS_GROUP_': '_SINGLETONS_GROUP_/COUNTER',
  'jet': 'jet/njet',
  'jet/px': 'jet/njet',
  'jet/py': 'jet/njet',
  'muons': 'muons/nmuons',
  'muons/px': 'muons/nmuons',
  'muons/py': 'muons/nmuons',
  'nParticles': '_SINGLETONS_GROUP_/COUNTER'},
 '_LIST_OF_COUNTERS_': ['_SINGLETONS_GROUP_/COUNTER',
  'jet/njet',
  'muons/nmuons'],
 '_SINGLETONS_GROUP_/COUNTER': [1, 1],
 '_MAP_DATASETS_TO_DATA_TYPES_': {'_SINGLETONS_GROUP_/COUNTER': int,
  'jet/njet': int,
  'jet/px': numpy.int64,
  'jet/py': numpy.int64,
  'muons/nmuons': int,
  'muons/px': numpy.int64,
  'muons/py': numpy.int64,
  'nParticles': numpy.int64},
 '_PROTECTED_NAMES_': ['_PROTECTED_NAMES_',
  '_GROUPS_',
  '_MAP_DATASETS_TO_COUNTERS_',
  '_MAP_DATASETS_TO_DATA_TYPES__LIST_OF_COUNTERS_',
  '_SINGLETONS_GROUP_',
  '_SINGLETONS_GROUP_/COUNTER'],
 'jet/njet': 

In [11]:
from copy import deepcopy
xnew = deepcopy(x)
xnew['_GROUPS_'] = d['_GROUPS_']
xnew['_MAP_DATASETS_TO_DATA_TYPES_'] = d['_MAP_DATASETS_TO_DATA_TYPES_']
hf.write_to_file('test2.h5', xnew)

{'_SINGLETONS_GROUP_/COUNTER': <class 'int'>, 'jet/njet': <class 'int'>, 'jet/px': <class 'numpy.int64'>, 'jet/py': <class 'numpy.int64'>, 'muons/nmuons': <class 'int'>, 'muons/px': <class 'numpy.int64'>, 'muons/py': <class 'numpy.int64'>, 'nParticles': <class 'numpy.int64'>}
_SINGLETONS_GROUP_/COUNTER       has 2            entries
jet/njet                         has 2            entries
muons/nmuons                     has 2            entries
Metadata added


<Closed HDF5 file>