# Working with Awkward Arrays

Awkward arrays are existing python software that allows for different length arrays to be stored in a single array. This can be very useful for those working with "heterogeneous" data.

In [1]:
# imports
import awkward as ak
import hepfile as hf

## Introduction to Awkward Arrays

This is a general overview, see https://awkward-array.org/doc/main/index.html for more details.

Say we have an array, `list1`, that is made up of lists of lists that are different lengths. 

In [2]:
list1 = [[1,2,3],
        [4,5],
        [6]]

Sadly, NumPy doesn't allow for easy manipulations/calculations with such "ragged" arrays. That is where the `awkward` package becomes very useful. We can create an awkward array from `list1` with the following code:

In [3]:
awk = ak.Array(list1)
print(awk)
print(type(awk))

[[1, 2, 3], [4, 5], [6]]
<class 'awkward.highlevel.Array'>


Then, we can do many similar calculations that we normally could do with NumPy

In [4]:
# sum along different axis
print(f'Total Sum = {ak.sum(awk)}')
print(f'Sum of columns = {ak.sum(awk, axis=0)}')
print(f'Sum of rows = {ak.sum(awk, axis=1)}')

Total Sum = 21
Sum of columns = [11, 7, 3]
Sum of rows = [6, 9, 6]


## Converting hepfiles to awkward arrays

All of the awkward tools for hepfile are in `hepfile.awkward_tools`.

We have built in an easy method to go from the output of the `hepfile.read.load` method to an awkward array called `hepfile_to_awkward`

**Note:** This section of this tutorial assumes you have completed the *writing_hepfiles_from_dicts* tutorial!

In [5]:
infile = 'output_from_dict.hdf5'

# read in the hepfile data
data, _ = hf.load(infile)

# convert it to an awkward array
dataAwk = hf.awkward_tools.hepfile_to_awkward(data)
print()
print('Awkward Array:\n')
dataAwk.show()

Building the indices...

Built the indices!
Data is read in and input file is closed.
['jet' 'muons' 'nParticles'] ['_SINGLETONS_GROUP_', '_SINGLETONS_GROUP_/COUNTER', 'jet', 'jet/njet', 'jet/px', 'jet/py', 'muons', 'muons/nmuons', 'muons/px', 'muons/py', 'nParticles']

Awkward Array:

{jet: [{px: [1, 2, 3], py: [1, 2, 3]}, {px: [3, ..., 7], py: [...]}],
 muons: [{px: [1, 2, 3], py: [1, 2, 3]}, {px: [3, ..., 7], py: ..., ...}],
 nParticles: [3, 4]}


Such a structure may be more intuitive to some and may make some analysis easier.

As a side note, `hepfile.awkward_tools.awkward_to_hepfile` does also exist. But, this method is meant to be called from the `hepfile.dict_tools.dictlike_to_hepfile` method. It could be used by a user but the better practice is to use the `dict_tools` wrapper method.