# 2019-12-20-coffea-demo

This demo of the new Awkward Array was presented on December 20, 2019, before the final 1.0 version was released. Some interfaces may have changed. To run this notebook, make sure you have version 0.1.33 ([GitHub](https://github.com/scikit-hep/awkward-1.0/releases/tag/0.1.33), [pip](https://pypi.org/project/awkward1/0.1.33/)) by installing

```bash
pip install awkward1==0.1.33
```

The basic concepts are presented on the [old Awkward README](https://github.com/scikit-hep/awkward-array/tree/0.12.17#readme) and the motivation for a 1.0 rewrite are presented on the [new Awkward README](https://github.com/scikit-hep/awkward-1.0/tree/0.1.32#readme).

## High-level array class

The biggest user-facing change is that, instead of mixing NumPy arrays and `JaggedArray` objects, the new Awkward has a single `Array` class.

In [1]:
# FIXME: remove this!
import sys
sys.path.insert(0, "/home/jpivarski/irishep/awkward-1.0")

In [2]:
import numpy as np
import awkward1 as ak

array1 = ak.Array([[1.1, 2.2, 3.3], [], [4.4, 5.5]])
array1

<Array [[1.1, 2.2, 3.3], [], [4.4, 5.5]] type='3 * var * float64'>

In [3]:
array2 = ak.Array([{"x": 0, "y": []}, {"x": 1, "y": [1.1]}, {"x": 2, "y": [1.1, 2.2]}])
array2

<Array [{x: 0, y: []}, ... y: [1.1, 2.2]}] type='3 * {"x": int64, "y": var * flo...'>

The same `Array` class applies to all data structures, such as the array of lists in `array1` and the array of records of _x_ and _y_ in `array2`.

There won't be any user-level functions that apply to some data types and not others. The result of an operation is likely type-dependent, but its usability is not. At this time, the only operations are conversions and descriptions.

In [4]:
ak.tolist(array1), ak.tolist(array2)

([[1.1, 2.2, 3.3], [], [4.4, 5.5]],
 [{'x': 0, 'y': []}, {'x': 1, 'y': [1.1]}, {'x': 2, 'y': [1.1, 2.2]}])

In [5]:
ak.tojson(array1), ak.tojson(array2)

('[[1.1,2.2,3.3],[],[4.4,5.5]]',
 '[{"x":0,"y":[]},{"x":1,"y":[1.1]},{"x":2,"y":[1.1,2.2]}]')

In [6]:
ak.typeof(array1), ak.typeof(array2)

(3 * var * float64, 3 * {"x": int64, "y": var * float64})

(The data types are described using the [datashape language](https://datashape.readthedocs.io/en/latest/), though Awkward has some features that are beyond the [current datashape specification](https://github.com/blaze/datashape/issues/237), so we've made some reasonable extensions.)

The next major change in interface is that operations on arrays, such as `ak.tolist` and `ak.typeof` above, are free-standing functions, rather than class methods. This is because it's desirable to put domain specific (i.e. physics) methods on the array object itself, and free-standing functions don't conflict with that. For instance,

   * `ak.cross(array1, array2)` is an array-manipulation function (the cross-join of `array1` and `array2`)
   * `array1.cross(array2)` could be a user-defined method, such as the 3D cross-product, if `array1` and `array2` represent (arrays of) 3D vectors.
   * `array1.somefield` is a shortcut for `array1["somefield"]`.

## Behavioral mix-ins

The major use of Awkward arrays so far has been to represent arrays or jagged arrays of physics objects with physics methods on the array objects themselves. In Awkward 0.x, this was implemented with Python multiple inheritance, but now it's a "first class citizen," built into Awkward 1.0's type system.