# Numpy basics

As a reminder, objects in Python (including numbers!) are abstract and dynamic:

   * you don't know where they are in memory (pointer address); could be anywhere
   * you don't know how they're represented in terms of bytes
   * data types, member data, function arguments, etc. are checked at the last possible moment

And so they are slow.

In [None]:
import random
data = []
for i in range(1000000):
    data.append(random.gauss(0, 1))

In [None]:
%%timeit
data2 = []
for x in data:
    data2.append(x**2)

But Numpy isn't.

In [None]:
import numpy
data = numpy.random.normal(0, 1, 1000000)

In [None]:
%%timeit
data2 = data**2

**How does it work?**

A Numpy array is everything a list of Python objects is not:

   * the data are known to be contiguous in memory (sequential access is important!)
   * you can directly access and manipulate their bytes
   * the data type of an array is specified once for the whole array
   * *bonus:* most methods benefit from hardware vectorization
   * *bonus:* all methods release Python's interpreter lock, so parallel threads can run at the same time
   * *bonus:* numbers use less memory than objects with pointers to their types

Numpy encourages a different order of operations: instead of processing a table of data one event at a time, it only helps if you process one operation (for all events) at a time.

In [None]:
px = numpy.random.normal(0, 30, 100000)
py = numpy.random.normal(0, 30, 100000)
pz = numpy.random.normal(0, 300, 100000)

Instead of

In [None]:
%%timeit
p = numpy.empty(100000)
for i in range(len(p)):                                   # for each px[i], py[i], pz[i]
    p[i] = numpy.sqrt(px[i]**2 + py[i]**2 + pz[i]**2)     # compute p[i]

do

In [None]:
%%timeit
p = numpy.sqrt(px**2 + py**2 + pz**2)       # compute all px**2, then all py**2, then all pz**2, then sum all, then sqrt all

Normal math functions are *scalar* (e.g. binary operators like `+` or functions from `import math`). They perform one operation each time they appear in Python.

Numpy math functions are *vectorized.* Given equal-length arrays as input, they return the same length array as output, performing all loops in compiled C or even vectorized across a CPU. (Some implementations perform the work in parallel or on a GPU, but not the default one.)

In [None]:
small_array = numpy.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [None]:
small_array**2

In [None]:
numpy.sqrt(small_array)

In [None]:
import math
math.sqrt(small_array)

Numpy arrays are raw bytes; you can do whatever you want with them.

In [None]:
asbytes = small_array.view(numpy.uint8)
print(small_array)
print(asbytes)

In [None]:
asbytes[17] = 123
print(small_array)
print(asbytes)

They may have arbitrarily many dimensions. Changing the dimensions (in a way that keeps the total number of items constant) *does not change the underlying data.*

In [None]:
small_array.reshape(5, 2)

The columns can be named, making it easy to swap array-of-structs with struct-of-arrays. (A `recarray` is literally the same as 

In [None]:
recarray = small_array.view([("one", int), ("two", int)])
print(recarray)
print(recarray.dtype)

In [None]:
recarray["one"]

In [None]:
recarray["two"]

In [None]:
recarray[2]

Numpy arrays follow the same "slicing" rules as Python lists, though slicing becomes more important because it's much faster than iterating.

In [None]:
small_array[4:-2]

But they also have new rules, such as masking by an array of booleans:

In [None]:
mask = numpy.array([True, True, False, False, False, True, False, True, False, False])
small_array[mask]

In [None]:
small_array[mask] = numpy.array([1000, 1001, 1005, 1007])
small_array

And "fancy indexing": using an array of indexes to filter and potentially reorder an array:

In [None]:
indexes = numpy.array([7, 5, 1, 0])
small_array[indexes]

In [None]:
small_array[indexes] = 999
small_array

As in C/C++, you have to be very careful about what returns a view versus what returns a copy:

   * **view:** the new array is a pointer to the same data as the old array; it's faster (does not scale with the size of the array) and changes to the new array affect the old array. There are times when you want that; times when you don't.
   * **copy:** the new array is detached from the old; it's slower to create (sometimes insignificant), and changes to the new array have no effect on the old array.

The `base` attribute of a view is a reference to the array the view views.

In [None]:
view = small_array[4:-2]
copy = small_array[4:-2].copy()

print(view.base)

Let's apply vectorized functions and fancy indexing to a real physics problem. Suppose that you're given an array of `Jet_pt`, an array of `Jet_eta`, and an array of indexes in which each event starts and stops:

In [None]:
import uproot

In [None]:
tree = uproot.open("~/NanoAOD-DYJetsToLL.root")["Events"]
pt, eta = tree.arrays(["Jet_pt", "Jet_eta"], outputtype=tuple)
starts, stops = pt.starts, pt.stops
pt = numpy.array(pt)
eta = numpy.array(eta)

In [None]:
print(starts)   # the first event has no jets because starts[0] == stops[0]
print(stops)
print(pt)       # pt[0:5] are for jets in the second event
print(eta)      # eta[0:5] are for jets in the second event, etc.

**Question 1:** How do we find the events with at least one jet?

In [None]:
hasajet = ???                   # want array of booleans: tell me what to type!
hasajet

**Question 2:** How do we compute `pz = pt*sinh(eta)` for the first jet in each event?

In [None]:
indexes = ???                   # want array of integers: tell me what to type!
pz = ???                        # want array of floats: tell me what to type!
pz

Most scientific libraries for Python do the number-crunching in C/C++ and the interface in Python. (One tends to see problems separated into "slow control" and "fast math.")

Numpy is the common language that makes it possible to move data from one number-crunching library to another.

For instance, we can define new vectorized Numpy functions in ROOT.

In [None]:
import ROOT
ROOT.gInterpreter.Declare("""
void computemass(int n, float* pt1, float* pt2, float* eta1, float* eta2, float* phi1, float* phi2, float* out) {
    TLorentzVector one, two;
    for (int i = 0;  i < n;  i++) {
        one.SetPtEtaPhiM(pt1[i], eta1[i], phi1[i], 0.1056583745);    // muon mass
        two.SetPtEtaPhiM(pt2[i], eta2[i], phi2[i], 0.1056583745);
        out[i] = (one + two).M();
    }
}""")

In [None]:
tree = uproot.open("~/NanoAOD-DYJetsToLL.root")["Events"]
pt, eta, phi = tree.arrays(["Muon_pt", "Muon_eta", "Muon_phi"], outputtype=tuple)
starts, stops = pt.starts, pt.stops
pt = numpy.array(pt)
eta = numpy.array(eta)
phi = numpy.array(phi)

**Mini-exercise:** comment each of the lines below. What are they doing?

In [None]:
has2muons = stops - starts >= 2          # ???
firsts = starts[has2muons]               # ???
seconds = starts[has2muons] + 1          # ???
pt1, pt2 = pt[firsts], pt[seconds]       # ???
eta1, eta2 = eta[firsts], eta[seconds]   # ???
phi1, phi2 = phi[firsts], phi[seconds]   # ???

(If ROOT didn't know about Numpy arrays, we could have given it pointers: `pt1.ctypes.data`, `pt2.dtypes.data`, etc.)

In [None]:
masses = numpy.empty(len(pt1), dtype="float32")
ROOT.computemass(len(pt1), pt1, pt2, eta1, eta2, phi1, phi2, masses)

In [None]:
masses

In [None]:
from histbook import *
from vega import VegaLite

In [None]:
Hist(bin("mass", 200, 0, 200), fill=masses).step(width=800, height=400, yscale="log").to(VegaLite)

We could have computed masses with vectorized Numpy functions, but the point is to show that using Numpy doesn't mean not using ROOT. Anything that gets the job done!