# NumPy

NumPy is Pyhton "workhorse" for handling local, multi-dimensional arrays efficiently

Fundamental data structure: `ndarray`


In [None]:
import numpy as np

In [None]:
a = np.array([1, 2, 3, 4, 5])
type(a)

In [None]:
# element access largely the same with Python lists
print(a[1])
print(a[-2])
print(a[1:3])

### Datatype

Rule for `ndarrray`: all elements must be of same type

This implies that unlike with Python lists, the _datatype_ is a property of the array and not of the element


In [None]:
a = np.array([1, 2, 3, 4])
print(a)

In [None]:
# inferred datatype
a.dtype

In [None]:
# choose datatype explicitly
a = np.array([1, 2, 3, 4], dtype="float32")
print(a)
print(a.dtype)

### Simple operations on entire arrays

Python lists do _not_ support this


In [None]:
a * 2

In [None]:
a + 1

In [None]:
np.sqrt(a)

In [None]:
a**2

### Support for an arbitrary number of dimensions

Numpy closely follows concepts from linear algebra: vectors, matrices, tensors


In [None]:
a = np.array([[1, 2, 3], [4, 5, 6]])
print(a)

In [None]:
a.shape

In [None]:
a2 = np.reshape(a, 6)
print(a2)

In [None]:
a2.shape

In [None]:
a3 = np.reshape(a2, (3, 2))
print(a3)

### Remark: Numpy is speedy

Numpy offers C-like performance for single-machine computations owing to SIMD vector
instructions, see [NumPy SIMD
instructions](https://numpy.org/doc/stable/reference/simd/index.html#cpu-simd-optimizations)

Compare three ways to double all elements of a large array of 100'000 numbers


In [None]:
n_nums = 100000

In [None]:
# Approach 1: Python list
def native_double(arr):
    value = []
    for x in arr:
        value.append(2 * x)
    return value

%timeit native_double(list(range(n_nums)))


In [None]:
def comprehension_double(arr):
    return [2 * x for x in arr]

%timeit comprehension_double(list(range(n_nums)))


In [None]:
def numpy_double(arr):
    return 2 * arr

%timeit numpy_double(np.arange(n_nums))
