In [1]:
import numpy as np

# NumPy

Nump's array class is called `ndarray`, also known by alias `array`.

*`numpy.array` is not the same as standard library class `array.array`*

In [2]:
a = np.arange(15).reshape(3, 5)
print(a)

[[ 0  1  2  3  4]
 [ 5  6  7  8  9]
 [10 11 12 13 14]]


In [3]:
a.shape

(3, 5)

In [5]:
a.ndim

2

In [7]:
a.dtype.name

'int32'

In [8]:
a.itemsize

4

In [9]:
a.size

15

In [10]:
type(a)

numpy.ndarray

In [11]:
b = np.array([6, 7, 8])
print(b)

[6 7 8]


## Array creation

We can create numpy array from regular python sequences. The type of the resulting array is deducted from the type of elems in the sequence.

In [13]:
a = np.array([2, 3, 4])
print(a)
print(a.dtype)

[2 3 4]
int32


`array` transforms sequences of sequences into 2D arrays, seq of seq of seq into 3D arrays and so on.

In [14]:
b = np.array([(1.8, 3, 8), (1, 2, 7)])
print(b)

[[1.8 3.  8. ]
 [1.  2.  7. ]]


The type of `array` can be specified at creation time, using key word argument `dtype`

In [16]:
c = np.array([[1, 2], [3, 4]], dtype=complex)

Numpy offers several functions to create arrays with placeholder data. For example:

* `zeros` creates an array full of 0s
* `ones` creates an array full of 1s
* `empty` creates array with random data which depends on the state of memory

In [17]:
np.zeros((3, 4))

array([[0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.]])

In [21]:
np.ones((2, 3, 6), dtype=np.int16)

array([[[1, 1, 1, 1, 1, 1],
        [1, 1, 1, 1, 1, 1],
        [1, 1, 1, 1, 1, 1]],

       [[1, 1, 1, 1, 1, 1],
        [1, 1, 1, 1, 1, 1],
        [1, 1, 1, 1, 1, 1]]], dtype=int16)

In [23]:
np.empty((3, 5))

array([[1.57991622e-311, 3.40905296e-322, 0.00000000e+000,
        0.00000000e+000, 0.00000000e+000],
       [1.16095484e-028, 3.94648786e+180, 8.95402175e-096,
        1.12958007e+277, 7.28193369e+223],
       [2.59027849e-144, 4.82412328e+228, 1.04718130e-142,
        2.65657260e-312, 2.12203497e-312]])

To create seq. of numbers, numpy provides `arange` func which is anologus to build-in `range`, but it returns an array

In [24]:
np.arange(10, 30, 5)

array([10, 15, 20, 25])

In [26]:
np.arange(0, 3, 0.3)

array([0. , 0.3, 0.6, 0.9, 1.2, 1.5, 1.8, 2.1, 2.4, 2.7])

When using `arange` whith `float` it is hard to predict the result size, because of that is is usually better to use `linspace` func, which receives as arg. the num. of desired elements instead of the step.

In [28]:
np.linspace(0, 2, 9)

array([0.  , 0.25, 0.5 , 0.75, 1.  , 1.25, 1.5 , 1.75, 2.  ])

In [32]:
np.linspace(0, 2 * np.pi, 100)

array([0.        , 0.06346652, 0.12693304, 0.19039955, 0.25386607,
       0.31733259, 0.38079911, 0.44426563, 0.50773215, 0.57119866,
       0.63466518, 0.6981317 , 0.76159822, 0.82506474, 0.88853126,
       0.95199777, 1.01546429, 1.07893081, 1.14239733, 1.20586385,
       1.26933037, 1.33279688, 1.3962634 , 1.45972992, 1.52319644,
       1.58666296, 1.65012947, 1.71359599, 1.77706251, 1.84052903,
       1.90399555, 1.96746207, 2.03092858, 2.0943951 , 2.15786162,
       2.22132814, 2.28479466, 2.34826118, 2.41172769, 2.47519421,
       2.53866073, 2.60212725, 2.66559377, 2.72906028, 2.7925268 ,
       2.85599332, 2.91945984, 2.98292636, 3.04639288, 3.10985939,
       3.17332591, 3.23679243, 3.30025895, 3.36372547, 3.42719199,
       3.4906585 , 3.55412502, 3.61759154, 3.68105806, 3.74452458,
       3.8079911 , 3.87145761, 3.93492413, 3.99839065, 4.06185717,
       4.12532369, 4.1887902 , 4.25225672, 4.31572324, 4.37918976,
       4.44265628, 4.5061228 , 4.56958931, 4.63305583, 4.69652

## Basic operations

Arithmetic operators on arrays apply **elementwise** and a new array is created as the result.

In [33]:
a = np.array([20, 30, 40, 50])
b = np.arange(4)
print(a)
print(b)

[20 30 40 50]
[0 1 2 3]


In [35]:
c = a - b
print(c)

[20 29 38 47]


In [36]:
c = b**2
print(c)

[0 1 4 9]


In [37]:
c = 10 * np.sin(a)
print(c)

[ 9.12945251 -9.88031624  7.4511316  -2.62374854]


In [39]:
c = a < 35
print(c)

[ True  True False False]


Product operator `*` operates elementwise in numpy arrays. To calculate **matrix product** we use `@` operator or `dot` method

In [40]:
A = np.array([[1, 1],
              [0, 1]])
B = np.array([[2, 0],
              [3, 4]])
print(A)
print(B)

[[1 1]
 [0 1]]
[[2 0]
 [3 4]]


In [41]:
c = A * B
print(c)

[[2 0]
 [0 4]]


In [42]:
c = A @ B
print(c)

[[5 4]
 [3 4]]


In [43]:
c = A.dot(B)
print(c)

[[5 4]
 [3 4]]


To perform operations in place we can use augmanted operators such as `+=` and `*=`

In [44]:
A += B
print(A)

[[3 1]
 [3 5]]


In [45]:
A *= 2
print(A)

[[ 6  2]
 [ 6 10]]


When working on arrays with diff types, the type of the resulting array corresponds to the more general one, this is known as **upcasting**

In [49]:
a = np.ones(3, dtype=np.int32)
b = np.linspace(0, np.pi, 3)
print(f"a dtype: {a.dtype.name}")
print(f"b dtype: {b.dtype.name}")

a dtype: int32
b dtype: float64


In [51]:
c = a + b
print(f"c data: {c}")
print(f"c dtype: {c.dtype.name}")

c data: [1.         2.57079633 4.14159265]
c dtype: float64


In [52]:
d = np.exp(c * 1j)
print(f"c data: {d}")
print(f"c dtype: {d.dtype.name}")

c data: [ 0.54030231+0.84147098j -0.84147098+0.54030231j -0.54030231-0.84147098j]
c dtype: complex128


Many unary operators are impl. as methods of `ndarray` class

In [53]:
rg = np.random.default_rng(1)  # create instance of default random number generator
a = rg.random((2, 3))
print(f"a data: {a}")

a data: [[0.51182162 0.9504637  0.14415961]
 [0.94864945 0.31183145 0.42332645]]


In [54]:
a.sum()

3.290252281866131

In [55]:
a.min()

0.14415961271963373

In [56]:
a.max()

0.9504636963259353

By default, those operations apply to the array as if it was a list of numbers, regardless its shape. By specifying the `axis` parameter, the operation can be applied along the specified axis of the array.

In [57]:
b = np.arange(12).reshape(3, 4)
print(f"b data: {b}")

b data: [[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]]


In [58]:
b.sum(axis=0) # sum of each column

array([12, 15, 18, 21])

In [59]:
b.min(axis=1) # sum of each row

array([0, 4, 8])

In [60]:
b.cumsum(axis=1) # cumulative sum along each row

array([[ 0,  1,  3,  6],
       [ 4,  9, 15, 22],
       [ 8, 17, 27, 38]])

## Universal functions

Numpy provides mathematical functions such as `sin`, `cos`, `exp` ect. In numpy this are called *universal functions* (`ufunc`) and they operate **elementwise** on an array, producing new array as output.

In [61]:
b = np.arange(3)
print(f"b data: {b}")

b data: [0 1 2]


In [62]:
np.exp(b)

array([1.        , 2.71828183, 7.3890561 ])

In [63]:
np.sqrt(b)

array([0.        , 1.        , 1.41421356])

In [64]:
np.cos(b)

array([ 1.        ,  0.54030231, -0.41614684])

## Indexing, slicing and iterating

**1D arrays** can be indexed, sliced and iterated like lists and ther python sequences

In [65]:
a = np.arange(10) ** 3
print(f"a: {a}")

a: [  0   1   8  27  64 125 216 343 512 729]


In [66]:
a[2]

8

In [67]:
a[2:5]

array([ 8, 27, 64], dtype=int32)

In [69]:
a[:6:2] = 1000
print(f"a: {a}")

a: [1000    1 1000   27 1000  125  216  343  512  729]


In [70]:
a[::-1]

array([ 729,  512,  343,  216,  125, 1000,   27, 1000,    1, 1000],
      dtype=int32)

In [71]:
for i in a:
    print(i**(1 / 3.))

9.999999999999998
1.0
9.999999999999998
3.0
9.999999999999998
5.0
5.999999999999999
6.999999999999999
7.999999999999999
8.999999999999998


**Multidimensional arrays** can have 1 index per axis and those indices are given in a `tuple` separated by commas

In [75]:
b = np.fromfunction(lambda x, y: 10 * x + y, (5, 4), dtype=int)
print(f"b: {b}")

b: [[ 0  1  2  3]
 [10 11 12 13]
 [20 21 22 23]
 [30 31 32 33]
 [40 41 42 43]]


In [76]:
b[2, 3]

23

In [77]:
b[:, 1]

array([ 1, 11, 21, 31, 41])

In [78]:
b[1:3, :]

array([[10, 11, 12, 13],
       [20, 21, 22, 23]])

When fewer indices are provided than the num of axes, the missing values are considered compleate slies `:`, e.g. `b[2] == b[2, :]`

In [79]:
b[-1]

array([40, 41, 42, 43])

In [80]:
b[-1, :]

array([40, 41, 42, 43])

Numpy also allows for usage of `...` which are representing all remaining axes

In [81]:
c = np.fromfunction(lambda x, y, z: 10 * x + y - z, (5, 4, 3), dtype=int)
print(f"c: {c}")

c: [[[ 0 -1 -2]
  [ 1  0 -1]
  [ 2  1  0]
  [ 3  2  1]]

 [[10  9  8]
  [11 10  9]
  [12 11 10]
  [13 12 11]]

 [[20 19 18]
  [21 20 19]
  [22 21 20]
  [23 22 21]]

 [[30 29 28]
  [31 30 29]
  [32 31 30]
  [33 32 31]]

 [[40 39 38]
  [41 40 39]
  [42 41 40]
  [43 42 41]]]


In [84]:
c[1, ...]

array([[10,  9,  8],
       [11, 10,  9],
       [12, 11, 10],
       [13, 12, 11]])

In [85]:
c[1, :, :]

array([[10,  9,  8],
       [11, 10,  9],
       [12, 11, 10],
       [13, 12, 11]])

`...` can be also used at the begining or in the middle of slice

In [86]:
c[1, ..., 2]

array([ 8,  9, 10, 11])

In [87]:
c[1, :, 2]

array([ 8,  9, 10, 11])

Iterating over multidimensional arrays is done with respect to the first axis

In [88]:
for row in b:
    print(row)

[0 1 2 3]
[10 11 12 13]
[20 21 22 23]
[30 31 32 33]
[40 41 42 43]


In [89]:
for m in c:
    print(m)

[[ 0 -1 -2]
 [ 1  0 -1]
 [ 2  1  0]
 [ 3  2  1]]
[[10  9  8]
 [11 10  9]
 [12 11 10]
 [13 12 11]]
[[20 19 18]
 [21 20 19]
 [22 21 20]
 [23 22 21]]
[[30 29 28]
 [31 30 29]
 [32 31 30]
 [33 32 31]]
[[40 39 38]
 [41 40 39]
 [42 41 40]
 [43 42 41]]


If we want to perform an operation on each element in the array, we can use `flat` attribute which is `iterator` over all the elements in the array

In [90]:
for elem in b.flat:
    print(elem)

0
1
2
3
10
11
12
13
20
21
22
23
30
31
32
33
40
41
42
43


## Shape manipulation

### Changing the shape of an array

In [91]:
a = np.floor(10 * rg.random((3, 4)))
print(f"a: {a}")

a: [[8. 4. 5. 0.]
 [7. 5. 3. 7.]
 [3. 4. 1. 4.]]


In [92]:
a.shape 

(3, 4)

In [93]:
a.ravel() # returns the flattened array

array([8., 4., 5., 0., 7., 5., 3., 7., 3., 4., 1., 4.])

In [97]:
a.reshape(6, 2) # returns the array with modified shape

array([[8., 4.],
       [5., 0.],
       [7., 5.],
       [3., 7.],
       [3., 4.],
       [1., 4.]])

In [98]:
a.T # returns transposed array

array([[8., 7., 3.],
       [4., 5., 4.],
       [5., 3., 1.],
       [0., 7., 4.]])

In [96]:
a.T.shape

(4, 3)

If a dimension is given as `-1` in the reshaping operation, the other dims are calc. automatically

In [99]:
a.reshape(3, -1)

array([[8., 4., 5., 0.],
       [7., 5., 3., 7.],
       [3., 4., 1., 4.]])

### Stacking together different arrays

Several arrays can be stacked together along diff. axes

In [104]:
a = np.floor(10 * rg.random((2, 2)))
b = np.floor(10 * rg.random((2, 2)))
print(f"a: {a}")
print(f"b: {b}")

a: [[6. 9.]
 [0. 5.]]
b: [[4. 0.]
 [6. 8.]]


In [105]:
np.vstack((a, b))

array([[6., 9.],
       [0., 5.],
       [4., 0.],
       [6., 8.]])

In [106]:
np.hstack((a, b))

array([[6., 9., 4., 0.],
       [0., 5., 6., 8.]])

Function `column_stack` stack 1D arrays as columns into a 2D array

## Spliting Arrays

* `hsplit` splits an array along the horizontal axis
* `vsplit` splits it along the vertical axis

Arrays can be split either by specifying the num of equally shaped arrays to return, or by specifying the cols/rows after which division should occur

In [107]:
a = np.floor(10 * rg.random((2, 12)))
print(a)

[[5. 2. 8. 5. 5. 7. 1. 8. 6. 7. 1. 8.]
 [1. 0. 8. 8. 8. 4. 2. 0. 6. 7. 8. 2.]]


In [108]:
np.hsplit(a, 3)

[array([[5., 2., 8., 5.],
        [1., 0., 8., 8.]]),
 array([[5., 7., 1., 8.],
        [8., 4., 2., 0.]]),
 array([[6., 7., 1., 8.],
        [6., 7., 8., 2.]])]

In [109]:
np.hsplit(a, (3, 4))

[array([[5., 2., 8.],
        [1., 0., 8.]]),
 array([[5.],
        [8.]]),
 array([[5., 7., 1., 8., 6., 7., 1., 8.],
        [8., 4., 2., 0., 6., 7., 8., 2.]])]

## Copies and views

### No copy at all

Simple assigment make no copy of object or their data. After simple assigment both sides points to the same underlying `ndarray`

In [111]:
a = np.array([[ 0,  1,  2,  3],
              [ 4,  5,  6,  7],
              [ 8,  9, 10, 11]])
b = a

In [112]:
b is a

True

Python passes mutable objects as **references**, meaning that function calls make no copy of `ndarrays`

In [114]:
def func(x):
    print(id(x))
          
print(id(a))
func(a)

3197821921360
3197821921360


### View or shallow copy

Different array objects can share the same data, the `view` method creates a new array object that looks at the same data as other array

In [115]:
c = a.view()

In [116]:
c is a

False

In [117]:
c.flags.owndata

False

In [125]:
c = c.reshape((2, 6)) # a's shape doesnt change

In [126]:
a.shape

(3, 4)

In [127]:
c[0, 4] = 1234 # a's data changes

In [128]:
a

array([[   0,    1,    2,    3],
       [1234,    5,    6,    7],
       [   8,    9,   10,   11]])

Slicing an array return a view of it

In [129]:
s = a[:, 1:3]

In [130]:
s[:] = 10

In [131]:
a

array([[   0,   10,   10,    3],
       [1234,   10,   10,    7],
       [   8,   10,   10,   11]])

In [132]:
s = 12

In [133]:
print(s)
print(a)

12
[[   0   10   10    3]
 [1234   10   10    7]
 [   8   10   10   11]]


### Deep copy

The `copy` method makes a complete copy of the array and its data

In [135]:
d = a.copy()
d is a

False

In [136]:
d.base is a

False

In [137]:
d[0, 0] = 2137

In [138]:
a

array([[   0,   10,   10,    3],
       [1234,   10,   10,    7],
       [   8,   10,   10,   11]])