# Numerical Python: NumPy

In [1]:
# Esta celda da el estilo al notebook
from IPython.core.display import HTML
css_style = 'style_1.css'
css_file = css_style
HTML(open(css_file, "r").read())

In order to work with strict numerical data, let it be for data analysis, machine learning, etc. Python lacks some core methods and objects that are present in other languages, such as `MATLAB` and `R`. Luckily, there are extensive libraries that solve this problem. One of the basic libraries is `numpy`.

In [1]:
import numpy as np

## 1. Motivation

If we want to do a matrix multiplication, let's say:

$$ \begin{pmatrix} 2 & 1 \\4 & 2\end{pmatrix} ·  \begin{pmatrix} 4 & 3 \\1 & 3\end{pmatrix}$$

The direct and naive approach would be:

In [3]:
A = [[2,1],[4,2]]
B = [[4,3],[1,3]]
A*B

TypeError: can't multiply sequence by non-int of type 'list'

Well, that did not go well. We can try with the `@`

In [4]:
A@B

TypeError: unsupported operand type(s) for @: 'list' and 'list'

Nope. Let's try something even simpler. Vector addition. We have these two:

$$ a = [1,2]\qquad b = [4,9] \\ a+ b =\, ?$$

In [5]:
a = [1,2]; b = [4,9]
a + b

[1, 2, 4, 9]

Well. That worked. It did not do the thing we wanted, but it worked. The reason of this is that we are using a `list`. Lists are lists, not vectors, and not matrices. We have to use `numpy.arrays` instead.

In [7]:
a = np.array([1,2]); b = np.array([4,9])
print(a + b)
A = np.array([[2,1],
              [4,2]])
B = np.array([[4,3],
              [1,3]])
print(A * B)

[ 5 11]
[[8 3]
 [4 6]]


So, the addition works fine. The multiplication however is element by element. Let's change that. There are two options.

In [10]:
print(A@B)
print(np.dot(A,B))

[[ 9  9]
 [18 18]]
[[ 9  9]
 [18 18]]


Now, these objects have a very important type:

In [11]:
type(A)

numpy.ndarray

## 2. Array attributes

These N-dimensional arrays have a series of useful attributes. Let's create a vector, a matrix and a tensor.

In [23]:
vector = np.array([1,2,3])
matrix = np.array([[1.0,4],[2,3]])
tensor = np.array([  [[2+3j,3],
                      [3,4]],
                     [[4,5],
                      [2,1]]
                  ])
print('vector :\n', vector)
print('\nmatrix :\n', matrix)
print('\ntensor :\n', tensor)
dic_data = {i:j for i,j in zip(['vector', 'matrix', 'tensor'], [vector, matrix, tensor])}

vector :
 [1 2 3]

matrix :
 [[1. 4.]
 [2. 3.]]

tensor :
 [[[2.+3.j 3.+0.j]
  [3.+0.j 4.+0.j]]

 [[4.+0.j 5.+0.j]
  [2.+0.j 1.+0.j]]]


### 2.1 shape

This is equivalent to `size` in `MATLAB`. It returns the dimensions:

In [24]:
for i in dic_data:
    print('Shape of',i, '=', dic_data[i].shape)

Shape of vector = (3,)
Shape of matrix = (2, 2)
Shape of tensor = (2, 2, 2)


### 2.2 ndim

This attribute returns the number of array dimensions:

In [25]:
for i in dic_data:
    print('Dimensions of', i, '=', dic_data[i].ndim)

Dimensions of vector = 1
Dimensions of matrix = 2
Dimensions of tensor = 3


### 2.3 flags

These show a handful of properties:

In [26]:
for i in dic_data:
    print('\nData in {0} has the following flags\n'.format(i), dic_data[i].flags)


Data in vector has the following flags
   C_CONTIGUOUS : True
  F_CONTIGUOUS : True
  OWNDATA : True
  WRITEABLE : True
  ALIGNED : True
  WRITEBACKIFCOPY : False
  UPDATEIFCOPY : False

Data in matrix has the following flags
   C_CONTIGUOUS : True
  F_CONTIGUOUS : False
  OWNDATA : True
  WRITEABLE : True
  ALIGNED : True
  WRITEBACKIFCOPY : False
  UPDATEIFCOPY : False

Data in tensor has the following flags
   C_CONTIGUOUS : True
  F_CONTIGUOUS : False
  OWNDATA : True
  WRITEABLE : True
  ALIGNED : True
  WRITEBACKIFCOPY : False
  UPDATEIFCOPY : False


### 2.4 dtype

This describes the data type of the elements in the array. They are homogeneous, meaning that they can only handle elements with the same data type.

In [27]:
for i in dic_data:
    print('Type of data in {0}:'.format(i), dic_data[i].dtype)

Type of data in vector: int32
Type of data in matrix: float64
Type of data in tensor: complex128


### 2.5 itemsize and nbytes

`itemsize` returns the size in bytes of each element of the array, while `nbytes` returns the total

In [28]:
for i in dic_data:
    print('\nBytes of each element in {0}:'.format(i), dic_data[i].itemsize)
    print('Bytes of the array in {0}:'.format(i), dic_data[i].nbytes)


Bytes of each element in vector: 4
Bytes of the array in vector: 12

Bytes of each element in matrix: 8
Bytes of the array in matrix: 32

Bytes of each element in tensor: 16
Bytes of the array in tensor: 128


### 2.6 transposed

It returns the transposed:

In [29]:
for i in dic_data:
    print('Transposed of {0}\n'.format(i), dic_data[i].T)

Transposed of vector
 [1 2 3]
Transposed of matrix
 [[1. 2.]
 [4. 3.]]
Transposed of tensor
 [[[2.+3.j 4.+0.j]
  [3.+0.j 2.+0.j]]

 [[3.+0.j 5.+0.j]
  [4.+0.j 1.+0.j]]]


## 3. Creating numpy arrays

We have already seen one way of creating a numpy array from lists. Now, there are other functions that allow us to create predefined arrays.

### 3.1 numpy.empty

It has the syntaxis:

```python
numpy.empty(shape, dtype = float, order = 'C')
```

Where `order` has a `C` for row-major arrays and `F` for column-major arrays.

In [56]:
empty_array = np.empty((2,3))
print(empty_array)

[[3.06e-322 0.00e+000 0.00e+000]
 [0.00e+000 0.00e+000 0.00e+000]]


It does not initialize the entries, so they sometimes are zero and sometimes random.

### 3.2 numpy.zeros

Returns an array of specified size filled with zeros:

In [58]:
zero_array = np.zeros((2,3))
print(zero_array)

[[0. 0. 0.]
 [0. 0. 0.]]


### 3.3 numpy.ones

Returns an array of specified size filled with ones:

In [59]:
one_array = np.ones((2,3))
thirty_array = 30*np.ones((2,3))
print(one_array)
print(thirty_array)

[[1. 1. 1.]
 [1. 1. 1.]]
[[30. 30. 30.]
 [30. 30. 30.]]


### 3.4 numpy.arange

Returns evenly spaced values within a given range

In [60]:
range_01 = np.arange(4)
range_02 = np.arange(-2,4)
range_03 = np.arange(-2,4,0.5)
print(range_01)
print(range_02)
print(range_03)

[0 1 2 3]
[-2 -1  0  1  2  3]
[-2.  -1.5 -1.  -0.5  0.   0.5  1.   1.5  2.   2.5  3.   3.5]


### 3.5 numpy.linspace

Returns a number of evenly spaced values between the specified interval.

In [70]:
linspace = []
linspace.append(np.linspace(0,5))
linspace.append(np.linspace(0,5,20))
[print(i) for i in linspace]

[0.         0.10204082 0.20408163 0.30612245 0.40816327 0.51020408
 0.6122449  0.71428571 0.81632653 0.91836735 1.02040816 1.12244898
 1.2244898  1.32653061 1.42857143 1.53061224 1.63265306 1.73469388
 1.83673469 1.93877551 2.04081633 2.14285714 2.24489796 2.34693878
 2.44897959 2.55102041 2.65306122 2.75510204 2.85714286 2.95918367
 3.06122449 3.16326531 3.26530612 3.36734694 3.46938776 3.57142857
 3.67346939 3.7755102  3.87755102 3.97959184 4.08163265 4.18367347
 4.28571429 4.3877551  4.48979592 4.59183673 4.69387755 4.79591837
 4.89795918 5.        ]
[0.         0.26315789 0.52631579 0.78947368 1.05263158 1.31578947
 1.57894737 1.84210526 2.10526316 2.36842105 2.63157895 2.89473684
 3.15789474 3.42105263 3.68421053 3.94736842 4.21052632 4.47368421
 4.73684211 5.        ]


[None, None]

## 4. Package of linear algebra

One of the most useful modules in numpy is the `linalg` module. It allows the user to work completely with matrices.

For example, `linalg.inv(M)` returns the inverse of matrix M.

In [2]:
M = np.array([[1,0,0],
              [0,3,0],
              [0,0,6]])
np.linalg.inv(M)

array([[1.        , 0.        , 0.        ],
       [0.        , 0.33333333, 0.        ],
       [0.        , 0.        , 0.16666667]])

For other properties:

In [4]:
## Eigenvalues and eigenvectors
np.linalg.eig(M)

(array([1., 3., 6.]), array([[1., 0., 0.],
        [0., 1., 0.],
        [0., 0., 1.]]))

In [5]:
## Linear systems
b = np.array([2,9,30])
np.linalg.solve(M,b)

array([2., 3., 5.])