# 6 - NumPy

## Guia de comandos

- Importação do NumPy
```
import numpy as np
```

- Criação de um array NumPy

```
A = np.array([[1, 2], [3, 4], [5, 6]])
```

- Verificação do shape: `A.shape`

- Obtenção da matriz transposta: `A.transpose()` ou `A.T`

- Operações ponto a ponto entre vetores e matrizes: Usar `+`, `-`, `*`, `/` e `**`

- Produto escalar entre vetores ou produto entre matriz e vetor ou matriz e matriz: `np.dot(array1, array2)` ou `array1 @ array2`

- Alterando dimensões de um array: `v = v.reshape(1, 3)`

- Indexação e fatiamento (indexing, slicing)
  - Primeira linha de uma matriz (retorna array de rank 1): `A[0]` ou `A[0, :]`
    - Para obter array de rank 2: `A[0].reshape(1, -1)` ou `A[[0]]`
  - Primeira e segunda linha de uma matriz, retornando array de rank 2: `A[[0, 1]]`

- Funções de conveniência
  - Matriz de 0\'s: `np.zeros((M, N))`
  - Matriz de 1\'s: `np.ones((M, N))`
  - Matriz com elementos aleatórios (sorteados de distribuição uniforme de 0 a 1): `np.random.rand(M, N)`
  - Matriz com elementos aleatórios (sorteados de distribuição normal de média 0 e variância 1): `np.random.randn(M, N)`

## Testes mostrados no vídeo

### Ganho de desempenho com o Numpy

In [1]:
import random

In [2]:
N = 1000
x_list = [random.random() for i in range(N)]

In [3]:
type(x_list)

list

In [4]:
y_list = [random.random() for i in range(N)]

In [5]:
def prod(x, y):
    N = len(x)
    return [x[i]*y[i] for i in range(N)]

In [6]:
z_list = prod(x_list, y_list)

In [7]:
z_list[:10]

[0.238299467521131,
 0.4535996113523602,
 0.6199383570909979,
 0.4154201051635536,
 0.322343767033582,
 0.058764016239509075,
 0.06221152019491746,
 0.020518904049980042,
 0.0020234015184232646,
 0.26454526196481093]

In [8]:
sum(z_list)

260.4980991866005

In [9]:
%timeit z_list = prod(x_list, y_list)

36.2 µs ± 577 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)


In [10]:
import numpy as np

In [11]:
x_np = np.array(x_list)
y_np = np.array(y_list)

In [12]:
def prod_np(x, y):
    return x * y

In [13]:
z_np = prod_np(x_np, y_np)

In [14]:
z_np[:10]

array([0.23829947, 0.45359961, 0.61993836, 0.41542011, 0.32234377,
       0.05876402, 0.06221152, 0.0205189 , 0.0020234 , 0.26454526])

In [15]:
sum(z_np)

260.4980991866005

In [16]:
%timeit z_np = prod_np(x_np, y_np)

481 ns ± 16.4 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)


In [17]:
36.4e-6/456e-9

79.82456140350877

### Arrays

In [18]:
import numpy as np
A = np.array([[1, 2], [3, 4], [5, 6]])
A

array([[1, 2],
       [3, 4],
       [5, 6]])

In [19]:
type(A)

numpy.ndarray

In [20]:
lista = [1, 2.5, "texto"]

In [21]:
lista

[1, 2.5, 'texto']

In [22]:
A.dtype

dtype('int64')

In [23]:
A = np.array([[1.0, 2], [3, 4], [5, 6]])
A

array([[1., 2.],
       [3., 4.],
       [5., 6.]])

In [24]:
A.dtype

dtype('float64')

In [25]:
A.shape

(3, 2)

In [26]:
A.transpose()

array([[1., 3., 5.],
       [2., 4., 6.]])

In [27]:
A.T

array([[1., 3., 5.],
       [2., 4., 6.]])

In [28]:
v = np.array([1, 2, 3])

In [29]:
v

array([1, 2, 3])

In [30]:
v.shape

(3,)

In [31]:
v.T

array([1, 2, 3])

In [32]:
np.dot(v, v.T)

14

In [33]:
v = np.array([[1, 2, 3]])

In [34]:
v.shape

(1, 3)

In [35]:
v = np.array([[1], [2], [3]])

In [36]:
v.shape

(3, 1)

In [37]:
np.dot(v, v.T)

array([[1, 2, 3],
       [2, 4, 6],
       [3, 6, 9]])

In [38]:
v = v.reshape(1, 3)

In [39]:
v

array([[1, 2, 3]])

In [40]:
# Para garantir que um vetor tenha um determinado shape
assert(v.shape == (1, 3))

In [41]:
v.shape

(1, 3)

In [42]:
a = np.array([4, 5, 6]).reshape(4, 1)
a

ValueError: cannot reshape array of size 3 into shape (4,1)

In [43]:
a = np.array([4, 5, 6]).reshape(-1, 1)
a

array([[4],
       [5],
       [6]])

In [44]:
a.shape

(3, 1)

In [45]:
A

array([[1., 2.],
       [3., 4.],
       [5., 6.]])

In [46]:
A.reshape(-1, 3)

array([[1., 2., 3.],
       [4., 5., 6.]])

### Slicing

In [47]:
A = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])

In [48]:
A

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

In [49]:
A[0] # primeira linha

array([1, 2, 3])

In [50]:
A[0].shape # resultado é um array de rank 1

(3,)

In [51]:
A[0, :]

array([1, 2, 3])

In [52]:
A[0, :].shape

(3,)

Para obter array de rank 2:

- Usar reshape

In [53]:
A[0, :].reshape(1, -1)

array([[1, 2, 3]])

In [54]:
A[0, :].reshape(1, -1).shape

(1, 3)

- Usar um iterável (ou range) para especificar linhas e colunas

In [55]:
A[[0]]

array([[1, 2, 3]])

In [56]:
A[[0]].shape

(1, 3)

In [57]:
A[[0, 1]]

array([[1, 2, 3],
       [4, 5, 6]])

In [58]:
A[:, [0]]

array([[1],
       [4],
       [7]])

In [59]:
A[[0], :]

array([[1, 2, 3]])

In [60]:
A[0:1, :]

array([[1, 2, 3]])

In [61]:
A[0:2, :]

array([[1, 2, 3],
       [4, 5, 6]])

### Funções de conveniência

In [62]:
np.zeros((3, 2))

array([[0., 0.],
       [0., 0.],
       [0., 0.]])

In [63]:
np.ones((2, 3))

array([[1., 1., 1.],
       [1., 1., 1.]])

In [64]:
np.random.rand(3, 2) # elementos entre 0 e 1, uniformemente distribuídos

array([[0.46635824, 0.34286397],
       [0.64836301, 0.00745396],
       [0.69759353, 0.89151124]])

In [65]:
np.random.randn(3, 2) # elementos de uma distribuição normal com média zero e variância 1

array([[ 1.18431295,  0.87547317],
       [-1.50005818, -2.01904215],
       [-0.22474556, -0.0353872 ]])

### Operações

Operações básicas do Python (`+`, `-`, `*`, `/`, `**`) sempre são executadas elemento a elemento

In [66]:
10*np.ones((2,2))

array([[10., 10.],
       [10., 10.]])

In [67]:
A = 10 * np.ones((2, 2))
A

array([[10., 10.],
       [10., 10.]])

In [68]:
B = np.array([[2, 2], [5, 5]])
B

array([[2, 2],
       [5, 5]])

In [69]:
A*B

array([[20., 20.],
       [50., 50.]])

In [70]:
A**B

array([[   100.,    100.],
       [100000., 100000.]])

In [71]:
np.dot(A,B) # multiplicação matricial

array([[70., 70.],
       [70., 70.]])

In [72]:
A @ B

array([[70., 70.],
       [70., 70.]])

In [73]:
A = np.array([[1, 2], [3, 4]])
A

array([[1, 2],
       [3, 4]])

In [74]:
v = np.array([[5], [6]])
v

array([[5],
       [6]])

In [75]:
A@v

array([[17],
       [39]])

In [76]:
v@A

ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 2 is different from 1)

In [77]:
np.dot(A,v)

array([[17],
       [39]])

### Broadcasting

In [78]:
a = np.array([[1,2,3]]) # vetor linha
a

array([[1, 2, 3]])

In [79]:
b = np.array([[4,5,6]]).reshape(-1, 1) # vetor coluna
b

array([[4],
       [5],
       [6]])

In [80]:
a+b

array([[5, 6, 7],
       [6, 7, 8],
       [7, 8, 9]])

In [81]:
b+a

array([[5, 6, 7],
       [6, 7, 8],
       [7, 8, 9]])

In [82]:
a

array([[1, 2, 3]])

In [83]:
b

array([[4],
       [5],
       [6]])

In [84]:
b*a

array([[ 4,  8, 12],
       [ 5, 10, 15],
       [ 6, 12, 18]])

In [85]:
c = np.array([[1,2,3]]) # vetor linha
c

array([[1, 2, 3]])

In [86]:
d = np.array([[4,5]]).reshape(-1, 1) # vetor coluna
d

array([[4],
       [5]])

In [87]:
c*d

array([[ 4,  8, 12],
       [ 5, 10, 15]])

In [88]:
c@d

ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 2 is different from 3)