<p style="font-family: Arial; font-size:3.75em;color:purple; font-style:bold"><br>
Introdução ao numpy :
</p><br>

<p style="font-family: Arial; font-size:1.25em;color:#2462C0; font-style:bold"><br>
Pacote para computação científica com Python
</p><br>

Numerical Python, ou "Numpy", é um pacote fundamental e muitos dos pacotes mais comuns utilizados em data science ( ex. pandas) são construidos sobre ele. O Numpy nos provém arrays multi-dimensionais de alta performance que pode podemos usar como vetores ou matrizes. 

As caracteristicas principais do numpy são:

- ndarrays: arrays n-dimensionais do mesmo tipo de dados, que são rápidas e eficientes em termos de espaço. Existem vários métodos incorporados para ndarrays que permitem o processamento rápido de dados sem usar loops (por exemplo, calcular a média).
- Broadcasting: uma ferramenta útil que define o comportamento implícito entre arrays multidimensionais de tamanhos diferentes.
- Vectorization: permite operações numéricas em ndarrays.
- Input/Output: simplifica a leitura e gravação de dados de / para arquivo.

<b>fontes recomendadas:</b><br>
<a href="https://docs.scipy.org/doc/numpy/reference/">Numpy Documentation</a><br>




<p style="font-family: Arial; font-size:2.75em;color:purple; font-style:bold"><br>

Começando com  ndarray<br><br></p>

**ndarrays** são matrizes multidimensionais de tempo e espaço eficientes,o núcleo do numpy. Vamos começar criando ndarrays usando o pacote numpy.

<p style="font-family: Arial; font-size:1.75em;color:#2462C0; font-style:bold"><br>

Como criar um 1D-Array?:
</p>

In [None]:
import numpy as np

an_array = np.array([3, 33, 333])  # Create a 1-D array

print(type(an_array))              # The type of an ndarray is: "<class 'numpy.ndarray'>"

In [None]:
# test the shape of the array we just created, it should have just one dimension (1-D)
print(an_array.shape)

In [None]:
# because this is a 1-D array, we need only one index to accesss each element
print(an_array[0], an_array[1], an_array[2]) 

In [None]:
an_array[0] =888            # ndarrays are mutable, here we change an element of the array

print(an_array)

<p style="font-family: Arial; font-size:1.75em;color:#2462C0; font-style:bold"><br>

Como criar um 2-D array?:</p>

A 2-D **ndarray** é aquele com duas dimensões. Observe o formato abaixo de [[row], [row]]. As matrizes bidimensionais são ótimas para representar matrizes que são frequentemente úteis em ciência de dados.

In [None]:
another = np.array([[11,12,13],[21,22,23]])   # Create a 2-D array

print(another)  # print the array

print("The shape is 2 rows, 3 columns: ", another.shape)  # rows x columns                   

print("Accessing elements [0,0], [0,1], and [1,0] of the ndarray: ", another[0, 0], ", ",another[0, 1],", ", another[1, 0])

<p style="font-family: Arial; font-size:1.75em;color:#2462C0; font-style:bold"><br>

Há muitas maneiras de criar matrizes numpy:
</p>

Aqui, criamos vários arrays de tamanhos diferentes com diferentes formas e diferentes valores pré-preenchidos. O numpy possui vários métodos integrados que nos ajudam a criar arrays multidimensionais de maneira rápida e fácil.

In [None]:
# create a 2x2 array of zeros
ex1 = np.zeros((2,2))      
print(ex1)                              

In [None]:
# create a 2x2 array filled with 9.0
ex2 = np.full((2,2), 9.0)  
print(ex2)

# create a 2x2 array filled with "heudson"
ex3 = np.full((2,2),'Heudson')  
print(ex3)

In [None]:
# create a 2x2 matrix with the diagonal 1s and the others 0
ex4 = np.eye(2,2)
print(ex4)   

In [None]:
# create an array of ones
ex5 = np.ones((1,2))
print(ex5)    

In [None]:
# notice that the above ndarray (ex5) is actually rank 2, it is a 2x1 array
print(ex5.shape)

# which means we need to use two indexes to access an element
print()
print(ex5[0,1])

In [None]:
# create an array of random floats between 0 and 1
ex6 = np.random.random((2,2))
print(ex6)    

<p style="font-family: Arial; font-size:2.75em;color:purple; font-style:bold"><br>

Array Indexing
<br><br></p>

<p style="font-family: Arial; font-size:1.75em;color:#2462C0; font-style:bold"><br>
Slice indexing:
</p>

Semelhante ao uso de indexação de slices com listas e strings, podemos usar a indexação de slices para extrair sub-regiões de ndarrays.

In [None]:
# 2-D array of shape (3, 4)
an_array = np.array([[11,12,13,14], [21,22,23,24], [31,32,33,34]])
print(an_array)

Use o corte de array para obter um subarray que consiste nas duas primeiras linhas x 2 colunas.

In [None]:
a_slice = an_array[:2, 1:3]
print(a_slice)

Quando você modifica uma fatia, você realmente modifica a matriz subjacente.

In [None]:
print("Before:", an_array[0, 1])   #inspect the element at 0, 1 
print(an_array)
a_slice[0, 0] = 1000    # a_slice[0, 0] is the same piece of data as an_array[0, 1]
print("After:", an_array[0, 1]) 
print(an_array)

<p style="font-family: Arial; font-size:1.75em;color:#2462C0; font-style:bold"><br>

Usndo integer indexing & slice indexing
</p>

Podemos usar combinações de indexação de inteiros e indexação de slices para criar matrizes de formatos diferentes.

In [None]:
# Create a Rank 2 array of shape (3, 4)
an_array = np.array([[11,12,13,14], [21,22,23,24], [31,32,33,34]])
print(an_array)

In [None]:
# Using both integer indexing & slicing generates an array of lower rank
row_rank1 = an_array[1, :]    # Rank 1 view 

print(row_rank1, row_rank1.shape)  # notice only a single []

In [None]:
# Slicing alone: generates an array of the same rank as the an_array
row_rank2 = an_array[1:2, :]  # Rank 2 view 

print(row_rank2, row_rank2.shape)   # Notice the [[ ]]

In [None]:
#We can do the same thing for columns of an array:

print()
col_rank1 = an_array[:, 1]
col_rank2 = an_array[:, 1:2]

print(col_rank1, col_rank1.shape)  # Rank 1
print()
print(col_rank2, col_rank2.shape)  # Rank 2

<p style="font-family: Arial; font-size:1.75em;color:#2462C0; font-style:bold"><br>

Indexação de arrays para elementos em mudança:
</p>

Às vezes é útil usar uma matriz de índices para acessar ou alterar elementos.

In [None]:
# Create a new array
an_array = np.array([[11,12,13], [21,22,23], [31,32,33], [41,42,43]])

print('Original Array:')
print(an_array)

In [None]:
# Create an array of indices
col_indices = np.array([0, 1, 2, 0])
print('\nCol indices picked : ', col_indices)

row_indices = np.arange(4)
print('\nRows indices picked : ', row_indices)

In [None]:
# Examine the pairings of row_indices and col_indices.  These are the elements we'll change next.
for row,col in zip(row_indices,col_indices):
    print(row, ", ",col)

In [None]:
# Select one element from each row
print('Values in the array at those indices: ',an_array[row_indices, col_indices])

In [None]:
# Change one element from each row using the indices selected
an_array[row_indices, col_indices] += 100000

print('\nChanged Array:')
print(an_array)

<p style="font-family: Arial; font-size:2.75em;color:purple; font-style:bold"><br>
Boolean Indexing
<br><br></p>
<p style="font-family: Arial; font-size:1.75em;color:#2462C0; font-style:bold"><br>
Indexação de matriz para mudança de elementos:
</p>

In [None]:
# create a 3x2 array
an_array = np.array([[11,12], [21, 22], [31, 32]])
print(an_array)

In [None]:
# create a filter which will be boolean values for whether each element meets this condition
filter1 = (an_array > 15)
print(filter1)
print()

# create a filter which will be boolean values for whether each line  meets this condition
filter2 = (an_array > 15).all(axis = 1)
print(filter2)
print()
# create a filter which will be boolean values for whether each column  meets this condition
filter3 = (an_array > 15).all(axis = 0)
print(filter3)

Observe que o filtro1 tem o mesmo tamanho de ndarray que an_array, que é preenchido com True para cada elemento cujo elemento correspondente em an_array for maior que 15 e False para os elementos cujo valor seja menor que 15.

O filtro2 tem o mesmo tamanho da primeira dimensão do an_array, e é preenchido com True para cada linha onde todos os elementos correspondentes em an_array for maior que 15 e False para a linha onde todos os elementos cujo valor seja menor que 15.

In [None]:
# we can now select just those elements which meet the criteria 1
print(an_array[filter1])
print()


In [None]:
# For short, we could have just used the approach below without the need for the separate filter array.

an_array[(an_array >15)]

O que é particularmente útil é que podemos realmente mudar elementos na matriz aplicando um filtro lógico similar. Vamos adicionar 100 a todos os valores pares.

In [None]:
an_array[an_array>15] +=100
print(an_array)

<p style="font-family: Arial; font-size:2.75em;color:purple; font-style:bold"><br>

Datatypes e Operations com arrays
<br><br></p>

<p style="font-family: Arial; font-size:1.75em;color:#2462C0; font-style:bold"><br>

Datatypes:
</p>

In [None]:
ex1 = np.array([11, 12]) # Python assigns the  data type
print(ex1.dtype)

In [None]:
ex2 = np.array([11.0, 12.0]) # Python assigns the  data type
print(ex2.dtype)

In [None]:
ex3 = np.array([11, 21], dtype=np.int64) #You can also tell Python the  data type
print(ex3.dtype)

In [None]:
# you can use this to force floats into integers (using floor function)
ex4 = np.array([11.1,12.7], dtype=np.int64)
print(ex4.dtype)
print()
print(ex4)

In [None]:
# you can use this to force integers into floats if you anticipate
# the values may change to floats later
ex5 = np.array([11, 21], dtype=np.float64)
print(ex5.dtype)
print()
print(ex5)

<p style="font-family: Arial; font-size:1.75em;color:#2462C0; font-style:bold"><br>

Operações aritiméticas com arrays:

</p>

In [None]:
x = np.array([[111,112],[121,122]], dtype=np.int)
y = np.array([[211.1,212.1],[221.1,222.1]], dtype=np.float64)

print(x)
print()
print(y)

In [None]:
# add
print(x + y)         # The plus sign works
print()
print(np.add(x, y))  # so does the numpy function "add"

In [None]:
# subtract
print(x - y)
print()
print(np.subtract(x, y))

In [None]:
# multiply
print(x * y)
print()
print(np.multiply(x, y))

In [None]:
# divide
print(x / y)
print()
print(np.divide(x, y))

In [None]:
# square root
print(np.sqrt(x))

In [None]:
# exponent (e ** x)
print(np.exp(x))

<p style="font-family: Arial; font-size:2.75em;color:purple; font-style:bold"><br>

Métodos estatísticos, Sorting, e <br> <br> Operações de conjuntos:
<br><br>
</p>

<p style="font-family: Arial; font-size:1.75em;color:#2462C0; font-style:bold"><br>

Operações Estatísticas Básicas:
</p>

In [None]:
# setup a random 2 x 4 matrix
arr = 10 * np.random.randn(2,5)
print(arr)

In [None]:
# compute the mean for all elements
print(arr.mean())

In [None]:
# compute the means by row
print(arr.mean(axis = 1))

In [None]:
# compute the means by column
print(arr.mean(axis = 0))

In [None]:
# sum all the elements
print(arr.sum())

In [None]:
# compute the medians
print(np.median(arr, axis = 1))

<p style="font-family: Arial; font-size:1.75em;color:#2462C0; font-style:bold"><br>

Sorting:
</p>


In [None]:
# create a 10 element array of randoms
unsorted = np.random.randn(10)

print(unsorted)

In [None]:
# create copy and sort
sorted_ = np.array(unsorted)
sorted_.sort()

print(sorted_)
print()
print(unsorted)

In [None]:
# inplace sorting
unsorted.sort() 

print(unsorted)

<p style="font-family: Arial; font-size:1.75em;color:#2462C0; font-style:bold"><br>

Encontrando elementos Únicos:
</p>

In [None]:
array = np.array([1,2,1,4,2,1,4,2])

uniques,index,counts = np.unique(array,return_counts=True,return_index=True)

print(uniques)
print(index)
print(counts)

<p style="font-family: Arial; font-size:1.75em;color:#2462C0; font-style:bold"><br>

Operação de conjunto com np.array data type:
</p>

In [None]:
import matplotlib.pyplot as plt
from matplotlib_venn import venn2

In [None]:
s1 = np.array(['desk','chair','bulb'])
s2 = np.array(['lamp','bulb','chair'])
print(s1, s2)

In [None]:
print( np.intersect1d(s1, s2) ) 
venn2(subsets = [set(s1),set(s2)], set_labels = ('s1', 's2'))
plt.show()

In [None]:
print( np.union1d(s1, s2) )

In [None]:
print( np.setdiff1d(s1, s2) )# elements in s1 that are not in s2

In [None]:
print( np.in1d(s1, s2) )#which element of s1 is also in s2

<p style="font-family: Arial; font-size:2.75em;color:purple; font-style:bold"><br>

Broadcasting:
<br><br>
</p>

Introdução ao broadcasting. <br>
Referências: <br>
https://docs.scipy.org/doc/numpy-1.10.1/user/basics.broadcasting.html

In [None]:
start = np.zeros((4,3))
print(start)

In [None]:
# create a rank 1 ndarray with 3 values
add_rows = np.array([1, 0, 2])
print(add_rows)

In [None]:
y = start + add_rows  # add to each row of 'start' using broadcasting
print(y)

In [None]:
# create an ndarray which is 4 x 1 to broadcast across columns
add_cols = np.array([[0,1,2,3]])
add_cols = add_cols.T

print(add_cols)

In [None]:
# add to each column of 'start' using broadcasting
y = start + add_cols 
print(y)

In [None]:
# this will just broadcast in both dimensions
add_scalar = np.array([1])  
print(start+add_scalar)

Exemplos de slides:

In [None]:
# create our 3x4 matrix
arrA = np.array([[1,2,3,4],[5,6,7,8],[9,10,11,12]])
print(arrA)

In [None]:
# create our 4x1 array
arrB = [0,1,0,2]
print(arrB)

In [None]:
# add the two together using broadcasting
print(arrA + arrB)

<p style="font-family: Arial; font-size:2.75em;color:purple; font-style:bold"><br>

Teste de velocidade: ndarrays vs lists
<br><br>
</p>

Configurando os parãmetros para o teste de velocidade.
Nós estaremos testando o tempo para somar os elementos em um ndarray versus uma lista.

In [None]:
from numpy import arange
from timeit import Timer

size    = 1000000
timeits = 1000

In [None]:
# create the ndarray with values 0,1,2...,size-1
nd_array = arange(size)
print( type(nd_array) )
print(nd_array.shape)

In [None]:
# timer expects the operation as a parameter, 
# here we pass nd_array.sum()
timer_numpy = Timer("nd_array.sum()", "from __main__ import nd_array")

print("Time taken by numpy ndarray: %f seconds" % 
      (timer_numpy.timeit(timeits)/timeits))

In [None]:
# create the list with values 0,1,2...,size-1
a_list = list(range(size))
print (type(a_list) )
print(len(a_list))

In [None]:
# timer expects the operation as a parameter, here we pass sum(a_list)
timer_list = Timer("sum(a_list)", "from __main__ import a_list")

print("Time taken by list:  %f seconds" % 
      (timer_list.timeit(timeits)/timeits))

<p style="font-family: Arial; font-size:2.75em;color:purple; font-style:bold"><br>

ler ou escrever no Disk:
<br><br>
</p>

<p style="font-family: Arial; font-size:1.3em;color:#2462C0; font-style:bold"><br>

Forma binária:</p>

In [None]:
x = np.array([ 23.23, 24.24] )

In [None]:
np.save('an_array', x)

In [None]:
np.load('an_array.npy')

<p style="font-family: Arial; font-size:1.3em;color:#2462C0; font-style:bold"><br>

Formato Texto:</p>

In [None]:
np.savetxt('array.txt', X=x, delimiter=',')

In [None]:
np.loadtxt('array.txt', delimiter=',')

<p style="font-family: Arial; font-size:2.75em;color:purple; font-style:bold"><br>

Operações adicionais
<br><br></p>

<p style="font-family: Arial; font-size:1.75em;color:#2462C0; font-style:bold"><br>

Produto ponto em Matrices and Produto interno com Vectors:

</p>

In [None]:
# determine the dot product of two matrices
x2d = np.array([[1,1],[1,1]])
y2d = np.array([[2,2],[2,2]])
print(x2d.dot(y2d))
print()
print(np.dot(x2d, y2d))

In [None]:
# determine the inner product of two vectors
a1d = np.array([9 , 9 ])
b1d = np.array([10, 10])

print(a1d.dot(b1d))
print()
print(np.dot(a1d, b1d))

In [None]:
# dot produce on an array and vector
print(x2d.dot(a1d))
print()
print(np.dot(x2d, a1d))

<p style="font-family: Arial; font-size:1.75em;color:#2462C0; font-style:bold"><br>

Sum:
</p>

In [None]:
# sum elements in the array
ex1 = np.array([[11,12],[21,22]])

print(np.sum(ex1))          # add all members

In [None]:
print(np.sum(ex1, axis=0))  # columnwise sum

In [None]:
print(np.sum(ex1, axis=1))  # rowwise sum

<p style="font-family: Arial; font-size:1.75em;color:#2462C0; font-style:bold"><br>

Element-wise Functions: </p>


Por exemplo, vamos comparar dois valores de matrizes para obter o máximo de cada um.

In [None]:
# random array
x = np.random.randn(8)
x

In [None]:
# another random array
y = np.random.randn(8)
y

In [None]:
# returns element wise maximum between two arrays

np.maximum(x, y)

<p style="font-family: Arial; font-size:1.75em;color:#2462C0; font-style:bold"><br>

Reshaping array:
</p>

In [None]:
# grab values from 0 through 19 in an array
arr = np.arange(20)
print(arr)

In [None]:
# reshape to be a 4 x 5 matrix
arr.reshape(4,5)

<p style="font-family: Arial; font-size:1.75em;color:#2462C0; font-style:bold"><br>

Transpose:

</p>

In [None]:
# transpose
ex1 = np.array([[11,12],[21,22]])

ex1.T

<p style="font-family: Arial; font-size:1.75em;color:#2462C0; font-style:bold"><br>

Indexing using where():</p>

In [None]:
x_1 = np.array([1,2,3,4,5])

y_1 = np.array([11,22,33,44,55])

filter_ = np.array([True, False, True, False, True])

In [None]:
out = np.where(filter_, x_1, y_1)
print(out)

In [None]:
mat = np.random.rand(5,5)
mat

In [None]:
np.where( mat > 0.5, 1000, -1)

<p style="font-family: Arial; font-size:1.75em;color:#2462C0; font-style:bold"><br>

condicionais "any" ou "all" :</p>

In [None]:
arr_bools = np.array([ True, False, True, True, False ])

In [None]:
arr_bools.any()

In [None]:
arr_bools.all()

<p style="font-family: Arial; font-size:1.75em;color:#2462C0; font-style:bold"><br>
Gerando números aleatórios:
</p>

In [None]:
Y = np.random.normal(size = (1,5))[0]
print(Y)

In [None]:
Z = np.random.randint(low=2,high=50,size=4)
print(Z)

In [None]:
np.random.permutation(Z) #return a new ordering of elements in Z

In [None]:
np.random.uniform(size=4) #uniform distribution

In [None]:
np.random.normal(size=4) #normal distribution

<p style="font-family: Arial; font-size:1.75em;color:#2462C0; font-style:bold"><br>

Merging data sets:
</p>

In [None]:
K = np.random.randint(low=2,high=50,size=(2,2))
print(K)

print()
M = np.random.randint(low=2,high=50,size=(2,2))
print(M)

In [None]:
np.vstack((K,M))

In [None]:
np.hstack((K,M))

In [None]:
np.concatenate([K, M], axis = 0)

In [None]:
np.concatenate([K, M.T], axis = 1)