# Testando paralelização em GPUS através da biblioteca Numba

## Multiplicando vetores (CPU)

### Declarando a função

In [2]:
import numpy as np

def mulVectorsCPU(a:list[np.float64], b:list[np.float64]):
    if a.size != b.size: raise RuntimeError('mulVectorsCPU() expects two vectors (lists) of equal length!')

    # Initializing vector to be returned with zeroes (with default dtype -> np.float64)
    c = np.zeros(a.size)
    for i in range(b.size): c[i] = a[i] * b[i]
    return c

### Inicializando vetores para testes

Inicializando dois vetores **`a`** e **`b`** com **100.000.000 de floats randomizados** (no **intervalo [-10, +10]**).

> Esta parte de geração costuma demorar 3m40s para executar num Ryzen 7 7700X @ 5.4 GHz!

In [3]:
N = 100000000
MIN_VAL = -10.0
MAX_VAL = 10.0

# Initializing vectors with zeroes (with default dtype -> np.float64)
a = np.zeros(N)
b = np.zeros(N)

# Filling vectors with random floats
for i in range(N):
    a[i] = np.random.uniform(MIN_VAL, MAX_VAL)
    b[i] = np.random.uniform(MIN_VAL, MAX_VAL)

### Rodando e cronometrando a função

Utilizando o código abaixo podemos avaliar o **tempo de execução** da função **`mulVectorsCPU()`**, que roda na CPU.

Em **dez execuções** de teste (com o mesmo input), temos:

- Tempo médio: ~14,6612 segundos
- Tempo mínimo: ~11,6958 segundos
- Tempo máximo: ~22,6586 segundos

In [4]:
from timeit import default_timer as timer

print('Multiplying vectors...')

startT = timer() # Startng our timer
ab = mulVectorsCPU(a, b)
endT = timer() # Endng our timer
execTime = endT - startT

print(f'Vectors multiplied with success!\nExecution time of {execTime} seconds\n\nResults:')

print(f'a[:5] = {a[:5]}; a[-5:] = {a[-5:]}')
print(f'b[:5] = {b[:5]}; b[-5:] = {b[-5:]}')
print(f'a*b[:5] = {ab[:5]}; a*b[-5:] = {ab[-5:]}')

Multiplying vectors...
Vectors multiplied with success!
Execution time of 10.899645179000004 seconds

Results:
a[:5] = [ 8.46820582 -2.07350933  3.98438312 -5.94779188 -9.18597756]; a[-5:] = [ 6.86080059 -3.18931605 -4.61463927 -2.04224803  8.64487294]
b[:5] = [-3.81603613 -4.08957518 -2.5057891   4.99247346  1.82104168]; b[-5:] = [ 6.02995183 -0.61436131 -1.71797819 -2.6246819  -6.91982981]
a*b[:5] = [-32.31497936   8.47977229  -9.98402378 -29.69419314 -16.72804803]; a*b[-5:] = [ 41.37029705   1.9593924    7.9278496    5.36025143 -59.82104943]


## Multiplicando vetores (GPU)

### Declarando a função

In [20]:
from numba import vectorize

@vectorize(['float64(float64, float64)'], target='cuda')
def mulVectorsGPU(a:np.float64, b:np.float64):
    return a*b

In [52]:
print('Multiplying vectors...')

startT = timer() # Startng our timer
ab = mulVectorsGPU(a, b)
endT = timer() # Endng our timer
execTime = endT - startT

print(f'Vectors multiplied with success!\nExecution time of {execTime} seconds\n\nResults:')

print(f'a[:5] = {a[:5]}; a[-5:] = {a[-5:]}')
print(f'b[:5] = {b[:5]}; b[-5:] = {b[-5:]}')
print(f'a*b[:5] = {ab[:5]}; a*b[-5:] = {ab[-5:]}')

Multiplying vectors...
Vectors multiplied with success!
Execution time of 0.1863040700000056 seconds

Results:
a[:5] = [ 8.46820582 -2.07350933  3.98438312 -5.94779188 -9.18597756]; a[-5:] = [ 6.86080059 -3.18931605 -4.61463927 -2.04224803  8.64487294]
b[:5] = [-3.81603613 -4.08957518 -2.5057891   4.99247346  1.82104168]; b[-5:] = [ 6.02995183 -0.61436131 -1.71797819 -2.6246819  -6.91982981]
a*b[:5] = [-32.31497936   8.47977229  -9.98402378 -29.69419314 -16.72804803]; a*b[-5:] = [ 41.37029705   1.9593924    7.9278496    5.36025143 -59.82104943]
