## Exercise 1

Create a vectorized version of log and exp math function for 1D array A = [2, 5, 10, 3, 8]

Results should be: 
+ [0.6931472 1.609438  2.3025851 1.0986123 2.0794415]
+ [7.3890562e+00 1.4841316e+02 2.2026465e+04 2.0085537e+01 2.9809580e+03]

In [None]:
import os
import numpy as np
import math
os.environ["NUMBA_ENABLE_CUDASIM"] = "1"
import matplotlib.pyplot as plt
%matplotlib inline

In [None]:
from numba import vectorize, float32, int32

A = np.array([2, 5, 10, 3, 8], dtype='int32')

@vectorize([float32(int32)], target='parallel', fastmath=True)
def vec_log(x):
    y = np.log(x)
    return y

@vectorize([float32(int32)], target='parallel', fastmath=True)
def vec_exp(x):
    y = np.exp(x)
    return y

print('Logarithm of A:', vec_log(A))
print('exponential of A:', vec_exp(A))

## Exerice 2
Compute the value of a Gaussian probability density function at $x$ with $mean = 1$, $\sigma = 1$, lower and upper bound in $(-3, 3)$ and $size = 100000$

In [None]:
from numba import vectorize, float64, float32

N = 100000
mean = 1
std = 1

x = np.linspace(-3, 3, N, dtype='float32')

@vectorize([float64(float32, float32, float32)], target='parallel', fastmath=True) 
def vec_gaussian(x, mean, std):
    return 1/(std*np.sqrt(2*np.pi)) * np.exp((-1/2)*((x-mean)/std)**2)

plt.plot(x,vec_gaussian(x, mean, std))
plt.show()

## Exercise 3

Create a "zero suppression" function. A common operation when working with waveforms is to force all samples values below a certain absolute magnitude to be zero, as a way to eliminate low amplitude noise. 
Plot the data before and after the application of the zero_suppress function.

$thresold = 15$

In [None]:
from numba import jit

n = 100000
noise = np.random.normal(size=n) * 3
pulses = np.maximum(np.sin(np.arange(n) / (n / 23)) - 0.3, 0.0)
data = ((pulses * 300) + noise).astype(np.int16)

#put your code here
plt.plot(np.arange(n), data, color='r')
plt.title('Original noisy data')

@jit
def zero_suppression(x, thr):
    for i in range(len(x)):
        if np.abs(x[i]) < thr:
            x[i] = 0
    
suppressed_data = data.copy()
zero_suppression(suppressed_data, 15)

plt.figure()
plt.plot(np.arange(n), suppressed_data)
plt.title('Suppressed noisy data')

## Exercise 4

Calculate the Sigmoid kernel between the matrix X and Y defined as below. The Sigmoid kernel is defined as:

$k(x,y) = \tanh(\alpha x^T y + c) $

In [None]:
from numba import jit

X = np.random.rand(3,3)
Y = np.random.rand(3,3)

@jit
def sigmoid_kernel(x, y, alpha, c):
    return np.tanh(alpha * np.dot(x.T,y) + c)

print(sigmoid_kernel(X,Y,1,0))

## Exercise 5

Create a kernel function similar to the ```double_kernel``` see during the lecture and create a new function that takes a 3 dimensional matrix as input calculating the $cos$ for each element and then returns the result. The shape of the matrix must be $256X256X256$. The matrix can be randomly generated

In [None]:
from numba import cuda

@cuda.jit
def kernel(io_array):
    x, y, z = cuda.grid(3)
    if x < io_array.shape[0] and y < io_array.shape[1] and z < io_array.shape[2]:
        io_array[x,y,z] = math.cos(io_array[x,y,z])


mat = np.random.random((256,256,256))

# Configure the blocks
threadsperblock = (8,4,4)
blockspergrid_x = int(math.ceil(mat.shape[0] / threadsperblock[0]))
blockspergrid_y = int(math.ceil(mat.shape[1] / threadsperblock[1]))
blockspergrid_z = int(math.ceil(mat.shape[2] / threadsperblock[2]))
blockspergrid = (blockspergrid_x, blockspergrid_y, blockspergrid_z)

kernel[blockspergrid, threadsperblock](mat)
print(mat)

## Exercise 6

Create a matrix multiplication kernel function, called ```matmul``` that takes as input two 2D matrices:
+ A of shape $24x12$
+ B of shape $12x22$
and that computes the multiplication and put the results into a third matrix C of shape $24x12$

A and B must be randomly generated and only int values are allowed.


In [None]:
from __future__ import division
from numba import cuda
import numpy
import math

# complete the code
@cuda.jit
def matmul(A, B, C):
    """Perform matrix multiplication of C = A * B
    """
    row, col = cuda.grid(2)
    if row < C.shape[0] and col < C.shape[1]:
        tmp = 0.
        for k in range(A.shape[1]):
            tmp += A[row, k] * B[k, col]
        C[row, col] = tmp

# Initialize the data arrays
A = np.random.randint(1,10,(24,12))
B = np.random.randint(1,10,(12,22))
C = np.zeros((24,22))

# Configure the blocks
threadsperblock = (4,2)
blockspergrid_x = int(math.ceil(C.shape[0] / threadsperblock[0]))
blockspergrid_y = int(math.ceil(C.shape[1] / threadsperblock[1]))
blockspergrid = (blockspergrid_x, blockspergrid_y)

matmul[blockspergrid, threadsperblock](A, B, C)
print(C)