# Numpy

## 1 - Arrays

<p>Numpy arrays are fixed-type contiguous collections with high flexibility and efficiency in indexing methods and operations. There are several data types that can be used for arrays, the most common are:</p>
<ul>
<li><em>int8, int16, int32, int64</em></li>
<li><em>uint8, uint16, uint32, uint64</em></li>
<li><em>float16, float16, float64</em></li>
<li><em>complex64, complex128</em></li>
<li><em>bool</em></li>
</ul>
<p><strong>Multidimensiona arrays</strong></p>
<p>Each array is characterized by a set of&nbsp;<strong>axes&nbsp;</strong>and a&nbsp;<strong>shape</strong>. The axes of an array define its dimensions:</p>
<ul>
<li>a row vectos has 1 axis</li>
<li>a 2D matrix has 2 axes</li>
<li>a ND array has N axes</li>
</ul>
<p><img src="./img/n2.png" alt="" width="348" height="100" /></p>
<p>Axes can be numbered with negative values, the axis with index -1 is always along the <strong>row</strong>, while the last dimension added always take the lowest value (with sign) both with positive and negative indexing.</p>
<p>The <strong>shape</strong> of an array is a tuple that specifies the number of elements along each axis. a column vector is a 2D matrix.</p>
<p><img src="./img/n3.png" alt="" width="348" height="100" /></p>
<p><img src="./img/n4.png" alt="" width="348" height="100" /></p>

In [4]:
import numpy as np

matrix = [[1, 2, 3], 
          [4, 5, 6], 
          [7, 8, 9]]

np.array(matrix, dtype=np.uint8)

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]], dtype=uint8)

In [2]:
null_matrix = np.zeros((2, 3), dtype=np.int8)
id_matrix = np.ones((2, 3), dtype=np.int8)
full_matrix = np.full((3, 3), 10, dtype=np.int8)

print(null_matrix, end='\n\n')
print(id_matrix, end='\n\n')
print(full_matrix, end='\n\n')

[[0 0 0]
 [0 0 0]]

[[1 1 1]
 [1 1 1]]

[[10 10 10]
 [10 10 10]
 [10 10 10]]



In [3]:
#10 sample in range 0-1
linear = np.linspace(0, 1, 10)

#Sequence 10-20 with step 2
step_2 = np.arange(10, 21, 2)

#Random (2,3) matrix with normal distribution µ=5 std=2
gauss_matrix = np.random.normal(5, 2, (2,3)) 

#Random (3,4) matrix with uniform distribution on [0,1]
uniform_matrix = np.random.random((3, 4))

print(linear, end='\n\n')
print(step_2, end='\n\n')
print(gauss_matrix, end='\n\n')
print(uniform_matrix, end='\n\n')

[0.         0.11111111 0.22222222 0.33333333 0.44444444 0.55555556
 0.66666667 0.77777778 0.88888889 1.        ]

[10 12 14 16 18 20]

[[5.11983799 5.78470741 4.42754161]
 [3.55617633 4.58363865 4.51080059]]

[[0.07694248 0.55163731 0.09155646 0.57208055]
 [0.58878897 0.1634618  0.16378468 0.67434533]
 [0.68032324 0.55294879 0.99821347 0.68774243]]



In [4]:
x = np.array([[2, 3, 4], [8, 6, 7]])
dimensions = x.ndim 
matrix_shape = x.shape
matrix_size = x.size

print(dimensions, matrix_shape, matrix_size)

2 (2, 3) 6


## 2 - Computation on Numpy

<p><strong>Universal Functions (element-wise)</strong></p>
<ul>
<li><strong>Binary</strong> operations with arrays of the <strong>same shape</strong><br />
<ul>
<li>sum &amp; subtraction (+ -)</li>
<li>multiplication &amp; division (* /)</li>
<li>modulus (%)</li>
<li>floor division (//)</li>
<li>exponentiation (**)</li>
<p><img src="./img/n5.png" alt="" width="450" height="100" /></p>
</ul>
</li>
<li><strong>Unary&nbsp;</strong>operations (apply the operation separately to each element of the array)
<ul>
<li>np.abs(x)</li>
<li>np.exp(x), np.log(x), np.log2(x), np.log10(x)</li>
<li>np.sin(x), np.cos(x), np.tan(x), np.arctan(x)</li>
<li>...</li>
<p><img src="./img/n6.png" alt="" width="300" height="100" /></p>
</ul>
</li>
</ul>

In [5]:
x = np.array([[1, 2], [2, 2]])
y = np.array([[3, 4], [5, 6]])

print(f"x * y = \n{x*y}\n")
print(f"exp(x) = \n{np.exp(x)}")

x * y = 
[[ 3  8]
 [10 12]]

exp(x) = 
[[2.71828183 7.3890561 ]
 [7.3890561  7.3890561 ]]


<p><strong>Aggregate functions</strong></p>
<ul>
<li>Functions that returns a single value from an array
<ul style="list-style-type: square;">
<li>np.min(x), np.max(x) <em>or x.min(), x.max()</em></li>
<li>np.mean(x), np.std(x) <em>or x.mean(), x.std()</em></li>
<li>np.sum(x)&nbsp;<em>or x.sum()</em></li>
<li>np.argmin(x), np.argmax(x)&nbsp;<em>or x.argmin(), x.argmax()</em></li>
</ul>
</li>
<li>Aggregation functions along axes
<ul style="list-style-type: square;">
<li>Specify the operation and the axis</li>
<li>The aggregation dimension is removed from the output</li>
</ul>
</li>
</ul>

In [6]:
x = np.array([[1, 7], [2, 4]])
print(x.argmax(axis=0))
print(x.argmax(axis=1))
print(x.sum(axis=0))
print(x.sum(axis=1))

[1 0]
[1 1]
[ 3 11]
[8 6]


In [7]:
y = np.array([[[1, 2, 3], [4, 5, 6]], 
              [[7, 8, 9], [10, 11, 12]],
              [[13, 14, 15], [16, 17, 18]]])

y.min(axis=-1)

array([[ 1,  4],
       [ 7, 10],
       [13, 16]])

<p><img src="./img/n9.png" alt="" width="500" height="100" /></p>

In [8]:
y.min(axis=1)

array([[ 1,  2,  3],
       [ 7,  8,  9],
       [13, 14, 15]])

<p><img src="./img/n10.png" alt="" width="500" height="100" /></p>

<p><strong>Sorting</strong></p>
<ul>
<li>The sort() methods it is possible to specify the axis along which sorting the array (-1 by default)
<ul style="list-style-type: square;">
<li>np.sort(x) - creates a sorted copy of x</li>
<li>x.sort() - sorts x inplace</li>
</ul>
</li>
<li>It is also possible to return the position of te indices of the sotred array (by default on axis -1)
<ul style="list-style-type: square;">
<li>np.argsort(x)</li>
</ul>
</li>
</ul>

In [9]:
x = np.array([[2, 1, 3], [7, 8, 9]])
np.sort(x)

array([[1, 2, 3],
       [7, 8, 9]])

<p><img src="./img/n11.png" alt="" width="280" height="100" /></p>

In [10]:
x = np.array([[2, 7, 3], [7, 2, 1]])
np.sort(x, axis=0)

array([[2, 2, 1],
       [7, 7, 3]])

<p><img src="./img/n12.png" alt="" width="348" height="100" /></p>

In [11]:
x = np.array([9,1,8,3,7,4,6,5])
np.argsort(x)

array([1, 3, 5, 7, 6, 4, 2, 0])

In [12]:
x = np.array([[2, 1, 3], [7, 8, 9]])
np.argsort(x)

array([[1, 0, 2],
       [0, 1, 2]])

<p><img src="./img/n13.png" alt="" width="490" height="100" /></p>

<p><strong>Algebraic operations</strong></p>
<p>The np.dot(x,y) returns the matrix product between the arrays passed as parameter. The resulting shape depends on the original shapes of the arguments that have to be coherent with th operation performed</p>

In [13]:
x = np.array([1, 2, 3])
y = np.array([0, 2, 1])

np.dot(x,y)

7

<p><img src="./img/n14.png" alt="" width="225" height="100" /></p>

In [14]:
x = np.array([[1, 1], [2, 2]])
y = np.array([2, 3])

np.dot(x,y)

array([ 5, 10])

<p><img src="./img/n15.png" alt="" width="250" height="100" /></p>

In [15]:
x = np.array([[1, 1], [2, 2]])
y = np.array([[2, 2], [1, 1]])

np.dot(x,y)

array([[3, 3],
       [6, 6]])

<p><img src="./img/n16.png" alt="" width="350" height="100" /></p>

### Example

**Sigmoid activation function: $y = sigmoid(x)$**

$$sigmoid(x_i) = \frac{1}{1+exp(-x_i)}$$

<p><img src="./img/sig.png" alt="" width="50%"/></p>

In [23]:
x = np.array([4, -1, 7, 9, 3, -5])

y = 1/ (1 + np.exp(-x))
y

array([0.98201379, 0.26894142, 0.99908895, 0.99987661, 0.95257413,
       0.00669285])

**Softmax activation function: $$y_i = \frac{exp(x_i)}{\sum_j{exp(x_j)}}$$**

Activation function that normalizes the input vector to a discrete probability distribution (values of the result add to 1).



In [24]:
y_soft = np.exp(x) / np.exp(x).sum()
y_soft

array([5.88673555e-03, 3.96645122e-05, 1.18238244e-01, 8.73669020e-01,
       2.16560899e-03, 7.26480881e-07])

## 3 - Broadcasting

<p>The concept of broadcasting is extremely useful when there is the need to perform operations between arrays with <strong>different shapes</strong></p>

<p><img src="./img/n17.png" alt="" width="500" height="100" /></p>

<ol>
<li>The shape of the array with&nbsp;<strong>fewer dimensions</strong> is&nbsp;<strong>padded&nbsp;</strong>with leading ones</li>
<li>If the shape along a dimension is 1 for one of the arrays, and &gt;1 for the other, the smaller array is&nbsp;<strong>stretched to match to other array</strong></li>
<li>If there is a dimension where both arrays have shape &gt;1 broadcasting&nbsp;<strong>cannot be performed</strong></li>
</ol>

In [16]:
x = np.array([1, 2, 3])
y = np.array([[11], [12], [13]])
x + y

array([[12, 13, 14],
       [13, 14, 15],
       [14, 15, 16]])

<p><img src="./img/n19.png" alt="" width="500" height="100" /></p>

If the shapes are incompatible, broadcasting can't be executed and Numpy will <strong>raise a ValueError exception</strong>.

In [17]:
x = np.array([[1, 2], [3, 4], [5, 6]])
y = np.array([11, 12, 13])
x + y

ValueError: operands could not be broadcast together with shapes (3,2) (3,) 

### Example - Dataset normalization

In [26]:
#input table
n_samples = 100
n_columns = 5
mean = 1
std = 3
X = np.random.normal(mean, std, (n_samples, n_columns))

Apply z-score normalizing each column bu subtracting its mean and dividing by its standard deviation

In [36]:
X.mean(axis=0)

array([1.32082952, 0.92954785, 1.12435215, 0.75414228, 1.088014  ])

In [38]:
X.std(axis=0)

array([3.11140629, 3.14343853, 3.13090558, 2.77163615, 3.21447385])

In [41]:
X_norm = (X - X.mean(axis=0)) / X.std(axis=0)

In [33]:
# WRONG method
# X_norm = np.zeros(X.shape)
# for col in range(X.shape[1]):
#     X_norm[:,col] = (X[:,col]-X[:,col].mean())/X[:,col].std()
# X_norm

## 4 - Accessing arrays

<ul>
<li>Simple indexing</li>
<li>Slicing</li>
<li>Masking</li>
<li>Fancy indexing</li>
<li>Combined indexing</li>
</ul>

### Simple indexing

In [18]:
x = np.array([[2, 3, 4], [5, 6, 7]])
x[1,2]

7

In [19]:
x[1,2] = 1
x

array([[2, 3, 4],
       [5, 6, 1]])

In [20]:
x[0,-1]

4

### Slicing
<p>Slicing returns a view of the original array, each read and write operation on a view is computed inplace in the original array.</p>
<p>x[<span style="color: #339966;"> start</span> : <span style="color: #ff6600;">stop</span> : <span style="color: #33cccc;">step</span> , ... ]</p>
<ul>
<li>Creates a view of the elements from&nbsp;<span style="color: #339966;">start</span> (included) to &nbsp;<span style="color: #ff6600;">stop</span> (excluded) with fixed <span style="color: #33cccc;">step</span></li>
<li><span style="color: #000000;">Every update on the view yield <strong>inplace</strong> updates on the original array</span></li>
<li>With the shortcuts&nbsp;<strong>omit start</strong>,<strong>&nbsp;</strong><strong>omit stop </strong>and<strong>&nbsp;</strong><strong>omit step&nbsp;</strong>it's possible to slice from the beginning of the array, until the end of the array or without skipping elements, respectively.</li>
</ul>

In [21]:
x = np.array([[1,2,3],[4,5,6],[7,8,9]])

x[:, 1:]    # or x[0:3, 1:3]  

array([[2, 3],
       [5, 6],
       [8, 9]])

<p>select all rows and the last 2 columns</p>
<p><img src="./img/n22.png" alt="" width="100" height="100" /></p>

In [22]:
x[:2, ::2]    # or x[0:2, 0:3:2]    

array([[1, 3],
       [4, 6]])

<p>select the first two rows and the first and third coumns</p>
<p><img src="./img/n23.png" alt="" width="105" height="100" /></p>

In [None]:
view = x[:, 1:]
view

In [None]:
view[:,:] = 0
x    # chages the array inplace

To avoid updating the original array the methond .copy() can be used

In [None]:
x = np.array([[1,2,3],[4,5,6],[7,8,9]])
x_cp = x[:, 1:].copy() 
x_cp[:, :] = 0
x

### Masking
<p>Selecting element in the original array making use of a mask (a boolean array) with the same shape of original array.</p>
<p>Masking returns a&nbsp;<strong>one-dimensional vector</strong> that is a <strong>copy</strong> of the original array elements selected by the mask (no inplace changes)a&nbsp;Masks are usually created making use of a standard python comparison operator (&gt;, &gt;=, &lt;, &lt;=, ==, !=).</p>
<p>Moreover, Numpy allows boolean operations between masks with the same shape</p>
<ul>
<li>&amp; (and)</li>
<li>| &nbsp;(or)</li>
<li>^ (xor)</li>
<li>~ (negation)</li>
</ul>


In [None]:
x = np.array([1.2, 4.1, 1.5, 4.5])
x > 4    

In [None]:
x = np.array([[1.2, 4.1], [1.5, 4.5]])
x <= 1.5

In [None]:
~(x <= 1.5)

In [None]:
x[x <= 1.5]

In [None]:
x[x > 4] = 0
x

Masking does not create views, but copies

In [None]:
x = np.array([1.2, 4.1, 1.5, 4.5])
masked = x[x > 4]
masked[:] = 0
x

### Fancy indexing
Specify explicitly the index of the elements to be selected. Similarly to masking, fancy indexing provides copies (not views) of the original array.

In [None]:
x = np.array([7, 9, 6, 5])
x[[1,3]]

In [None]:
x = np.array([[0.0, 1.0, 2.0], [3.0, 4.0, 5.0], [6.0, 7.0, 8.0]])
x[[1, 2]]

<p><img src="./img/n24.png" alt="" width="200" height="100" /></p>

In [40]:
x[[1, 2], [0, 2]] # select indices (1,0) (2,2)

array([3, 8])

<p><img src="./img/n25.png" alt="" width="120" height="100" /></p>

In [34]:
x = np.array([1.2, 4.1, 1.5, 4.5])
x[[1, 3]] = 0    # assignment is allowed
x

array([1.2, 0. , 1.5, 0. ])

In [35]:
x = np.array([1.2, 4.1, 1.5, 4.5])
selection = x[[1,3]]
selection[:] = 0    # assignment does not effect x
x

array([1.2, 4.1, 1.5, 4.5])

### Combined indexing

<p>Numpy allows mixing the different accessing methods introduced so far.</p>

In [36]:
x = np.array([[0, 1, 2], [3, 4, 5], [6, 7, 8]])
x

array([[0, 1, 2],
       [3, 4, 5],
       [6, 7, 8]])

In [37]:
x[[True, False, True], 1:] # mask + slicing

array([[1, 2],
       [7, 8]])

<p><img src="./img/n26.png" alt="" width="120" height="100" /></p>

In [26]:
x[[0,2], :2] # fancy + slicing

array([[0, 1],
       [6, 7]])

<p><img src="./img/n27.png" alt="" width="120" height="100" /></p>

In [39]:
x[0, 1:] # simple + slicing

array([1, 2])

<p><img src="./img/n28.png" alt="" width="120" height="100" /></p>

In [38]:
x[[True, False, True], 0] # mask + simple

array([0, 6])

<p><img src="./img/n29.png" alt="" width="120" height="100" /></p>

### Example

In [43]:
# Input table (12 samples with 4 attributes)
X = np.array([[5.1, 3.5, 1.4, 0.2],
              [4.3, 3. , 1.1, 0.1],
              [5. , 3.4, 1.6, 0.4],
              [5.1, 3.4, 1.5, 0.2],
              [6.9, 3.1, 4.9, 1.5],
              [6.7, 3.1, 4.4, 1.4],
              [6. , 2.9, 4.5, 1.5],
              [6.1, 3. , 4.6, 1.4],
              [6.5, 3. , 5.8, 2.2],
              [7.7, 3.8, 6.7, 2.2],
              [7.4, 2.8, 6.1, 1.9],
              [6.8, 3.2, 5.9, 2.3]])

attributes = ['height', 'width', 'intensity', 'weight']

lables = np.array([0, 0, 0, 0, 1, 1, 1, 1, 2, 2, 2, 2])

## Array transformation

### Concatenation

<p>The concatenate( ) method returns the concatenation of existing arrays along one axis, spceified as parameter (default is 0).&nbsp;All the input array dimensions for the concatenation axis must match exactly.</p>

In [57]:
x = np.array([[1, 2, 3], [4, 5, 6]])
y = np.array([[11,12, 13], [14, 15, 16]])

np.concatenate((x, y))

array([[ 1,  2,  3],
       [ 4,  5,  6],
       [11, 12, 13],
       [14, 15, 16]])

<p><img src="./img/n30.png" alt="" width="200" height="100" /></p>

In [54]:
np.concatenate((x,y), axis = 1)

array([[ 1,  2,  3, 11, 12, 13],
       [ 4,  5,  6, 14, 15, 16]])

<p><img src="./img/n31.png" alt="" width="200" height="100" /></p>

<p>The hstack( ) and vstack( ) methods works in a similar way. These functions make most sense for arrays with up to 3 dimensions. For instance, for pixel-data with a height (first axis), width (second axis), and r/g/b channels (third axis).</p>

In [58]:
np.hstack((x,y))

array([[ 1,  2,  3, 11, 12, 13],
       [ 4,  5,  6, 14, 15, 16]])

In [59]:
np.vstack((x,y))

array([[ 1,  2,  3],
       [ 4,  5,  6],
       [11, 12, 13],
       [14, 15, 16]])

<p><img src="./img/n32.png" alt="" width="370" height="100" /></p>

<p>Moreover, vstack allows also concatenating 1D vectors along a new axis (not allowed with the concatenate method)</p>

In [56]:
x = np.array([1, 2, 3])
y = np.array([11,12, 13])
np.vstack((x,y))


array([[ 1,  2,  3],
       [11, 12, 13]])

<p><img src="./img/n33.png" alt="" width="200" height="100" /></p>

### Splitting
<p>The split( ) takes as parameter an <strong>array</strong> and the <strong>indices</strong>&nbsp;alog which splitting the array, and&nbsp;returns a list of numpy arrays. The methods hsplit( ) and vsplit( ) work in a similar way, but they are used to split a matrix into a set of equal sized arrays.</p>

In [112]:
x = np.array([7, 7, 9, 9, 8, 8])
np.split(x, [2,4])    # split before element 2 and 4

[array([7, 7]), array([9, 9]), array([8, 8])]

<p><img src="./img/n34.png" alt="" width="300" height="100" /></p>

In [113]:
x = np.array([[1, 2, 3, 4, 5, 6], [11, 12, 13, 14, 15, 16], 
              [21, 22, 23, 24, 25, 26], [31, 32, 33, 34, 35, 36]])
x

array([[ 1,  2,  3,  4,  5,  6],
       [11, 12, 13, 14, 15, 16],
       [21, 22, 23, 24, 25, 26],
       [31, 32, 33, 34, 35, 36]])

In [114]:
np.hsplit(x, 2)

[array([[ 1,  2,  3],
        [11, 12, 13],
        [21, 22, 23],
        [31, 32, 33]]),
 array([[ 4,  5,  6],
        [14, 15, 16],
        [24, 25, 26],
        [34, 35, 36]])]

In [118]:
np.hsplit(x, 3)

[array([[ 1,  2],
        [11, 12],
        [21, 22],
        [31, 32]]),
 array([[ 3,  4],
        [13, 14],
        [23, 24],
        [33, 34]]),
 array([[ 5,  6],
        [15, 16],
        [25, 26],
        [35, 36]])]

In [119]:
np.vsplit(x,2)

[array([[ 1,  2,  3,  4,  5,  6],
        [11, 12, 13, 14, 15, 16]]),
 array([[21, 22, 23, 24, 25, 26],
        [31, 32, 33, 34, 35, 36]])]

### Reshaping
<p>The reshape( ) method is used to change the shape of an array. The size of the array on which the method is called must match the size of the target shape.</p>

In [128]:
x = np.arange(6)
x

array([0, 1, 2, 3, 4, 5])

In [129]:
x.reshape((2,3))

array([[0, 1, 2],
       [3, 4, 5]])

<p><img src="./img/n36.png" alt="" width="400" height="100" /></p>

### Adding new dimensions

In [130]:
x = np.array([[1, 2, 3], [4, 5, 6]])
x

array([[1, 2, 3],
       [4, 5, 6]])

In [132]:
x[np.newaxis, :, :]

array([[[1, 2, 3],
        [4, 5, 6]]])

<p><img src="./img/n37.png" alt="" width="400" height="100" /></p>

In [133]:
x[:, np.newaxis, :]

array([[[1, 2, 3]],

       [[4, 5, 6]]])

In [134]:
x[:, :, np.newaxis]

array([[[1],
        [2],
        [3]],

       [[4],
        [5],
        [6]]])