## Machine Learning - Spring 2025
### Dr. Ilia Tetin

### Seminar 01

The numpy library is a cornerstone in the Python data science ecosystem, offering efficient operations for numerical computations across multidimensional arrays. Its ability to perform vectorized computations means that operations can be applied to array elements simultaneously, significantly speeding up processing times compared to traditional for-loop iterations. With numpy, users can perform a wide range of mathematical tasks, from basic arithmetic to complex linear algebra and statistical operations. It also provides functionalities for array creation, reshaping, slicing, and indexing, making it a versatile tool for data manipulation and analysis. Whether you're working on scientific computing, machine learning, or any task that requires efficient numerical computation, numpy is an essential library to master.

Docs: https://numpy.org/

In [4]:
import numpy as np

In [5]:
vec = np.array([[1, 2], 
                [3, 4], 
                [5, 6]])
vec

array([[1, 2],
       [3, 4],
       [5, 6]])

In [33]:
a = np.array([1, 2, 3])
b = a*2
r1 = np.array(list(zip(a, b)))

r2= np.vstack((a, b))

print(r1, r2, sep =' \n' )

[[1 2]
 [2 4]
 [3 6]] 
[[1 2 3]
 [2 4 6]]


In [3]:
print(vec)

[[1 2]
 [3 4]
 [5 6]]


In [4]:
vec.dtype

dtype('int32')

In [5]:
type(vec)

numpy.ndarray

![image.png](attachment:image.png)

Dimensionality

In [36]:
vec

array([[1, 2],
       [3, 4],
       [5, 6]])

In [34]:
vec.shape

(3, 2)

In [35]:
vec.ndim

2

The axis parameter in many numpy functions is a powerful feature that specifies the axis along which the function is applied within an array.

`axis=0`: This means the function is applied column-wise. For a 2D array, it will perform the operation down each column.
`axis=1`: This means the function is applied row-wise. For a 2D array, it will perform the operation across each row. 

This concept extends to higher-dimensional arrays as well, where axis=0 refers to the outermost dimension and higher axis numbers refer to deeper dimensions within the array

In [8]:
np.sum(vec)

21

In [9]:
np.sum(vec, axis=0)

array([ 9, 12])

In [10]:
np.sum(vec, axis=1)

array([ 3,  7, 11])

Transposing an array or matrix is a fundamental operation in linear algebra and numerical analysis, which essentially flips the array over its diagonal. This means that the row and column indices of the array's elements are swapped. In the context of numpy and the array vec you're working with, transposing will convert the shape of the array from its original form to its transpose form.

In [11]:
vec.T

array([[1, 3, 5],
       [2, 4, 6]])

In [12]:
vec.transpose()

array([[1, 3, 5],
       [2, 4, 6]])

In [37]:
vec

array([[1, 2],
       [3, 4],
       [5, 6]])

Changing the shape of arrays in numpy is a common operation that allows you to rearrange the elements of an array into a new shape without changing its data.

In [38]:
vec.reshape(2, 3)

array([[1, 2, 3],
       [4, 5, 6]])

In [39]:
vec.reshape(-1, 3)

array([[1, 2, 3],
       [4, 5, 6]])

In [40]:
vec.reshape(2, -1)

array([[1, 2, 3],
       [4, 5, 6]])

In [48]:
vec.reshape(-1, 2)

array([[1, 2],
       [3, 4],
       [5, 6]])

In [59]:
vec1 = np.vstack((vec, vec, vec, vec))
vec1.reshape(-1, 12)

array([[1, 2, 3, 4, 5, 6, 1, 2, 3, 4, 5, 6],
       [1, 2, 3, 4, 5, 6, 1, 2, 3, 4, 5, 6]])

Indexing

In [17]:
# Accessing the second column of the array
vec[:, 1]

array([2, 4, 6])

In [18]:
# Accessing the third row of the array
vec[2, :]

array([5, 6])

In [19]:
# Accessing the first element of the second row (using slicing to keep the result as a 2D array)
vec[1:2, 0]  # This will print [[3]]

array([3])

In [20]:
# Accessing every other row in the array
vec[::2, :]

array([[1, 2],
       [5, 6]])

In [21]:
vec + 1

array([[2, 3],
       [4, 5],
       [6, 7]])

In [22]:
vec * 2

array([[ 2,  4],
       [ 6,  8],
       [10, 12]])

In [23]:
vec**2

array([[ 1,  4],
       [ 9, 16],
       [25, 36]])

In [24]:
vec + vec**2

array([[ 2,  6],
       [12, 20],
       [30, 42]])

In [25]:
vec * vec**2

array([[  1,   8],
       [ 27,  64],
       [125, 216]])

In [26]:
np.sin(vec)

array([[ 0.84147098,  0.90929743],
       [ 0.14112001, -0.7568025 ],
       [-0.95892427, -0.2794155 ]])

![image.png](attachment:image.png)

In [60]:
vec

array([[1, 2],
       [3, 4],
       [5, 6]])

In [62]:
(vec**2).T

array([[ 1,  9, 25],
       [ 4, 16, 36]])

In [27]:
vec.dot((vec**2).T)

array([[  9,  41,  97],
       [ 19,  91, 219],
       [ 29, 141, 341]])

In [28]:
vec @ (vec**2).T

array([[  9,  41,  97],
       [ 19,  91, 219],
       [ 29, 141, 341]])

Broadcasting: https://docs.scipy.org/doc/numpy-1.15.0/user/basics.broadcasting.html

In [29]:
vec

array([[1, 2],
       [3, 4],
       [5, 6]])

In [30]:
np.arange(3).reshape(3, 1)

array([[0],
       [1],
       [2]])

In [31]:
vec + np.arange(3).reshape(3, 1)

array([[1, 2],
       [4, 5],
       [7, 8]])

Boolean arrays:

In [32]:
is_even = vec % 2 == 0
print(is_even)

[[False  True]
 [False  True]
 [False  True]]


In [33]:
np.sum(is_even)

3

Boolean indexing in numpy is a powerful feature that enables you to select elements from an array using another array of boolean values (True or False) of the same shape. 

In [34]:
vec[vec % 2 == 0]

array([2, 4, 6])

Creating an array of zeros is a common task in numpy, useful for initializing arrays to a known value before filling them with data, among other uses.

In [35]:
np.zeros((2, 3))

array([[0., 0., 0.],
       [0., 0., 0.]])

In [36]:
np.ones((3, 2))

array([[1., 1.],
       [1., 1.],
       [1., 1.]])

In [37]:
np.identity(5)

array([[1., 0., 0., 0., 0.],
       [0., 1., 0., 0., 0.],
       [0., 0., 1., 0., 0.],
       [0., 0., 0., 1., 0.],
       [0., 0., 0., 0., 1.]])

Concatenation is the process of combining two or more arrays into a single array. In numpy, there are several functions to concatenate arrays along different axes.

In [38]:
vec

array([[1, 2],
       [3, 4],
       [5, 6]])

In [39]:
np.hstack((vec, np.zeros(vec.shape)))

array([[1., 2., 0., 0.],
       [3., 4., 0., 0.],
       [5., 6., 0., 0.]])

In [40]:
np.vstack((vec, np.zeros(vec.shape)))

array([[1., 2.],
       [3., 4.],
       [5., 6.],
       [0., 0.],
       [0., 0.],
       [0., 0.]])

Random numbers generation:

In [41]:
np.random.rand(2, 3)

array([[0.89998638, 0.1307108 , 0.2634704 ],
       [0.79565063, 0.38233248, 0.92461563]])

In [42]:
np.random.seed(2019)
np.random.rand(2, 3)

array([[0.90348221, 0.39308051, 0.62396996],
       [0.6378774 , 0.88049907, 0.29917202]])

In [43]:
np.random.randn(3, 2)

array([[ 0.57376143,  0.28772767],
       [-0.23563426,  0.95349024],
       [-1.6896253 , -0.34494271]])

In [44]:
np.random.normal(2, 1, size=3)

array([2.0169049 , 1.48501648, 2.24450929])

In [45]:
np.random.randint(5, 10, size=3)

array([8, 6, 8])

Efficiency:

In [46]:
n = 300
A = np.random.rand(n, n)
B = np.random.rand(n, n)

In [47]:
%%time
C = np.zeros((n, n))
for i in range(n):
    for j in range(n):
        for k in range(n):
            C[i, j] += A[i, k] * B[k, j]

CPU times: total: 5.5 s
Wall time: 11.6 s


In [48]:
%%time
C = A @ B

CPU times: total: 0 ns
Wall time: 26 ms


#### Practice:

1. Reverse a one-dimensional array (make its elements go in the reverse order).
2. Find the maximum odd element in the array.
3. Replace all odd elements in the array with your favorite number.
4. Create an array of the first $n$ odd numbers, written in descending order. For example, if $n=5$, the answer will be `array([9, 7, 5, 3, 1])`. Functions that may be useful in solving: ``.arange()``
5. Calculate the closest and furthest numbers to a given number in the considered array of numbers. For example, if the input is an array `array([0, 1, 2, 3, 4])` and the number 1.33, the answer will be (1, 4). Functions that may be useful in solving: ``.abs(), .argmax(), .argmin()``
6. Compute the antiderivative (integral) of a given polynomial (use your favorite number as the constant). For example, if an array of coefficients `array([4, 6, 0, 1])` is input, corresponding to the polynomial $4x^3 + 6x^2 + 1$, the output is an array of coefficients `array([1, 2, 0, 1, -2])`, corresponding to the polynomial $x^4 + 2x^3 + x - 2$. Functions that may be useful in solving: `.append()`
7. Using point 6, calculate the first derivative for the given polynomial at a given point.

In [72]:
def square(x):
    return x ** 2

# Create a NumPy array
arr = np.array([1, 2, 3, 4, 5])

# Vectorize the function
vectorized_square = np.vectorize(square)
result = vectorized_square(arr)
result

array([ 1,  4,  9, 16, 25])

In [74]:
names=np.array(['Jim', 'Luke', 'Josh', 'Pete'])
k=np.vectorize(lambda s: s[0])
k

<numpy.vectorize at 0x22319670690>

In [75]:
k(names)

array(['J', 'L', 'J', 'P'], dtype='<U1')

In [81]:
k=np.vectorize(lambda s: s[0])(names)==('J')
print(k,
names[k], sep='\n')

[ True False  True False]
['Jim' 'Josh']


In [95]:
x = np.array(list(range(0, 10)))
x

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [96]:
x[(x%2!=0) & (x%7!=0)]

array([1, 3, 5, 9])

In [97]:
[i for i in range(0, 10) if i%2!=0 and i%7!=0 ]

[1, 3, 5, 9]

In [100]:
[i**2 for i in range(1, 10) if i==5 or i==9]

[25, 81]

In [105]:
vec[vec == 10 ] = 2
vec

array([[2, 2],
       [2, 2],
       [2, 2]])

In [116]:
np.arange(1, 6+1).reshape(2, -1).T

array([[1, 4],
       [2, 5],
       [3, 6]])

# Challenge 9

In [771]:
np.repeat([range(3, 10+1)], 5, axis=0)

array([[ 3,  4,  5,  6,  7,  8,  9, 10],
       [ 3,  4,  5,  6,  7,  8,  9, 10],
       [ 3,  4,  5,  6,  7,  8,  9, 10],
       [ 3,  4,  5,  6,  7,  8,  9, 10],
       [ 3,  4,  5,  6,  7,  8,  9, 10]])

In [271]:
np.array([list(range(3,10+1)) for i in range(1,5+1)])

array([[ 3,  4,  5,  6,  7,  8,  9, 10],
       [ 3,  4,  5,  6,  7,  8,  9, 10],
       [ 3,  4,  5,  6,  7,  8,  9, 10],
       [ 3,  4,  5,  6,  7,  8,  9, 10],
       [ 3,  4,  5,  6,  7,  8,  9, 10]])

In [269]:
np.array([range(3,10+1)]*5)

array([[ 3,  4,  5,  6,  7,  8,  9, 10],
       [ 3,  4,  5,  6,  7,  8,  9, 10],
       [ 3,  4,  5,  6,  7,  8,  9, 10],
       [ 3,  4,  5,  6,  7,  8,  9, 10],
       [ 3,  4,  5,  6,  7,  8,  9, 10]])

# Challenge 8

In [299]:
[[13], [14]]*5

[[13], [14], [13], [14], [13], [14], [13], [14], [13], [14]]

In [300]:
np.repeat([[13], [14]]*5, 4, axis=1).T

array([[13, 14, 13, 14, 13, 14, 13, 14, 13, 14],
       [13, 14, 13, 14, 13, 14, 13, 14, 13, 14],
       [13, 14, 13, 14, 13, 14, 13, 14, 13, 14],
       [13, 14, 13, 14, 13, 14, 13, 14, 13, 14]])

In [301]:
[13, 14]*5

[13, 14, 13, 14, 13, 14, 13, 14, 13, 14]

In [302]:
np.array([[13, 14]*5 for i in range(1,4+1)])

array([[13, 14, 13, 14, 13, 14, 13, 14, 13, 14],
       [13, 14, 13, 14, 13, 14, 13, 14, 13, 14],
       [13, 14, 13, 14, 13, 14, 13, 14, 13, 14],
       [13, 14, 13, 14, 13, 14, 13, 14, 13, 14]])

In [310]:
[[13]*2, [14]*2]*3

[[13, 13], [14, 14], [13, 13], [14, 14], [13, 13], [14, 14]]

In [303]:
np.array([[13]*4, [14]*4]*5).T

array([[13, 14, 13, 14, 13, 14, 13, 14, 13, 14],
       [13, 14, 13, 14, 13, 14, 13, 14, 13, 14],
       [13, 14, 13, 14, 13, 14, 13, 14, 13, 14],
       [13, 14, 13, 14, 13, 14, 13, 14, 13, 14]])

# Challenge 8

In [629]:
matrix = np.triu(np.random.randint(1, 100, 7))
matrix

array([[19, 23, 77, 97, 63, 52, 42],
       [ 0, 23, 77, 97, 63, 52, 42],
       [ 0,  0, 77, 97, 63, 52, 42],
       [ 0,  0,  0, 97, 63, 52, 42],
       [ 0,  0,  0,  0, 63, 52, 42],
       [ 0,  0,  0,  0,  0, 52, 42],
       [ 0,  0,  0,  0,  0,  0, 42]])

In [630]:
matrix.T

array([[19,  0,  0,  0,  0,  0,  0],
       [23, 23,  0,  0,  0,  0,  0],
       [77, 77, 77,  0,  0,  0,  0],
       [97, 97, 97, 97,  0,  0,  0],
       [63, 63, 63, 63, 63,  0,  0],
       [52, 52, 52, 52, 52, 52,  0],
       [42, 42, 42, 42, 42, 42, 42]])

In [631]:
m_add_mt = matrix+ matrix.T
m_add_mt

array([[ 38,  23,  77,  97,  63,  52,  42],
       [ 23,  46,  77,  97,  63,  52,  42],
       [ 77,  77, 154,  97,  63,  52,  42],
       [ 97,  97,  97, 194,  63,  52,  42],
       [ 63,  63,  63,  63, 126,  52,  42],
       [ 52,  52,  52,  52,  52, 104,  42],
       [ 42,  42,  42,  42,  42,  42,  84]])

In [632]:
(matrix + matrix.T).reshape(-1,)

array([ 38,  23,  77,  97,  63,  52,  42,  23,  46,  77,  97,  63,  52,
        42,  77,  77, 154,  97,  63,  52,  42,  97,  97,  97, 194,  63,
        52,  42,  63,  63,  63,  63, 126,  52,  42,  52,  52,  52,  52,
        52, 104,  42,  42,  42,  42,  42,  42,  42,  84])

In [633]:
m_add_mt_reduced = (matrix + matrix.T).reshape(-1,)
m_add_mt_reduced[list(range(0,len(m_add_mt_reduced),np.shape(matrix)[1]+1)),] = np.diag(matrix)
m_add_mt_reduced.reshape(np.shape(matrix))

array([[19, 23, 77, 97, 63, 52, 42],
       [23, 23, 77, 97, 63, 52, 42],
       [77, 77, 77, 97, 63, 52, 42],
       [97, 97, 97, 97, 63, 52, 42],
       [63, 63, 63, 63, 63, 52, 42],
       [52, 52, 52, 52, 52, 52, 42],
       [42, 42, 42, 42, 42, 42, 42]])

In [641]:
m_add_mt = (matrix + matrix.T) 
dim_matrix = np.shape(matrix)[1] 
diag_half_multiplier = np.diagflat([-0.5]*dim_matrix)+np.ones((dim_matrix,dim_matrix))
np.multiply(m_add_mt, diag_half_multiplier)

array([[19., 23., 77., 97., 63., 52., 42.],
       [23., 23., 77., 97., 63., 52., 42.],
       [77., 77., 77., 97., 63., 52., 42.],
       [97., 97., 97., 97., 63., 52., 42.],
       [63., 63., 63., 63., 63., 52., 42.],
       [52., 52., 52., 52., 52., 52., 42.],
       [42., 42., 42., 42., 42., 42., 42.]])

## Challenge 7

In [698]:
a = np.array([6, 2, 0, 3, 0, 0, 5, 0, 0])

In [701]:
index = np.where(a[:len(a)-1]==0)[0][-1] 
index

7

In [702]:
a[index+1]

0

## Challenge 6

In [704]:
matrix = np.array([[1, 4, 4200],
                   [0, 10, 5000], 
                   [1, 2, 1000]])
matrix

array([[   1,    4, 4200],
       [   0,   10, 5000],
       [   1,    2, 1000]])

In [714]:
np.mean(matrix, axis=0)
np.std(matrix, axis=0)

array([4.71404521e-01, 3.39934634e+00, 1.72819752e+03])

In [718]:
np.round(( matrix - np.mean(matrix, axis=0) )/np.std(matrix, axis=0),5)

array([[ 0.70711, -0.39223,  0.46291],
       [-1.41421,  1.37281,  0.92582],
       [ 0.70711, -0.98058, -1.38873]])

## Challenge 5

In [719]:
matrix = np.array([[0, 1, 2, 3],
                   [4, 5, 6, 7],
                   [8, 9, 10, 11],
                   [12, 13, 14, 15]])

np.prod(np.diag(matrix))

0

In [726]:
diag_m = np.diag(matrix)
np.prod(diag_m[diag_m != 0])

750

## Challenge 4

In [729]:
block = np.array([[1, 3, 3], [7, 0, 0]])
block

array([[1, 3, 3],
       [7, 0, 0]])

In [733]:
np.vstack((np.hstack((block, block)),np.hstack((block, block))))

array([[1, 3, 3, 1, 3, 3],
       [7, 0, 0, 7, 0, 0],
       [1, 3, 3, 1, 3, 3],
       [7, 0, 0, 7, 0, 0]])

## Challenge 1

In [760]:
weights = np.array([0.3, 0.4, 0.2, 0.1])
marks = np.array([7, 0, 8, 6])
np.multiply(weights,marks).sum()


4.300000000000001

In [762]:
int(np.multiply(weights,marks).sum().round(0))

4

## Challenge 2

In [766]:
array = np.array([3, 5, 1, 0, -3, 22, 213436])
number = -111

array[::3] = number
array

array([-111,    5,    1, -111,   -3,   22, -111])

## Challenge 3


In [788]:
array1 = np.array([1.5, 0.5, 2, -4.1, -3, 6, -1])
array2 = np.array([1.2, 0.5, 1, -4.0,  3, 0, -1.2])
precision = 0.5

np.where(np.abs(np.subtract(array1, array2)) <0.5)[0]


array([0, 1, 3, 6], dtype=int64)

In [786]:
a = np.array([[1, 2, 3, 4, 5, 6],
              [-2, 1, 2, 3, 4, 5]])

np.where(a > 2)

(array([0, 0, 0, 0, 1, 1, 1], dtype=int64),
 array([2, 3, 4, 5, 3, 4, 5], dtype=int64))

## Challenge 1


![image.png](attachment:image.png)

In [804]:
vec1 = np.array([-2, 1,  0, -5, 4, 3, -3])
vec2 = np.array([ 0, 2, -2, 10, 6, 0,  0])



96.0

In [807]:
numerator  = np.dot(vec1, vec2) 
denominator = (  np.sqrt(np.sum(vec1**2)) ) * ( np.sqrt(np.sum(vec2**2))   )

In [809]:
 numerator/denominator

-0.25