<a href="https://colab.research.google.com/github/vgruz/wit_ml/blob/main/ML_WIT_(Linear_Algebra).ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [19]:
import numpy as np
from sklearn.metrics.pairwise import cosine_similarity
import time

# Matrix Operations

In [None]:
a = np.array([[1, 2, 3], [2, 5, 6], [6, 7, 4]])
print(f"Matrix:\n {a}")

Matrix:
 [[1 2 3]
 [2 5 6]
 [6 7 4]]


In [None]:
b = np.eye(5)
print(f"Eye Matrix:\n {b}")

Eye Matrix:
 [[1. 0. 0. 0. 0.]
 [0. 1. 0. 0. 0.]
 [0. 0. 1. 0. 0.]
 [0. 0. 0. 1. 0.]
 [0. 0. 0. 0. 1.]]


In [None]:
c = np.ones((7, 5))
print(f"Ones Matrix:\n {c}")

Ones Matrix:
 [[1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1.]]


In [None]:
v = np.arange(0, 24, 2)
print(f"Vector-Column:\n {v}")
print(f"2-D Array:\n {v.reshape(-1,12)}")

Vector-Column:
 [ 0  2  4  6  8 10 12 14 16 18 20 22]
2-D Array:
 [[ 0  2  4  6  8 10 12 14 16 18 20 22]]


In [None]:
v.T

array([ 0,  2,  4,  6,  8, 10, 12, 14, 16, 18, 20, 22])

In [None]:
v.reshape(-1,12).T

array([[ 0],
       [ 2],
       [ 4],
       [ 6],
       [ 8],
       [10],
       [12],
       [14],
       [16],
       [18],
       [20],
       [22]])

In [None]:
d = v.reshape((3, 4))
print(f"Matrix:\n {d}")

Matrix:
 [[ 0  2  4  6]
 [ 8 10 12 14]
 [16 18 20 22]]


You can extract entire rows or columns from a matrix using the expressions array[i, :] or array[:, j], respectively:

In [None]:
print(f"Second row: {d[1, :]}")
print(f"Fourth column: {d[:, 3]}")

Second row: [ 8 10 12 14]
Fourth column: [ 6 14 22]


Another way to access elements is by using the expression __`array[list1, list2]`__, where __`list1`__ and __`list2`__ are lists of integers. With this kind of indexing, both lists are traversed simultaneously, and the elements of the matrix at the corresponding coordinates are returned. The following example illustrates how this indexing mechanism works more clearly:

In [None]:
d[[1, 2], [3, 0]]

array([14, 16])

### Determinant



---

__Reminder of the theory.__ For square matrices, there exists the concept of a __determinant__.

Let $A$ be a square matrix. The __determinant__ (or __det__) of the matrix $A \in \mathbb{R}^{n \times n}$ is defined as the number

$$
\det A = \sum_{\alpha_{1}, \alpha_{2}, \dots, \alpha_{n}} (-1)^{N(\alpha_{1}, \alpha_{2}, \dots, \alpha_{n})} \cdot a_{\alpha_{1} 1} \cdot \cdot \cdot a_{\alpha_{n} n},
$$

where $\alpha_{1}, \alpha_{2}, \dots, \alpha_{n}$ is a permutation of the numbers from $1$ to $n$, and $N(\alpha_{1}, \alpha_{2}, \dots, \alpha_{n})$ is the number of inversions in the permutation. The summation is carried out over all possible permutations of length $n$.

_Do not worry if this definition isn't entirely clear — we won't need it in this exact form later._

For example, for a $2 \times 2$ matrix:

$$
\det \left( \begin{array}{cc} a_{11} & a_{12} \\ a_{21} & a_{22}  \end{array} \right) = a_{11} a_{22} - a_{12} a_{21}
$$

Calculating a determinant directly from the definition requires on the order of $n!$ operations, so methods have been developed that allow for fast and efficient computation.

In `NumPy`, the determinant of a matrix is calculated using the function __`numpy.linalg.det(a)`__, where __`a`__ is the input matrix.

In [None]:
a = np.array([[1, 2, 1], [1, 1, 4], [2, 3, 6]], dtype=np.float32)
det = np.linalg.det(a)
det

np.float32(-1.0)



Let's look at an interesting property of the determinant. Suppose we have a parallelogram with vertices at the points $(0, 0), (c,d), (a+c, b+d), (a, b)$ (the vertices are given in clockwise order). Then the area of this parallelogram can be calculated as the absolute value of the determinant of the matrix  
$$
\left( \begin{array}{cc} a & c \\ b & d \end{array} \right).
$$  
Similarly, the volume of a parallelepiped can be expressed using the determinant of a $3 \times 3$ matrix.

### Matrix Transpose

Theory Reminder. The transpose of a matrix $A^{T}$ is the matrix obtained from the original matrix $A$ by swapping rows with columns. Formally, the elements of the matrix $A^{T}$ are defined as $a^{T}{ij} = a{ji}$, where $a^{T}_{ij}$ is the element of $A^{T}$ located at the intersection of row $i$ and column $j$.

In NumPy, the transpose of a matrix is computed using the function numpy.transpose() or the array.T method, where array is the given 2D array.

In [None]:
a = np.array([[1, 2], [3, 4]])
b = np.transpose(a)
c = a.T

# 1. Deep-ML Tasks


[Calculate Cosine Similarity Between Vectors](https://www.deep-ml.com/problems/76?from=Machine%20Learning)

In [18]:
def cosine_similarity_np(v1, v2):
    # Check if input vectors are numpy arrays
    if not isinstance(v1, np.ndarray) or not isinstance(v2, np.ndarray):
        raise TypeError("Both inputs must be numpy arrays.")

    # Check for empty or all-zero vectors
    if not np.any(v1) or not np.any(v2):
        return -1

    # Check for equal lengths
    if v1.shape != v2.shape:
        return -1

    # Compute cosine similarity
    norm_product = np.linalg.norm(v1) * np.linalg.norm(v2)
    # Check divide-by-zero
    if norm_product == 0:
        return -1
    similarity = np.dot(v1, v2) / norm_product

    return np.round(similarity, 3)

In [17]:

def cosine_similarity_sklearn(v1, v2):
    # Check if input vectors are numpy arrays
    if not isinstance(v1, np.ndarray) or not isinstance(v2, np.ndarray):
        raise TypeError("Both inputs must be numpy arrays.")

    # Check for empty or all-zero vectors
    if not np.any(v1) or not np.any(v2):
        return -1.0

    # Check that the shapes match after flattening
    if v1.shape != v2.shape:
        return -1.0

    sim = cosine_similarity(v1, v2)
    return round(sim, 3)

In [None]:
def cosine_similarity(v1, v2):
    if v1.shape != v2.shape:
        raise ValueError("Arrays must have the same shape")

    if v1.size == 0:
        raise ValueError("Arrays cannot be empty")

    # Flatten arrays in case of 2D


    dot_product = np.dot(v1, v2)
    magnitude1 = np.sqrt(np.sum(v1**2))
    magnitude2 = np.sqrt(np.sum(v2**2))

    if magnitude1 == 0 or magnitude2 == 0:
        raise ValueError("Vectors cannot have zero magnitude")

    return round(dot_product / (magnitude1 * magnitude2), 3)

In [21]:
import math

def cosine_similarity(v1, v2):
    if len(v1) != len(v2):
        raise ValueError("Vectors must have the same length")

    if len(v1) == 0:
        raise ValueError("Vectors cannot be empty")

    dot_product = 0
    magnitude1 = 0
    magnitude2 = 0

    for a, b in zip(v1, v2):
        dot_product += a * b
        magnitude1 += a * a
        magnitude2 += b * b

    magnitude1 = math.sqrt(magnitude1)
    magnitude2 = math.sqrt(magnitude2)

    if magnitude1 == 0 or magnitude2 == 0:
        raise ValueError("Vectors cannot have zero magnitude")

    return round(dot_product / (magnitude1 * magnitude2), 3)

In [23]:
size = 10_000_000
v1 = np.random.rand(size)
v2 = np.random.rand(size)


def check_time(func, name):
    start = time.time()
    result = func(v1, v2)
    elapsed = time.time() - start
    print(f"{name.ljust(30)}: {elapsed:.4f} sec | similarity: {result}")


print(f"Check cosine similarity functions with {size:.0f} element:\n")
check_time(cosine_similarity, "Iterative Solution Function")
check_time(cosine_similarity_np, "NumPy Function")
check_time(cosine_similarity_sklearn, "Scikit-learn Function")


Check cosine similarity functions with 10000000 element:

Iterative Solution Function   : 4.9491 sec | similarity: 0.75
NumPy Function                : 0.0538 sec | similarity: 0.75
Scikit-learn Function         : 6.5396 sec | similarity: 0.75


Eigen-Values and Eigen-Vectors

In [None]:
def calculate_eigenvalues_np(matrix: list[list[float|int]]) -> list[float]:
  if not matrix:
    return None
  return list(np.linalg.eig(matrix).eigenvalues)

In [None]:
def calculate_eigenvalues(matrix: list[list[float]]) -> list[float]:
    a, b, c, d = matrix[0][0], matrix[0][1], matrix[1][0], matrix[1][1]
    trace = a + d
    determinant = a * d - b * c
    # Calculate the discriminant of the quadratic equation
    discriminant = trace**2 - 4 * determinant
    # Solve for eigenvalues
    lambda_1 = (trace + discriminant**0.5) / 2
    lambda_2 = (trace - discriminant**0.5) / 2
    return [lambda_1, lambda_2]

In [None]:
matrix = [[2, 1], [1, 2]]
print(calculate_eigenvalues_np(matrix))

[np.float64(3.0), np.float64(1.0)]


In [None]:
def inverse_2x2(matrix: list[list[float]]) -> list[list[float]]:
  a, b, c, d = matrix[0][0], matrix[0][1], matrix[1][0], matrix[1][1]
  determinant = a * d - b * c
  if determinant == 0:
    return None
  return [[d / determinant, -b / determinant], [-c / determinant, a / determinant]]

In [None]:
print(inverse_2x2(matrix))

[[0.6666666666666666, -0.3333333333333333], [-0.3333333333333333, 0.6666666666666666]]
