<a href="https://colab.research.google.com/github/Manya123-max/Big-Data-Framework/blob/main/BDF8_MATRIX_OPERATION.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**Aim:** The code demonstrates the implementation of basic matrix operations using PySpark's
DenseMatrix. The key objectives are:
1. To define matrices as DenseMatrix objects, which are part of the pyspark.ml.linalg modul
2. To perform common matrix operations:
Addition: Adding corresponding elements of two matrices.
Subtraction: Subtracting corresponding elements of two matrices.
Multiplication: Calculating the dot product of rows and columns to obtain the product of
two matrices.
Transposition: Switching rows and columns of a matrix.
3. To provide scalable and efficient handling of dense matrices using Spark

In [None]:
from pyspark.sql import SparkSession
from pyspark.ml.linalg import DenseMatrix

**Step 1: Initialize SparkSession**

Purpose: Creates a Spark session to allow for distributed computations using PySpark.

Effect: The session enables the execution of Spark-based operations, which is essential when working with large-scale data or operations in PySpark.

In [None]:
# Step 1: Initialize SparkSession
spark = SparkSession.builder \
    .appName("Matrix Operations with DenseMatrix") \
    .getOrCreate()

**Step 2: Define Dense Matrices**

Purpose: Defines two 3x3 matrices using the DenseMatrix class.

Details: Both matrices are defined in column-major order. This means that the elements are stored as a single list, with each set of 3 consecutive values representing one column of the matrix.

In [None]:
# Step 2: Define Dense Matrices
matrix_a = DenseMatrix(3, 3, [2, 4, 6, 8, 10, 12, 14, 16, 18])  # 3x3 matrix in column-major order
matrix_b = DenseMatrix(3, 3, [18, 16, 14, 12, 10, 8, 6, 4, 2])  # 3x3 matrix in column-major order

**Step 3: Matrix Addition**

Purpose: Adds two matrices element-wise.

Process: Uses the zip function to pair corresponding elements from both matrices and adds them.

In [None]:
# Step 3: Matrix Addition
def add_matrices(m1, m2):
    return DenseMatrix(m1.numRows, m1.numCols, [a + b for a, b in zip(m1.values, m2.values)])

result_add = add_matrices(matrix_a, matrix_b)
print("Matrix Addition Result:")
print(result_add.toArray())

Matrix Addition Result:
[[20. 20. 20.]
 [20. 20. 20.]
 [20. 20. 20.]]


**Step 4: Matrix Subtraction**

Purpose: Subtracts the elements of matrix_b from the corresponding elements of matrix_a.

Process: Similar to the addition, but instead, element-wise subtraction is performed.

In [None]:
# Step 4: Matrix Subtraction
def subtract_matrices(m1, m2):
    return DenseMatrix(m1.numRows, m1.numCols, [a - b for a, b in zip(m1.values, m2.values)])

result_sub = subtract_matrices(matrix_a, matrix_b)
print("\nMatrix Subtraction Result:")
print(result_sub.toArray())


Matrix Subtraction Result:
[[-16.  -4.   8.]
 [-12.   0.  12.]
 [ -8.   4.  16.]]


**Step 5: Matrix Multiplication**

Purpose: Computes the dot product of rows from matrix_a with columns from matrix_b.

Process: Uses nested loops to iterate through the rows and columns of the matrices and calculate the dot product for each element of the resulting matrix.

In [None]:
# Step 5: Matrix Multiplication
def multiply_matrices(m1, m2):
    from pyspark.ml.linalg import Matrices
    result_values = [0.0] * (m1.numRows * m2.numCols)
    for i in range(m1.numRows):
        for j in range(m2.numCols):
            sum_val = 0.0
            for k in range(m1.numCols):
                sum_val += m1[i, k] * m2[k, j]
            index = i * m2.numCols + j
            result_values[index] = sum_val
    return Matrices.dense(m1.numRows, m2.numCols, result_values)

result_mul = multiply_matrices(matrix_a, matrix_b)
print("\nMatrix Multiplication Result:")
print(result_mul.toArray())


Matrix Multiplication Result:
[[360. 456. 552.]
 [216. 276. 336.]
 [ 72.  96. 120.]]


**Step 6: Matrix Transposition**

Purpose: Transposes matrix_a, switching its rows and columns.

Process: The isTransposed=True argument allows for efficient transposition without changing the underlying matrix values.

In [None]:
# Step 6: Matrix Transposition
def transpose_matrix(m):
    return DenseMatrix(m.numCols, m.numRows, m.values, isTransposed=True)

result_transpose_a = transpose_matrix(matrix_a)
print("\nMatrix Transposition of A:")
print(result_transpose_a.toArray())



Matrix Transposition of A:
[[ 2.  4.  6.]
 [ 8. 10. 12.]
 [14. 16. 18.]]


**Result:**

The code demonstrates the ability to perform matrix operations (addition, subtraction, multiplication, and transposition) using PySpark's DenseMatrix. This approach is efficient and scalable for large datasets due to PySpark's distributed nature.

Matrix Addition: Results in a matrix with each element as the sum of corresponding elements.

Matrix Subtraction: Produces a matrix where each element is the difference between corresponding elements.

Matrix Multiplication: Uses nested loops to calculate the dot product, resulting in a matrix with each element as the product of corresponding rows and columns.

Matrix Transposition: Swaps the rows and columns, providing a new matrix where the rows of the original become columns and vice versa.