## Problem Statement

You are given two integers ‘N’ and ‘K’, also provided with a ‘N x N’ square matrix ‘ARR’.
Your task is to print the sum of all sub-squares of size ‘K x K’ where ‘K’ is smaller than or equal to ‘N’.

**Constraints:**\
1 <= ‘T’ <= 10\
1 <= ‘N’ <= 500\
1 <= ‘K’ <= ‘N’\
1 <= ‘ARR[i][j]’ <= 1000

Where ‘ARR[i][j]’ denotes the matrix element at the jth column in the ith row

**Time Limit:** 1 sec\
**Sample Input 1:**\
1\
3 2\
8 1 3\
2 9 3\
0 3 5

**Sample Output 1:**\
20 16\
14 20

**Explanation for sample input 1:**\
There can be a total of 4 sub-matrix of size 2x2:\
First, starting at index [0,0]:\
8 1\
2 9\
So the sum of this matrix will be (8+1+2+9) = 20.\
Second, starting at index [0,1]:\
1 3\
9 3\
So the sum of this matrix will be (1+3+9+3) =16.\
Third, starting at index [1,0]:\
2 9\
0 3\
So the sum of this matrix will be (2+9+0+3) = 14\
Fourth, starting at index [1,1]:\
9 3\
3 5\
So the sum of this matrix will be (9+3+3+5) = 20.\
So we will return 2D array of size 2*2 with values as calculated above.\

**Sample Input 2:**\
1\
2 2\
5 7\
8 1\
**Sample Output 2:**\
21\
**Explanation for sample input 2:**\
Only 1 sub-matrix is possible starting from index [0, 0] sum of which is 21.

## Algorithm

To solve this problem efficiently, we can use a technique similar to the prefix sum but for 2D matrices. This technique involves creating an auxiliary matrix that will help us calculate the sum of any sub-square in constant time after an initial preprocessing step.

The steps are as follows:
1. Generate a prefix sum matrix (auxiliary matrix) where each element at (i, j) stores the sum of all elements in the sub-matrix from (0, 0) to (i, j).
1. Use this prefix sum matrix to calculate the sum of each sub-square of size 'K x K' efficiently.

**Here is a step-by-step algorithm:**
1. Create an auxiliary matrix 'prefixSum' of size 'N x N'.
1. Calculate the prefix sum for the first row and the first column of 'ARR' and store it in 'prefixSum'.
1. Calculate the prefix sum for the rest of the matrix using the formula:
prefixSum[i][j] = ARR[i][j] + prefixSum[i-1][j] + prefixSum[i][j-1] - prefixSum[i-1][j-1]
1. To calculate the sum of a sub-square of size 'K x K' starting at (i, j), use the following formula:
sum = prefixSum[i+K-1][j+K-1] - prefixSum[i+K-1][j-1] - prefixSum[i-1][j+K-1] + prefixSum[i-1][j-1]
Note that we need to check boundary conditions when i or j is 0.
1. Print the sum for all sub-squares of size 'K x K'.

**Below is the pseudocode for the above algorithm:**

```python
function computePrefixSum(ARR, N):
    prefixSum = createMatrix(N, N)
    for i in range(0, N):
        for j in range(0, N):
            if i == 0 and j == 0:
                prefixSum[i][j] = ARR[i][j]
            elif i == 0:
                prefixSum[i][j] = prefixSum[i][j-1] + ARR[i][j]
            elif j == 0:
                prefixSum[i][j] = prefixSum[i-1][j] + ARR[i][j]
            else:
                prefixSum[i][j] = ARR[i][j] + prefixSum[i-1][j] + prefixSum[i][j-1] - prefixSum[i-1][j-1]
    return prefixSum

function printKxKSubSquareSums(ARR, N, K):
    prefixSum = computePrefixSum(ARR, N)
    for i in range(0, N-K+1):
        for j in range(0, N-K+1):
            sum = prefixSum[i+K-1][j+K-1]
            if i > 0:
                sum -= prefixSum[i-1][j+K-1]
            if j > 0:
                sum -= prefixSum[i+K-1][j-1]
            if i > 0 and j > 0:
                sum += prefixSum[i-1][j-1]
            print(sum, end=" ")
        print()  # Move to next line after printing each row
```

The time complexity of the preprocessing step (computing the prefix sum) is **O(N^2)**, and calculating each sub-square sum takes **O(1)** time. Since we need to calculate the sum for all possible sub-squares, the overall time complexity is **O(N^2)** for the preprocessing step plus **O((N-K+1)^2)** for calculating and printing all sub-square sums.

The space complexity is **O(N^2)** for storing the prefix sum matrix.

This algorithm ensures optimal time complexity

## Implementation

In [1]:
def sumOfKxKMatrices(arr: list, k: int):
    # calculate prefix sum
    n = len(arr)
    p = [[0 for _ in range(n)] for _ in range(n)]
    result = []

    # First element is the same as in the original matrix
    p[0][0] = arr[0][0]

    # Initialize first column of prefix sum matrix
    for i in range(1, n):
        p[i][0] = p[i - 1][0] + arr[i][0]

    # Initialize first row of prefix sum matrix
    for i in range(1, n):
        p[0][i] = p[0][i - 1] + arr[0][i]

    # Fill the rest of the prefix sum matrix
    for i in range(1, n):
        for j in range(1, n):
            p[i][j] = p[i - 1][j] + p[i][j - 1] - p[i - 1][j - 1] + arr[i][j]

    # calculate the k sub-matrix sum
    for i in range(k - 1, n):
        row = []  # Reset row for each new row in the result
        for j in range(k - 1, n):
            s = p[i][j]
            if i - k >= 0:
                s -= p[i - k][j]
            if j - k >= 0:
                s -= p[i][j - k]
            if i - k >= 0 and j - k >= 0:
                s += p[i - k][j - k]  # Corrected: this should be an addition
            row.append(s)
        result.append(row)
    return result

In [2]:
sumOfKxKMatrices(arr=[[8, 1, 3], [2, 9, 3], [0, 3, 5]], k=2)

[[20, 16], [14, 20]]