# **Ema's Quantum Computer**

https://www.hackerrank.com/challenges/two-pluses/problem?isFullScreen=true

## **Problem description**: 

Ema has a grid of good 'G' and bad 'B' cells of size m * n. Find the largest product of the volume of two non-overlapping plus signs. A plus sign is a symmetric plus sign of width 1 with all arms of the same length and a centerpoint at which all four arms meet.

For notation simplicity, m will be replaced by n.


## Here I present an O(n^2) fast encoding algorithm with an O(n^3) spatial comparison algorithm 

A list of the most notable tools available:
- Dictionary Indexing
- Chebyshev distance spatial sorting
- Key sorting and greedy algorithms
- Early stopping heuristics
- Spatially bounded algorithms that reduce O(n^2) permutations into O(n) linear reads
- Graham scan method of creating convex hulls.

## **Step 1**: O(n^2) runtime in dictionary encoding of plusses

A naive implementation of encoding plusses searches through every (x, y) coordinate and indexes for plusses, taking O(n^3) to search.

This can be done faster by a factor of n in the following manner:
    - parse through each row and for each cell record the length of the longest line segment of 'G' cells and the position of the rightmost cell of the line segment
    - parse through each colum n and for each cell, use the line segment features to compute the centerpoint and volume of the largest plus at (x, y)

Total runtime of search and encoding: O(n^2)

## **Python Class: Plus** -- Data structure for logical simplicity: 
- Keep a dictionary of plusses with coordinate keys
- Keep a second dictionary of the plusses sorted by volume
    - There at most min(n, m) different size plusses
    - Heuristic sort with chebyshev distance from origin -- gives higher chance of early stopping -- O(n^2 * log(n)) runtime
- Logically sort the lengths of the plusses to search with early stopping heuristics

## **Step 2**: Finding the largest product.

- Total plusses: n^2
- Total unique volumes: n

A naive implementation brute forces the combinations of every two plus signs, taking O(n^4) to search

## **Bounding Box algorithm**:

- Given an array of equal size plusses, a bounding box can be constructed where plus centers inside the box must intersect at least one other plus of the same size and plus signs that fall outside of the bounding box must necessarily be distinct. This alone massively reduces the compute time of the first pass of the algorithm.
- Taking advantage of the fact that plusses of equal size must overlap twice when they intersect, we get a fast solution for computing the largest prodcut of their intersect: 
    - Let x be the distance between two centers on the x axis, y be the distance between two centers on the y axis
    - **Case 1**: x == 0 or y == 0 -- The arms overlap, maximum product is when the arm length is equally divided between the two plus signs. Constant time solution. We can do horizontal and vertical scans in efficient O(n^2) time and store the results in a spatially indexed dictionary, an **R-tree** would naturally work in parallel with the bounding box algorithm as it also uses bounding boxes.
    - **Case 2**: x == y -- The centers are directly diagonal to each other. Maximum product results in the trimming of one plus until there's no intersect. Constant time solution, same as case 1.
    - **Case 3**: None of the above -- The maximum product is (max(x, y) * 4 + 1) ^ 2. Both plusses are trimmed until they slip past each other. Linear distance metric.
        - **Key observation**: A pre-sort of O(n^2 * log(n)) can be used to create a full 2d convex hull. This is the Graham scan method of constructing a convex hull.
        - **Key observation**: A convex hull should be computed for each x coordinate and each y coordinate, due to the possibility of overlap at one of the convex hull's points
            - At each x and y coordinate, create a convex hull without points (with drop outs) at that coordinate.
        - **Key observation**: Each convex hull can have at most sqrt(n^2) = n distinct points due to the reduced dimensionality of the problem.
        
        - **Key observation**: Creating n convex hulls for each volume is O(n) and not O(n^2)
        - **key observation**: Creating n^2 convex hulls total for n distinct volumes is O(n^2) and not O(n^3)

        **Proof**:
        1. **Definition**: A convex hull is a one-dimensional boundary that encloses a set of points on a 2D plane, forming the minimal convex set (Axiomatic).
        2. **Extremal Points Limit**: By definition, the convex hull of a set of points on a plane can have at most n extremal points (Derived from 1).
        3. **Lesser Convex Hulls**: When any point is excluded from the grid, a new convex hull formed has no more than n extremal points (Consequence of 2).
        4. **Distinct Extremal Points**: Removing points that are two units apart results in convex hulls with distinct new extremal points, except at a shared boundary (Geometric property from 1).
        5. **Batch Processing**: Constructing two distinct convex hulls by alternately removing odd and even-indexed points ensures all scenarios are covered with minimal overlap (Efficiency improvement using 3 and 4).
        6. **Coverage**: These two distinct lesser convex hulls collectively represent all potential configurations arising from the individual removal of any point (Comprehensive coverage using 5).
        7. **Information Content**: The total informational content (number of unique extremal points) within these hulls is O(n), reflecting linear complexity (Summation of minimal overlaps from 6).
        8. **Construction Complexity**: Building O(n) of these convex hulls requires linear time, O(n), due to the limited number of points processed (Efficiency derived from iterative approach in 4, 5, and 7).
        9. **Conclusion**: Considering all individual convex hulls constructed by excluding each point, the entire dataset exhibits linear complexity in both construction and informational content (Final conclusion from 8).

        **Conclusion**: This proof demonstrates that handling convex hulls derived by excluding points from a grid is linear complexity, O(n). This property significantly enhances the efficiency of any algorithm requiring geometric boundary analyses in grid-based data structures.

    - **Example**: given a plus at (3, 3) with volume 9, a plus at (4,6) is identical to a plus at (5,6) because the optimal solution for both is to truncate to the largest of the x and y differences. Notable exceptions are overlapping plusses at (3, 3) and (3, 4) as well as directly diagonal plusses (3, 3) and (4, 4) which can be handled separately through a number of sweep line algorithms horizontally, vertically, and diagonally.

As a result, this bounding box algorithm for a fixed volume is linear runtime, and constructing n convex hulls is linear runtime due to their overlapping information. They can be represented by a dictionary with O(n) space.


## **Final step**: Permutation between all n distinct volumes

### **Current steps taken so far**:
- n^2 total plusses recorded in O(n^2) time
- O(n^2 * log(n)) sort along one axis for each volume for Graham scan
- n different volumes of plusses
    - n different bounding boxes created in O(n^2) time
    - n different convex hulls of size O(n) created in O(n^2) time globally -- see above bounding box algorithm
- **Total**: O(n^2 * log(n))

### **Permutation**: The spatial comparison algorithm
- Comparing two convex hulls for largest L1 and next largest L2 plus volumes: (n choose 2) = n^2 comparisons -- runtime O(n^2 * comparison time)

    - 2 bounding boxes
    - 2 convex hulls
        - n elements per convex hull

    - Shrink the bounding box of L1 to match L2 and take a sorted permutation: Convex hull might go outside bounding box.
        - **Case 1**: Both convex hulls fall inside both bounding boxes
            - Across the entire dataset of different volumes, each volume contains the square root of its plusses encoded as convex hulls, thus the total number of plusses left is only O(n)
            - We only need to do a full permutation of these convex hulls, with drop out dictionaries for row and column aware comparisons ensuring edge cases are covered.
            - In the worst case, the maximum number of plusses remaining inside convex hulls exists when plusses are evenly distributed across volumes. That leaves sqrt(n) items in n bins or O(n^1.5) spatial complexity.
            - **A brute force permutation of the remaining bins yields (n^1.5 choose 2) or n^3**
            - Heuristics put in place can modify this by a quarter
            - I have not proven a lower bound at this moment. It is possible there is a n^2 total runtime, but n^3 from the original n^4 is a big improvement.
        - **Case 2**: Some convex hull plus centerpoints fall outside of the opposing bounding box
            - At minimum, we have L2 * L2 = max product
            - A final permuntation takes at most O(n^2) permutations
            - Search is halted
        - **Case 3**: Some conves hull plus centerpoints fall outside of its own shrunk bounding box
            - At minimum, we have L2 * L2 = max product
            - A final permuntation takes at most O(n^2) permutations
            - Search is halted
    
**Final runtime: O(n^3)**

In [12]:
def twoPluses(grid):
    class Plus:
        def __init__(self):
            self.plusses = {}
            self.decr_plus = {}
            self._maxprod = 0
            self.sorted_at_volume = set()
            self.upper_bound = None

        # Populate class attributes when inserting a plus
        def add(self,row,col,volume):
            t = (row, col)
            self.plusses[t] = volume
            if volume not in self.decr_plus:
                self.decr_plus[volume] = []
            self.decr_plus[volume].append(t)

        # For use in exists conditions
        def __contains__(self, row_col_tuple):
            return row_col_tuple in self.plusses
        
        # Retrieve plusses at specific volume
        def plusses_at_volume(self, volume):
            if volume not in self.sorted_at_volume:
                self.decr_plus[volume].sort(key = lambda x: x[0] + x[1])  # L1 norm very likely to have disjoint plusses.
                self.sorted_at_volume.add(volume)
            return self.decr_plus[volume]
        
        # Maxprod attribute custom handled to reduce clutter
        @property
        def maxprod(self):
            return self._maxprod
        @maxprod.setter
        def maxprod(self, value):
            if self._maxprod < value:
                self._maxprod = value
            else:
                pass
                # print(f'{value} too small to set as maxprod: {self.maxprod}')

        # Main algorithm to compute max product after populating
        def find_max(self):
            vol_desc = sorted(list(self.decr_plus.keys()),reverse=True)
            num_unique_volumes = len(vol_desc)
            for i in range(num_unique_volumes):
                for j in range(i+1):
                    
                    vol_i, vol_j = vol_desc[i], vol_desc[j]
                    
                    # Calculate ideal product without any interference
                    ideal_prod = vol_i * vol_j

                    # Case: Same size plus comparison                    
                    if i == j:
                        plusses = self.plusses_at_volume(vol_desc[i])
                        for index_a in range(len(plusses)):
                            for index_b in range(index_a + 1, len(plusses)):
                                if ideal_prod <= self.maxprod:
                                    break
                                self.find_prod(plusses[index_a], plusses[index_b], vol_i, vol_j, ideal_prod)
                    # Case: Different size plus comparison (two arrays)
                    else:
                        i_plusses = self.plusses_at_volume(vol_desc[i])
                        j_plusses = self.plusses_at_volume(vol_desc[j])
                        for iplus in i_plusses:
                            for jplus in j_plusses:
                                if ideal_prod <= self.maxprod:
                                    break
                                self.find_prod(iplus, jplus, vol_i, vol_j, ideal_prod)

            return self.maxprod

        def find_prod(self, a, b, vol_a, vol_b, ideal_prod):           
            # Calculate arm lengths for both plusses
            arm1 = (vol_a-1)//4
            arm2 = (vol_b-1)//4
            
            # Immediate return if centers coincide
            if a == b: 
                return
            
            # Calculate absolute positional differences
            x = abs(a[0]-b[0])
            y = abs(a[1]-b[1])
            
            # Check for direct line overlap without displacement
            if not x or not y:
                combined_arms = arm1 + arm2
                max_distance = max(x, y)
                if combined_arms < max_distance:
                    self.maxprod = ideal_prod
                    return
                else:
                    space = max_distance - 1
                    arm_small = min(arm1, arm2, space//2)
                    self.maxprod = (arm_small * 4 + 1) * ((space - arm_small) * 4 + 1)
                    return

            # Prepare zero-based indices for conflict checks
            x_space = x - 1
            y_space = y - 1
            
            # Define and check conflict conditions
            axbyconflict = (arm1 >= x and arm2 >= y)
            aybxconflict = (arm1 >= y and arm2 >= x)
            if not axbyconflict and not aybxconflict:  # No conflict
                self.maxprod = ideal_prod
            elif axbyconflict and aybxconflict:  # Symmetric error, two possibilities
                self.maxprod = max((min(x_space, y_space) * 4 + 1) * max(vol_a, vol_b), (max(x_space, y_space) * 4 + 1) ** 2)
            elif axbyconflict:  # Only AXBY conflict, require permutation
                self.maxprod = max(((x_space * 4 + 1) * vol_b), (vol_a * (y_space * 4 + 1)))
            else:  # Only AYBX conflict
                self.maxprod = max(((y_space * 4 + 1) * vol_b), (vol_a * (x_space * 4 + 1)))

        # Print method to visualize plusses
        def print(self,n = 1):
            print(f'printing plusses with volume {n} or larger')
            i = 0
            for key, value in self.plusses.items():
                if value >= n:
                    print(key,value,end=' ')
                    i +=1
                    if i % 8 ==  0: print()
            print(f'count {i}')
            
    # Main function to populate the dictionary of largest plusses at each row and column and their volumes
    def populate(grid):
        def process_segment(row, start, end, d):
            L = end - start + 1
            for col in range(start, end + 1):
                d[(row, col)] = (end, L)  # Populate flat dictionary
                
        def process_plus(col, start, end, d):
            L = end - start + 1
            for row in range(start, end + 1):
                row_right, row_L = d[(row, col)]
                col_right, col_L = end, L
                row_left, col_left = row_right - row_L + 1, col_right - col_L + 1
                volume = min((col - row_left), (row_right - col), (row - col_left), (col_right - row)) * 4 + 1
                dplus.add(row, col, volume)
        def parse_rows(grid):
            for rownum, row in enumerate(grid):
                start_i = None
                for i, val in enumerate(row):
                    if val == 'G':
                        if start_i is None:
                            start_i = i  # Start of a new plus
                    else:
                        if start_i is not None:  # End of a plus
                            process_segment(rownum, start_i, i - 1, d)
                            start_i = None
                if start_i is not None:  # Process any segment extending to the end of the row
                    process_segment(rownum, start_i, len(row) - 1, d)

        def parse_columns(grid):
                num_rows = len(grid)
                num_cols = len(grid[0])
                for col in range(num_cols):
                    start_i = None
                    for i in range(num_rows):
                        if grid[i][col] == 'G':
                            if start_i is None:
                                start_i = i
                        else:
                            if start_i is not None:    # end of a plus
                                process_plus(col, start_i, i - 1, d)
                                start_i = None
                    if start_i is not None:    # process any segment extending to end of col
                        process_plus(col, start_i, num_rows - 1, d)
        dplus = Plus()
        d = {}
        parse_rows(grid)
        parse_columns(grid)

        return dplus
    if not grid or len(grid[0]) == 0:
        return 0

    dplus = populate(grid)
    dplus.print(2)
    return dplus.find_max()


In [13]:
grid = [
    'GGGGGGGGGGGG',
    'BGBGGGBGBGBG',
    'GGGGGGGGGGGG',
    'BGBGGGBGBGBG',
    'GGGGGGGGGGGG',
    'GGGGGGGGGGGG',
    'GGGGGGGGGGGG',
    'GGGGGGGGGGGG',
    'BGBGGGBGBGBG',
    'BGBGGGBGBGBG',
    'BGBGGGBGBGBG',
    'BGBGGGBGBGBG',
    'GGGGGGGGGGGG',
    'GGGGGGGGGGGG'
]

In [14]:
def visualize_easier(grid):
    grid2 = []
    for row in grid:
        temp = row.replace('G','O')
        grid2.append(temp.replace('B','-'))
    return grid2
x = visualize_easier(grid)
for i in x:
    print(i)

OOOOOOOOOOOO
-O-OOO-O-O-O
OOOOOOOOOOOO
-O-OOO-O-O-O
OOOOOOOOOOOO
OOOOOOOOOOOO
OOOOOOOOOOOO
OOOOOOOOOOOO
-O-OOO-O-O-O
-O-OOO-O-O-O
-O-OOO-O-O-O
-O-OOO-O-O-O
OOOOOOOOOOOO
OOOOOOOOOOOO


In [15]:
twoPluses(grid)

printing plusses with volume 2 or larger
(2, 1) 5 (4, 1) 5 (5, 1) 5 (6, 1) 5 (7, 1) 5 (12, 1) 5 (5, 2) 5 (6, 2) 5 
(2, 3) 9 (4, 3) 13 (5, 3) 13 (6, 3) 13 (7, 3) 13 (12, 3) 5 (1, 4) 5 (2, 4) 9 
(3, 4) 5 (4, 4) 17 (5, 4) 17 (6, 4) 17 (7, 4) 17 (8, 4) 5 (9, 4) 5 (10, 4) 5 
(11, 4) 5 (12, 4) 5 (2, 5) 9 (4, 5) 17 (5, 5) 21 (6, 5) 21 (7, 5) 21 (12, 5) 5 
(5, 6) 5 (6, 6) 5 (2, 7) 9 (4, 7) 17 (5, 7) 17 (6, 7) 17 (7, 7) 17 (12, 7) 5 
(5, 8) 5 (6, 8) 5 (2, 9) 9 (4, 9) 9 (5, 9) 9 (6, 9) 9 (7, 9) 9 (12, 9) 5 
(5, 10) 5 (6, 10) 5 count 50


189

In [285]:
grid = [
    'BBBGBGBBB',
    'BBBGBGBBB',
    'BBBGBGBBB',
    'GGGGGGGGG',
    'BBBGBGBBB',
    'BBBGBGBBB',
    'GGGGGGGGG',
    'BBBGBGBBB',
    'BBBGBGBBB',
    'BBBGBGBBB'
]

In [231]:
grid = [
    'GBBBBBBGGGBGGBB',
    'GBBBBBBGGGBGGBB',
    'GBBBBBBGGGBGGBB',
    'GBBBBBBGGGBGGBB',
    'GGGGGGGGGGGGGGG',
    'GGGGGGGGGGGGGGG',
    'GBBBBBBGGGBGGBB',
    'GBBBBBBGGGBGGBB',
    'GGGGGGGGGGGGGGG',
    'GBBBBBBGGGBGGBB',
    'GBBBBBBGGGBGGBB',
    'GGGGGGGGGGGGGGG',
    'GGGGGGGGGGGGGGG',
    'GBBBBBBGGGBGGBB'
]

In [208]:
grid = [
    'GBGBGGB',
    'GBGBGGB',
    'GBGBGGB',
    'GGGGGGG',
    'GGGGGGG',
    'GBGBGGB',
    'GBGBGGB'
]

In [116]:
grid = [
'GGGGGG',
'GBBBGB',
'GGGGGG',
'GGBBGB',
'GGGGGG'
]

In [144]:
grid = [
    'BBGBBBB',
    'BBGBBBB',
    'GGGGGBB',
    'BBGGGGB',
    'BBGBGBB'
]

In [168]:
grid = [
    'BGGGB',
    'GGGGG',
    'BGGGB',
]

In [None]:
grid = [
'GGGGGG',
'GBBBGB',
'GGGGGG',
'GGBBGB',
'GGGGGG',
]