# Closet Pair Problem
### Problem

    Input: a set P = {p1, ..., pn} of n points in a plane R^2
    
        Notation: d(pi, pj) = Euclidean distance
            ex: if pi = (xi, yi) and pj = (xj,yj)
                d(pi, pj) = sqrt((xi-yi)^2+(xj-yj)^2)
            
    Output: a pair p*,q* in P of distinct points that minimize 
        d(p,q) over p,q in P
    
    Assumption: All coordinates distinct x and y
    
    Brute-force: takes O(n^2) time
    1-D Version: 
        1) sort points O(nlogn) time
        2) Return pair of adjacent points O(n) time
        
        
### Approach

    1) Divide into subproblems
        Let Q = left half of P
        R = right half of P
        Qx, Qy, Rx, Ry [takes O(n) time]
        
    2) Conquer subproblems recursively
        (p1,q1) = ClosestPair(Qx,Qy)
        (p2,q1) = ClosestPair(Rx,Ry)
        delta = min{d(p1,q1),d(p2,q2)}
        (p3,q3) = ClosestSplitPair(Px,Py, delta)
        
        return best of (p1,q1) (p2,q2) (p3,q3)
            Requirements:
            Runtime, O(nlogn) allows us to squash ClosestSplitPair
            Correct when closest pair of P is a split pair
        
    3) Combine solutions into original problem
    
#### ClosestSplitPair(Px,Py,delta)
    Let x´ = biggest x-coordinate in left of P
    
    
        Vizually: [-delta    x´  +delta]
                       |   . |.    |
                       |    .|     |  .
                    .  |     |   . |

               
               
    Let Sy = points of P with x-coordinate inside neighborhood 
        [x´-delta, x´+delta]
        
    Linear Scan: Initialize with best: delta and best_pair: NULL
    
    for i=1 to |Sy| -1:                             O(n) time___ 
            for j=1 to min{7, |Sy|-i}:   Constant O(1) time__   |
                let P,q = ith, (ith + jth) points of Sy      |  |
                if d(p,q) < best:                            |  |
                    bestpair=(p,q)                         __|  |
                    best = d(p,q)                            ___|
     RunTime: O(n) time
     
##### Correctness claim
    Assume: Suppose there exist point inside x´ delta neighborhood
    or formally there exist p in Q and q in R that are split pairs
    such that d(p,q) < delta = min(d(p1,q1), d(p2,q2))
    
    Then..
        A) p, q are member of Sy
        B) p, q are at most 7 points apart in Sy
        
    
    Corollary1 1: If closest pair of p is a split pair then
            ClosestSplitPair will find it
    
    Corollary 2: Closest pair is correct and ran in O(nlogn) time
    
    Proof A: p=(x1,y1) in Q, q=(x2,y2) in R, d(p,q) < delta
        - since d(p,q) < delta, |x1-x2|<delta and|y1-y2|<delta
        - p in Q => x1 <= x´ and q in R x2 >= x´
            => x1, x2  in [x´- delta, x´+ delta]
            
            
    Proof B: p=(x1,x2) and q=(x2,y2) are at most 7 positions apart 
    in Sy
                                |    
                       _________|_________   
                      | p  |    |    |    |
                      |____|____|____|____|    
                      |    |    |    |    |delta
    min{y1, y2}_______|____|____|____|_q__|_______________
                       -delta   |   +delta
                                |   
                                |    
                                |    
                                |
                                x´
                                
                                
        Lemma1: all points of Sy with y-coordinate between those of 
            p and q inclusive lie in one of these 8 boxes
            
            Subproof: first recall y-coordinates of p,q differ by
            delta. Then by definition of Sy all have x-coordinates
            between x´- delta and x´+ delta
            
        Lemma2: At most one point of P in each box
            
            Subproof: by contradiction suppose a,b lie in the same box.
                Then
                    i) a,b are either both in Q or both in R
                    ii) d(a,b) <= delta/2*sqrt(2) < delta
                    
                    but i and ii contradict definition of delta as 
                    smallest distance between pairs of points in Q 
                    and R