### Author: Kubam Ivo 
### Purpose: Algorithms For Big Data Project
### Date: 25/3/2021

**Algorithm 1(eliminate points-m):** <br>
    **Input:** p1,p2,... , pn' (in order) where n' is the number of points in the stream.<br> 
    **Output**: Skyline points S' <br>
    1. Let x = 24m. 
    2. **Pass 1:** For j : 1, 2, ..., x, let p'j be a point picked uniformly at random from the stream. <br>
    Let S be the set of such points.<br>
    **Pass 2**
    4. for i = 1, ..., n' do 
         * for any p'j, if pi dominates p'j then p'j:=pi
    6. end for 
    7. Let S'={p'1,p'2,...,p'x}.
    8. **Pass 3** 
            Delete from stream all points in S' and all points dominated by any point in S'.
    9. return S' 

In [57]:
# generate points
import random

def generate_points(n):
    #random.seed(a=123)
    data = [(random.randint(0,100),random.randint(0,100)) for x in range(n)] 
    return data
    

In [59]:
#reservoir sampling
m = 3 #number of skyline points
import random
def reservoir_sample(stream, m):
    k = 24*m
    reservoir = [stream[i] for i in range(k)]
    
    N=0
    for t, item in enumerate(stream):
        N += 1
        s = random.randint(0,t)
        if s < k:
            reservoir[s] = item    

    return reservoir

In [62]:
n = 1000 #stream size
stream = generate_points(n)

In [67]:
selected_point =reservoir_sample(stream,m)


In [29]:
import random 
def dominate(stream,selected_point):
    for i in range(n):
        sampled_elem = random.choice(selected_point)
        if sampled_elem < stream[i]:
            selected_point[selected_point.index(sampled_elem)] = stream[i]
    return selected_point
    


In [71]:
skyline_points = dominate(stream,selected_point)


In [73]:
def remove_point_stream(stream,skyline_points):
    for point in skyline_points:
        if point in stream:
            stream.remove(point)
        for elem in stream:
            if point > elem:
                stream.remove(elem)
    return stream

    

In [32]:
remove_point_stream(stream,skyline_points)

[]

Algorithm 2 (Streaming RAND): 
    1: Let n be the number of points in the input stream. 
    Let m' = 1. 
    2: while the input stream is not empty do: 
    3: let n' be the current number of points in the stream 
    4: Call eliminate points (m'log(nlogn))
    5: If more than n'/2 points are left in the stream, m' = 2 m'
    6: end while 
    Remark: In case the stream cannot be changed, we do not have to actually delete points from stream. 
    We only keep the skyline points found so far and consider only points in the stream that is not dominated by any found skyline points. 
        