Before you turn this problem in, make sure everything runs as expected. First, **restart the kernel** (in the menubar, select Kernel$\rightarrow$Restart) and then **run all cells** (in the menubar, select Cell$\rightarrow$Run All).

Make sure you fill in any place that says `YOUR CODE HERE` or "YOUR ANSWER HERE", as well as your name and collaborators below:

In [1]:
NAME = ""
IMMATRICULATION_NUMBER = ""

---

## Exercise 11: Elias-Fano encoding.

Goal of this exercise is to implement the Elias-Fano encoding.

## 1. Encode a sorted list of non-decreasing integers

Implement a function `elias_fano` that can be called by
```
(upper, lower) = elias_fano(intList)
```

that receives a sorted list of non-decreasing integers (`intList`) and computes the Elias-Fano encoding for this list and returns them as a tuple of lists `(upper, lower)`. The list `upper` contains the sizes of the buckets as unary code and the list `lower` contains the lower parts of the binary codes. You can assume a fixed size of `w=9` (`w` is the total number of bits for each binary code) and `l=5` (`l` is the number of bits used for the lower part of each binary code).

**Hint:** With the code `format(int(i),'0'+str(w)+'b')` you can compute the binary representation (as a formatted string) of integer i having w "bits".

In [10]:
w=9
l=5

def elias_fano(intList):
    global w,l
    upper_width = w-l
    upper = ""
    upper_n = 0
    lower = ""
    for i in intList:
        # for each occurence of an upper-bit-sequence, add a 1 to the upper-bit-sequence
        # once we change, we nee to add a 0
        bits = format(i, f"0{w}b")
        lower += bits[-l:]
        upper_bits = bits[:-l]
        upper_n_now = int(upper_bits, 2)
        # compare upper_bits with upper_n
        if upper_n != upper_n_now:
            # add as many 1s as the difference between upper_n and upper_bits
            upper += "0" * (upper_n_now - upper_n)
            upper_n = upper_n_now
        upper += "1"
    upper += "0"
    # print(f"{upper=}, {lower=}")
    return upper, lower
            
    
print(elias_fano([9,15,54,72,77,78,97]))

myList = [63, 72, 93, 94, 107, 159, 161, 219, 241, 249, 251]
print(elias_fano(myList))

('11010111010', '01001011111011001000011010111000001')
('0101110101010101110', '1111101000111011111001011111110000111011100011100111011')


In [11]:
import base64, random

def decode(upper, lower):
    global w,l
    ret = []
    myCode = b'YnVjLGkgPSAwLDAKZm9yIGIgaW4gdXBwZXI6CiAgICBpZihiPT0nMCcpOiBidWMgKz0gMQogICAgZWxzZTogCiAgICAgICAgcmV0LmFwcGVuZChpbnQoZm9ybWF0KGludChidWMpLCcwJytzdHIody1sKSsnYicpK2xvd2VyW2wqaTpsKmkrbF0sMikpCiAgICAgICAgaSArPSAx';eval(compile(base64.b64decode(myCode).decode('ascii'),'<string>','exec'))
    return ret

myList = [1,1,1,1]
(upper, lower) = elias_fano(myList)
assert (upper, lower)==('11110','00001000010000100001'), "Failed for %s" % myList 

myList = [256]
(upper, lower) = elias_fano(myList)
assert (upper, lower)==('0000000010','00000'), "Failed for %s" % myList 

myList = [1,1,1,1,256]
(upper, lower) = elias_fano(myList)
assert (upper, lower)==('11110000000010','0000100001000010000100000'), "Failed for %s" % myList 

myList = [1,4,9,16,25,36,49,64,81,100,256,511]
(upper, lower) = elias_fano(myList)
assert (upper, lower)==('1111101101101000001000000010','000010010001001100001100100100100010000010001001000000011111'), "Failed for %s" % myList 

def createRandomSortedList(n): 
    feed = [i for i in range(1, 256)]*10
    ret = random.sample(feed, n)
    ret.sort()
    return ret

def testEliasFano(myList):
    (upper, lower) = elias_fano(myList)
    
    assert myList==decode(upper, lower), "test failed for input list %s ( %s , %s)" % (myList, upper, lower)

def testAndCreateEliasFano(n):
    myList = createRandomSortedList(n)
    testEliasFano(myList)


for i in range(10,25): testAndCreateEliasFano(i)

## 2. Accessing the n-th entry

Implement a function `access` that can be called by
```
access(upper, lower, n)
```
that returns the integer at position `n` (`n` starting at position 0) of the original `list`. 

First of all, you have to compute the right bucket `b`, i.e., you have to know, how many 0-bits are there up to the n-th 1-bit of the `upper` list (i.e., you can compute `b=select(1, n) - n` ). For the `lower` part, you have to access the `l` bits at the position corresponding to the given position `n`. Finally, you have to concatenate the `upper` part and the `lower` part and compute the corresponding integer number.

**Hint:** With the code `int(b,2)` you can convert a binary string back into an integer

In [12]:
def select(binString, b, n):                             # selects the n-th (n starting at 0) b-bit
    return [i for i,x in enumerate(list(binString)) if x == b][n]

def access(upper, lower, i):              # returns the integer at position i (starting at 0)
    global w,l
    b = select(upper, "1", i) - i # number if 0s before the i-th 1
    # which gives us the upper bits
    # the lower bits are the i-th l bits of lower
    upper_bits = format(b, f"0{w-l}b")
    lower_bits = lower[i*l:(i+1)*l]
    return int(upper_bits+lower_bits, 2)



In [13]:
import random

def createRandomSortedList(n): 
    feed = [i for i in range(1, 256)]*10
    ret = random.sample(feed, n)
    ret.sort()
    return ret

def testEliasFano(myList):
    (upper, lower) = elias_fano(myList)
    
    newList = [access(upper, lower, i) for i in range(len(myList))]
    
    assert myList==newList, "test failed for input list %s (%s)" % (myList, newList)

def testAndCreateEliasFano(n):
    myList = createRandomSortedList(n)
    testEliasFano(myList)
    
myList = [1,1,1,1]
testEliasFano(myList)

myList = [256]
testEliasFano(myList)

myList = [1,1,1,1,256]
testEliasFano(myList)

myList = [1,4,9,16,25,36,49,64,81,100,256,511]
testEliasFano(myList)

for i in range(10,25): testAndCreateEliasFano(i)