## Basic sorting algorithms
----

### Exercise: Selection Sort
Write the function ```SelectionSort(coll)``` that returns a sorted list with the elements in *coll*. 
You have to implements Selection Sort algorithm.

In [1]:
def SelectionSort(A):
    for i in range(len(A)-1):
        min_pos = i
        for j in range(i+1,len(A)):
            if A[j] < A[min_pos]:
                min_pos = j
        A[i],A[min_pos] = A[min_pos], A[i]
    return A

data = [-2, 45, 0, 11, -9]
SelectionSort(data)

[-9, -2, 0, 11, 45]

In [2]:
def test_sortedness(my_list):
    return my_list == sorted(my_list)

my_list = list(range(10))[::-1]

print(SelectionSort(my_list))

assert test_sortedness( SelectionSort(my_list) ), "Must be increasing!"

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]


### Exercise: Insertion Sort
Write the function ```InsertionSort(coll)``` that returns a sorted list with the elements in *coll*. 
You have to implements Insertion Sort algorithm.

In [3]:
def InsertionSort(A):
    for i in range(1,len(A)):  
        key = A[i]
        j = i-1
        while j >= 0 and A[j] > key:
            A[j+1] = A[j]
            j -= 1
        A[j+1] = key
    return A

data = [9, 5, 1, 4, 3,-12]
InsertionSort(data)

[-12, 1, 3, 4, 5, 9]

In [5]:
## Check correctdness

my_list = list(range(10))[::-1]

print(InsertionSort(my_list))

assert test_sortedness( InsertionSort(my_list) ), "Must be increasing!"

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]


### Comparators

You have learned that many sorting are based on comparison. 
They obtain a ordered sequence by comparing elements. 

It's often very useful to define our own way to compare elements. Any comparator that implies a total order 
is a good one. 

For example, assume you have a list of tuple. Each tuple stores information about a person. 
If you sort this list, the final ordering is *"lexicographic"* one. First we compare the first component, 
then the second component for tuples with the same first component, and so on.
 
However, you may want impose your own way to order. For example, sort person by name, then increasingly by age, and so on. 

This is possible by implementing your own comparator and let ```.sort()``` and ```sorted()``` to use it.

### How? 
You know that comparison-based algorithms sort a sequence by comparing pairs of elements. 
Thus, a comparator is a function that takes two elements, say a and b, and compare them.

The result of a comparison is a value smaller than $0$, if a must precede b in the ordering. 
The result is larger than $0$, if b must precede a. The result is $0$, if we do not care.

For example, we can use the following comparator to sort numbers in reverse order.

In [6]:
def my_cmp(a, b):
    if a > b: return -1
    return 1

In [7]:
# shorter version
def my_cmp(a, b): 
    return b-a # a is before if larger that b

To use our own comparator with ```.sort()``` and ```sorted()```, we have to use ```functools.cmp_to_key(cmp)```function. This converts our comparator to a function that can be used as a argument for parameter ```key```. 


In [8]:
import functools

print( sorted(list(range(10)), key=functools.cmp_to_key(my_cmp)) )

[9, 8, 7, 6, 5, 4, 3, 2, 1, 0]



------
### Exercise: Strange orderings
Given a list, write and test comparators to obtain the following orderings:
- Even number precede odd ones. Even numbers are sorted in non-decreasing  order while odd ones are sorted in non-increasing order.
- Strings are sorted in non-increasing order based on their lengths. Strings having the same length are sorted in non-increasing lexicographic order. 

In [9]:
my_list = list(range(10))
my_list2 = ["a", "b", "aba", "cad", "zzzz", "aaaa"]

In [13]:
## Your implementation here!!!      
def my_cmp1(a,b):
    if a%2==0 and b%2==0:
        return a-b
    if a%2==0 and b%2!=0:
        return -1
    if a%2!=0 and b%2==0:
        return 1
    return b-a

def my_cmp2(a,b):
    if len(a) != len(b):
        return len(b) - len(a)
    if a < b:
        return -1
    if a > b:
        return 1
    return 0

### Exercise: Insertion Sort with a comparator
Write the function ```InsertionSort(coll, cmp)``` that returns a sorted list with the elements in *coll* using 
```cmp```as a comparator.

In [14]:
import functools

def InsertionSort(A,cmp):
    for i in range(1,len(A)):
        key = A[i]
        j = i-1
        while j >= 0 and cmp(A[j],key)>0:
            A[j+1] = A[j]
            j -= 1
        A[j+1] = key
    return A


# my_list = [7,-1,12,4,3,1,1,9]
# assert InsertionSort(my_list, my_cmp1) == sorted(my_list, key = functools.cmp_to_key(my_cmp1)), "Nein"

In [15]:
## Test here your implementation by using comparators from previous exercise.

def test_sortedness(my_list, cmp):
    return InsertionSort(my_list, cmp) == sorted(my_list, key = functools.cmp_to_key(cmp))

assert test_sortedness(my_list, my_cmp1), "Must be sorted"
assert test_sortedness(my_list2, my_cmp2), "Must be sorted"

-----

### Exercise: Intersection of two lists
Write a function ```intersection_slow(l1, l2)``` which returns the intersection of the two lists l1 and l2.

Use the trivial algorithms that runs in $\Theta(|l1|\times|l2|)$. 

In [16]:
def intersection_slow(l1,l2):
    x = []
    for i in l1:
        for j in l2:
            if i==j:
                x.append(i)
    return x

In [17]:
## Test here your implementation 

l1 = [3, 5, 1, 2]
l2 = [1, 4, 6, 2]

assert set(intersection_slow(l1, l2)) == set([1, 2]), "Urca"

----
### Exercise: Faster intersection of two lists
Write a function ```intersection(l1, l2)``` which returns the intersection of the two lists l1 and l2.

Assume that both l1 and l2 are sorted!

In [19]:
def intersections(l1, l2):
    i = 0
    j = 0
    intersection = []
    
    while i <= len(l1)-1 and j <= len(l2)-1:
        if l1[i] < l2[j]:
            i+=1
        elif l1[i] > l2[j]:
            j+=1
        else:
            intersection.append(l1[i])
            i+=1
            j+=1
            
    return intersection
            
intersections([1,2,3,4,6,7],[1,4])

[1, 4]

In [21]:
## Test here your implementation 

l1 = sorted([3, 5, 1, 2])
l2 = sorted([1, 4, 6, 2])

assert set(intersections(l1, l2)) == set([1, 2]), "Urca"

----
### Exercise: You own search engine
You are given a collection of texts and you want to build your own search engine, people at Google are already very scared!

Modern search engines are based on a data structure called *Inverted Index*. 

Each document of the collection is assigned an identifier, starting from 0.
An inverted index stores a list, called *inverted list*, for each term of the collection.
The list for a term *t* contains the identifiers of all the documents containing term *t*. The list is sorted.

For example,

````
C = ["dog cat elephant monkey",  "dog lion tiger", "fish dog dog cat cow"]

````

The list of term *cat* is [0,2], the list of *elephant* is [0].

Given two terms, an AND query reports all the documents containing both terms. For example, 
*query("cat", "dog"), the result is [0, 2].

You goal is to implement a simple search engine. Do the following. 

- Given the collection, build a dictionary that maps each term to its inverted list. Observe that 
each document occurs at most once in each list. 
- Implement a function *query* which answers an AND query. 

In [22]:
def build_index(C):
    index = {}

    for idx, el in enumerate(C):
        el = el.split()
        for parola in el:
            if parola not in index:
                index[parola] = [idx]
            else:
                if idx != index[parola][-1]:   # evito di inserire due volte lo stesso indice
                    index[parola].append(idx)
    
    return index

def query(index, t1, t2):
    return intersections(index[t1],index[t2])

C = ["dog cat elephant monkey",  "dog lion tiger", "fish dog dog cat cow"]
index = build_index(C)
query(index,"cat","dog")

[0, 2]

In [23]:
## Test here your implementation 

C = ["dog cat elephant monkey",  "dog lion tiger", "fish dog dog cat cow"]

index = build_index(C)
assert query(index, "cat", "dog") == [0, 2], "Urca"