## Basic sorting algorithms
----

### Exercise: Selection Sort
Write the function ```SelectionSort(coll)``` that returns a sorted list with the elements in *coll*. 
You have to implements Selection Sort algorithm.

In [None]:
def SelectionSort(coll):
    for i in range(len(coll)-1):
        minimum = i
        for j in range(len(coll[i:])):
            if coll[i+j] < coll[minimum]:
                minimum = i+j
        coll[i], coll[minimum] = coll[minimum], coll[i]
    return coll

In [None]:
## Check correctdness your implementation!

def test_sortedness(my_list):
    return my_list == sorted(my_list)

my_list = [51, 15, 2, 5, 10, 9, 3, 31, 2]

print(SelectionSort(my_list))

assert test_sortedness( SelectionSort(my_list) ), "Must be increasing!"

[2, 2, 3, 5, 9, 10, 15, 31, 51]


### Exercise: Insertion Sort
Write the function ```InsertionSort(coll)``` that returns a sorted list with the elements in *coll*. 
You have to implements Insertion Sort algorithm.

In [None]:
def InsertionSort(coll):
    for i in range(len(coll)):
        first_el = coll[i]
        j = i-1
        while j >= 0 and coll[j] > first_el:
            coll[j + 1] = coll[j]
            j = j-1
        coll[j+1] = first_el
    return coll

In [None]:
## Check correctdness your implementation!

my_list = [51, 15, 2, 5, 10, 9, 3, 31, 2]

print(InsertionSort(my_list))

assert test_sortedness( InsertionSort(my_list) ), "Must be increasing!"

[2, 2, 3, 5, 9, 10, 15, 31, 51]


### Comparators

You have learned that many sorting are based on comparison. 
They obtain a ordered sequence by comparing elements. 

It's often very useful to define our own way to compare elements. Any comparator that implies a total order 
is a good one. 

For example, assume you have a list of tuple. Each tuple stores information about a person. 
If you sort this list, the final ordering is *"lexicographic"* one. First we compare the first component, 
then the second component for tuples with the same first component, and so on.
 
However, you may want impose your own way to order. For example, sort person by name, then increasingly by age, and so on. 

This is possible by implementing your own comparator and let ```.sort()``` and ```sorted()``` to use it.

### How? 
You know that comparison-based algorithms sort a sequence by comparing pairs of elements. 
Thus, a comparator is a function that takes two elements, say a and b, and compare them.

The result of a comparison is a value smaller than $0$, if a must precede b in the ordering. 
The result is larger than $0$, if b must precede a. The result is $0$, if we do not care.

For example, we can use the following comparator to sort numbers in reverse order.

In [None]:
def my_cmp(a, b):
    if a > b: return -1
    return 1

In [None]:
# shorter version
def my_cmp(a, b): 
    return b-a # a is before if larger that b

To use our own comparator with ```.sort()``` and ```sorted()```, we have to use ```functools.cmp_to_key(cmp)```function. This converts our comparator to a function that can be used as a argument for parameter ```key```. 


In [None]:
import functools

print( sorted(list(range(10)), key=functools.cmp_to_key(my_cmp)) )

[9, 8, 7, 6, 5, 4, 3, 2, 1, 0]



------
### Exercise: Strange orderings
Given a list, write and test comparators to obtain the following orderings:
- Even number precede odd ones. Even numbers are sorted in non-decreasing  order while odd ones are sorted in non-increasing order.
- Strings are sorted in non-increasing order based on their lengths. Strings having the same length are sorted in non-increasing lexicographic order. 

In [None]:
my_list = list(range(10))
my_list2 = ["a", "b", "aba", "cad", "zzwz", "zzzz", 'adc', "fv", 'vf', "aaaa"]

In [None]:
######################################### TEST WITH LAMBDA FUNCTIONS #########################################

def Strange_EvenOdd(my_list):
    return sorted(my_list, key=lambda x: (x % 2, x) if x % 2 == 0 else (x % 2, -x)) 
    #If "even" I sort them as positive number and I put the 0 remainder to be ordered first, if odd I put the 1 as first element and then the negative number so that higher absolute number are the first
def Strange_Longer(my_list2):
    return sorted(my_list2, key=lambda x: tuple([-len(x)]+[-ord(x[i]) for i in range(len(x))])) 
    #for corner cases like ["zzwz", "zzzz"] I came up with a tuple for each of the characters' ASCII values
    
    
print(Strange_EvenOdd(my_list))
print(Strange_Longer(my_list2))

######################################### WORKING SOLUTIONS #########################################

def cmp1(a, b):
    if a == b:
        return 0 #keep original order if they are the same

    a_odd, b_odd = a % 2, b % 2
    
    if a_odd and not b_odd: #here I specify that even numbers precede odd ones
        return 1
    elif not a_odd and b_odd: #the same as before if the order of compared elements is inverted
        return -1
    elif a_odd and b_odd: # if they are both odd and the element we are considering is smaller than the one compared to keep the increasing order
        return 1 if a < b else -1
    
    return -1 if a < b else 1 #if they are even and the considered element is smaller put it after the one compared to

print(sorted(my_list, key = functools.cmp_to_key(cmp1)))

def cmp2(a,b):
    if len(a) == len(b):
        return -1 if a > b else 1 #the latest letters comes first
    
    return -1 if len(a) > len(b) else 1 #the longest eleent comes first

print(sorted(my_list2, key = functools.cmp_to_key(cmp2)))

[0, 2, 4, 6, 8, 9, 7, 5, 3, 1]
['zzzz', 'zzwz', 'aaaa', 'cad', 'adc', 'aba', 'vf', 'fv', 'b', 'a']
[0, 2, 4, 6, 8, 9, 7, 5, 3, 1]
['zzzz', 'zzwz', 'aaaa', 'cad', 'adc', 'aba', 'vf', 'fv', 'b', 'a']


### Exercise: Insertion Sort with a comparator
Write the function ```InsertionSort(coll, cmp)``` that returns a sorted list with the elements in *coll* using 
```cmp```as a comparator.

In [None]:
def InsertionSort(coll, cmp):
    for i in range(len(coll)):
        first_el = coll[i]
        j = i-1
        while j >= 0 and cmp(coll[j], first_el) == 1:
            coll[j + 1] = coll[j]
            j = j-1
        coll[j+1] = first_el
    return coll


In [None]:
my_list = [51, 15, 2, 5, 10, 9, 3, 31, 2]

In [None]:
InsertionSort(my_list, cmp1)

[2, 2, 10, 51, 31, 15, 9, 5, 3]

In [None]:
## Test here your implementation by using comparators from previous exercise.

def test_sortedness(my_list, cmp):
    return InsertionSort(my_list, cmp) == sorted(my_list, key = functools.cmp_to_key(cmp))

assert test_sortedness(my_list, cmp1), "Must be sorted"
assert test_sortedness(my_list2, cmp2), "Must be sorted"

-----

### Exercise: Intersection of two lists
Write a function ```intersection_slow(l1, l2)``` which returns the intersection of the two lists l1 and l2.

Use the trivial algorithms that runs in $\Theta(|l1|\times|l2|)$. 

In [None]:
def intersection_slow(first_list, second_list):
    return [el_in_first for el_in_first in first_list for el_in_second in second_list if el_in_first == el_in_second]

In [None]:
## Test here your implementation 

l1 = [3, 5, 1, 2]
l2 = [1, 4, 6, 2]

assert set(intersection_slow(l1, l2)) == set([1, 2]), "Urca"

----
### Exercise: Faster intersection of two lists
Write a function ```intersection(l1, l2)``` which returns the intersection of the two lists l1 and l2.

Assume that both l1 and l2 are sorted!

In [None]:
def intersection(first_list, second_list):
    resulting_intersection = []
    
    for el_in_first in first_list:
        for el_in_second in second_list:
            if el_in_first < el_in_second:
                break
            if el_in_first == el_in_second:
                resulting_intersection.append(el_in_first)
    return resulting_intersection

In [None]:
## Test here your implementation 

l1 = sorted([3, 5, 1, 2])
l2 = sorted([1, 4, 6, 2])

assert set(intersection(l1, l2)) == set([1, 2]), "Urca"

----
### Exercise: You own search engine
You are given a collection of texts and you want to build your own search engine, people at Google are already very scared!

Modern search engines are based on a data structure called *Inverted Index*. 

Each document of the collection is assigned an identifier, starting from 0.
An inverted index stores a list, called *inverted list*, for each term of the collection.
The list for a term *t* contains the identifiers of all the documents containing term *t*. The list is sorted.

For example,

````
C = ["dog cat elephant monkey",  "dog lion tiger", "fish dog dog cat cow"]

````

The list of term *cat* is [0,2], the list of *elephant* is [0].

Given two terms, an AND query reports all the documents containing both terms. For example, 
*query("cat", "dog"), the result is [0, 2].

You goal is to implement a simple search engine. Do the following. 

- Given the collection, build a dictionary that maps each term to its inverted list. Observe that 
each document occurs at most once in each list. 
- Implement a function *query* which answers an AND query. 

In [None]:
def build_index(C):
    index = {}

    for idx, el in enumerate(C):
        terms = el.split(' ')
        
        for term in terms:
            try:
                type(index[term])
            except KeyError:
                index[term] = []
            index[term].append(idx)
    
    return index

def query(index, *terms):
    
    def intersection_slow(first_list, second_list):
        return [el_in_first for el_in_first in first_list for el_in_second in second_list if el_in_first == el_in_second]
    
    def intersection_multiple(multiple_lists):
        if multiple_lists == []:
            return None
        
        elif len(multiple_lists) == 1:
            return multiple_lists[0]

        else:
            intersected = multiple_lists[0]

            for ls in multiple_lists[1:]:
                intersected = intersection_slow(intersected, ls)

        return list(set(intersected))

    to_intersect = []
    
    for term in terms:
        to_intersect.append(index[term])
    
    return intersection_multiple(to_intersect)

In [None]:
## Test here your implementation 

C = ["dog cat elephant monkey",  "dog lion tiger", "fish dog dog cat cow"]

index = build_index(C)
assert query(index, "cat", "dog") == [0, 2], "Urca§"