**Author**: Ismaele Gorgoglione

## L01 - Basic Sorting Algorithm
----

### Exercise: Selection Sort
Write the function ```SelectionSort(coll)``` that returns a sorted list with the elements in *coll*. 
You have to implements Selection Sort algorithm.

In [1]:
def SelectionSort(coll):
    for i in range(len(coll)-1):
        min_pos = i
        for j in range(i+1, len(coll)):
            if coll[j] < coll[min_pos]:
                min_pos = j
        coll[i], coll[min_pos] = coll[min_pos], coll[i]
    return coll

In [2]:
## Check correctdness

def test_sortedness(my_list):
    return my_list == sorted(my_list)

my_list = list(range(10))[::-1]

print(SelectionSort(my_list))

assert test_sortedness( SelectionSort(my_list) ), "Must be increasing!"

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]


___

### Exercise: Insertion Sort
Write the function ```InsertionSort(coll)``` that returns a sorted list with the elements in *coll*. 
You have to implements Insertion Sort algorithm.

In [3]:
def InsertionSort(coll):
    for j in range(1, len(coll)):
        key = coll[j]
        i = j-1
        while i >= 0 and coll[i] > key:
            coll[i+1] = coll[i]
            i = i-1
        coll[i+1] = key
    return coll

In [4]:
## Check correctdness

my_list = list(range(10))[::-1]

print(InsertionSort(my_list))

assert test_sortedness( InsertionSort(my_list) ), "Must be increasing!"

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]


---

To use our own comparator with ```.sort()``` and ```sorted()```, we have to use ```functools.cmp_to_key(cmp)```function. This converts our comparator to a function that can be used as a argument for parameter ```key```. 


In [5]:
import functools

### Exercise: Strange orderings
Given a list, write and test comparators to obtain the following orderings:
- Even number precede odd ones. Even numbers are sorted in non-decreasing  order while odd ones are sorted in non-increasing order.
- Strings are sorted in non-increasing order based on their lengths. Strings having the same length are sorted in non-increasing lexicographic order. 

In [6]:
my_list = list(range(10))
my_list2 = ["a", "b", "aba", "cad", "zzzz", "aaaa"]

In [7]:
## Your implementation here!!!
def my_cmp(a, b):
    if a % 2 == 0 and b % 2 == 0:  # both even
        return a - b  # non-decreasing order
    elif a % 2 == 1 and b % 2 == 1:  # both odd
        return b - a  # non-increasing order
    elif a % 2 == 0:  # a is even, b is odd
        return -1  # a comes before b
    else:  # a is odd, b is even
        return 1  # b comes before a

In [8]:
print( sorted(my_list, key = functools.cmp_to_key(my_cmp)) )

[0, 2, 4, 6, 8, 9, 7, 5, 3, 1]


---

### Exercise: Insertion Sort with a comparator
Write the function ```InsertionSort(coll, cmp)``` that returns a sorted list with the elements in *coll* using 
```cmp``` as a comparator.

In [9]:
def InsertionSort(coll, cmp):
    for i in range(1, len(coll)):
        j = i
        while j > 0 and cmp(coll[j], coll[j-1]) < 0:
            coll[j], coll[j-1] = coll[j-1], coll[j]
            j -= 1
    return coll

In [10]:
my_list = list(range(10))
def my_cmp(a,b):
    if a > b: return 1
    return -1

In [11]:
## Implementation test
def test_sortedness(my_list, cmp):
    return InsertionSort(my_list, cmp) == sorted(my_list, key = functools.cmp_to_key(cmp))

print(InsertionSort(my_list,my_cmp))

assert test_sortedness(my_list, my_cmp), "Must be sorted"
assert test_sortedness(my_list2, my_cmp), "Must be sorted"

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]


-----

### Exercise: Intersection of two lists
Write a function ```intersection_slow(l1, l2)``` which returns the intersection of the two lists l1 and l2.

Use the trivial algorithms that runs in $\Theta(|l1|\times|l2|)$. 

In [12]:
def intersection_slow(l1, l2):
    l3 = []
    for item in l1:
        if item in l2:
            l3.append(item)
    return l3

In [13]:
## Implementation test

l1 = [3, 5, 1, 2]
l2 = [1, 4, 6, 2]

#print(intersection_slow(l1,l2))

assert set(intersection_slow(l1, l2)) == set([1, 2]), "Urca"

----
### Exercise: Faster intersection of two lists
Write a function ```intersection(l1, l2)``` which returns the intersection of the two lists l1 and l2.

Assume that both l1 and l2 are sorted!

In [14]:
def intersection(l1,l2):
    l3 = []
    i = 0
    j = 0
    while i < len(l1) and j < len(l2):
        if l1[i] > l2[j]:
            i += 1
        elif l1[i] < l2[j]:
            j += 1
        else:
            l3.append(l1[i])
            i += 1
            j += 1
    return l3

In [15]:
## Implementation test

l1 = sorted([3, 5, 1, 2])
l2 = sorted([1, 4, 6, 2])

assert set(intersection(l1, l2)) == set([1, 2]), "Urca"

----
### Exercise: search engine
You are given a collection of texts and you want to build your own search engine, people at Google are already very scared!

Modern search engines are based on a data structure called *Inverted Index*. 

Each document of the collection is assigned an identifier, starting from 0.
An inverted index stores a list, called *inverted list*, for each term of the collection.
The list for a term *t* contains the identifiers of all the documents containing term *t*. The list is sorted.

For example,

````
C = ["dog cat elephant monkey",  "dog lion tiger", "fish dog dog cat cow"]

````

The list of term *cat* is [0,2], the list of *elephant* is [0].

Given two terms, an AND query reports all the documents containing both terms. For example, 
*query("cat", "dog"), the result is [0, 2].

You goal is to implement a simple search engine. Do the following. 

- Given the collection, build a dictionary that maps each term to its inverted list. Observe that 
each document occurs at most once in each list. 
- Implement a function *query* which answers an AND query. 

In [16]:
def build_index(C):
    index = {}
    for pos, item in enumerate(C):
        terms = item.split()
        for term in terms:
            if term not in index:
                index[term] = []
            if pos not in index[term]:
                index[term].append(pos)
    for term in index:
        index[term].sort()
    return index

def query(index, t1, t2):
    if t1 not in index.keys() or t2 not in index.keys():
        return None
    for key, value in index.items():
        if key == t1:
            l1 = value
        if key == t2:
            l2 = value
    return list(set(l1) & set(l2))

In [17]:
## Implementation test

C = ["dog cat elephant monkey",  "dog lion tiger", "fish dog dog cat cow"]

index = build_index(C)
assert query(index, "cat", "dog") == [0, 2], "Urca"