# Lab 09: lambdas and comprehension

## Exercise 0: Lambda basics

Lambda functions are useful when you want to parametrize some behaviors. Look at the following example, which parameterizes how *x* and *y* are meant to be used depending on a given parameter *n*.

In [None]:
import random
r = [
    lambda x,y: x+y, # Sum
    lambda x,y: x-y, # Difference
    lambda x,y: x*y  # Multiplication
]
n = random.randrange(0,3,1)
print(f"Random number is {n}")
r[n](2,3)

## Exercise 1: Compare strings

Consider the following two sets:

In [None]:
s1 = {"a","b","c"}
s2 = {"b","c","d"}

Define the variable *comparators* as a LIST of lambda functions that compute:
- The union of two sets
- The difference of two sets
- The intersection of two sets

(If you don't remember how this is done, check Slides 5bis). 

Then, define a function ```compare_strings(fun,s1,s2)``` that applies to *s1* and *s2* the function *fun* given in input and returns the result.

In [None]:
comparators = [
    lambda x,y: x | y, # Union
    lambda x,y: x & y, # Intersection
    lambda x,y: x - y  # Difference
]

In [None]:
def compare_strings(fun,s1,s2):
    return fun(s1,s2)

In [None]:
compare_strings(comparators[2],s1,s2)

## Exercise 2: Sorting grades

You have a list of tuples, *grades*, where each tuple contains a grade received by a student on a course. The format is: (student_id, student_name, course, grade, laude); note that *laude* is a boolean value denoting the "30 cum laude" grade.

Write code that uses the Python built-in function *sorted* (https://docs.python.org/3/library/functions.html#sorted) in combination with lambda functions to sort the list as follows:
- Sort by student id
- Sort by name
- Sort by grades
- Sort by decreasing grades
- Sort by decreasing grades; in case of parity, give precedence to laudes
- Sort by decreasing grades; in case of parity, give precedence to laudes; in case of parity, sort alphabetically

Finally:
- Use list comprehension to return a new list with the names of students with a grade on 'Database systems'
- (Extra) Considering the last sorting function above, replace the *grades* list with a call to the *filter* function (https://docs.python.org/3/library/functions.html#filter) that keeps only occurrences of the 'Database systems' course

In [None]:
grades = [
 (42, 'Alice', 'Programming and computer architectures', 18, False),
 (42, 'Alice', 'Database systems', 18, False),
 (1, 'Bob', 'Fundamentals of accounting', 23, False),
 (1, 'Bob', 'Fundamentals of management and organization', 30, False),
 (2, 'Chuck', 'Fundamentals of finance and banking', 28, False),
 (5, 'Dan', 'Fundamentals of accounting', 24, False),
 (5, 'Dan', 'Operating systems, networks and web', 23, False),
 (4, 'Eve', 'Fundamentals of management and organization', 24, False),
 (4, 'Eve', 'Fundamentals of accounting', 30, False),
 (99, 'Frank', 'Fundamentals of finance and banking', 23, False),
 (1, 'Bob', 'Programming and computer architectures', 30, True),
 (5, 'Dan', 'Database systems', 25, False),
 (99, 'Frank', 'Fundamentals of management and organization', 26, False),
 (99, 'Frank', 'Programming and computer architectures', 25, False),
 (99, 'Frank', 'Database systems', 27, False),
 (2, 'Chuck', 'Operating systems, networks and web', 26, False),
 (4, 'Eve', 'Operating systems, networks and web', 30, True),
 (42, 'Alice', 'Fundamentals of finance and banking', 29, False),
 (2, 'Chuck', 'Fundamentals of management and organization', 30, True),
 (99, 'Frank', 'Operating systems, networks and web', 22, False),
 (2, 'Chuck', 'Database systems', 30, False),
 (5, 'Dan', 'Fundamentals of finance and banking', 30, False),
 (2, 'Chuck', 'Fundamentals of accounting', 28, False),
 (1, 'Bob', 'Operating systems, networks and web', 24, False),
 (1, 'Bob', 'Database systems', 24, False),
 (5, 'Dan', 'Fundamentals of management and organization', 19, False),
 (5, 'Dan', 'Programming and computer architectures', 30, False),
 (2, 'Chuck', 'Programming and computer architectures', 23, False),
 (99, 'Frank', 'Fundamentals of accounting', 18, False),
 (1, 'Bob', 'Fundamentals of finance and banking', 22, False),
 (4, 'Eve', 'Programming and computer architectures', 24, False),
 (4, 'Eve', 'Fundamentals of finance and banking', 20, False),
 (42, 'Alice', 'Fundamentals of accounting', 30, False),
 (42, 'Alice', 'Operating systems, networks and web', 18, False)
]

In [None]:
# Sort by student id
sorted(grades, key = lambda x: x[0])

In [None]:
# Sort by name
sorted(grades, key = lambda x: x[1])

In [None]:
# Sort by grades
sorted(grades, key = lambda x: x[3])

In [None]:
# Sort by decreasing grades
sorted(grades, key = lambda x: -x[3])

In [None]:
# Sort by decreasing grades; in case of parity, give precedence to laudes
sorted(grades, key = lambda x: (-x[3], not x[4]))
# Alternative:
sorted(grades, key = lambda x: (x[3], x[4]), reversed = True)

In [None]:
# Sort by decreasing grades; in case of parity, give precedence to laudes; in case of parity, sort alphabetically
sorted(grades, key = lambda x: (-x[3], not x[4], x[1]))

In [None]:
# Use list comprehension to return a new list with the names of students with a grade on 'Database systems'
[g[1] for g in grades if g[2] == "Database systems"]

In [None]:
# (Extra) Filter on the 'Database systems', then sort by decreasing grades; in case of parity, give precedence to laudes; in case of parity, sort alphabetically
sorted(filter(lambda x: x[2] == "Database systems", grades), key = lambda x: (-x[3], not x[4], x[1]))

## Exercise 3: List of strings

Write a Python function named *analyze_strings* that takes a list of strings as its only parameter. The function must return a tuple (n1, n2, l), where:
- n1 is the length of the shortest string;
- n2 is the length of the longest string;
- l is a list containing, for each input string, a tuple (c1, c2, s), where c1 and c2 are the first and last characters in the string, respectively, and s is the string reversed.

For example,

```analyze_strings(["One", "Two", "Three"])```
    
must return:

```(3, 5, [('O', 'e', 'enO'), ('T', 'o', 'owT'), ('T', 'e', 'eerhT')])```
    
You can assume that the input list is not empty and that it does not contain empty strings. Use the *list comprehension* syntax as much as possible.

In [None]:
def analyze_strings(p):
    lengths = [len(s) for s in p]
    return min(lengths), max(lengths), [(s[0], s[-1], s[::-1]) for s in p]

## Exercise 4 Word counting

Consider the following code, which counts the frequency of words in a text, read from a file given in input.

In [None]:
def readFromFile(p):
    try:
        with open(p) as f:
            txt = f.read()
            return txt.split("\n")
    except:
        return None

def remove_punctuation(t):
    for c in t:
        if not c.isalpha():
            t = t.replace(c," ")
    return t
    
def word_count(path,n):
    txt = readFromFile(path)
    dict = {}
    for line in txt:
        line = remove_punctuation(line.lower())
        for word in line.split():
            dict[word] = dict.get(word, 0) + 1
    return {w: dict[w] for w in dict.keys() if dict[w]>=n}

The function ```def word_count(path,n):``` must become ```word_count(path,fun):``` where *fun* is a lambda function that takes in input a word *w* and the number of found occurrences *n* and determines **WHICH** words should be returned
- Call ```word_count('files/lost.txt', ...):``` in a way that it returns only words that appear at least 3 times
- Call ```word_count('files/lost.txt', ...):``` in a way that it returns only words that begin with the letter "a"
- Call ```word_count('files/lost.txt', ...):``` in a way that it returns only words that are at least 10 characters long
- Call ```word_count('files/lost.txt', ...):``` in a way that it returns only words that contain the letter "p" and that appear at least twice

Try variations of the same functions on the other files (`files/sheep.txt` and `files/divine_comedy.txt`).

In [None]:
def word_count(path,fun):
    txt = readFromFile(path)
    dict = {}
    for line in txt:
        line = remove_punctuation(line.lower())
        for word in line.split():
            dict[word] = dict.get(word, 0) + 1
    return {w: dict[w] for w in dict.keys() if fun(w,dict[w])}

In [None]:
word_count('files/lost.txt', lambda w,n: n>3)

In [None]:
word_count('files/lost.txt', lambda w,n: w[0]=="a")

In [None]:
word_count('files/lost.txt', lambda w,n: len(w)>=10)

In [None]:
word_count('files/lost.txt', lambda w,n: "p" in w and n>=2)

## (Extra) Exercise 5: Histogram of letters

Write a Python function called *create_histogram* that takes a string as input parameter and performs the following operations:
- converts all the characters to uppercase;
- counts the occurrences of each letter in the text and creates a dictionary that associates to each letter its number of occurrences;
  - *Hint: to consider only non-alphabetic characters, use the ```.isalpha()``` method of strings*
- returns a list of tuples (l, h), where l is a letter and h a string containing as many asterisks as many times the word appears in the text; the list of tuples must be sorted according to the alphabetic order of the letters.

For example,

```create_histogram("If we can't live together, we're gonna die alone.")```
    
must return:

    [('A', '***'),
     ('C', '*'),
     ('D', '*'),
     ('E', '********'),
     ('F', '*'),
     ('G', '**'),
     ('H', '*'),
     ('I', '***'),
     ('L', '**'),
     ('N', '****'),
     ('O', '***'),
     ('R', '**'),
     ('T', '***'),
     ('V', '*'),
     ('W', '**')]

In [None]:
def create_histogram(text):
    d = {}
    for c in text.upper():
        if(c.isalpha()):
            if(c in d):
                d[c] = d[c]+'*'
            else:
                d[c] = '*'
    return [(c, d[c]) for c in sorted(d.keys())]

In [None]:
def create_histogram(text):
    ch_list = [c for c in text.upper() if c.isalpha()]
    d = {}
    for ch in ch_list:
        d[ch] = d.get(ch, 0) + 1
    return [(ch, "*" * d[ch]) for ch in sorted(d.keys())]

## (Extra) Exercise 6: Find common words in two strings

Write a Python function called *find_common_words* that takes two strings as input arguments and returns the sorted list of common words. The function should:
- separate words based on the presence of spaces (" ")
- remove punctuation characters, but keep apostrophes (use the `replace` method of strings)
- make no distinction between uppercase and lowercase letters.

For example,

    text1 = "And so he spoke, and so he spoke, that lord of Castamere, But now the rains weep o'er his hall, with no one there to hear."
    text2 = "No one sang the words, but Catelyn knew \"The Rains of Castamere\" when she heard it."
    find_common_words(text1,text2)
    
must return:

    ['but', 'castamere', 'no', 'of', 'one', 'rains', 'the']
    
Hints:
- Define a function to translate the text into words and apply it over both text1 and text2
- Be efficient in finding the common words

In [None]:
def set_of_words(s):
    for p in ".,!?:;-\"": # This requires explicitly indicating which characters should be removed
        s = s.replace(p, " ")
    return set(s.lower().split()) 
    # Notice that the 'set()' function removes possible duplicates in the list!
    # Also, calling 'split()' without parameters avoids generating the empty string ""

def find_common_words(s1, s2):
    return sorted(set_of_words(s1) & set_of_words(s2))

In [None]:
def set_of_words(s):
    s = s.lower()
    for c in s:
        if not c.isalpha() and c!="'": # This clears any non-character (except apostrophes)
            s = s.replace(c," ")
    
    return set(s.split()) 

def find_common_words(s1, s2):
    return sorted(set_of_words(s1) & set_of_words(s2))

In [None]:
text1 = "And so he spoke, and so he spoke, that lord of Castamere, But now the rains weep o'er his hall, with no one there to hear."
text2 = "No one sang the words, but Catelyn knew \"The Rains of Castamere\" when she heard it."
find_common_words(text1,text2)