# 2. Alphabetical Sort

In this exercise we wil focus on sorting characters and strings. 
### Part 1:
For the first part we were asked to implement the Counting Sort algorithm and this is how we did it:

In [1]:
def counting_sort(l):

    range_el = max(l) + 1
    occur = [0] * range_el
    final = [0] * (len(l))

    for i in range(len(l)):
        occur[l[i]] += 1

    for i in range(1, len(occur)):
        occur[i] = occur[i - 1] + occur[i]

    for e in l:
        final[occur[e]-1] = e
        occur[e] -= 1

    return final

### N.B.
* range_el is the range of the elements we are trying to sort, basically it corresponds to the biggest value we have int the list that we need to sort. 
* in the first loop we are counting the occurrences of every element
* in the second loop we are summing each element of the occur vector with his previous one
* in the third loop we are actually creating the ordered list (final) based on the info in the occur list

We used [this video][1] as reference to understand how the counting sort works.


[1]: https://www.youtube.com/watch?v=7zuGmKfUt7s

### Part 2:
Here we wrote an algorithm that uses Counting Sort to sort the letter of the alphabet

In [4]:
def sort_char(char_list):

    to_sort_int = [(ord(x) - 97) for x in char_list]  # turning the characters into int from 0 to 25
    result = counting_sort(to_sort_int)               # running the counting sort on them 
    result = [chr(x + 97) for x in result]            # turning the int back into char
    return result

Let's see how it works:

In [5]:
to_sort = ['p', 'a', 'n', 'd', 'r', 'e', 'q', 'w', 't', 'y', 'u', 'i', 'o', 's', 'f', 'g', 'h', 'j', 'k', 'l', 'z', 'x',
           'c', 'v', 'b', 'm', 'f']

print(sort_char(to_sort))

['a', 'b', 'c', 'd', 'e', 'f', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z']


#### Time Complexity:
Counting sort is great for sorting positive numbers in a given and realtivly small range because it takes linear time to compute. In fact we don't have nested for loops. Every loop is either looping over len(input) or the range (k) of possible values of the input. In fact the complexity is O(n+k) where n is the length of the input list and k is the range the possible values of the elements in the list.
In our algorithm that works on characters we didn't add complexity because we are only turning into int every char (a loop over n), calling the counting sort and then turning the int back into char (loop over n). So the time complexity remains linear.

### Part 3:
In this part we built an algorithm based on counting sort to sort strings alphabetically. We decided to convert the strings into numbers (turning any char into a two digit number) and sort them basing our algorithm on the radix sort, which recursively calls counting sort to sort numbers first by the most significant digit, then next one and so on until the least significant one. In this algorithm we used some **auxiliary functions:**

In [6]:
# This function is used to prepar the input for the main function

def prep_input(list_strings):

    m = len(max(list_strings, key=len))      # max length of a string in the list
    out = []
    list_strings = [x.lower() for x in list_strings]  # converting to lowercase

    for word in list_strings:                # for every word we prepare the new numerical string
        s = ""
        for letter in word: 
            if letter == " ":                # if the word has spaces in it we represent them with the code '01'
                s += '01'  
            else:
                s += str(ord(letter)-86)     # al the other letters are converted into a number (from 11 to 36)
        if len(word) < m:
            s = s.ljust((len(word) + (m-len(word)))*2, '0')  # if a string is shorter than m we pad it with 
                                                             # zeros at the end to have them of the same length
        out.append(s)

    out = [int(x) for x in out]              # converting all the strings into int

    return out


# This functions is used to convert the numbers back into words after we sorted them

def prettify(ordered_list_int):
    result = []
    for num in ordered_list_int:
        num = str(num)
        stringa = ""                                 # for every number we create a string
        for i in range(1,len(num), 2):               # we have a character every two digits
            if (num[i-1])+str(num[i]) == "00":       # 00 gets ignored beacuse it was just padding
                continue
            elif (num[i-1])+str(num[i]) == "01":     # 01 mean a space inside the string
                stringa += " "
            else:
                stringa += chr(int(str(num[i-1])+str(num[i]))+86)  # all the other numbers are turned
                                                                   # back into chars
        result.append(stringa)

    return result

* Then we have the **main functions** of the algorithm:

In [7]:
# This is the same structure of the counting sort above, but with a little change that
# lets us consider only the current decimal position (expressed by digit) to sort the numbers.
# Moreover this time we are modifying the list in place

def counting_sort_snd(l, digit):

    occur = [0] * 10
    final = [0] * (len(l)+1)

    for i in range(len(l)):
        occur[(l[i] // digit) % 10] += 1

    for i in range(1, len(occur)):
        occur[i] = occur[i - 1] + occur[i]

    for e in reversed(l):
        final[occur[(e // digit) % 10] - 1] = e
        occur[(e // digit) % 10] -= 1

    for i in range(len(l)):
        l[i] = final[i]

In [12]:
# This is our version of the radix sort algorithm; it finds out the biggest number in the list to order
# and then calls the counting sort on every decimal position/digit incrementig it every time.

def radix_sort(l):

    range_el = max(l)
    digit = 1

    while range_el / digit > 0:
        counting_sort_snd(l, digit)
        digit *= 10

* and this is the final product:

In [13]:
def my_algorithm(to_sort):

    print("Input: ", to_sort)            # printing input
    prepped_list = prep_input(to_sort)   # preparing input

    radix_sort(prepped_list)             # ordering list
    result = prettify(prepped_list)      # make the result readable

    return result

Let's see it at work:

In [14]:
input_test = ["good", "bad", "building", "oak hill", "zara", "kiss", "kissing", "oak", "crazy", "wow"]

print("Result: ", my_algorithm(input_test))

Input:  ['good', 'bad', 'building', 'oak hill', 'zara', 'kiss', 'kissing', 'oak', 'crazy', 'wow']
Result:  ['bad', 'building', 'crazy', 'good', 'kiss', 'kissing', 'oak', 'oak hill', 'wow', 'zara']


#### Time complexity:
The auxiliary funtions both take almost a quadratic time to be executed beacuse they loop on every word of the list and then on every letter of the word (or every two letters). More precisely they take O(m* n) where m is the number of words and n the maximum lenght of the words. The counting sort is the same so it's still linear. The radix_sort is calling the counting sort for every digit of the number so the total amount of the radix sort is O(m* j) where m is still the number of elements to sort and j is the medium number of digits they have. 