# Reverse list to string

Write a method

```python
def reverse_list_string(data: list[str], tab_by: int) -> str | None:
```

that _returns_ a string with the contents of the `data` list in reverse order, one element per line, and each element progressively tabbed by as many spaces as to the right as specified by parameter `tab_by`. If the list is empty or null, the method should return 'None'. For example, if

```python
data = ['Howard', 'Jarvis', 'Morse', 'Loyola']
```

and `tab_by = 2`, the method should _return_ the string

```python
Loyola
  Morse
    Jarvis
      Howard
```

The method should not use any imported modules nor should it alter the input array `data` in any way. You may not create any temporary arrays either. You may not use methods `join` nor `reversed`. Any loops you use in the method should traverse the input array `data` from front to end. A reverse loop

```python
    for i in range(len(data)-1, -1, -1): ...
```

is **not** an acceptable solution, nor is the expression `len(data)-1-i` or its equivalent, anywhere in your code.
There should be no magic values in your method. Your method should have one and only one return statement. No inner methods or recursion or negative array indices can be used. The method should have a docstring and should be well narrated with comments.


## Solution & technical comments

There are two main challenges in this problem. The first is to find a way to reverse the contents while traversing them forward. The analogy I like to use is riding the CTA Red Line, southbound, while writing the names of the stations from the bottom of a blank page, up.

```
                                                                Thorndale
                                                Granville       Granville
                                Loyola          Loyola          Loyola  
                Morse           Morse           Morse           Morse  
                Jarvis          Jarvis          Jarvis          Jarvis 
Howard          Howard          Howard          Howard          Howard 
---------------------------------------------------------------------------
(1st            (2nd            (3rd            (4th            (5th
 stop)           stop)           stop)           stop)           stop)
```

The second challenge is to manage the _weird_ spacing to have the names appear in a staircase fashion. This is done by adding a string with the long space before the first station and then chopping off a couple of characters from that string.

It's always a good idea to protect the method from null (`None`) or empty (`len()==0`) data.


In [None]:
TAB_SIZE = 2


def reverse_list_string(data: list[str], tab_by: int = TAB_SIZE) -> str | None:
    """Reverse a list of strings to a single string with each element on a new line and
    indented by the specified number of tabs.
    """
    # initialize return value
    reverse_string = None
    # check for valid input, otherwise just return None
    if data is not None and len(data) > 0:
        # prepare constants
        SPACING = " " * tab_by * (len(data) - 1)
        NEWLINE = "\n"
        # first element (last in output string) has most spacing
        reverse_string = SPACING + data[0]
        # build string in reverse order by traversing input list
        # in a forward manner. Skip the first element since it's
        # already added and start with position 1.
        for i in range(1, len(data)):
            # reduce spacing by one tab using string slicing.
            SPACING = SPACING[tab_by:]
            # add next element with newline and spacing in front
            reverse_string = SPACING + data[i] + NEWLINE + reverse_string
    return reverse_string


# quick test
data = ["Howard", "Jarvis", "Morse", "Loyola"]
print(reverse_list_string(data))
print(reverse_list_string(None))
print(reverse_list_string([]))

# Compare list content

Write a method

```python
def measure_similarity(target: list[str], reference: list[str]) -> float:
```

that returns a value $0 \leq s \leq 1$ defined as

$$
s = \frac{\text{number of elements from }\texttt{target}\text{ that exist in }\texttt{reference}}{\text{number of elements in }\texttt{target}}
$$


## Solution & technical comments

The solution below starts by protecting the method from `None` or `len()==0` input. In this case the method would return 0.0. This may not be always the right thing to do, as a value of 0 suggests a _valid_ input with absolutely no overlap between the two lists. An alternative would be to either [raise an exception](https://docs.python.org/3/tutorial/errors.html) or return a ridiculous value such as a negative number.

The `if`-statement that protects the method from null/empty data is formatted across multiple lines for readibility.


In [None]:
def measure_similarity(target: list[str], reference: list[str]) -> float:
    """Measure the similarity of two lists as the number of elements in target that are also
    in reference divided by the total number of elements in target.
    """
    # initialize return value
    result = 0.0
    # check for valid input
    if (
        target is not None
        and len(target) > 0
        and reference is not None
        and len(reference) > 0
    ):
        # Initialize match counter
        matches = 0
        # Consider every element in target
        for t in target:
            # If it is also in reference, count it as a match. Here I use
            # the membership operator 'in' which is rather Pythonic. In
            # other languages you might use a loop to traverse the reference
            # list and compare each element to t.
            if t in reference:
                matches += 1
        # Compute ratio of matches to total elements in target. We know it
        # is safe to divide by len(target) since we checked for that above.
        result = matches / len(target)
    return result


# quick test
target = ["a", "b", "c", "d"]
reference = ["c", "d", "e", "f"]
print(measure_similarity(target, reference))  # expect 0.5
print(measure_similarity([], reference))  # expect 0.0
print(measure_similarity(target, []))  # expect 0.0
print(measure_similarity([], []))  # expect 0.0
print(measure_similarity(["a", "b"], ["c", "d"]))  # expect 0.0
print(measure_similarity(["a", "b"], ["a", "b"]))  # expect 1.0

# Report list comparison

Write a method

```python
def report_similarity(target: list[str], reference: list[str]) -> float:
```

that returns a string with the result of a comparison between two lists given to method `measure_similarity` earlier, as a percentage value with two decimal digits. For example, if

```python
target = ["a", "b", "c", "d"]
reference = ["c", "d", "e", "f"]
```

the method should return the string `0.50%`.

For this method you may want to read a bit about [_f-strings_](https://docs.python.org/3/tutorial/inputoutput.html#formatted-string-literals) and [_format specifiers_](https://docs.python.org/3/library/string.html#formatspec). The Python documents on these two topics are, admittedly, not very engaging. RealPython has a [comprehensive tutorial](https://realpython.com/python-formatted-output/). And there is always [Python Cheat Sheet](https://www.pythoncheatsheet.org/cheatsheet/string-formatting) to the rescue!


## Solution & technical comments

The method _requires_ a string as its output. The specications suggest that the string should be generated using format specifiers, in this case `.2f%`.


In [None]:
def report_similarity(target: list[str], reference: list[str]) -> str:
    """Report the similarity of two lists as a percentage string with two decimal places."""
    return f"{measure_similarity(target, reference):.2%}"


# quick test
target = ["a", "b", "c", "d"]
reference = ["c", "d", "e", "f"]
print(report_similarity(target, reference))  # expect "50.00%"
print(report_similarity([], reference))  # expect "0.00%"
print(report_similarity(target, []))  # expect "0.00%"
print(report_similarity([], []))  # expect "0.00%"
print(report_similarity(["a", "b"], ["c", "d"]))  # expect "0.00%"
print(report_similarity(["a", "b"], ["a", "b"]))  # expect "100.00%"

# Count frequencies

Many situations require that we count the frequency of certain items. In data compression, for example, we want frequent letters to take less space. This had led to the letters E, T, A, I, N, and M to have the shortest symbols in [Morse Code](https://en.wikipedia.org/wiki/Morse_code).
The arrangement of letters on typewritters and computer keyboards also reflects letter frequency. And let's not get started with how many points we get for words like `OXYPHENBUTAZONE` -- over 1700 points when placed strategically on a _Scrabble_ board, according to the collective wisdom of Reddit.

A simple method to count frequencies in a string that only lower case letters and spaces, is to use a dictionary as shown below.


In [None]:
def simple_frequency_counter(message: str) -> dict[str, int]:
    """Count the frequency of each symbol in the given message using a dictionary."""
    # initialize return value
    frequency = {}
    # check for valid input
    if message is not None and len(message) > 0:
        # Consider every character in the message
        for char in message:
            if char in frequency:
                # Increment existing count
                frequency[char] += 1
            else:
                # First occurrence
                frequency[char] = 1
    return frequency


# quick test
print(simple_frequency_counter("hello world"))

One problem with dictionaries is that they require more memory. An array can do the job with less. So let's try this.

Write a method

```python
def efficient_frequency_counter(message: str) -> list[int]:
```

that returns an array with the frequencies of symbols in the given string. The input string comprises only lower case letters and the space character. No other symbols (such as punctuation marks or numbers or upper case letters) are present. The string is of sufficient length and complexity to [contain every letter in the english alphabet at least once](https://en.wikipedia.org/wiki/Pangram), for example:

```python
pangram = "sphinx of black quartz judge my vow"
```

You may verify the efficiency of your code with the following simple test.

```python
import sys  # to check memory usage
pangram = "sphinx of black quartz judge my vow"
dict_freq = simple_frequency_counter(pangram)  # method provided above
list_freq = efficient_frequency_counter(pangram)  # your method
print(sys.getsizeof(dict_freq), "bytes for dict")
print(sys.getsizeof(list_freq), "bytes for list")
```

A dictionary requires about 832 bytes, while a list (when done right) requires 272 bytes. This may seem a trivial improvement at a time when we pay about $0.05 per gigabyte of memory. And yet being frugal pays off. Memory may not always be abundant, especially when we write code for small devices and controllers. Learning how to be economical with memory is always a good skill.


## Solution & technical comments

The idea here is to map the 26 letters and the space character into a list with as many elements. Lists are, of course, indexed sequentially from 0. Letters are *encoded* as ASCII values, with A being 65. So if we want to store the frequency of letter A into the first position of the frequency list, we need to map A to 0, B to 1, etc. This is achieved by a simply substracting A's ASCII from the ASCII code of each character we encounter. The expression
```python
ord(char) - OFFSET
``` 
does this mapping for us.

This is a very frugal method to count frequencies of letters and spaces if the string is in lower case type and has no other symbols in it -- no numbers, no punctuation, etc. If the string has both upper and lower case letters, as well as numbers and punctuation, it may be more practical to declare a list
```python
frequency = [0] * 256
```
and use the actual ASCII code of each character as an index to this list. There are $2^8$ ASCII values, hence the size of the list. Most of the list will be unused, but at least we'll be avoiding the complexity of mapping groups of ASCII codes into sequential positions for a smaller array.

In [None]:
def efficient_frequency_counter(message: str) -> list[int]:
    """Count the frequency of each symbol in the given message using a list. Method may
    expect only lowercase letters and spaces.
    """
    # Constants
    LETTERS = 26  # a-z, 26 letters
    OFFSET = ord("a")  # ASCII value of 'a'
    SPACE = " "  # space character
    # initialize return value. Here we chose to use the last element of
    # the frequency list for the space character. Alternatively we 
    # could have chosen the first element. Either positions work out fine.
    frequency = [0] * (LETTERS + 1)  # 26 letters + space
    # Check for valid input
    if message is not None and len(message) > 0:
        # Consider every character in the message
        for char in message:
            # Check if it's a space and count it in the last position
            if char == SPACE:
                frequency[LETTERS] += 1
            else:
                # Count letter in position based on ASCII value
                frequency[ord(char) - OFFSET] += 1
    return frequency

In [None]:
import sys  # to check memory usage

pangram = "sphinx of black quartz judge my vow"

dict_freq = simple_frequency_counter(pangram)  # method provided above
list_freq = efficient_frequency_counter(pangram)  # your method
print(sys.getsizeof(dict_freq), "bytes for dict")
print(sys.getsizeof(list_freq), "bytes for list")