# Morse code encoder

Use Python to encode a string into the International Morse Code system. Morse code is a character encoding scheme used in telecommunication that encodes text characters as standardized sequences of two different signal durations called dots and dashes or dits and dahs. Morse code is named for Samuel F. B. Morse, the inventor of the telegraph [1]

[1] https://www.wikiwand.com/en/Morse_code


## Auxilliary files
In the datasets folder there is a `.csv` file containing the Morse code equivalent for the most comoon characters of the English language.

## Skills
- Importing modules
- Import data using Pandas library
- Logical indexing using Pandas library
- Accessing data in Pandas dataframe
- Create lists and append elements
- Manipulation of strings (join strings, change case, multiply strings)
- For loop
- If statement

**Let's do it!** or in Morse code:

    
    `.- -..   .- ... - .-. .-   .--. . .-.   .- ... .--. . .-. .-`


## Goal

For any given string, replace each character (letter, exclamation point, comma, etc.) with its corresponding Morse code. The idea is to return a single string with the input string encoded in Morse code.

## Step 1: Break down the problem into smaller pieces

When we follow a tutorial we typically see nice and logically-organized code. However, we hardly ever write code like this. Instead, we try to break down the problem into smaller components and we test these smaller components. Then, we assemble the code, where new challenges will likely emerge. So, there are several iterations of the polishing process.

Here are few of the steps that I envisioned before writing the code:

- Search or create a lookup table of English characters and Morse codes.

- Iterate over each character of the input string

- Match a character from the input string to the list of characters in the lookup table

- Retrieve the Morse code for the matched character

- Store this code (at this point I still did not know I was going to use a list, althought it sorts of make sense)

- Repeat steps with the following character of the input string

<br/>

**Things that I ignored at this stage and that later on became important**

- Spacing between letters and words. By reading the Wikipedia page I found that there are actually rules for spacing characters and words. So I tried to implement a rough variation of them in my code.

- Join all the Morse codes in a list to form a string. I dealt with this problem once the code was working and I needed to focus on how to print the output string.

- Spaces in input string. The lookup table has Morse codes for characters, but it has no way of dealing with spaces. My first script was unable to handle spaces and was crashing even for something like `Hello world`. So, I focused on getting the right answer for just `Hello`. If I can retrive the correct Morse codes for a simple word, then it means that I'm close. Aim at accomplishing small steps and then proceed. After savoring small victories I feel motivated and engaged to resolve the next step.

## Step 2: Create a Morse code lookup table 

THe first step consists of creating a lookup table between common English characters and Morse codes. I obtained the codes from Wikipedia and saved them into a text file. The file is in the `Datasets` folder.

If you copy-paste text from a website or file, make sure to remove the any formatting. I also had to disable in TextEdit (Mac) software the "smart dashes" and "smart quotes", so that the text editor keeps "---" as three dashes instad of creating a horizontal line.

I compiled the Morse codes for a total of 52 characters and I saved them in a tab-delimeted file.

This is an example of steps that take some 20 to 30 minutes just to prepare the data. 

## Step 3: Load lookup table

Since we have a text file with two columns (character and code), it's pretty obvious that Pandas is a good alternative.

Pandas linrary also allows for logical indexing, which means that we can use a vector of Booleans to easily retrieve information from specific column cells.

To load the lookuptable into Python we need few pieces of information:

1. URL for the file
2. File delimiter
3. Parser engine to use. We will use the python engine, which is more feature-complete. The default parser will throw an error.

After loading the lookup table we will display the entire dataframe to double-check that everything is loaded as expected.

**Note**: I had no idea that Python has different parsers. I first thought that my code was crashing because I encoded something wrong in the text file and that Python was having a hard time to read my file. Much of this problem arose after I added characters such as apostrophes to my lookup table. So, I first found a thread in StackOverflow and then I went to the [Pandas official documentation](<https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html)...and there it was!, a succint but nice explanation about parser engines.

In [17]:
import pandas as pd

In [18]:
# Load lookup table
morse_table = pd.read_csv('https://raw.githubusercontent.com/soilwater/pynotes-agriscience/gh-pages/datasets/morse_lookup_table.txt',
                          sep='\t',
                          engine='python')
# Display dataframe
morse_table

Unnamed: 0,character,code
0,A,.-
1,B,-...
2,C,-.-.
3,D,-..
4,E,.
5,F,..-.
6,G,--.
7,H,....
8,I,..
9,J,.---


## Step 4: Test the steps

Breaking down the problem into smaller pieces in step 1 does not necessarily means that we know how to code these steps. It's important that you understand the difference. If you can break down a complex problem into smaller, simpler problems you will be able to find a solution or workaround.

Below are few examples of the tests that I tried before attempting to write the Morse encoder script. Note that my tests are based on trivial examples. If the code works for the letter `A` then it will work for other letters.

### Test 1: Match a single character to an entire Pandas column of characters

In [4]:
morse_table.character == 'A'

0      True
1     False
2     False
3     False
4     False
5     False
6     False
7     False
8     False
9     False
10    False
11    False
12    False
13    False
14    False
15    False
16    False
17    False
18    False
19    False
20    False
21    False
22    False
23    False
24    False
25    False
26    False
27    False
28    False
29    False
30    False
31    False
32    False
33    False
34    False
35    False
36    False
37    False
38    False
39    False
40    False
41    False
42    False
43    False
44    False
45    False
46    False
47    False
48    False
49    False
50    False
51    False
52    False
53    False
Name: character, dtype: bool

Output is a Boolean vector, where the first value is True and the rest are False. The code successfully identifies the location of the character 'A'.

### Test 2: Retrieve the Morse code for the matched character. 
Here I decided to save the boolean vector into a variable, so that I can pass it to the Pandas column.

In [5]:
idx = morse_table.character == 'A'   
print(morse_table.code[idx]) # I want the Morse code for the row that returned True
print(type(morse_table.code[idx])) 

0    .-
Name: code, dtype: object
<class 'pandas.core.series.Series'>


Close, but not exactly what I expected. See the problem is that this is type pandas.series. I just want the Morse code as a string, so that I can store it or concatenate it.

### Test 2 (second attempt)
After trying few alternatives and visiting the official documentation I found that we can access the information inside the pandas.series by specifying the value.

In [6]:
idx = morse_table.character == 'A'
print(morse_table.code[idx].values[0])
print(type(morse_table.code[idx].values[0]))

.-
<class 'str'>


Bingo! But the code above does not work for lower case characters like 'a'. So we need to learn how to do that. Fortunately, this is easy to implement in python (see next test).

In [7]:
# Test 3: Change case of character

mystr = 'a'
print(mystr.upper())

A


It works!

### Test 4: Append characters
In step 1 I realized that after finding the Morse code for a specific character I had to find a way of storing that string before I move on, otherwise I would keep iterating and overwriting my Morse codes.

In [8]:
# Start with an empty list
output_string = []

# Append an example string (I don't even know what character this Morse code represents)
output_string.append('.-')

# Print string to see its current state
print(output_string)

# Append another random Morse code
output_string.append('.---')

# Print list
print(output_string)

['.-']
['.-', '.---']


It's working. The Morse codes here do not matter, the idea is to test the code that will enable us to store the codes in a list.

### Test 5: Join string list items into a single string.
If I store all the Morse codes in a list, how do I print them all together at the end? See, I want my code to return a 'translated' string.
Since the previous step worked as expected, I will use it in this test. From the Strings tutorial we learned that we can join list items as follows:

In [9]:
print('&'.join(output_string))

.-&.---


So, if I assign a string with a single space into a new variable, I should be able to merge all the Morse codes and separate them by a single space to make it more readable. In other words, each Morse code representing a character will be separated by a blank space.

In [10]:
separator = " "
print(separator.join(output_string))

.- .---


Then I asked myself, what happens if I add more spaces, will the resulting string of Morse codes look better or worse? There is only one way to find out. Here is a cool Python trick. It's much more transparent and readable than adding the spaces

In [None]:
separator = " " * 3
print(separator.join(output_string))

## Step 5: Put the pieces together

Now that we ran several tests and that we know how to code the different parts of the problem we are ready to put the puzzle together. It will be the first try, so it's fine if it doesn't work from top to bottom. The goal here is to get at least some steps to work together.


In [11]:
# First attempt to encode strings into Morse code

decoded_string = "Hello"
encoded_string = []
letter_sep = " " * 3

for letter in decoded_string:
    idx = morse_table.character == letter.upper()
    encoded_string.append(morse_table.code[idx].values[0])

print(letter_sep.join(output_string))

.-   .---


The code works, but it crashes when I add a string with spaces, like "Hello world". This is because spaces are not part of the lookup table. So, we need to handle this in the code using an 'if' statement is probably the first thing that comes to mind. Let's try it

In [14]:
# If we find a space between words then we will add a space larger than that between letters.
decoded_string = "Hello world"
encoded_string = []

letter_sep = " " * 3 # Space between letters
word_sep = " " * 5   # Space between words

for letter in decoded_string:
    if letter == " ":
        encoded_string.append(word_sep)
    else:
        idx = morse_table.character == letter.upper()
        encoded_string.append(morse_table.code[idx].values[0])

print(letter_sep.join(encoded_string))

....   .   .-..   .-..   ---           .--   ---   .-.   .-..   -..


In [22]:
# A complete and improved version of the previous code that handles strings with multiple lines.

decoded_string = """The programmers of tomorrow are the wizards of the future. You are going to look like you have magic powers compared to everybody else. Gabe Newell"""
#decoded_string = "Let's do it!"
encoded_string = []

letter_sep = " " * 1
word_sep = " " * 1 # Alternatively to better identify words

for letter in decoded_string:
    
    # Handle spaces between words
    if letter == ' ':
        encoded_string.append(word_sep)
        
    # Handle new line in text with multiple lines. I basically decided to ignore it.
    elif letter == "\n":
        continue
        
    else:
        idx = morse_table.character == letter.upper()
        encoded_string.append(morse_table.code[idx].values[0])

print(letter_sep.join(encoded_string))


- .... .   .--. .-. --- --. .-. .- -- -- . .-. ...   --- ..-.   - --- -- --- .-. .-. --- .--   .- .-. .   - .... .   .-- .. --.. .- .-. -.. ...   --- ..-.   - .... .   ..-. ..- - ..- .-. . .-.-.-   -.-- --- ..-   .- .-. .   --. --- .. -. --.   - ---   .-.. --- --- -.-   .-.. .. -.- .   -.-- --- ..-   .... .- ...- .   -- .- --. .. -.-.   .--. --- .-- . .-. ...   -.-. --- -- .--. .- .-. . -..   - ---   . ...- . .-. -.-- -... --- -.. -.--   . .-.. ... . .-.-.-   --. .- -... .   -. . .-- . .-.. .-..


## Final comments

We can certainly keep adding features to our code. Other ideas that stem from this project are:
- Convert the script into a function
- Add input validation (e.g. ensure that input is a string)
- Print the actual English string above the Morse code
- Write a script or function that converts Morse code back into English
- Create a game that asks the player to guess which character is a random Morse code