# Exercises - Functions and Containers

## Conversions between the binary and the decimal system

We discussed different systems to represent numbers. To represent a natural number $n$ in a base $b$, we have
$$
n = \sum_{i=0}^m q_i b^i
$$
with $0\leq q_i < b$. In our daily life, we use $b=10$ (decimal system), whereas computers internally work with $b=2$ (binary numbers).

Your task is to write a Python-functions to convert a given positive integer number between the binary and the decimal systems.

An algorithm to perform the decimal $\to$ binary direction is as follows: *start with the integer in question and divide it by 2 keeping notice of the quotient and the remainder. Continue dividing the quotient by 2 until you get a quotient of zero. Then just write out the remainders in the reverse order.*

Here is an example of such conversion using the integer 12. 
First, let’s divide the number by two specifying quotient and remainder:
$$
\begin{eqnarray}
 12 : 2 & = & 6 + 0 \\
 6 : 2 & = & 3 + 0 \\
 3 : 2 & = & 1 + 1 \\
 1 : 2 & = & 0 + 1 \\
\end{eqnarray}
$$
Now, we simply need to write out the remainder in the reverse order — 1100. So, 12 in decimal system is represented as 1100 in binary.

Your tasks:
1. Implement a function `dec2binary` that accepts a positive integer as argument and returns its binary representation *as a string*  - **please** check the hints below before you start!
2. Verify your function with the following test cases: $0\rightarrow 0$, $2\rightarrow 10$, $15\rightarrow 1111$, $170 \rightarrow 10101010$, $123456789 \rightarrow 111010110111100110100010101$.
3. Implement a function `binary2dec` which takes a binary number (represented as a string) as argument and returns the decimal equivalent as integer. This can be done using the defining equation above.
4. verify that your functions fulfil the relations `x == binary2dec(dec2binary(x))` and `b = dec2binary(binary2dec(b))`.

**Hints for `dec2binary`:**
1. See for instance [this website](https://blog.angularindepth.com/the-simple-math-behind-decimal-binary-conversion-algorithms-d30c967c9724) if you are interested to learn how/why the algorithm works.
2. Use a `while`-loop to perform necessary divisions.
3. To obtain the quotient and the remainder of a division with one expression, you can use the function ``divmod``.
4. You can represent the binary-number and build it up within your loop in a string - see the next cell!

In [None]:
# 1. remember that you can convert an integer to a string with the 'str'-function!
# 2. strings can be 'concatenated' with the '+'-operator

n1 = 1  # 1 as integer
n2 = 0  # 0 as integer
number = n1 + n2  # simple integer addition
binary_number = str(n1) + str(n2) # the '+'-operator on strings 'concatenates them'
print(number, binary_number)

In [None]:
# your solution here

## Word doubling

This exercise results in a very practical program that you can use when writing your thesis, a reasearch paper or another, longer text.

When writing a text, we often make the mistake to repeat a word.

   ```
   When typing longer texts, we often often make the mistake to
   repeat individual words such as here here.

   ```
   
Write a program which reads a text file and marks positions with such mistakes. You should print lines and linenumbers with double words. Consider also cases where a word doubling occurs directly before and after a line break.

You can find a short example text for test purposes [here](data/double_words.txt).

**Note:**
An obvious ansatz for the problem is to read a line, to split it according to spaces, tabulators and line breaks and to perform the double word test. This, however is no complete solution to the problem! Consider again the above example:

   ```
   ... such as here here.
   ```
   
The double word `here` would not be recognised because the second one is directly followed by a dot (end of sentence mark). There are similar issues with other punctuation marks such as semicolons, parentheses and so on. See the following cell for a possibility to take this into account.

**Example:** The [example text](data/double_words.txt) gives the follwing output with my code:
```
Repetition in line 1. Word "often" at position 6!
Repetition in line 2. Word "here" at position 6!
Repetition of the first word "words" on line 5. It occured at the end of the previous (non-empty) line!
Repetition of the first word "test" on line 9. It occured at the end of the previous (non-empty) line!

```

In [None]:
# example to split a string into words taking into account
# (removing) punctuation.
# For time reasons, we will not treat 'regular expressions' in class
# but you should look them up yourself!

import re # module to handle regular expressions in a Python program

s = "Here some text with double double words words. It also contains puctuation!"

# split s into its words without the punctuation marks; note that
# you might end up with empty strings in the word list!
words = re.split('\W+', s.rstrip())

# we remove empty strings from the word list:
words_clean = [word for word in words if word != "" ]

print(words, words_clean)

In [None]:
# your solution here