# Task

Write a function `cut_suffix` which takes a string and a suffix. A function should return this string
without the given suffix.

```python
cut_suffix("foobar", "bar")
>>> "foo"

cut_suffix("foobar", "boo")
>>> "foobar"
```

In [3]:
def cut_suffix(s: str, suffix: str) -> str:
    if suffix and s.endswith(suffix):
        return s[:-len(suffix)]
    return s

print(cut_suffix("foobar", "bar"))
print(cut_suffix("foobar", "boo"))
print(cut_suffix("hello", ""))
print(cut_suffix("abc", "abc"))

foo
foobar
hello



Write a function `boxed` which takes a string and two arguments: a symbol `fill` and a number
`pad`. A result of the `boxed` function execution should be a string surrounded by `fill` symbols as
it’s shown in the example.

```python
print(boxed("Hello world", fill="*", pad=2))
print(boxed("Fishy", fill="#", pad=1))
```

result:

```md
*****************
** Hello world **
*****************

#########
# Fishy #
#########
```

In [5]:
def boxed(text: str, fill: str = "*", pad: int = 1) -> str:
    middle = f"{fill}{' ' * pad}{text}{' ' * pad}{fill}"
    border = fill * len(middle)
    return f"{border}\n{middle}\n{border}"

print(boxed("Hello world", fill="*", pad=2))
print()
print(boxed("Fishy", fill="#", pad=1))

*****************
*  Hello world  *
*****************

#########
# Fishy #
#########



---



**She-bang** – a sequence `#!` which is used in Unix-like systems to run executable scripts. **She-bang**
is always written on the first line in the script. After **she-bang** there is path to an interpreter program
written, for example:

`#! /bin/sh`

`#!/usr/bin/env python -v`

> Look at more example of she-bang here: http://en.wikipedia.org/wiki/Shebang_(Unix)


Write a function `parse_shebang` which takes a path to an executable script and return a path to an
interpreter program, if a script contains `she-bang` and None otherwise.
For the scripts in the example above:
```python
parse_shebang("./example1.txt")
>>> "/bin/sh"

parse_shebang("./example2.txt")
>>> "/usr/bin/env python -v"
```

In [None]:
def parse_shebang(path: str):
    with open(path, "r", encoding="utf-8", errors="replace") as f:
        first_line = f.readline()

    if not first_line.startswith("#!"):
        return None

    shebang = first_line[2:].rstrip("\r\n").lstrip()

    return shebang or None

parse_shebang("./example1.txt")

# Special Task (Bonus 4%)

A probabilistic langauge model describes pieces of text of some language in terms of random processes.
One of the simplest language model can be stated the following way. Let’s assume that we know a set of all
words in a language. Let’s generate words in a sentence from left to right one-by-one:

- Randomly take first two words from the set of all words.
- Each i’th word we will generate having from two previous (i - 1)’th and (i - 2)’th words.

Let’s try to build a language model based on lyrics of Taylor Swift's songs!

1. Write a function `words` which takes a text file and returns a list of words from a file:
```python
import io
handle = io.StringIO("""Can we always be this close forever and ever?
And ah, take me out, and take me home forever and ever.""")
words(handle)
>>>['Can', 'we', 'always', 'be', 'this', 'close', 'forever', 'and', 'ever?\n', 'And', 'ah,', 'take', 'me', 'out,', 'and', 'take', 'me', 'home', 'forever', 'and', 'ever.',
 ]
```

**Mind that punctuation and new-line characters stay unchanged!!!**

In [9]:
def words(handle):
    result = []
    for line in handle:
        parts = line.split(" ")
        result.extend(parts)
    return result

---

2. Write a function `transition_matrix` which takes a list of words and returns a dictionary. This
dictionary for every pair of words `(u, v)` contains a list of words `w` which follow words `u` and `v` in the
input list of words. For the example above:

```python
language = words(handle)
m = transition_matrix(language)
m[("take", "me")]
>>> ["out,", "home"]

m[("we", "always")]
>>> ["be"]

m[("forever", "and")]
>>> ["ever?\n", "ever."]
```

In [8]:
def transition_matrix(word_list):
    matrix = {}
    for i in range(len(word_list) - 2):
        u = word_list[i]
        v = word_list[i + 1]
        w = word_list[i + 2]
        key = (u, v)
        if key not in matrix:
            matrix[key] = []
        matrix[key].append(w)
    return matrix


---

3. Write a function `markov_chain` which generates sentences of a defined size. A function takes three
parameters:
- a list of words, a result `words` function execution,
- a dictionary, built with `transition_matrix` function,
- an integer – a number of words in sentence to be generated.


Let me remind how to generate random sentences. Let’s generate words in a sentence from left to right
one-by-one:
- Randomly take two first words from all words list words.
- Each `i`’th word will be generated using previous two `(i - 1)`’th and `(i - 2)`’th words (with help of `transition_matrix`).
- If this pair didn’t happen to exist (it’s in `transition_matrix` dictionary) then `i`’th word is taken randomly from the set of all
words.

You will need functions `random.randint` and `random.choice`.

In [12]:
import random

def markov_chain(word_list, matrix, length):
    if length <= 0 or not word_list:
        return []

    if length == 1:
        return [random.choice(word_list)]

    first = random.choice(word_list)
    second = random.choice(word_list)
    sentence = [first, second]

    for i in range(2, length):
        key = (sentence[i - 2], sentence[i - 1])
        if key in matrix and matrix[key]:
            next_word = random.choice(matrix[key])
        else:
            next_word = random.choice(word_list)
        sentence.append(next_word)

    return sentence


---

4. Write a function `taylor_swifter()` which takes a path to a file `taylor_swift.txt` and an
integer – a length of a sentence and returns a sentence of specified language on Taylor Swift's
language.

```python
print(taylor_swifter("./taylor_swift.txt", 30))

>>>'well dancing pack I got this music in my head tell me to the garden? In the garden, would you trust me If I told you it was never mine'
```

In [None]:
def taylor_swifter(path: str, length: int) -> str:
    with open(path, "r", encoding="utf-8") as f:
        ws = words(f)

    m = transition_matrix(ws)

    generated = markov_chain(ws, m, length)

    sentence = " ".join(generated).replace("\n", " ")

    return sentence

print(taylor_swifter("./taylor_swift.txt", 30))

I won't stop groovin' It's cool, that's what people say, mm-mm But if I sneak out time of style, we never learned to be in the whole time You know


>The lyrics came from well-known taylor swifts songs and use for education only.