## Functional Programming: Introduction

Functional Programming (FP) is a programming paradigm that specifies operations on immutable data structures.

If you're interested, you can check out [this playlist](https://www.youtube.com/playlist?list=PLP8GkvaIxJP1z5bu4NX_bFrEInBkAgTMr) that covers some of the key concepts and functions of FP.

In [1]:
# from pipe import select, take
from typing import List
from pydantic import validate_call
from functools import reduce

#### Programming Paradigms

Conceptually, programming paradigms are approaches to writing code. There are many flavours, yet the most common are:
1. Imperative (e.g. Object-Oriented)
2. Functional

To see these paradigms in action, let's count the number of words in Hermann Hesse's novel "Siddartha" using the three paradigms.

In [2]:
with open(r"../../datasets/week_2/siddartha.txt", "r") as f:
    siddartha_text = f.readlines()
    siddartha_text_list = [line.strip() for line in siddartha_text]

In [3]:
# Imperative programming

count = 0

for line in siddartha_text_list:
    words = line.split(" ")

    for word in words:
        count += 1

In [4]:
count

39691

In [5]:
# Object-oriented programming
# Note that you could also define the "coun_words" method functionally as in the
# following cell.

class Text:
    def __init__(self, text, number):
        self.text = text
        self.number = number

    def count_words(self):
        count = 0

        for line in self.text:

            words = line.split(" ")

            for word in words:
                count += 1

        return count

In [7]:
sid_tex = Text(siddartha_text_list, 12)
sid_tex.count_words()

39691

### Explanation
The code uses **list comprehension** to count the total number of words in `siddartha_text_list` in a concise and functional way:

1. **List Comprehension**: `[len(line.split(" ")) for line in siddartha_text_list]`
   - Iterates over each `line` in `siddartha_text_list`.
   - Splits each line into words using `line.split(" ")`.
   - Calculates the length of each list of words with `len(...)`.
   - Collects these lengths into a new list.

2. **Summation**: `sum(...)`
   - Sums all the word counts from the list comprehension, giving the total number of words.

### Benefits of List Comprehension
1. **Conciseness**: Achieves the same result as the imperative loop in a single line, reducing code verbosity.
2. **Readability**: Offers a more readable and expressive syntax for transforming and filtering lists.
3. **Efficiency**: Typically faster than traditional loops because list comprehensions are optimized in Python's implementation.
4. **Functional Style**: Encourages a functional programming approach, avoiding explicit mutation of variables.

In [8]:
# Functional programming
sum([len(line.split(" ")) for line in siddartha_text_list])

39691

### Explanation
This code uses `reduce` and `map` along with `lambda` functions to count the total number of words in `siddartha_text_list`:

1. **`map`**: 
   - `map(lambda line: len(line.split(" ")), siddartha_text_list)`: 
     - Applies a lambda function to each `line` in `siddartha_text_list`.
     - The lambda function splits each line into words (`line.split(" ")`) and then counts the words using `len(...)`.
     - Produces an iterable of word counts for each line.

2. **`reduce`**:
   - `reduce(lambda count1, count2: count1 + count2, ...)`:
     - Takes the iterable of word counts from `map`.
     - The lambda function here adds two counts (`count1` and `count2`) together.
     - `reduce` applies this function cumulatively to all items in the iterable, effectively summing up the total word count.

### Benefits of Using `reduce` and `map`
1. **Functional Programming Paradigm**: Both `map` and `reduce` are core functional programming tools, promoting a declarative style where the focus is on "what to do" rather than "how to do it."
2. **Avoiding Explicit Loops**: They avoid the need for explicit loops and mutable state, which can make the code more concise and easier to understand.
3. **Composability**: Functions like `map` and `reduce` can be easily composed and reused with different lambda functions or other functions, enhancing code reusability.

In [9]:
# We can also write this programme as a Map/Reduce, but we will go deeper into that next week.
reduce(lambda count1, count2: count1 + count2, map(lambda line: len(line.split(" ")), siddartha_text_list))

39691

In functional programming, we are specifying transformation steps on data structures, usually sequences. It is not the data structures themselves that perform the transformations (as in object-oriented programming) but the functions that operate on them. In more practical terms, we are defining *pipelines* of functions. This will be important later on when we will work with **PySpark**.

#### Map/FlatMap

We will be making frequent use of map/flatmap. *Map* is also known as a ``one-to-one`` transformation, while *flatmap* is a ``one-to-many`` transformation.

#### Anonymous Functions (Lambda Functions)

Lambda functions are functions that are not given a specific name. Their name is inspired by the "Lambda Calculus". They are defined using the "lambda" keyword. Unlike in some other programming languages, lambda functions are not particularly powerful in Python, e.g. they are not multiline, so they are mostly used for simple operations.

In [17]:
# A sequence of numbers
array = [1, 2, 3, 4]

# We apply the lambda function to each element
list(map(lambda x: x + 1, array))

[2, 3, 4, 5]

#### Recursion

Recursion is an important building block of functional programming. It is the ability to define a function that calls itself. In most functional programming languages, recursion do the work that "for-loops" would do.

In [14]:
# Summing the elements of a list
# In FP terms, we add the "head" of the list to the "tail", here by using indices.
def sum(ls):
    if ls:
        print(ls[0])
        print(ls[1:])
        return ls[0] + sum(ls[1:]) # the "sum" function calls itself
    else:
        return 0


In [15]:
sum([1, 2, 3, 4])

1
[2, 3, 4]
2
[3, 4]
3
[4]
4
[]


10

#### Functions and Types

A function can be thought of as a set of rules mapping inputs to corresponding outputs like a mathematical function mapping one domain to another. Functions can be *pure* and *impure*. Pure functions **only** operate on explicit inputs, without side effects. **Impure** functions have side effects, which are not related to the inputs, such as modifications to global variables or accessing (read/write) the filesystem. Realistically, we will use many times impure functions out of pure necessity.

#### Functions & Functionalisms

In [16]:
word_list = ["häuser", "bäume", "berlin", "münchen", "donaudampfschifffahrtsgesellschaftskapitän"]

In [19]:
# A function that strips special characters and accents from words
# (Unicode normalization)
from unidecode import unidecode

In [20]:
# Spoiler: This can also be expressed using map/reduce.
# This is for next week. Here, we use a list comprehension.
" ".join([unidecode(word) for word in word_list])

'hauser baume berlin munchen donaudampfschifffahrtsgesellschaftskapitan'

#### Partial Application of Functions & Currying

We can define *partial* functions that receive less than the total number of arguments of the corresponding *total* function.

In [22]:
from functools import partial
from toolz.functoolz import curry

In [23]:
# Function to count the occurrence of a specific letter

def count_letter(word, letter):
    return word.count(letter)

# counter_letter_e is now a partial function because we fix the argument "letter" to e.
count_letter_e = partial(count_letter, letter="e")

list(map(count_letter_e, word_list))

[1, 1, 1, 1, 2]

#### The Pipe Operator

Would it not be useful, if we could specify a set of transformations concisely as a pipeline? Assigning a new variable for every pipeline step quickly gets tedious.

In [24]:
nums = [5, 6, 34, 12, 231, 98]

divisible_by_two = filter(lambda num: num % 2 == 0, nums)
squared_by_two = map(lambda num: num ** 2, divisible_by_two)

list(squared_by_two)

[36, 1156, 144, 9604]

For efficient chaining of functions and pipes, we can use the [Pipe](https://github.com/JulienPalard/Pipe) library, which uses the Set operator "|" as a Pipe command. *Note*: For those of you using UNIX-like shells, you will recognize this operator's symbol. Pipes are best used if you have a sequence that needs to be transformed.

In [25]:
from pipe import filter, map

In [26]:

# The expressions above are "lazy", which means that they are only executed when specifically
# instructed to do so.
expr = (
    nums |
    filter(lambda num: num % 2 == 0) |
    map(lambda num: num ** 2)
)

In [27]:
# This will execute the pipe
list(expr)

[36, 1156, 144, 9604]

#### Exercise

Reverse engineering an encrypted message: You have intercepted an encrypted message and even part of the code that generated it. Using the clues in the code, complete the pipeline to decrypt it.

Some rules are known:

1. The most frequent letter in the English language has been replaced by another one.
2. Some word particles have been attached to the beginnings of words, some to the end.
3. The message was reversed.


Your task is to reverse engineer the message, using the *Pipe* library. That is, beginning from the encrypted message, specify transformation steps. You may use intermediate, functional computations.

In [46]:
!pip3 install pipe

Collecting pipe
  Downloading pipe-2.2-py3-none-any.whl.metadata (17 kB)
Downloading pipe-2.2-py3-none-any.whl (9.7 kB)
Installing collected packages: pipe
Successfully installed pipe-2.2


In [31]:
from pipe import reverse, map

In [32]:
with open("../../datasets/week_2/encrypted_message.txt", "r") as message:
    encrypted_message = message.readlines()
    encrypted_message = [line.strip() for line in encrypted_message]

In [33]:
message = list(
    encrypted_message |
    reverse |
    map(lambda word: word.replace("nu", "")) |
    map(lambda word: word.replace("sku", "") if len(word.replace("sku", "")) < 5 else word.replace("asku", "")) |
    map(lambda word: word.replace("k", "e")) |
    map(lambda word: word.replace("tta", ""))
    #map(lambda word: "nu" + word)
)

In [34]:
message

['functional',
 'programming',
 'is',
 'the',
 'superior',
 'programming',
 'paradigm',
 'compared',
 'to',
 'other',
 'programming',
 'paradigms',
 'it',
 'is',
 'easier',
 'to',
 'read',
 'understand',
 'and',
 'debug',
 'object-oriented',
 'programming',
 'has',
 'led',
 'generations',
 'of',
 'developers',
 'to',
 'despair',
 'and',
 'insanity',
 'some',
 'even',
 'choosing',
 'to',
 'open',
 'specialty',
 'coffee',
 'shops',
 'in',
 'a',
 'gentrified',
 'area',
 'of',
 'lisbon',
 'instead']