## Functional programming

__GITHUB__

- [PyFunctional](https://github.com/EntilZha/PyFunctional)
- [RxPY](https://github.com/ReactiveX/RxPY)  -- [DOC](https://rxpy.readthedocs.io/en/latest/)


__Setup__
```
$ pip install pyfunctional
$ pip install rx
```

__TUTORIAL__
- [Tutorialspoint on RxPY](https://www.tutorialspoint.com/rxpy/rxpy_latest_release_updates.htm)  

__BOOK__
- [Introduction to ReactiveX and RxPY](https://subscription.packtpub.com/book/application_development/9781789138726/1/ch01lvl1sec11/introduction-to-reactivex-and-rxpy)  

__SLIDE__
- [reactive-programming-in-python](https://keitheis.github.io/reactive-programming-in-python/) 
    
__PROJECT__
- [functional-reactive-programming-in-python - scraping GitHub](https://jakubturek.com/functional-reactive-programming-in-python/)  

### PyFunctional

The project was created by [Pedro Rodriguez](https://www.pedro.ai/) taking the best ideas from Spark/Scala/LINQ APIs to provide an easy way to manipulate data when using Scala is not an option or PySpark is overkill

PyFunctional has three types of functions:

- __Streams__: read data for use by the collections API.
- __Transformations__: transform data from streams with functions such as map, flat_map, and filter
- __Actions__: These cause a series of transformations to lazy-evaluate to a concrete value. to_list, reduce, and to_dict are examples of actions.

In [1]:
from functional import seq

#### Example - map/filter/reduce

In [2]:
# or if you don't like backslash continuation
# put code inside ( ... )
total = (
seq(range(10))
    .map(lambda x: x * 2)
    .filter(lambda x: x > 4)
    .reduce(lambda x, y: x + y)
)
# 84
total

84

#### Example - word count

In [3]:
words = 'I do not know what I do not know , what do I know here and now ?'

In [4]:
(
seq(words.split(' '))
    .map(lambda word: (word, 1))
    .reduce_by_key(lambda x, y: x + y)
    .order_by(lambda x: -x[1])
)

0,1
I,3
do,3
know,3
not,2
what,2
",",1
here,1
and,1
now,1
?,1


In [5]:
(
seq(words.split(' '))
    .map(lambda word: (word, 1))
    .reduce_by_key(lambda x, y: x + y)
    .group_by(lambda x: x[1])
)

0,1
3,"[('I', 3), ('do', 3), ('know', 3)]"
2,"[('not', 2), ('what', 2)]"
1,"[(',', 1), ('here', 1), ('and', 1), ('now', 1), ('?', 1)]"


In [6]:
(
seq(words.split(' '))
    .map(lambda word: (word, 1))
    .reduce_by_key(lambda x, y: x + y)
    .count_by_key()
)

0,1
I,1
do,1
not,1
know,1
what,1
",",1
here,1
and,1
now,1
?,1


In [7]:
(
seq(words.split(' '))
    .count_by_value()
)

0,1
I,3
do,3
not,2
know,3
what,2
",",1
here,1
and,1
now,1
?,1


#### Example - set operators, joins

In [9]:
sent1 = 'I do not know what I do not know ,'
sent2 = 'what do I know here and now ?'

In [10]:
(
seq(sent1.split(" "))
    .union(seq(sent2.split(" ")))
)

['I', 'and', '?', 'do', 'now', 'not', 'here', 'what', 'know', ',']

In [11]:
(
seq(sent1.split(" "))
    .intersection(seq(sent2.split(" ")))
)

['I', 'do', 'know', 'what']

In [12]:
(
seq(sent1.split(" "))
    .difference(seq(sent2.split(" ")))
)

['not', ',']

In [13]:
s1 = (
seq(sent1.split(' '))
    .map(lambda word: (word, 1))
    .reduce_by_key(lambda x, y: x + y)
)
s2 = (
seq(sent2.split(' '))
    .map(lambda word: (word, 1))
    .reduce_by_key(lambda x, y: x + y)
)

In [14]:
s1, s2

([('I', 2), ('do', 2), ('not', 2), ('know', 2), ('what', 1), (',', 1)],
 [('what', 1), ('do', 1), ('I', 1), ('know', 1), ('here', 1), ('and', 1), ('now', 1), ('?', 1)])

In [16]:
s1.join(s2)  # inner join

0,1
I,"(2, 1)"
do,"(2, 1)"
know,"(2, 1)"
what,"(1, 1)"


In [17]:
s1.left_join(s2)

0,1
I,"(2, 1)"
do,"(2, 1)"
not,"(2, None)"
know,"(2, 1)"
what,"(1, 1)"
",","(1, None)"


In [18]:
s1.right_join(s2)

0,1
what,"(1, 1)"
do,"(2, 1)"
I,"(2, 1)"
know,"(2, 1)"
here,"(None, 1)"
and,"(None, 1)"
now,"(None, 1)"
?,"(None, 1)"


In [19]:
s1.outer_join(s2)

0,1
I,"(2, 1)"
do,"(2, 1)"
now,"(None, 1)"
not,"(2, None)"
here,"(None, 1)"
and,"(None, 1)"
what,"(1, 1)"
know,"(2, 1)"
",","(1, None)"
?,"(None, 1)"


In [20]:
s1.len(), s1.size(), s1.last(), s1.all()

(6, 6, [',', 1], True)

In [21]:
(
seq(sent1.split(" "))
    .zip(seq(sent2.split(" ")))
)

0,1
I,what
do,do
not,I
know,know
what,here
I,and
do,now
not,?


In [22]:
s1.find(lambda x: x[0] == "know")

('know', 2)

In [23]:
s1.to_dict()

{'I': 2, 'do': 2, 'not': 2, 'know': 2, 'what': 1, ',': 1}

In [24]:
s1.to_list()

[('I', 2), ('do', 2), ('not', 2), ('know', 2), ('what', 1), (',', 1)]

In [25]:
s1.to_set()

{(',', 1), ('I', 2), ('do', 2), ('know', 2), ('not', 2), ('what', 1)}

In [26]:
s1.to_json('sent1.json')

In [27]:
s1.to_jsonl('sent1-1.json')

In [28]:
s1.to_csv('sent1.csv')

In [29]:
s1.for_each(lambda x: print(x[0].upper()))

I
DO
NOT
KNOW
WHAT
,


#### Example - flat_map()

In [30]:
def int_list(iterable):
    return [int(var) for var in iterable]

nrow,ncol = 1000, 1000
str_lists = [[str(i*ncol+j) for i in range(ncol)] for j in range(nrow)]


In [31]:
%%time
list2 = seq(str_lists).flat_map(int_list)

# " ".join(list2[:10])
list2[:10]

CPU times: user 290 ms, sys: 20.1 ms, total: 310 ms
Wall time: 309 ms


[0, 1000, 2000, 3000, 4000, 5000, 6000, 7000, 8000, 9000]

In [32]:
%%time
list1 = []
for str_list in str_lists:
    for var in str_list:
        list1.append(int(var))
        
# " ".join(list1[:10])
list1[:10]

CPU times: user 505 ms, sys: 16.2 ms, total: 521 ms
Wall time: 537 ms


[0, 1000, 2000, 3000, 4000, 5000, 6000, 7000, 8000, 9000]

#### Example - process records

In [33]:
from functional import seq
from collections import namedtuple

Transaction = namedtuple('Transaction', 'reason amount')
transactions = [
    Transaction('github', 7),
    Transaction('food', 10),
    Transaction('coffee', 5),
    Transaction('digitalocean', 5),
    Transaction('food', 5),
    Transaction('riotgames', 25),
    Transaction('food', 10),
    Transaction('amazon', 200),
    Transaction('paycheck', -1000)
]

In [34]:
# Using the Scala/Spark inspired APIs
food_cost = (
seq(transactions)
    .filter(lambda x: x.reason == 'food')
    .map(lambda x: x.amount)
    .sum()
)
food_cost

25

In [35]:
# Using the LINQ inspired APIs
food_cost = (
seq(transactions)
    .where(lambda x: x.reason == 'food')
    .select(lambda x: x.amount)
    .sum()
)
food_cost

25

### Rx

[ReactiveX](http://reactivex.io), or Rx for short, is an API for programming with observable event streams. [RxPY](https://rxpy.readthedocs.io/en/latest/) is a port of ReactiveX to Python.

Rx is about processing streams of events. With Rx you:

*  Tell what you want to process (Observable)
*  How you want to process it (A composition of operators)
*  What you want to do with the result (Observer)

the pattern is that you `subscribe` to an `Observable` using an `Observer`:

```python
subscription = Observable.subscribe(observer)
```

#### Why Observables
Observable model allows you to treat streams of Asynchronous events with the same sort of operations that you use for collections of data items like arrays. It frees you from callbacks, and thereby makes your code more readable and less prone to bugs.

#### What is Observables
To support receiving events via push, an Observable/Observer pair connect via subscription. The Observable represents the stream of data and can be subscribed to by an Observer.

#### generate a sequence

In [36]:
import rx

xs = rx.from_(range(5))
d = xs.subscribe(print)

0
1
2
3
4


In [37]:
type(d)

rx.disposable.disposable.Disposable

In [38]:
def is_even(x):
    if x%2 == 0:
        print(x)
xs = rx.from_(range(10))
xs.subscribe(is_even)

0
2
4
6
8


<rx.disposable.disposable.Disposable at 0x7f843544ecf8>

In [39]:
xs.subscribe(
    on_next = lambda x: print(f"Got: {x}"),
    on_error = lambda e: print(f"Error Occurred: {e}"),
    on_completed = lambda: print("Done!"),
)

Got: 0
Got: 1
Got: 2
Got: 3
Got: 4
Got: 5
Got: 6
Got: 7
Got: 8
Got: 9
Done!


<rx.disposable.disposable.Disposable at 0x7f843545f080>

In [40]:
l = [1, "a", 2, "b"]

In [41]:
xs = rx.from_(l)
d = xs.subscribe(
    on_next = lambda x: print(x, isinstance(x,int)),
    on_error = lambda e: print(f"Error Occurred: {e}"),
    on_completed = lambda: print("\nDone!"),
)

1 True
a False
2 True
b False

Done!


In [42]:
import rx
from rx import operators as ops

source = rx.of("Alpha", "Beta", "Gamma", "Delta", "Epsilon")

composed = source.pipe(
    ops.map(lambda s: (s,len(s))),
    ops.filter(lambda i: i[1] >= 5)
)
composed.subscribe(lambda x: print(f"Received {x}"))

Received ('Alpha', 5)
Received ('Gamma', 5)
Received ('Delta', 5)
Received ('Epsilon', 7)


<rx.disposable.disposable.Disposable at 0x7f843541a3c8>

In [43]:
import time
import concurrent.futures
import rx
from rx import operators as ops

num_stream = list(range(10))

def work_slowly(data):
    time.sleep(int(data/2))
    return (data, data * data)

with concurrent.futures.ProcessPoolExecutor(5) as worker:
    rx.from_(num_stream).pipe(
        ops.flat_map(
            lambda num: worker.submit(work_slowly, num)),
    ).subscribe(print)

(0, 0)
(1, 1)
(2, 4)
(3, 9)
(5, 25)
(4, 16)
(6, 36)
(7, 49)
(8, 64)
(9, 81)


#### merge streams

one way to learn Rx APIs is to peek into the test cases.

In [44]:
print(rx.__version__)

3.1.0


In [45]:
import rx
from rx import operators as ops
from rx.testing import TestScheduler, ReactiveTest

In [46]:
on_next = ReactiveTest.on_next
on_completed = ReactiveTest.on_completed
on_error = ReactiveTest.on_error
subscribe = ReactiveTest.subscribe
subscribed = ReactiveTest.subscribed
disposed = ReactiveTest.disposed
created = ReactiveTest.created

In [47]:
scheduler = TestScheduler()
msgs1 = [on_next(150, 1), on_next(210, 2), on_next(225, 5), on_next(240, 8), on_completed(245)]
msgs2 = [on_next(150, 1), on_next(215, 3), on_next(230, 6), on_next(245, 9), on_completed(250)]
msgs3 = [on_next(150, 1), on_next(220, 4), on_next(235, 7), on_completed(240)]
o1 = scheduler.create_hot_observable(msgs1)
o2 = scheduler.create_hot_observable(msgs2)
o3 = scheduler.create_hot_observable(msgs3)

def create_ops():
    return rx.merge(o1, o2, o3)

results = scheduler.start(create_ops).messages

In [48]:
for i, result in enumerate(results):
    print(results[i].value.kind, results[i].time, results[i].value.value)

N 210.0 2
N 215.0 3
N 220.0 4
N 225.0 5
N 230.0 6
N 235.0 7
N 240.0 8
N 245.0 9
C 250.0 None


#### CPU Concurrency

To achieve concurrency, you use two operators: subscribe_on() and observe_on(). Both need a Scheduler which provides a thread for each subscription to do work (see section on Schedulers below). The ThreadPoolScheduler is a good choice to create a pool of reusable worker threads.



In [49]:
import multiprocessing
import random
import time
from threading import current_thread

import rx
from rx.scheduler import ThreadPoolScheduler
from rx import operators as ops


def intense_calculation(value):
    # sleep for a random short duration between 0.5 to 2.0 seconds to simulate a long-running calculation
    time.sleep(random.randint(5, 20) * 0.1)
    return value


# calculate number of CPUs, then create a ThreadPoolScheduler with that number of threads
optimal_thread_count = multiprocessing.cpu_count()
pool_scheduler = ThreadPoolScheduler(optimal_thread_count)

# Create Process 1
rx.of("Alpha", "Beta", "Gamma", "Delta", "Epsilon").pipe(
    ops.map(lambda s: intense_calculation(s)), ops.subscribe_on(pool_scheduler)
).subscribe(
    on_next=lambda s: print("PROCESS 1: {0} {1}".format(current_thread().name, s)),
    on_error=lambda e: print(e),
    on_completed=lambda: print("PROCESS 1 done!"),
)

# Create Process 2
rx.range(1, 10).pipe(
    ops.map(lambda s: intense_calculation(s)), ops.subscribe_on(pool_scheduler)
).subscribe(
    on_next=lambda i: print("\tPROCESS 2: {0} {1}".format(current_thread().name, i)),
    on_error=lambda e: print(e),
    on_completed=lambda: print("\tPROCESS 2 done!"),
)

# Create Process 3, which is infinite
# rx.interval(1).pipe(
rx.range(1, 20).pipe(    
    ops.map(lambda i: i * 100),
    ops.observe_on(pool_scheduler),
    ops.map(lambda s: intense_calculation(s)),
).subscribe(
    on_next=lambda i: print("\t\tPROCESS 3: {0} {1}".format(current_thread().name, i)),
    on_error=lambda e: print(e),
    on_completed=lambda: print("\t\tPROCESS 3 done!"),
)

# input("Press any key to exit\n")

<rx.disposable.disposable.Disposable at 0x7f8435328eb8>

PROCESS 1: ThreadPoolExecutor-0_0 Alpha
		PROCESS 3: ThreadPoolExecutor-0_2 100
PROCESS 1: ThreadPoolExecutor-0_0 Beta
	PROCESS 2: ThreadPoolExecutor-0_1 1
		PROCESS 3: ThreadPoolExecutor-0_3 200
PROCESS 1: ThreadPoolExecutor-0_0 Gamma
	PROCESS 2: ThreadPoolExecutor-0_1 2
PROCESS 1: ThreadPoolExecutor-0_0 Delta
		PROCESS 3: ThreadPoolExecutor-0_3 300
PROCESS 1: ThreadPoolExecutor-0_0 Epsilon
PROCESS 1 done!
	PROCESS 2: ThreadPoolExecutor-0_1 3
		PROCESS 3: ThreadPoolExecutor-0_3 400
		PROCESS 3: ThreadPoolExecutor-0_3 500
	PROCESS 2: ThreadPoolExecutor-0_1 4
	PROCESS 2: ThreadPoolExecutor-0_1 5
		PROCESS 3: ThreadPoolExecutor-0_3 600
		PROCESS 3: ThreadPoolExecutor-0_3 700
		PROCESS 3: ThreadPoolExecutor-0_3 800
	PROCESS 2: ThreadPoolExecutor-0_1 6
	PROCESS 2: ThreadPoolExecutor-0_1 7
		PROCESS 3: ThreadPoolExecutor-0_3 900
	PROCESS 2: ThreadPoolExecutor-0_1 8
	PROCESS 2: ThreadPoolExecutor-0_1 9
	PROCESS 2 done!
		PROCESS 3: ThreadPoolExecutor-0_3 1000
		PROCESS 3: ThreadPoolExecutor-

following examples are taken from [RxPy at TutorialsPoint](https://www.tutorialspoint.com/rxpy/rxpy_examples.htm)

In [52]:
import requests
import rx
import json
from rx import operators as ops
def filternames(x):
   if (x["name"].startswith("aws")):
      return x["name"]
   else :
      return ""

# fetch repo starts with "aws"
github_url = "https://api.github.com/users/wgong/repos"
content = requests.get(github_url)
y = json.loads(content.text)
source = rx.from_(y)
case1 = source.pipe(
   ops.filter(lambda c: filternames(c)),
   ops.map(lambda a: a["name"])
)
case1.subscribe(
   on_next = lambda i: print("Got - {0}".format(i)),
   on_error = lambda e: print("Error : {0}".format(e)),
   on_completed = lambda: print("\nJob Done!"),
)

Got - aws-lambda-developer-guide
Got - aws-serverless-workshops

Job Done!


<rx.disposable.disposable.Disposable at 0x7f8434a55b70>

#### Difference between observable and subject

In [53]:
from rx import of, operators as op
import random
test1 = of(1,2,3,4,5)
sub1 = test1.pipe(
   op.map(lambda a : a+random.random())
)
print("From first subscriber")
subscriber1 = sub1.subscribe(lambda i: print("From sub1 {0}".format(i)))
print("From second subscriber")
subscriber2 = sub1.subscribe(lambda i: print("From sub2 {0}".format(i)))

From first subscriber
From sub1 1.8351182635319636
From sub1 2.0938081046555266
From sub1 3.9649042587138466
From sub1 4.491310300435769
From sub1 5.80175770539422
From second subscriber
From sub2 1.7978921810693764
From sub2 2.224069541628566
From sub2 3.0965829803317044
From sub2 4.426677395554019
From sub2 5.632634159736532


In [54]:
from rx import of, operators as op
import random
from rx.subject import Subject
subject_test = Subject()
subject_test.subscribe(
   lambda x: print("From sub1 {0}".format(x))
)
subject_test.subscribe(
   lambda x: print("From sub2 {0}".format(x))
)
test1 = of(1,2,3,4,5)
sub1 = test1.pipe(
   op.map(lambda a : a+random.random())
)
subscriber = sub1.subscribe(subject_test)

From sub1 1.1730852728979415
From sub2 1.1730852728979415
From sub1 2.4501165009972476
From sub2 2.4501165009972476
From sub1 3.2745541946620613
From sub2 3.2745541946620613
From sub1 4.523644819624818
From sub2 4.523644819624818
From sub1 5.234264818789197
From sub2 5.234264818789197


- Cold Observable: every time you subscribe to the observable, it will give you new values.
- Hot Observable: values are shared, between both subscribers using the subject.