Based on **Francesco Pierfederici: Distributed Computing with Python, Chapter 2**

# An asynchronous example

To keep things simple but still interesting, let's write a tool that, given a text file, will 
count the occurrences of a given word

In [1]:
# Run the below line in a terminal:
# time(grep -io love data/pg2600.txt | wc -l)
#
# This will count the number of "love" work in the data/pg2600.txt file
# It will also measure the executation time.
# Results:


677

real    0m0.014s <br>
user    0m0.008s <br>
sys     0m0.004s <br>


### We will solve the word counting problem with coroutines 

In [2]:
def coroutine(fn):
    def wrapper(*args, **kwargs):
        c = fn(*args, **kwargs)
        next(c)
        return c
    return wrapper

In [3]:
#We will work with this file
f=open('../data/pg2600.txt')
print(f)

<_io.TextIOWrapper name='../data/pg2600.txt' mode='r' encoding='UTF-8'>


The next function, cat, will **read the file line by line**

This function, cat, **acts as the data source** for the whole program; 

It **reads the file line by line and sends each line to its child function via child.send(line).** 

This child function will be the **grep()** function

If we want a case-insensitive match, then we simply make the line lowercase; otherwise, we pass it unchanged.


In [4]:
def cat(f, case_insensitive, child):
    if case_insensitive:
        line_processor = lambda l: l.lower()
    else:
        line_processor = lambda l: l

    for line in f:
        child.send(line_processor(line))

### Counting the occurrences of the substring variable in each line

The **grep** function is our first coroutine. It will be the 'child' function in the 'cat()' function

* In it, we enter an infinite loop where **we keep receiving data (text = (yield))**, 

* then **count the occurrences** of the "substring" variable in the "text" variable, which is one line from the text file. 

* and **send that number of occurrences** to the next coroutine (count() in our case). This will add these numbers to count how many times we have seen the "substring" variable in the whole text file: **child.send(textcount))**.


In [5]:
@coroutine
def grep(substring, case_insensitive, child):
    if case_insensitive:
        substring = substring.lower()
    while True:
        text = (yield) #this is where we receive the data. It will be one line from the text file.
        textcount=text.count(substring) # Count the number of occurrences of substring in this line
        child.send(textcount) #send the the data (i.e. the count of this line) to the next coroutine (child())
        # The child function will add all these numbers

In [6]:
#Adding up all the numbers and printing out the total

#The count coroutine keeps a running total, n, of the numbers it receives, 
#(n += (yield)), from grep. 

#It catches the GeneratorExit exception sent to each coroutine 
#when they are closed (which in our case happens automatically when we reach the 
#end of the file) to know when to print out substring and n

@coroutine
def count(substring):
    n = 0
    try:
        while True:
            n += (yield)
    except GeneratorExit:
        print(substring, n)

This is how we can use these corutines:

In [7]:
cat(f, case_insensitive=True, child=grep(substring='love', case_insensitive=True, child=count('love')))

love 677
