# Description

For this exercise you will parallelize a toy problem to show the pattern of `concurrent.futures` use.  For this task, a simple serial approach will be faster than thread creation overhead.  However, it is easy to imagine reading much larger files where disk I/O time was significant enough to change this balance.

The code in the Setup generates 1000 files, each of which contains 20 integer, one per line.  You with read each file, and multiply together the numbers on lines 3 and 17 (one-based indexing of line-number).  In turn, you want the sum of all these multiplications as the return value of your function.

For the task, use however many workers you think is most appropriate (pretending the files were much larger and the disk much slower).  The function `sum_of_products()` should return the computed answer, calculated in a multi-threaded manner.

A hint when writing this.  Later lessons talk about race conditions, but just as advice, it is unsafe to put multiple partial results in a list from different threads.  However, doing so with a `collections.deque` is safe, and uses the same `.append()` method to add things.

# Setup

In [1]:
from concurrent.futures import ThreadPoolExecutor
from collections import deque

from random import sample, seed, randint
from string import ascii_letters
from time import time

def create_files(random_state=0):
    seed(random_state)
    for _files in range(1000):
        name = "".join(sample(ascii_letters, 5))
        with open(f"tmp-{name}.numbers", 'w') as fh:
            for _lines in range(20):
                print(randint(1, 99), file=fh)
    return time()

created = create_files()
    
def sum_of_products():
    ThreadPoolExecutor  # Use this for something
    return 2481234

# Solution

In [2]:
def sum_of_products():
    from glob import glob
    def getsum(fname):
        nums = [int(n) for n in open(fname)]
        return nums[2] * nums[16]

    with ThreadPoolExecutor(max_workers=64) as ex:
        final = sum(ex.map(getsum, glob('tmp-*.numbers')))

    return final

# Test Cases

In [3]:
def test_final():
    assert sum_of_products() == 2483973, f"Wrong total computed"
    
test_final()

In [4]:
def test_files_touched():
    import os
    for id_ in "uTHni AgwYn yiQnJ nlrgE wzXTs".split():
        assert os.stat(f"tmp-{id_}.numbers").st_atime > created, \
                "Files not read after creation"

test_files_touched()