# Project Context Manager

Let's build a context manager that is also a lazy iterator to open two files `cars.csv` and `personal_info.csv`

## [Goal 1](#Goal-1)
Do it with a custom class
## [Goal 2](#Goal-2)
Do it with a generator function

### Goal 1

In [17]:
# import modules
import csv
from collections import namedtuple
from itertools import islice

In [2]:
# navigate to data folder
import os

os.chdir('./Project_Context_Manager_data')
f_names = tuple(os.listdir())
f_names

('cars.csv', 'personal_info.csv')

First let's take a look to what there is inside the files:

In [3]:
for file in f_names:
    with open(file) as f:
        print(next(f), end='')
        print(next(f), end='')
    print('*'*40)

Car;MPG;Cylinders;Displacement;Horsepower;Weight;Acceleration;Model;Origin
Chevrolet Chevelle Malibu;18.0;8;307.0;130.0;3504.;12.0;70;US
****************************************
ssn,first_name,last_name,gender,language
100-53-9824,Sebastiano,Tester,Male,Icelandic
****************************************


First we can notice that the headers are not all lower_case, as the python best-practice wants, and then the major problem is that we don't have the same separator in the two files (the same `dialect` to be precise), therefore, we can use the same set-up for our csv.reader. Instead we are going to let csv guess what are the separators using the `Sniffer` method. to do this, we need to pass a sample of the data we are going to read.

In [8]:
with open(f_names[0]) as f:
    dialect = csv.Sniffer().sniff(f.read(1000))
    
print(vars(dialect))

{'__module__': 'csv', '_name': 'sniffed', 'lineterminator': '\r\n', 'quoting': 0, '__doc__': None, 'doublequote': False, 'delimiter': ';', 'quotechar': '"', 'skipinitialspace': False}


It works! the sniffer perfeclty understood the dialect of the file. Now we are goin to write an utility function to be used in the reading process.

In [9]:
def get_dialect(f_name):
    with open(f_name) as f:
        return csv.Sniffer().sniff(f.read(1000))

Now, what is left to do is to create a class than implement both the iterator and the context manager protocol, that will lazily return a namedtuple of the data.

In [14]:
class FileParser:
    def __init__(self, f_name):
        self.f_name = f_name
        
    def __enter__(self):
        self._f = open(self.f_name, 'r')
        self._reader = csv.reader(self._f, get_dialect(self.f_name)) # create the reader
        # parse the first line to create the namedtuple arguments
        headers = map(lambda x: x.lower(), next(self._reader)) 
        # create the namedtuple
        self._nt = namedtuple('Data', headers)
        return self # return a context manager -> FileParser
    
    def __exit__(self, exc_type, exc_val, exc_tb):
        self._f.close()
        return False # don't silence the possible exceptions
    
    def __iter__(self):
        return self
    
    def __next__(self):
        # to avoid an exception due to a closed file (if calling next from outside the with block)
        # we are going to raise our own exception as a StopIteration since it is more pertinent to our
        # lazy iterator
        if self._f.closed:
            raise StopIteration # if the file is closed you cannot iterate
        else:
            # next has to return the new row of the file as a namedtuple
            return self._nt(*next(self._reader))       

In [16]:
with FileParser(f_names[0]) as data:
    for row in islice(data, 2):
        print(row)

Data(car='Chevrolet Chevelle Malibu', mpg='18.0', cylinders='8', displacement='307.0', horsepower='130.0', weight='3504.', acceleration='12.0', model='70', origin='US')
Data(car='Buick Skylark 320', mpg='15.0', cylinders='8', displacement='350.0', horsepower='165.0', weight='3693.', acceleration='11.5', model='70', origin='US')


### Goal 2

In [21]:
# import modules
from contextlib import contextmanager

First lets create the iterator that will be yielded by our generator context manager. Given the csv.reader and the namedtuple, the following function will return the namedtuple populated with the data.

In [22]:
def parsed_data_iter(data_iter, nt):
    for row in data_iter:
        yield nt(*row)

Now, let's write the context manager as a generator function, leveraging the contextlib.contextmanger decorator

In [24]:
@contextmanager
def gen_like_ctxManager(f_name):
    f = open(f_name, 'r')
    # define the enter exit method with try-finally
    try:
        reader = csv.reader(f, get_dialect(f_name))
        header = map(lambda x: x.lower(), next(reader))
        nt = namedtuple('Data', header)
        # now yield the iterator
        yield parsed_data_iter(reader, nt)
        
    finally:
        f.close()    

In [26]:
with gen_like_ctxManager(f_names[1]) as data:
    for row in islice(data, 2):
        print(row)

Data(ssn='100-53-9824', first_name='Sebastiano', last_name='Tester', gender='Male', language='Icelandic')
Data(ssn='101-71-4702', first_name='Cayla', last_name='MacDonagh', gender='Female', language='Lao')


Finally we can do some clean-up to create a function that is more selfcontained, since the utility function we wrote are striclty related to what is happening inside the generator context manager.

In [34]:
@contextmanager
def gen_like_ctxManager_cleaned(f_name):
    f = open(f_name, 'r')
    # define the enter exit method with try-finally
    try:
        dialect = csv.Sniffer().sniff(f.read(1000))
        f.seek(0) # rewind to the beginning of the file since we read 1000 lines to guess the dialect
        reader = csv.reader(f, dialect)
        header = map(lambda x: x.lower(), next(reader))
        nt = namedtuple('Data', header)
        # now yield the iterator
        yield (nt(*row) for row in reader)
        
    finally:
        f.close()    

In [35]:
with gen_like_ctxManager_cleaned(f_names[1]) as data:
    for row in islice(data, 2):
        print(row)

Data(ssn='100-53-9824', first_name='Sebastiano', last_name='Tester', gender='Male', language='Icelandic')
Data(ssn='101-71-4702', first_name='Cayla', last_name='MacDonagh', gender='Female', language='Lao')
