# Introduction

Sometimes, we need to deal with a list of CSV files as a single CSV source. There are a couple of ways to go about this task.

In [4]:
# Contents of no_header1.csv
with open('no_header1.csv') as f:
    print(f.read())

501,karen,bash
502,john,tcsh


In [5]:
# Contents of no_header2.csv
with open('no_header2.csv') as f:
    print(f.read())

601,peter,bash
602,paul,tcsh
603,mary,zsh


## Homemade Solution

The first and obvious way is to open each file and return the lines within.

In [3]:
import csv

def open_multiple_files(*filenames):
    for filename in filenames:
        with open(filename) as f:
            yield from f

reader = csv.reader(open_multiple_files('no_header1.csv', 'no_header2.csv'))
for record in reader:
    print(record)

['501', 'karen', 'bash']
['502', 'john', 'tcsh']
['601', 'peter', 'bash']
['602', 'paul', 'tcsh']
['603', 'mary', 'zsh']


## Using fileinput

We can use the Python standard library `fileinput` to chain many files at once.

In [6]:
import csv
import fileinput

reader = csv.reader(fileinput.input(['no_header1.csv', 'no_header2.csv']))
for record in reader:
    print(record)

['501', 'karen', 'bash']
['502', 'john', 'tcsh']
['601', 'peter', 'bash']
['602', 'paul', 'tcsh']
['603', 'mary', 'zsh']


## Using itertools.chain

This method use the `itertools.chain` function to chain together a list of file objects. The disadvantage of this method is it opens many files at once, bad if the number of files is large.

In [3]:
import csv
import itertools

filenames = ['no_header1.csv', 'no_header2.csv']
contents = itertools.chain(*[open(f) for f in filenames])
for record in csv.reader(contents):
    print(record)

['501', 'karen', 'bash']
['502', 'john', 'tcsh']
['601', 'peter', 'bash']
['602', 'paul', 'tcsh']
['603', 'mary', 'zsh']


We can also use `itertools.chain.from_iterable` to accomplish the same:

In [6]:
import csv
import itertools

filenames = ['no_header1.csv', 'no_header2.csv']
contents = itertools.chain.from_iterable([open(f) for f in filenames])
for record in csv.reader(contents):
    print(record)

['501', 'karen', 'bash']
['502', 'john', 'tcsh']
['601', 'peter', 'bash']
['602', 'paul', 'tcsh']
['603', 'mary', 'zsh']
