# Manually Working with Delimited Formats

Most forms of tabular data can be loaded from disk using functions like pandas.read_table. In some cases, however, some manual processing may be necessary. It’s not uncommon to receive a file with one or more malformed lines that trip up  read_table. To illustrate the basic tools, consider a small CSV file:

For any file with a single-character delimiter, you can use Python’s built-in csv module. To use it, pass any open file or file-like object to csv.reader:

In [1]:
import csv

In [2]:
f = open('../../CSV Files/O_Reilly/ch06/ex7.csv')

f

<_io.TextIOWrapper name='../../CSV Files/O_Reilly/ch06/ex7.csv' mode='r' encoding='cp1252'>

In [3]:
read = csv.reader(f)

read

<_csv.reader at 0x161f66db6a0>

Iterating through the reader like a file yields tuples of values in each like with any quote characters removed:

In [4]:
for line in read:
    print (line)

['a', 'b', 'c']
['1', '2', '3']
['1', '2', '3', '4']


From there, it’s up to you to do the wrangling necessary to put the data in the form that you need it. For example:

In [5]:
lines = list(csv.reader(open('../../CSV Files/O_Reilly/ch06/ex7.csv')))

lines

[['a', 'b', 'c'], ['1', '2', '3'], ['1', '2', '3', '4']]

In [6]:
header = lines[0]
values = lines[1]

# Can also be written as 

header, values = lines[0], lines[1:]

header, values

(['a', 'b', 'c'], [['1', '2', '3'], ['1', '2', '3', '4']])

In [7]:
data_dict = {h: v for h, v in zip((header), zip(*values))}

data_dict

{'a': ('1', '1'), 'b': ('2', '2'), 'c': ('3', '3')}

SV files come in many different flavors. Defining a new format with a different delimiter, string quoting convention, or line terminator is done by defining a simple subclass of *csv.Dialect*:

In [8]:
class my_Dialect:
    lineterminator = '\n'
    delimiter = ';'
    quotecher = '"'

In [9]:
read = csv.reader(f, dialect= my_Dialect)

read

<_csv.reader at 0x161f66dab60>

ndividual CSV dialect parameters can also be given as keywords to csv.reader without having to define a subclass:

In [68]:
read = csv.reader(f, delimiter= "|")

read

<_csv.reader at 0x25aaa7ba080>

![csv dialect options](../../Pictures/csv%20dialect%20options.png)

To write delimited files manually, you can use csv.writer. It accepts an open, writable file object and the same dialect and format options as csv.reader:

In [10]:
with open('../../CSV Files/O_Reilly/ch06/b1ch67.csv', 'w') as f:
    writer = csv.writer(f, dialect = my_Dialect)
    writer.writerow(('one', 'two', 'three'))
    writer.writerows((('1', '2', '3'),
                    ('4', '5','6'),
                    ('7', '8', '9')))

In [16]:
import pandas as pd

pd.read_csv('../../CSV Files/O_Reilly/ch06/b1ch67.csv', sep=';')

Unnamed: 0,one,two,three
0,1,2,3
1,4,5,6
2,7,8,9
