## **CSV File Reading and Writing**


The so-called CSV (Comma Separated Values) format is the most common import and export format<br></br>for spreadsheets and databases. 

The csv module implements classes to read and write tabular data in CSV format. 

It allows programmers to say, “write this data in the format preferred by Excel,” or “read data from this file <br></br>which was generated by Excel,” 

without knowing the precise details of the CSV format used by Excel.

### **class and methods and variables**

In [48]:
import csv

methods = [i for i  in dir(csv) if not i.startswith('_') and not i == 're']
fmt = '{:20s}'*3

for fn in zip(*[iter(methods)] *3):
    print(fmt.format(*fn))  

Dialect             DictReader          DictWriter          
Error               OrderedDict         QUOTE_ALL           
QUOTE_MINIMAL       QUOTE_NONE          QUOTE_NONNUMERIC    
Sniffer             StringIO            excel               
excel_tab           field_size_limit    get_dialect         
list_dialects       reader              register_dialect    
unix_dialect        unregister_dialect  writer              


In [47]:
import csv

methods = [i for i  in dir(csv) if not i.startswith('_') and not i == 're']

for i in methods:
    print(i+':')
    print(getattr(csv,i).__doc__)
    print('*'*50)

Dialect:
Describe a CSV dialect.

    This must be subclassed (see csv.excel).  Valid attributes are:
    delimiter, quotechar, escapechar, doublequote, skipinitialspace,
    lineterminator, quoting.

    
**************************************************
DictReader:
None
**************************************************
DictWriter:
None
**************************************************
Error:
None
**************************************************
OrderedDict:
Dictionary that remembers insertion order
**************************************************
QUOTE_ALL:
int(x=0) -> integer
int(x, base=10) -> integer

Convert a number or string to an integer, or return 0 if no arguments
are given.  If x is a number, return x.__int__().  For floating point
numbers, this truncates towards zero.

If x is not a number or if base is given, then x must be a string,
bytes, or bytearray instance representing an integer literal in the
given base.  The literal can be preceded by '+' or '-' and be sur

### **reader(csvfile, dialect='excel', **fmtparams)**

<ol type="1">
    <li> Return a reader object which will iterate over lines in the given csvfile.</li>
<br></br>
<li>csvfile can be any object which supports the iterator protocol and returns a string each time  <br></br>its __next__() method is called — file objects and list objects are both suitable.</li> 
<br></br>
<li>If csvfile is a file object, it should be opened with newline=''. </li>
<br></br>
<li>An optional dialect parameter can be given which is used to define a set of parameters specific to <br></br>a particular CSV dialect. It may be an instance of a subclass of the Dialect class or one of the strings <br></br>returned by the list_dialects() function.</li>
<br></br>
<li>The other optional fmtparams keyword arguments can be given to override individual formatting parameters <br></br>in the current dialect.</li>
<br></br>
<li>Each row read from the csv file is returned as a list of strings. 
No automatic data type conversion is performed <br></br> unless the <b>QUOTE_NONNUMERIC</b>
option is specified (in which case unquoted fields are transformed into floats).</li>
</ol>

In [20]:
import csv
with open('/home/mana/Work/innovators.csv', 'r') as file:
    reader = csv.reader(file)
    for row in reader:
        print(row)

['SN|Name|Contribution']
['1|Linus Torvalds|Linux Kernel']
['2|Tim Berners-Lee|World Wide Web']
['3|Guido van Rossum|Python Programming']


#### **CSV files with Custom Delimiters**

By default, a comma is used as a delimiter in a CSV file. However, some CSV files can use delimiters<br></br> other than a comma. Few popular ones are | and \t.

#### **Read CSV file Having | Delimiter**

In [21]:
import csv
with open('/home/mana/Work/innovators.csv', 'r') as file:
    reader = csv.reader(file, delimiter = '|')
    for row in reader:
        print(row)

['SN', 'Name', 'Contribution']
['1', 'Linus Torvalds', 'Linux Kernel']
['2', 'Tim Berners-Lee', 'World Wide Web']
['3', 'Guido van Rossum', 'Python Programming']


#### **Read CSV file Having Tab Delimiter**

In [None]:
import csv
with open('/home/mana/Work/innovators.csv', 'r') as file:
    reader = csv.reader(file, delimiter = '\t')
    for row in reader:
        print(row)

#### **CSV files with initial spaces**
Some CSV files can have a space character after a delimiter. When we use the default csv.reader()<br></br> function to read these CSV files, we will get spaces in the output as well.

To remove these initial spaces, we need to pass an additional parameter called skipinitialspace. 

<pre>
SN, Name, City
1, John, Washington
2, Eric, Los Angeles
3, Brad, Texas
</pre>

In [26]:
import csv
with open('/home/mana/Work/peoples.csv', 'r') as csvfile:
    reader = csv.reader(csvfile, skipinitialspace=True)
    for row in reader:
        print(row)

['Name', 'Age', 'Profession']
['Jack', '23', 'Doctor']
['Miller', '22', 'Engineer']


In [28]:
import csv

with open('/home/mana/Work/eggs.csv', newline = '') as csvfile:
    csvreader = csv.reader(csvfile)
    for row in csvreader:
        print(row)

['4/5/2015 13:34', 'Apples', '73']
['4/5/2015 3:41', 'Cherries', '85']
['4/6/2015 12:46', 'Pears', '14']
['4/8/2015 8:59', 'Oranges', '52']
['4/10/2015 2:07', 'Apples', '152']
['4/10/2015 18:10', 'Bananas', '23']
['4/10/2015 2:40', 'Strawberries', '98']


#### **Read csv file with space**

In [29]:
import csv

with open('/home/mana/Work/eggs.csv', newline = '') as csvfile:
    csvreader = csv.reader(csvfile, delimiter = ' ')
    for row in csvreader:
        print(row)

['4/5/2015', '13:34,Apples,73']
['4/5/2015', '3:41,Cherries,85']
['4/6/2015', '12:46,Pears,14']
['4/8/2015', '8:59,Oranges,52']
['4/10/2015', '2:07,Apples,152']
['4/10/2015', '18:10,Bananas,23']
['4/10/2015', '2:40,Strawberries,98']


#### **Read CSV files with quotes**

In [32]:
import csv
with open('/home/mana/Work/Quote.csv', 'r') as file:
    reader = csv.reader(file, quoting=csv.QUOTE_ALL, delimiter = ';')
    for row in reader:
        print(row)

['SN', 'Name', 'Quotes']
['1', 'Buddha', 'What we think we become']
['2', 'Mark Twain', 'Never regret anything that made you smile']
['3', 'Oscar Wilde', 'Be yourself everyone else is already taken']


#### **Read CSV files using dialect**

Suppose we have a CSV file (office.csv) with the following content:
<pre>
"ID"| "Name"| "Email"
"A878"| "Alfonso K. Hamby"| "alfonsokhamby@rhyta.com"
"F854"| "Susanne Briard"| "susannebriard@armyspy.com"
"E833"| "Katja Mauer"| "kmauer@jadoop.com"
</pre>

The CSV file has initial spaces, quotes around each entry, and uses a | delimiter.

Instead of passing three individual formatting patterns, let's look at how to use dialects to read this file.

#### **create office.csv**

In [8]:
import csv

data = [["ID", "Name", "Email"],
        ["A878", "Alfonso K. Hamby", "alfonsokhamby@rhyta.com"],
        ["F854", "Susanne Briard", "susannebriard@armyspy.com"],
        ["E833", "Katja Mauer", "kmauer@jadoop.com"]]

with open('/home/mana/Work/office.csv', 'w', newline = '') as officecsv:
    writer = csv.writer(officecsv, delimiter = '|', skipinitialspace = True, 
                        quoting = csv.QUOTE_ALL)
    writer.writerows(data)

#### **Write office.csv**

In [10]:
import csv
csv.register_dialect('myDialect',
                     delimiter='|',
                     skipinitialspace=True,
                     quoting=csv.QUOTE_ALL)

with open('/home/mana/Work/office.csv', 'r') as csvfile:
    reader = csv.reader(csvfile, dialect='myDialect')
    for row in reader:
        print(row)

['ID', 'Name', 'Email']
['A878', 'Alfonso K. Hamby', 'alfonsokhamby@rhyta.com']
['F854', 'Susanne Briard', 'susannebriard@armyspy.com']
['E833', 'Katja Mauer', 'kmauer@jadoop.com']


#### **Read passwd file**

In [None]:
import csv

with open('/etc/passwd', newline = '') as passfile:
    csvreader = csv.reader(passfile, delimiter = ':')
    for row in csvreader:
        print(row)

### **writer(csvfile, dialect='excel', **fmtparams)**


<ol type="1">
<li>Return a writer object responsible for converting the user’s data into delimited strings on the given file-like object.</li>
<br></br>
 <li>csvfile can be any object with a write() method.</li>
<br></br>
<li>If csvfile is a file object, it should be opened with newline=''</li>
<br></br>
<li>An optional dialect parameter can be given which is used to define a set of parameters specific to a particular<br></br> CSV dialect.</li>
<br></br>
 <li>To make it as easy as possible to interface with modules which implement the DB API, the value None is written <br></br>as the empty string. While this isn’t a reversible transformation, it makes it easier to dump SQL NULL data values<br></br> to CSV files without preprocessing the data returned from a cursor.fetch* call.</li>
<br></br>
<li>All other non-string data are stringified with str() before being written.</li>


In [9]:
import csv

with open('/home/mana/Work/csvtest.csv', 'w', newline = '') as csvfile:
    writer = csv.writer(csvfile, delimiter = ',', quotechar = '|', quoting = csv.QUOTE_MINIMAL)
    writer.writerow(['Manavalan', 25])
    writer.writerow(['Joe', 10])

#### **Writing Multiple Rows with writerows()**

If we need to write the contents of the 2-dimensional list to a CSV file,

here's how we can do it.

In [11]:
import csv
row_list = [["SN", "Name", "Contribution"],
             [1, "Linus Torvalds", "Linux Kernel"],
             [2, "Tim Berners-Lee", "World Wide Web"],
             [3, "Guido van Rossum", "Python Programming"]]
with open('/home/mana/Work/protagonist.csv', 'w', newline='') as file:
    writer = csv.writer(file)
    writer.writerows(row_list)

#### **How to convert file etc-passwd to csv file?**

In [7]:
import csv

with open('/etc/passwd', newline = '') as prfile, open('pass.csv', 'w') as pwfile:
    csvreader = csv.reader(prfile, delimiter = ':')
    writer = csv.writer(pwfile, delimiter = ',',quoting = csv.QUOTE_MINIMAL)
    
    for x in csvreader:
        writer.writerow(x)    

In [8]:
import csv

with open('/etc/passwd', newline = '') as prfile, open('pass.csv', 'w') as pwfile:
    csvreader = csv.reader(prfile, delimiter = ':')
    writer = csv.writer(pwfile, delimiter = ',',quoting = csv.QUOTE_MINIMAL)
    writer.writerows(csvreader)    

### **Delimiters**

#### **CSV Files with Custom Delimiters**

By default, a **comma** is used as a delimiter in a CSV file. 


However, some CSV files can use delimiters other than a comma. Few popular ones are **| and \t**.


Suppose we want to use | as a delimiter in the csv file.<br></br>we can pass an additional delimiter parameter to the csv.writer() function.





In [12]:
import csv
data_list = [["SN", "Name", "Contribution"],
             [1, "Linus Torvalds", "Linux Kernel"],
             [2, "Tim Berners-Lee", "World Wide Web"],
             [3, "Guido van Rossum", "Python Programming"]]
with open('/home/mana/Work/innovators.csv', 'w', newline='') as file:
    writer = csv.writer(file, delimiter='|')
    writer.writerows(data_list)

### **Quote**

=> Some CSV files have quotes around each or some of the entries.


**Example:**<br></br>1;"Buddha";"What we think we become"


=> Using csv.writer() by default will not add these quotes to the entries.



=> In order to add them, we will have to use another optional parameter called quoting.

Let's take an example of how quoting can be used around the non-numeric values and ; as delimiters.

In [19]:
import csv
row_list = [
    ["SN", "Name", "Quotes"],
    [1, "Buddha", "What we think we become"],
    [2, "Mark Twain", "Never regret anything that made you smile"],
    [3, "Oscar Wilde", "Be yourself everyone else is already taken"]
]


with open('/home/mana/Work/Quote.csv', 'w') as csvfile:
    writer = csv.writer(csvfile, quoting = csv.QUOTE_NONNUMERIC, delimiter = ';')
    writer.writerows(row_list)

As you can see, we have passed csv.QUOTE_NONNUMERIC to the quoting parameter. 
<br></br>It is a constant defined by the csv module.

**csv.QUOTE_NONNUMERIC** specifies the writer object that quotes should be added around the non-numeric entries.

____
There are 4 other predefined constants you can pass to the quoting parameter:

**=> csv.QUOTE_ALL** - Specifies the writer object to write CSV file with quotes around all the entries.

**=> csv.QUOTE_NONNUMERIC** - specifies the writer object that quotes should be added around<br></br> the non-numeric entries.

**=> csv.QUOTE_MINIMAL** - Specifies the writer object to only quote those fields which contain<br></br> special characters (delimiter, quotechar or any characters in lineterminator)

**=> csv.QUOTE_NONE** - Specifies the writer object that none of the entries should be quoted. <br></br>It is the default value.


### **quoting character**

#### **CSV files with custom quoting character**

We can also write CSV files with custom quoting characters. For that, we will have to use an optional parameter called quotechar.

In [21]:
import csv
row_list = [
    ["SN", "Name", "Quotes"],
    [1, "Buddha", "What we think we become"],
    [2, "Mark Twain", "Never regret anything that made you smile"],
    [3, "Oscar Wilde", "Be yourself everyone else is already taken"]
]

with open('/home/mana/Work/Quotechar.csv', 'w') as csvfile:
    writer = csv.writer(csvfile, quotechar = '*', quoting = csv.QUOTE_NONNUMERIC, delimiter = ';')
    writer.writerows(row_list)

Here, we can see that quotechar='*' parameter instructs the writer object to use * as quote for all non-numeric values.

In [22]:
import csv
row_list = [
    ["SN", "Name", "Quotes"],
    [1, "Buddha", "What we think we become"],
    [2, "Mark Twain", "Never regret anything that made you smile"],
    [3, "Oscar Wilde", "Be yourself everyone else is already taken"]
]

with open('/home/mana/Work/Quotechar.csv', 'w') as csvfile:
    writer = csv.writer(csvfile, quotechar = '*', quoting = csv.QUOTE_ALL, delimiter = ';')
    writer.writerows(row_list)

Here, we can see that quotechar='*' parameter instructs the writer object to use * as quote for all.

### **What is dialect?**
Dialect helps in grouping together many specific formatting patterns like delimiter, skipinitialspace,<br></br> quoting, escapechar into a single dialect name.

It can then be passed as a parameter to multiple writer or reader instances.

=> **Dialect is made Quotes and Delimiter as per requirement.**

=> **The custom dialect requires a name in the form of a string.**

#### **How to create custom dialect?**

The CSV file has quotes around each entry and uses | as a delimiter.<br></br>Instead of passing two individual formatting patterns, let's look at how to use dialects to write this file.

### **register_dialect(name[, dialect[, **fmtparams]])**

=> **Create Dialect**

=> **Register Dialect**

In [24]:
import csv

csv.register_dialect('mydialect', delimiter = ';', quoting = csv.QUOTE_ALL)

we can see that the csv.register_dialect() function is used to define a custom dialect. 

The custom dialect requires a name in the form of a string. 
Other specifications can be done either
<br></br>
by passing a sub-class of the Dialect class, 
or by individual formatting patterns as shown in the example.

While creating the writer object, we pass **dialect='myDialect'** to specify that the writer instance 
<br></br>must use that particular dialect.

The advantage of using dialect is that it makes the program more modular. Notice that we can reuse <br></br> myDialect to write other CSV files without having to re-specify the CSV format.



In [4]:
import csv
row_list = [
    ["ID", "Name", "Email"],
    ["A878", "Alfonso K. Hamby", "alfonsokhamby@rhyta.com"],
    ["F854", "Susanne Briard", "susannebriard@armyspy.com"],
    ["E833", "Katja Mauer", "kmauer@jadoop.com"]
]

csv.register_dialect('mydialect', delimiter = ';', quoting = csv.QUOTE_NONNUMERIC)
with open('/home/mana/Work/mydialect.csv', 'w', newline = '') as csvfile:
    writer = csv.writer(csvfile, dialect = 'mydialect')
    writer.writerows(row_list)

### **get_dialect(name)**

Return the dialect associated with name. <br></br>An Error is raised if name is not a registered dialect name. 

**This function returns an immutable Dialect.**

In [5]:
csv.get_dialect('mydialect')

<_csv.Dialect at 0x7fa147fd61b8>

### **list_dialects()**

**Return the names of all registered dialects.**

In [6]:
csv.list_dialects()

['excel', 'excel-tab', 'unix', 'mydialect']

### **unregister_dialect(name)**

Delete the dialect associated with name from the dialect registry. <br></br> An Error is raised if name is not a registered dialect name.

In [34]:
csv.unregister_dialect('mydialect')

In [7]:
csv.list_dialects()

['excel', 'excel-tab', 'unix', 'mydialect']

### **field_size_limit([new_limit])**

Returns the current maximum field size allowed by the parser. <br></br>If new_limit is given, this becomes the new limit.

In [37]:
csv.field_size_limit()

131072

### **DictReader(f, fieldnames=None, restkey=None, restval=None, dialect='excel', *args, **kwds)**

**The objects of a csv.DictReader() class can be used to read a CSV file as a dictionary.**

<ol type="1">

<li>Create an object that operates like a regular reader but maps the information in each row to<br></br> a dict whose keys are given by the optional fieldnames parameter.</li>
&nbsp;
<li>The fieldnames parameter is a sequence. If fieldnames is omitted, the values in the first row <br></br>of file f will be used as the fieldnames. Regardless of how the fieldnames are determined, <br></br>the dictionary preserves their original ordering.</li>
<br></br>
<li>If a row has more fields than fieldnames, the remaining data is put in a list and stored with the fieldname <br></br>specified by restkey (which defaults to None). If a non-blank row has fewer fields than fieldnames,<br></br> the missing values are filled-in with the value of restval (which defaults to None).</li>
 <br></br>
<li>All other optional or keyword arguments are passed to the underlying reader instance.</li>
</ol>

In [22]:
import csv

help(csv.DictReader)

Help on class DictReader in module csv:

class DictReader(builtins.object)
 |  Methods defined here:
 |  
 |  __init__(self, f, fieldnames=None, restkey=None, restval=None, dialect='excel', *args, **kwds)
 |      Initialize self.  See help(type(self)) for accurate signature.
 |  
 |  __iter__(self)
 |  
 |  __next__(self)
 |  
 |  ----------------------------------------------------------------------
 |  Data descriptors defined here:
 |  
 |  __dict__
 |      dictionary for instance variables (if defined)
 |  
 |  __weakref__
 |      list of weak references to the object (if defined)
 |  
 |  fieldnames



#### **Create Sample CSV**

&nbsp;

| Name	| Age	| Profession|
|-------| ------| ----------|
| Jack	| 23	|  Doctor    |
| Miller |	22	| Engineer   |



In [20]:
import csv

peoples = [['Name','Age', 'Profession'],
           ['Jack',23,'Doctor'],
           ['Miller',22,'Engineer']]

with open('/home/mana/Work/peoples.csv', 'w') as csvfile:
    writer = csv.writer(csvfile)
    writer.writerows(peoples)

### **How to read filed header only?**

In [27]:
import csv

with open('/home/mana/Work/peoples.csv', newline = '') as csvfile:
    reader = csv.reader(csvfile)
    header = next(reader)
    
header

['Name', 'Age', 'Profession']

In [32]:
import csv

with open('/home/mana/Work/peoples.csv', newline = '') as csvfile:
    header = csv.DictReader(csvfile).fieldnames    
    
header

['Name', 'Age', 'Profession']

### **How csv.DictReader() can be used?**

In [13]:
import csv

with open('/home/mana/Work/peoples.csv', newline = '') as csvfile:
    reader = csv.DictReader(csvfile)
    for row in reader:
        print(row['Name'], row['Age'])

Jack 23
Miller 22


In [8]:
import csv

with open('/home/mana/Work/peoples.csv', newline = '') as csvfile:
    reader = csv.DictReader(csvfile)
    for row in reader:
        print(dict(row))

{'Name': 'Jack', 'Age': '23', 'Profession': 'Doctor'}
{'Name': 'Miller', 'Age': '22', 'Profession': 'Engineer'}


As we can see, the entries of the first row are the dictionary keys.<br></br> And, the entries in the other rows are the dictionary values.

Here, csv_file is a csv.DictReader() object. The object can be iterated over using a for loop. <br></br> The csv.DictReader() returned an OrderedDict type for each row. That's why we used dict() <br></br>to convert each row to a dictionary.

Notice that we have explicitly used the dict() method to create dictionaries inside the for loop.
<br></br>print(dict(row))

**Note:** Starting from Python 3.8, csv.DictReader() returns a dictionary for each row,<br></br> and we do not need to use dict() explicitly.


### **DictWriter(f, fieldnames, restval='', extrasaction='raise', dialect='excel', *args, **kwds)**

**The objects of csv.DictWriter() class can be used to write to a CSV file from a Python dictionary.**

The minimal syntax of the csv.DictWriter() class is:

**csv.DictWriter(file, fieldnames)**
* f or file - CSV file where we want to write to

* fieldnames - a list object which should contain the column headers specifying the order<br></br> in which data should be written in the CSV file

<ol type="1">
<li>Create an object which operates like a regular writer but maps dictionaries onto output rows.</li> 
 <br></br>
<li>The fieldnames parameter is a sequence of keys that identify the order in which values in the<br></br> dictionary passed to the writerow() method are written to file f.</li> 
<br></br>
<li>The optional restval parameter specifies the value to be written if the dictionary is missing a key<br></br> in fieldnames.</li> 
<br></br>
<li>If the dictionary passed to the writerow() method contains a key not found in fieldnames, the optional <br></br>extrasaction parameter indicates what action to take. If it is set to 'raise', the default value, a ValueError<br></br> is raised. If it is set to 'ignore', extra values in the dictionary are ignored. Any other optional or keyword<br></br> arguments are passed to the underlying writer instance.</li>
<br></br>
<li>Note that unlike the DictReader class, the fieldnames parameter of the DictWriter class is not optional.</li>
</ol>

#### **Write CSV header**

In [19]:
import csv

help(csv.DictWriter)

Help on class DictWriter in module csv:

class DictWriter(builtins.object)
 |  Methods defined here:
 |  
 |  __init__(self, f, fieldnames, restval='', extrasaction='raise', dialect='excel', *args, **kwds)
 |      Initialize self.  See help(type(self)) for accurate signature.
 |  
 |  writeheader(self)
 |  
 |  writerow(self, rowdict)
 |  
 |  writerows(self, rowdicts)
 |  
 |  ----------------------------------------------------------------------
 |  Data descriptors defined here:
 |  
 |  __dict__
 |      dictionary for instance variables (if defined)
 |  
 |  __weakref__
 |      list of weak references to the object (if defined)



In [14]:
import csv

with open('/home/mana/Work/header.csv', 'w', newline = '') as hfile:
    fieldnames = ['Name', 'Age']
    writer = csv.DictWriter(hfile, fieldnames)
    writer.writeheader()

#### **Write CSV header with rows**

In [16]:
import csv 

with open('/home/mana/Work/info.csv', 'w', newline = '') as infofile:
    fieldnames = ['Name' , 'Age']
    writer = csv.DictWriter(infofile, fieldnames)
    writer.writeheader()
    writer.writerow({'Name': 'Manavalan', 'Age': 30})
    writer.writerow({'Name': 'Anthony', 'Age': 24})

### **Sniffer class**
**The Sniffer class is used to deduce the format of a CSV file.**

#### **The Sniffer class offers two methods:**

**sniff(sample, delimiters=None):** <ol type='1'>
<li>This function analyses a given sample of the CSV text <br></br>and returns a Dialect subclass that contains all the parameters deduced.</li>
<li>An optional delimiters parameter can be passed as a string containing <br></br>possible valid delimiter characters.</li>
</ol>

**has_header(sample):** 

This function returns True or False based on analyzing whether the sample CSV <br></br>has the first row as column headers.

In [10]:
import csv

help(csv.Sniffer)

Help on class Sniffer in module csv:

class Sniffer(builtins.object)
 |  "Sniffs" the format of a CSV file (i.e. delimiter, quotechar)
 |  Returns a Dialect object.
 |  
 |  Methods defined here:
 |  
 |  __init__(self)
 |      Initialize self.  See help(type(self)) for accurate signature.
 |  
 |  has_header(self, sample)
 |  
 |  sniff(self, sample, delimiters=None)
 |      Returns a dialect (or None) corresponding to the sample
 |  
 |  ----------------------------------------------------------------------
 |  Data descriptors defined here:
 |  
 |  __dict__
 |      dictionary for instance variables (if defined)
 |  
 |  __weakref__
 |      list of weak references to the object (if defined)



#### **Using csv.Sniffer() to deduce the dialect of CSV files**

**Sample CSV**
<pre>
"ID"| "Name"| "Email"
A878| "Alfonso K. Hamby"| "alfonsokhamby@rhyta.com"
F854| "Susanne Briard"| "susannebriard@armyspy.com"
E833| "Katja Mauer"| "kmauer@jadoop.com" 
</pre>

In [38]:
import csv

with open('/home/mana/Work/office.csv', 'r') as file:
    sample = file.read(64)
    check_header = csv.Sniffer().has_header(sample)
    print('Check whether casvfile has header or not: ',check_header)
    deduced_dialect = csv.Sniffer().sniff(sample)
    
with open('/home/mana/Work/office.csv', newline = '') as csvfile:
    reader = csv.reader(csvfile, deduced_dialect)
    
    for row in reader:
        print(row)

Check whether casvfile has header or not:  True
['ID', 'Name', 'Email']
['A878', 'Alfonso K. Hamby', 'alfonsokhamby@rhyta.com']
['F854', 'Susanne Briard', 'susannebriard@armyspy.com']
['E833', 'Katja Mauer', 'kmauer@jadoop.com']


As you can see, we read only 64 characters of office.csv and stored it in the sample variable.

This sample was then passed as a parameter to the Sniffer().has_header() function. It deduced <br></br> that the first row must have column headers. Thus, it returned True which was then printed out.

Similarly, sample was also passed to the Sniffer().sniff() function. It returned all the deduced parameters<br></br> as a Dialect subclass which was then stored in the deduced_dialect variable.

Later, we re-opened the CSV file and passed the deduced_dialect variable as a parameter to csv.reader().

It was correctly able to predict delimiter, quoting and skipinitialspace parameters in the office.csv file <br></br>without us explicitly mentioning them.



### **Excercise**

#### **How to write CSV header?**

In [12]:
import csv

with open('/home/mana/Work/header.csv', 'w', newline = '') as file:
    fieldnames = ['Name', 'Age']
    writer = csv.DictWriter(file, fieldnames)
    writer.writeheader()

#### **How to convert list from string using CSV module?**

In [17]:
import csv
for row in csv.reader(['one,two,three']):
    print(row)

['one', 'two', 'three']


#### **Error Handling**
catching and reporting errors:

In [19]:
import csv, sys
filename = '/home/mana/Work/info.csv'
with open(filename, newline='') as f:
    reader = csv.reader(f)
    try:
        for row in reader:
            print(row)
    except csv.Error as e:
        sys.exit('file {}, line {}: {}'.format(filename, reader.line_num, e))

['Name', 'Age']
['Manavalan', '30']
['Anthony', '24']


### **How to read two text files convert into csv file with header ?**

In [21]:
!cat /home/mana/Work/name.txt

Manavalan
Ram
Xavier


In [22]:
!cat /home/mana/Work/age.txt

32
18
21


In [25]:
from pathlib import Path
import csv

dir_path = Path('/home/mana/Work')
name = (dir_path/'name.txt').read_text().splitlines()
age = (dir_path/'age.txt').read_text().splitlines()
csv_info = list(zip(name,age))

with open('/home/mana/Work/data.csv', 'w') as csvfile:
    filednames = ['Name', 'Age']
    wrih = csv.DictWriter(csvfile, filednames)
    wrih.writeheader()
    wrir = csv.writer(csvfile)
    wrir.writerows(csv_info)