![Py4Eng](img/logo.png)

# File format: CSV
## Yoav Ram

### The CSV format
Comma separated values (CSV) is a very common and useful format for storing tabular data. It is similar to an Excel file, only it is completely text based. Let's have a look at an example file, both using Excel and a simple text editor.

We can, quite easily, create our own functions for dealing with CSV files, for example by splitting each line by commas. However, Python has a built-in module for exactly this purpose, so why bother?

## Reading CSV files
The most simple way to read a CSV file is to use the modules `reader` function. This function receives a file object (created with `open()`) and returns a reader object.

Once we have defined the csv reader, we can use it to iterate over the file lines. Each row is returned as a list of the column values.

In [4]:
import csv

In [8]:
experiments_file = r'..\data\electrolyte_leakage.csv'
with open(experiments_file, 'r') as f:
    csv_reader = csv.reader(f)
    for row in csv_reader:
        print(row[0])

accession
101AV/Ge-0
157AV/Ita-0
162AV/Ct-1
163AV/Can-0
166AV/Cvi-0
172AV/Bur-0
180AV/Blh-1
186AV/Col-0
200AV/Gre-0
215AV/Mh-1
224AV/Oy-0
236AV/Shadahra
252AV/Akita
257AV/Sakata
25AV/Jea
266AV/N13
42AV/Bl-1
62AV/St-0
70AV/Kn-0
83AV/Edi-0
8AV/Pyl-1
91AV/Tsu-0
94AV/Mt-0
96AV/An-1


## Writing CSV files
Writing is also rather straightforward. The csv module supplies the `csv.writer` object, which has the method `writerow()`. This function receives a list, and prints it as a csv line.

In [9]:
import tempfile

In [14]:
fname = tempfile.mktemp(suffix='.csv')
print("Writing to", fname)
with open(fname, 'w', newline='') as fo:    # notice the 'w' instead of 'r'
    csv_writer = csv.writer(fo)
    csv_writer.writerow(['these','are','the','column','headers'])
    csv_writer.writerow(['and','these','are','the','values'])

Writing to C:\Users\yoavram\AppData\Local\Temp\tmpo2eujh1z.csv


In [15]:
!open $fname

## Exercise

The `electrolyte_leakage.csv` file features the results of experiments on different `Arabidopsis` ecotypes (accessions). In each row, there are 3 control plants and 3 plants tested under draught stress.

Read the CSV file, calculate the mean result for control and for test plants of each ecotype, and print the result as a new CSV file, in the following way:  
 
| Accession  |  control mean  |  test mean  |
|----------|----------------|---------------|
|101AV/Ge-0 |      7.34      |     3.03  |
|157AV/Ita-0|     16.85      |     2.92  |
  
Use the provided accessory function to calculate means.  

## Colophon
This notebook was written by [Yoav Ram](http://python.yoavram.com) and is part of the [_Python for Engineers_](https://github.com/yoavram/Py4Eng) course.

The notebook was written using [Python](http://python.org/) 3.6.1.
Dependencies listed in [environment.yml](../environment.yml), full versions in [environment_full.yml](../environment_full.yml).

This work is licensed under a CC BY-NC-SA 4.0 International License.

![Python logo](https://www.python.org/static/community_logos/python-logo.png)