## Text File Input and Output

### Writing Text Files

Files are special types of variables in Python. We can open a file for reading, writing or appending using the built-in *open()* function. The first parameter is the location of the file and the second parameter is the action we want to take: 
- "w" = write
- "r" = read
- "a" = append.

So to open a new file for writing, we call the *open()* function, supply a path for the file and specify the "w" action to write.  Note: if the file already exists, it will be completely overwritten!

In [7]:
fout = open("data.txt","w")

Since we just specified "data.txt" rather than a complete path, the file will be written to the same directory as our IPython Notebook.

After opening a file to write, we actually write data using the *write()* function with string formatting. Each call will append more text to the file. Note: new line characters are not automatically added.

In [8]:
for i in range(0,5):
    fout.write( "Current value of i is %d\n" % i )  # note, we add a newline with \n at the end of each line

When we are finished, we need to close the file.

In [9]:
fout.close()

Once a file is closed, we cannot write any more data to it. Trying to do so will give an error message.

In [10]:
fout.write("More data!")

ValueError: I/O operation on closed file.

### Reading Text Files

To open a new file for reading, we use the *open()* function again. Note: if the file does not exist, we will get an error message.

In [11]:
fin = open("data.txt","r")  # action "r" means open file to read

After opening a file to read, you can use several functions to access the data. The function *read()* gets the full contents of the file, *readline()* gets a full line of text, and *readlines()* loads all of the text from the file into a list with one value per line.

In [12]:
lines = fin.readlines()
for l in lines:
    print( l.strip() )  # note that we usually need to remove the newline characters from the end of strings

Current value of i is 0
Current value of i is 1
Current value of i is 2
Current value of i is 3
Current value of i is 4


In [13]:
lines

['Current value of i is 0\n',
 'Current value of i is 1\n',
 'Current value of i is 2\n',
 'Current value of i is 3\n',
 'Current value of i is 4\n']

Again we close the file when we are finished - this means no more read functions can be called on the file.

In [None]:
fin.close()

### Comma-Separated Files

Frequently, simple datasets are stored as *comma-separated value* (CSV) files. In a CSV file, tabular data is stored as plain text. Each line of the file is a record, and each record consists of one or more fields, separated by commas.

We can manually create a CSV file using the open() and write() functions. 

In [None]:
fout = open("simple.csv","w")
# create the records
for row in range(5):
    # start the record with an identifier
    fout.write("record_%d" % (row+1) )
    # create the fields for each record
    for col in range(4):
        value = (row+1)*(col+1)     # just create some dummy values
        fout.write(",%d" % value )  # notice the comma separator
    # move on to a new line in the file
    fout.write("\n")
# finished, so close the file
fout.close()    

We could just read back the entire file:

In [None]:
fin = open("simple.csv","r")
print( fin.read() )
fin.close()

But more often, we will want to parse the data into numeric values, line by line:

In [None]:
fin = open("simple.csv","r")
# process the file line by line
for line in fin.readlines():
    # remove the newline character from the end
    line = line.strip()
    # split the line based on the comma separator
    parts = line.split(",")
    # extract the identifier as the first value in the list
    record_id = parts[0]
    # convert the rest to integers from strings
    values = []
    for s in parts[1:]:
        values.append( int(s) )
    # display the record
    print( record_id, values )
# finished, so close the file
fin.close()

Later in the module we will look at more convenient ways for working with CSV data.