# Reading and Writing Files
When working with files, a Jupyter notebook is not always convenient.  
Especially for commands that read or write a file, the state of the file changes every time you execute such commands.  
Therefore, it is hard to test file commands in a notebook by running the same cell again, e.g. after making corrections.

This notebook provides pieces of Python code that you can copy to a Python script in an editor.  
You can also use this notebook for experimenting with non-file aspects of Python.

In [1]:
# preliminary definitions; to be replaced at liberty
filename = 'my_file.txt'
work_to_do = True
str_v = 'some string'
int_v = 123
float_v = '45.67'

## General template for reading files

Suppose you have a filename in a variable `filename` and you want to read data from that file.  
Then the general steps needed are as follows:

In [None]:
the_file = open(filename, 'r') # gives an error when the file does not exist (in the current directory)
for line in the_file:
    # now variable line contains one line from the file, including a trailing '\n'
    # ... obtain data for the line and process that data, e.g.
    print line[:-1] # don't include the trailing newline; print supplies a newline of its own
the_file.close() # AFTER the loop, not inside!

If you want to test the code that goes inside the loop, supply a value for `line` explicitly, e.g.:

In [2]:
# line exactly as generated by for line in the_file:
line = 'event-id\tlocation-id\tlocation-long\n'

Then you can test your code on this particular line.  
If you want to test on another line, make a new cell that defines that line and run it.

## General template for writing files

Suppose you have a filename in a variable `filename` and you want to write data to that file.  
Then the general steps needed are as follows:

In [None]:
filename = 'my_file.txt' # replace this by another filename
the_file = open(filename, 'w') # or 'a' instead of 'w' for appending to an existing file
while work_to_do:
    # process some data and write to file, e.g.
    the_file.write(str_v) # write a string
    the_file.write(str(int_v)) # write an int but convert it to string first
    the_file.write('%7.3f' % float_v) # write a float but format it to a string first
    the_file.write('\n') # go to a new line in the file
    # decide if there is more data: change work_to_do accordingly
the_file.close() # AFTER the loop, not inside!

The loop "`while work_to_do:`" can be replaced by any other loop that visits all data.
When reading one file and writing another file, this would typically be the reading loop of reading files.

## Breaking a line into parts

Once you have a line, e.g. the line defined above, you can break it into parts by method `split()`.  
If you don't supply arguments to `split()` is splits at all "white space".  
For splitting at specific characters like tabs or commas, specify that character as an argument to `split()`.

In [3]:
print line.split()
print line.split('\t')
print line.split(',')

['event-id', 'location-id', 'location-long']
['event-id', 'location-id', 'location-long\n']
['event-id\tlocation-id\tlocation-long\n']


Note that `'\n'` is not included in the result of `split()` without arguments (a newline is "white space" as well).  
For other splits you probably have to get rid of the newline first.

In [4]:
print line[:-1].split('\t')
copy = line.rstrip()
print copy.split('\t')

['event-id', 'location-id', 'location-long']
['event-id', 'location-id', 'location-long']


Further note that the result of `split()` is a list of strings.  
So, if you need a part as a number, you have to convert it to an int or float explicitly.

In [5]:
my_line = '123\tsome string\t45.67'
data = my_line.split('\t')
int_v = int(data[0])
str_v = data[1]
float_v = float(data[2])
print '__%d__%s__%f__' % (int_v, str_v, float_v)

__123__some string__45.670000__
