Copyright 2008-2015, Enthought, Inc.<br>
Use only permitted under license.  Copying, sharing, redistributing or other unauthorized use strictly prohibited.<br>
http://www.enthought.com

# Files

## Reading Files

Let's say we have a file 'rcs.txt' which contains data in text format like this:

    #freq (MHz)     vv (dB)     hh (dB)
      100          -20.3       -31.2
      200          -22.7       -33.6

We'd like to get the data into a list of lists of floating point numbers in
Python:

    [[100.0, -20.3, -31.2],
     [200.0, -22.7, -33.6]]

We can open the file with the `open` function or the `file` type:

In [1]:
f = open('rcs.txt')

In [2]:
f = file('rcs.txt')

You can read in the contents as a string using the `read()` method of the file object:

In [3]:
text = f.read()

In [4]:
print text

#freq (MHz)     vv (dB)     hh (dB)
  100          -20.3       -31.2
  200          -22.7       -33.6



or get a list of lines of the file:

In [5]:
f = open('rcs.txt')
lines = f.readlines()
print lines

['#freq (MHz)     vv (dB)     hh (dB)\n', '  100          -20.3       -31.2\n', '  200          -22.7       -33.6\n']


and then close the file:

In [6]:
f.close()

Now we can process the data:

In [7]:
result = []
for line in lines[1:]:
    # split the line into fields based on white space
    fields = line.split()
    # convert the text to numbers
    freq = float(fields[0])
    vv = float(fields[1])
    hh = float(fields[2])
    # group and append to results
    all = [freq, vv, hh]
    result.append(all)

In [8]:
print result

[[100.0, -20.3, -31.2], [200.0, -22.7, -33.6]]


Or simply iterate over the file object:

In [9]:
f = open('rcs.txt')
# skip first line
f.readline()
results = []
for line in f:
    all = [float(value) for value in line.split()]
    results.append(all)
f.close()

print results

[[100.0, -20.3, -31.2], [200.0, -22.7, -33.6]]


## Writing Files

To write to a file:

In [10]:
f = open('myfile.txt', 'w')
f.write("Hello world!")
f.close()

Checking that we actually wrote to it:

In [11]:
print open('myfile.txt').read()

Hello world!


Writing deletes any previous contents:

In [12]:
f = open('myfile.txt', 'w')
f.write("Another world")
f.close()
print open('myfile.txt', 'r').read()

Another world


There's also append mode:

In [13]:
f = open('myfile.txt', 'a')
f.write("... and more")
f.close()
print open('myfile.txt', 'r').read()

Another world... and more


And read-write mode:

In [14]:
f = open('myfile.txt', 'w+')
f.write('Hello world')
f.seek(6) # move to the 6th position in the file
print f.read(5) # read 5 characters
f.close()

world


## Binary and Text Files

Universal newline mode:

In [15]:
f = open('strange.txt', 'rU')
text = f.read()
print repr(text)
print f.newlines

'this\nfile\nhas many different\nline endings\n'
('\r', '\n', '\r\n')


Binary mode:

In [16]:
import os
f = open('binary.bin', 'wb')
f.write(os.urandom(16))
f.close()

f = open('binary.bin', 'rb')
print repr(f.read())
f.close()

'\x90\x86]\xe9\xdbD\xe5\x93\xba7\xcf\xf5\x9c<=\x97'


## Closing Files

Failure to close files can lead to data loss and bugs:

In [17]:
f = open('newfile.txt', 'w')
f.write("Hello world!")
g = open('newfile.txt', 'r')
print repr(g.read())

''


What actually happens depends on OS buffering:

In [23]:
f = open('newfile2.txt', 'w')
f.write("Hello world!")
for i in range(10): # might need to be higher on some systems
    f.write("Hello world! %s\n" % i)

g = open('newfile2.txt', 'r')
print g.read()

Hello world!Hello world! 0
Hello world! 1
Hello world! 2
Hello world! 3
Hello world! 4
Hello world! 5
Hello world! 6
Hello world! 7
Hello world! 8
Hello world! 9



Python tries to close files for you when you're done with them:

In [24]:
def write_file():
    f = open('newfile3.txt', 'w')
    for i in range(10): # might need to be higher on some systems
        f.write("Hello world! %s\n" % i)

write_file()
g = open('newfile3.txt', 'r')
print g.read()

Hello world! 0
Hello world! 1
Hello world! 2
Hello world! 3
Hello world! 4
Hello world! 5
Hello world! 6
Hello world! 7
Hello world! 8
Hello world! 9



But if there is an exception, it may not get closed:

In [20]:
def write_file():
    f = open('newfile4.txt', 'w')
    for i in range(2000): 
        x = 1.0/(i-1000) # might need to use something other than 1000
        f.write("Hello world! %s: %s\n" % (i, x))
    
write_file()

ZeroDivisionError: float division by zero

In [None]:
g = open('newfile4.txt', 'r')
print g.read()

You can prevent this with `try: ... finally: ...`:

In [28]:
f = open('newfile5.txt', 'w')
try:
    for i in range(20): 
        x = 1.0/(i-10) # might need to use something other than 1000
        f.write("Hello world! %s: %s\n" % (i, x))
finally:
    f.close()

ZeroDivisionError: float division by zero

In [29]:
g = open('newfile5.txt', 'r')
print g.read()

Hello world! 0: -0.1
Hello world! 1: -0.111111111111
Hello world! 2: -0.125
Hello world! 3: -0.142857142857
Hello world! 4: -0.166666666667
Hello world! 5: -0.2
Hello world! 6: -0.25
Hello world! 7: -0.333333333333
Hello world! 8: -0.5
Hello world! 9: -1.0



The `with` statement is nicer:

In [27]:
with open('myfile.txt', 'w') as f:
    f.write('Hello world\n')
    f.write('from a with statement')

So this is safe:

In [25]:
with open('newfile6.txt', 'w') as f:
    for i in range(20): 
        x = 1.0/(i-10) # might need to use something other than 1000
        f.write("Hello world! %s: %s\n" % (i, x))

ZeroDivisionError: float division by zero

In [26]:
g = open('newfile6.txt', 'r')
print g.read()

Hello world! 0: -0.1
Hello world! 1: -0.111111111111
Hello world! 2: -0.125
Hello world! 3: -0.142857142857
Hello world! 4: -0.166666666667
Hello world! 5: -0.2
Hello world! 6: -0.25
Hello world! 7: -0.333333333333
Hello world! 8: -0.5
Hello world! 9: -1.0

