## Tutorial 07: File Input and Output

These notes are adapted from the Python tutorial available at: https://docs.python.org/3/tutorial/.

Here we see how to work with files from within Python. This will
let us start working with some real datasets.

### Reading and Writing Files

open() returns a file object, and is most commonly used with two arguments: `open(filename, mode)`.

In [None]:
f = open('workfile', 'w')

The first argument is a string containing the filename. The second argument is another string containing a few characters describing the way in which the file will be used. mode can be `'r'` when the file will only be read, `'w'` for only writing (an existing file with the same name will be erased), and `'a'` opens the file for appending; any data written to the file is automatically added to the end. `'r+'` opens the file for both reading and writing. The mode argument is optional; `'r'` will be assumed if it’s omitted.

Normally, files are opened in text mode, that means, you read and write strings from and to the file, which are encoded in a specific encoding. If encoding is not specified, the default is platform dependent (see open()). `'b'` appended to the mode opens the file in binary mode: now the data is read and written in the form of bytes objects. This mode should be used for all files that don’t contain text.

In text mode, the default when reading is to convert platform-specific line endings (`\n` on Unix, `\r\n` on Windows) to just `\n`. When writing in text mode, the default is to convert occurrences of `\n` back to platform-specific line endings. This behind-the-scenes modification to file data is fine for text files, but will corrupt binary data like that in JPEG or EXE files. Be very careful to use binary mode when reading and writing such files.

It is good practice to use the `with` keyword when dealing with file objects. The advantage is that the file is properly closed after its suite finishes, even if an exception is raised at some point. Using with is also much shorter than writing equivalent try-finally blocks:

In [None]:
with open('workfile') as f:
    read_data = f.read()

f.closed

If you’re not using the with keyword, then you should call `f.close()` to close the file and immediately free up any system resources used by it. If you don’t explicitly close a file, Python’s garbage collector will eventually destroy the object and close the open file for you, but the file may stay open for a while. Another risk is that different Python implementations will do this clean-up at different times.

After a file object is closed, either by a with statement or by calling `f.close()`, attempts to use the file object will automatically fail.

In [None]:
f.close()
f.read()

### Methods of File Objects

`f.write(string)` writes the contents of string to the file, returning the number of characters written.

In [None]:
with open('workfile', 'w') as f:
    f.write('This is the first line.\n')
    f.write('This is the second line.\n')

Other types of objects need to be converted – either to a string (in text mode) or a bytes object (in binary mode) – before writing them.

To read a file’s contents, call `f.read(size)`, which reads some quantity of data and returns it as a string (in text mode) or bytes object (in binary mode). size is an optional numeric argument. When size is omitted or negative, the entire contents of the file will be read and returned; it’s your problem if the file is twice as large as your machine’s memory. Otherwise, at most size bytes are read and returned. If the end of the file has been reached, `f.read()` will return an empty string (`''`).

In [None]:
with open('workfile', 'r') as f:
    print(f.read())
    print(f.read())
    print(f.read())

`f.readline()` reads a single line from the file; a newline character (`\n`) is left at the end of the string, and is only omitted on the last line of the file if the file doesn’t end in a newline. This makes the return value unambiguous; if `f.readline()` returns an empty string, the end of the file has been reached, while a blank line is represented by `'\n'`, a string containing only a single newline.

For reading lines from a file, you can loop over the file object. This is memory efficient, fast, and leads to simple code:

In [None]:
with open('workfile', 'r') as f:
    for line in f:
        print(line, end='')

If you want to read all the lines of a file in a list you can also use `list(f)` or `f.readlines()`.

In [None]:
with open('workfile', 'r') as f:
    x = f.readlines()
    
x

-------

## Practice