# 14) Files and Directories <a class="tocSkip">

A file is a sequence of bytes, stored in some filesystem, and accessed by a filename. A directory is a collection of files and possibly other directories. The term folder is a synonym for directory.

### File input and output

The simplest kind of persistence is a plain file, sometimes called a flat file. You read from a file into memory and write from memory to a file. As with many languages, Python's file operations are largely modeled on the familiar Unix equivalents. You need to call the open function before you do the following: 
    - Read an existing file
    - Write to a new file
    - Append to an existing file
    - Overwrite an existing file

The open function is:

    fileobj = open(filename, mode)
    
    - fileobj is the file object returned by open()
    - filename is the string name of the file
    - mode is a string indicating the file's type and what you want to do with it

The first letter of mode indicates the operation:
    
    r means read.
    w means write. If a file doesn't exist, it's created. If it does exist, it's overwritten.
    x means write, but only if the file does not already exist.
    a means append if the file exists.

The second letter of the mode is the file's type:
    
    t (or nothing) means text.
    b means binary.

Last, you need to close the file to ensure that any writes complete, and that memory is freed. We shall soon see how to use with to automate this. Below is an example creating a file:

In [13]:
# An example file

fout = open('Files/file.txt', 'wt')
print('This is a new file.', file = fout)
fout.close()

We used the file argument to print. Without it, print write to standard output, which is your terminal (unless you have told your shell program to redirect output to a file with > or piped it to another program with |). We can do the same as above using the write function:

In [14]:
# Using the write function

fout = open('Files/file.txt', 'wt')
fout.write('This is a new file.')
fout.close()

You can call read() with no arguments to gather up the entire file at once. Be careful when doing this with large files; a gigabyte file will consume a gigabyte of memory:

In [15]:
# Reading a file

fin = open('Files/file.txt', 'rt')
line = fin.read()
fin.close()

line

'This is a new file.'

If you include a 'b' in the mode string, the file is opened in binary mode. In this case, you read and write bytes instead of a string.

If you forget to close a file that you have opened, it will be closed by Python after it is no linger referenced. This means that if you open a file within a function and do not close it explicitly, it will be closed automatically when the function ends. But you might have opened the file in a long-running function or the main section of the program. The file should be closed to force any remaining writes to be completed. Python has context managers to clean up things such as open files. We use the form:

    with expression as variable:
    
After the block of code under the context manager completes, the file is closed automatically.

In [16]:
# Using with to automatically close a file

with open('Files/file.txt', 'at') as fout:
    fout.write('\n')
    fout.write('I want to add a line.')

As you read and write, Python keeps track of where you are in the file. The tell() function returns your current offset from the beginning of the file, in bytes. The seek() function lets you jump to another byte by offset in the file. This means that you do not have to read in every byte in a file to read the last one; you can seek() to the last one and just read one byte. These functions are most useful for binary files. You can use them with text files, but unless the file is ASCII (one byte per character), you would have a hard time calculating offsets. The most popular encoding (UTF-8) uses varying numbers of bytes per character.

An alternative to reading and writing a file is to memory-map it with the standard mmap module. This makes the contents of a file look like a bytearray in memory.