# Chapter 9: Python File I/O

In this chapter we will have a look on how to open files in Python and how to read and write data to and from files. We will be covering some basic examples and some common file operations.

Let's first start by opening a file for reading in Python.

In [None]:
file = open('example.txt', 'r')
file_data = file.read()
print(file_data)
file.close()

We have now opened the `example.txt` text file included in this project and printed out the text. In the `open` function we define the name of the file and the mode in which we want to open the file. In this case it `r` which stands for read. The most common modes are:
- `r` - Read mode which is used when the file is only being read.
- `w` - Write mode which is used to edit and write new information to the file. Existing files with the same name will be overwritten.
- `a` - Append mode, which is used to add new data to the end of the file; that is new information is automatically amended to the end.
- `r+` - Special read and write mode, which is used to handle both actions when working with a file.
- `b` - Binary mode, which is used to handle non-text files, such as images and sound files.

Also, note the `file.close()` function. This is used to tell the system that we are done with the file and release it for other programs to use. It is always a good practice to close the file after you are done with it.

We can add lines to the file using the append mode.


In [None]:
def open_and_print_file(file_name):
    file = open(file_name, 'r')
    file_data = file.read()
    print(file_data)
    file.close()

file = open('example.txt', 'a')
file.write('This is a new line')
file.close()

open_and_print_file('example.txt')

As mentioned before, the `w` mode will write as if it is a new file. Old contents will thus be lost.

In [None]:
file = open('example.txt', 'w')
file.write('This is a new file')
file.close()

open_and_print_file('example.txt')

We now see that all the old contents have been lost, and we have just the new line that we have added.

If you need to read and write to the code you can use the `r+` mode. This allows for both actions at the same time and thus saves some code and operations to the disk as we only have to open the file once.

In [None]:
file = open('example.txt', 'r+')
data = file.read()
print(data)
file.write('\nThis is an appended line')
file.close()

We have now read the file and added a new line using the `r+` mode. While similar to using the `r` and `a` modes, there is a slight difference as the `a` mode will always place the new contents on a new line. The `r+` mode always appends the new data at the end of the file. This means that we have to add the 'newline' character `\n` to the beginning of the new line (or at the end of the previous one if you are thinking ahead).

Files also have multiple functions for reading the contents. So far we have used `read` which reads the entirety of the file. We can also read the file line by line using the `readline` function.

In [None]:
file = open('example.txt', 'r')
line = file.readline()
print(line)
line = file.readline()
print(line)

Ever call to `readline` progress to the next line. We can combine this with a loop to read the entire file line by line.

In [None]:
file = open('example.txt', 'r')
for line in file:
    print(line)

This can be useful if you have data on each line which you want to process individually. An example of this is a CSV file, where each line is a row of data. Using the `readline` function we can read each line and split the data into columns, and process it further. There is also a way to read the entire file into a list of lines using the `readlines` function.

In [None]:
file = open('example.txt', 'r')
lines = file.readlines()
print(lines)

We have a list of lines that we can now process. We could now apply list comprehension to process the data further. For example, we can remove the newline character from each line.

In [None]:
file = open('example.txt', 'r')
lines = file.readlines()
lines = [line.strip() for line in lines]
print(lines)

The strip function is especially useful for this as it removes all whitespace and newline characters from the beginning and end of the string. This is useful for cleaning up data before processing it further.

Besides text, we can also write binary data to files. This is useful for storing images, sound files, and other non-text data. We can do this by opening the file in binary mode. In this case, we will be opening an image file.

In [None]:
image = open('image.jpg', 'rb') # rb stands for read binary
image_data = image.readlines()
image.close()

for i in range(10): # As the image data is large, we will only print the first 10 lines using the readlines function
    print(image_data[i])

This is the binary data of the image, represented as a byte string. In practice, images are harder to deal with as they are often encoded in a specific format, such as JPEG or PNG. If we want to load in the image data we often use external libraries such as `PIL` or `OpenCV` to handle the image data. Below is a short demonstration of how to load an image using the `PIL` library.

In [None]:
from PIL import Image

image = Image.open('image.jpg')

image.show()

This will open a new window and display the image. Image processing is a whole other topic, and we will not be covering it in this chapter. We will, however, be covering it in a later chapter.

# Exercises

Play around with some of the file operations we have covered in this chapter. Try to:
1. Open a new file in write mode and write some data to it.
2. Open the file in read mode and read the data you just added.
3. Open the file in append mode and add some new data to it.
4. Open the file in read mode and read the data line by line.
5. Open the file in read mode and read the data into a list of lines.

Show the work in the cell below. You could make functions for each operation to make it easier to test each operation. This will also help you get used to writing functions in Python.