# Files and Exceptions

Now that you’ve mastered the basic skills you need to write organized programs that are easy to use, it’s time to think about making your programs even more relevant and usable. In this chapter you’ll learn to work with files so your programs can quickly analyze lots of data. You’ll learn to handle errors so your programs don’t crash when they encounter unexpected situations. You’ll learn about exceptions, which are special objects Python creates to manage errors that arise while a program is running. You’ll also learn about the json module, which allows you to save user data so it isn’t lost when your program stops running.

Learning to work with files and save data will make your programs easier for people to use. Users will be able to choose what data to enter and when to enter it. People can run your program, do some work, and then close the program and pick up where they left off later. Learning to handle exceptions will help you deal with situations in which files don’t exist and deal with other problems that can cause your programs to crash. This will make your programs more robust when they encounter bad data, whether it comes from innocent mistakes or from malicious attempts to break your programs. With the skills you’ll learn in this chapter, you’ll make your programs more applicable, usable, and stable.


## READING FROM A FILE 

An incredible amount of data is available in text files. Text files can contain weather data, traffic data, socioeconomic data, literary works, and more. Reading from a file is particularly useful in data analysis applications, but it’s also applicable to any situation in which you want to analyze or modify information stored in a file. For example, you can write a program that reads in the contents of a text file and rewrites the file with formatting that allows a browser to display it.

When you want to work with the information in a text file, the first step is to read the file into memory. You can read the entire contents of a file, or you can work through the file one line at a time.

### Reading an Entire File

To begin, we need a file with a few lines of text in it. Let’s start with a file that contains pi to 30 decimal places, with 10 decimal places per line:

```
pi_digits.txt

3.1415926535
  8979323846
  2643383279
```

All the files used in this lecture can be found under the **data** folder.

Here’s a program that opens this file, reads it, and prints the contents of the file to the screen:

# Difference between relative file path and absolute file path
'data/pi_digits.txt' this file path is so called relative file path, which means this file path is relative to this notebook location. Under the same folder. For the software development, you should use the relative directories. Don't use abslute directories <br/>
"C:\Users\zyang\OneDrive\桌面\ee.txt" this is an absolute file path, which mean directly from your computer. Windows use '/' in file path. Mac and Linux uses "\" in file path. Personal information, and the absolute file path different between different computer

In [1]:
with open('data/pi_digits.txt') as file_object: 
    contents = file_object.read()
print(contents)

3.1415926535 
  8979323846 
  2643383279



Python default path is the linux and mac file path, it should be use '/' instead of '\' (special character) . Compare below two cell. You need to insert 'r' before a windows path file. "//" also work

In [2]:
with open(r'C:\Users\zyang\OneDrive\桌面\ee.txt') as file_object:
    contents = file_object.read()
print(contents)

eeeeee


In [4]:
with open('C:/Users/zyang/OneDrive/桌面/ee.txt') as file_object:
    contents = file_object.read()
print(contents)

eeeeee


The first line of this program has a lot going on. Let’s start by looking at the open() function. To do any work with a file, even just printing its contents, you first need to open the file to access it. The open() function needs one argument: the name of the file you want to open. Python looks for this file in the directory where the program that’s currently being executed is stored. In this example, file_reader.py is currently running, so Python looks for pi_digits.txt in the directory where file_reader.py is stored. The open() function returns an object representing the file. Here, open('pi_digits.txt') returns an object representing pi_digits.txt. Python assigns this object to file_object, which we’ll work with later in the program.

The keyword with closes the file once access to it is no longer needed. Notice how we call open() in this program but not close(). You could open and close the file by calling open() and close(), but if a bug in your program prevents the close() method from being executed, the file may never close. This may seem trivial, but improperly closed files can cause data to be lost or corrupted. And if you call close() too early in your program, you’ll find yourself trying to work with a closed file (a file you can’t access), which leads to more errors. It’s not always easy to know exactly when you should close a file, but with the structure shown here, Python will figure that out for you. All you have to do is open the file and work with it as desired, trusting that Python will close it automatically when the with block finishes execution.

Once we have a file object representing pi_digits.txt, we use the read() method in the second line of our program to read the entire contents of the file and store it as one long string in contents. When we print the value of contents, we get the entire text file back.

The only difference between this output and the original file is the extra blank line at the end of the output. The blank line appears because read() returns an empty string when it reaches the end of the file; this empty string shows up as a blank line. If you want to remove the extra blank line, you can use rstrip() in the call to print():

In [5]:
with open('data/pi_digits.txt') as file_object:
    contents = file_object.read()
    print(contents.rstrip())

3.1415926535 
  8979323846 
  2643383279


If you choose to read file, and after finish coding you need to close the file

In [6]:
f =open('data/pi_digits.txt')
f.read()

'3.1415926535 \n  8979323846 \n  2643383279\n'

In [7]:
f.close()

### File Paths

When you pass a simple filename like pi_digits.txt to the open() function, Python looks in the directory where the file that’s currently being executed (that is, your .py program file) is stored.

Sometimes, depending on how you organize your work, the file you want to open won’t be in the same directory as your program file. For example, you might store your program files in a folder called python_work; inside python_work, you might have another folder called text_files to distinguish your program files from the text files they’re manipulating. Even though text_files is in python_work, just passing open() the name of a file in text_files won’t work, because Python will only look in python_work and stop there; it won’t go on and look in text_files. To get Python to open files from a directory other than the one where your program file is stored, you need to provide a file path, which tells Python to look in a specific location on your system.

Because text_files is inside python_work, you could use a relative file path to open a file from text_files. A relative file path tells Python to look for a given location relative to the directory where the currently running program file is stored. For example, you’d write:

```
with open('text_files/filename.txt') as file_object:
```

This line tells Python to look for the desired .txt file in the folder text_files and assumes that text_files is located inside python_work (which it is).

*Windows systems use a backslash (\) instead of a forward slash (/) when displaying file paths, but you can still *use forward slashes in your code.\*

You can also tell Python exactly where the file is on your computer regardless of where the program that’s being executed is stored. This is called an absolute file path. You use an absolute path if a relative path doesn’t work. For instance, if you’ve put text_files in some folder other than python_work—say, a folder called other_files—then just passing open() the path 'text_files/filename.txt' won’t work because Python will only look for that location inside python_work. You’ll need to write out a full path to clarify where you want Python to look.

Absolute paths are usually longer than relative paths, so it’s helpful to assign them to a variable and then pass that variable to open():

```
file_path = '/home/ehmatthes/other_files/text_files/filename.txt'
with open(file_path) as file_object:
```

Using absolute paths, you can read files from any location on your system. For now it’s easiest to store files in the same directory as your program files or in a folder such as text_files within the directory that stores your program files.

_If you try to use backslashes in a file path, you’ll get an error because the backslash is used to escape characters in strings. For example, in the path "C:\path\to\file.txt", the sequence \t is interpreted as a tab. If you need to use backslashes, you can escape each one in the path, like this: "C:\\\path\\\to\\\file.txt"._

### Reading Line by Line

When you’re reading a file, you’ll often want to examine each line of the file. You might be looking for certain information in the file, or you might want to modify the text in the file in some way. For example, you might want to read through a file of weather data and work with any line that includes the word sunny in the description of that day’s weather. In a news report, you might look for any line with the tag `<headline>` and rewrite that line with a specific kind of formatting.

You can use a for loop on the file object to examine each line from a file one at a time:

In [8]:
filename = 'data/pi_digits.txt'

with open(filename) as file_object:
    for line in file_object:
        print(line)

3.1415926535 

  8979323846 

  2643383279



```
➊ filename = 'pi_digits.txt'

➋ with open(filename) as file_object:
➌     for line in file_object:
          print(line)
```

At ➊ we assign the name of the file we’re reading from to the variable filename. This is a common convention when working with files. Because the variable filename doesn’t represent the actual file—it’s just a string telling Python where to find the file—you can easily swap out 'pi_digits.txt' for the name of another file you want to work with. After we call open(), an object representing the file and its contents is assigned to the variable file_object ➋. We again use the with syntax to let Python open and close the file properly. To examine the file’s contents, we work through each line in the file by looping over the file object ➌.

When we print each line, we find even more blank lines. These blank lines appear because an invisible newline character is at the end of each line in the text file. The print function adds its own newline each time we call it, so we end up with two newline characters at the end of each line: one from the file and one from print(). Using rstrip() on each line in the print() call eliminates these extra blank lines:

In [10]:
filename = 'data/pi_digits.txt'

with open(filename) as file_object:
    for line in file_object:
        print(line.rstrip()) # .rstrip() eliminate the special character at the end of each line

3.1415926535
  8979323846
  2643383279


In [11]:
# not finsih yet!
#update will come soon