# Files and Exceptions

---

You'll learn to handle errors so your programs don't crash when they encounter unexpected situations. *Exceptions* are special objects Python create to manage errors that arise while a program is running. You'll also learn about *json* module, which allows you to save user data so it isn't lost when your program stops running. 

## Reading from a File

Reading from a file is particularly useful in data analysis applications. When you want to work with the data in a text file, the first step is to read the file into memory. You can read the entire contents of a file, or you can work through the file one line at a time. 

Let's start with a file that contains *pi* to 30 decimal places, with 10 decimal places per line:

*pi_digits.txt*
```
        3.1415926535
          8979323846
          2643383279
```

Here's a program that opens this file, reads it, and prints the contents of the file to the screen:

In [1]:
# reading file, pi_digits.txt
with open ('pi_digits.txt') as file:
    contents = file.read()
print(contents)

3.1415926535
  8979323846
  2643383279
  



To do any work with a file, you first need to *open* the file to access it. The `open()` function needs one argument: the name of the file you want to open. Python looks for this file in the directory where the program that's currently being executed is stored.

The keyword `with` closes the file once access to it is no longer needed. Improperly closed files can cause data to be lost or corrupted. You can also call `close()` method to close the file but if a bug in your program prevents the `close()` method from being executed, the file may never close.

Once we have a file object representing *pi_digits.txt*, we use the `read()` method to read the entire contents of the file and store it as one long string in contents. 

### File Paths - Relative and Absolute

Sometimes, depending on how you organise your work, the file you want to open won't be in the same directory as your program file. In such a case, you'll need to provide a *file path*, which tells Python to look in a specific location on your system. One way is to use a *relative file path* which tells Python to look for a given location relative to the directory where the currently running program is stored. For example:

In [2]:
with open ('sample_files/pi_digits.txt') as file:
    contents = file.read()  # read in contents as one blob of string
print(contents)

3.1415926535
  8979323846
  2643383279
  



**Note**: *Windows systems use a backslash(\) instead of a forward(/) when displaying file paths, but you can still use forward slashes in your code.*

You can also tell Python exactly where the file is on your computer regardless of where the program that's being executed is stored. This is called an *absolute file path*. For instance:

In [3]:
with open ('/home/your_username/Documents/Git/python-basics/sample_files/pi_digits.txt') as file:
    contents = file.read()
print(contents)

3.1415926535
  8979323846
  2643383279
  



**Note** *In macOS and linux systems, the absolute file path typically begins with a forward slash (/), which represents the root directory. This contrasts with Windows operating systems, where the absolute file path starts with a drive letter, followed by a colon (:) and backslash (\). Backslashes in Python, however is used to escape characters in strings, thus for Windows systems you can use double backslash in filepaths like this: "C:\\\path\\\to\\\file.txt". Forward slashes are acceptable when addressing file paths in Windows system but only single forward slash is used, which in this case the path will be like: "C:/path/to/file.txt"*

### Reading line by line

When you're reading a file, you'll often want to examine each line of the file. You might be looking for certain information in the file, or you might want to modify the text in the file in some way. You can use a `for` loop on the file object to examine each line from a file one at a time:

In [4]:
filename = 'pi_digits.txt'

with open(filename) as file:
    for line in file:
        print(line.strip())  # strip whitespaces from both ends of each line

3.1415926535
8979323846
2643383279



In the original file contents, there are whitespace in the beginning of 2nd and 3rd lines of digits. However, we are able to read each line, strip the line of strip of whitespaces on both ends as we call `print()` on each line. Another way is instead of calling `read()`, we can use `readlines()` to store the contents as a list of lines:

In [5]:
filename = 'pi_digits.txt'

with open(filename) as file:
    lines = file.readlines()  # store file contents as a list of lines
    
for line in lines:
    print(line.strip())

3.1415926535
8979323846
2643383279



**Note**: *By now, you might have observed that when Python reads from a text file, it interprets all text in the file as a string. If you read in a number and want to work with that value in a numerical context, you'll have to convert it to an interger using the `int()` function or convert it to a float using the `float()` function.*

## Writing to a file

One of the simplest ways to save data is to write it to a file. To write text to a file, you'll need to call `open()` with a second argument telling Python that you want to write to the file. For instance:

In [6]:
filename = 'programming.txt'

# writing to file, "programming.txt"
with open(filename, 'w') as file:
    file.write("I love programming.")

The call to `open()` now has two arguments in this example. The first argument is still the name of the file we want to open. The second argument 'w', tells Python that we want to open the file in *write mode*. You can open the file in:

   + *read mode (`'r'`)*
   + *write mode (`'w'`)*
   + *append mode (`'a'`)*, or
   + *read and write mode (`'r+'`)*
   
If you omit the mode argument, Python opens the file in read-only mode by default. Also, the `open()` function automatically creates the file you're writing to if it doesn't already exist. However, take extra precaution when opening a file in write mode ('w') because if the file does exist, Python will erase the contents of the file before returning the file object. 

### Writing Multiple Lines

The `write()` function doesn't add any newlines to the text you write. If you write more than one line without including newline characters, all the lines you've written will be saved in only one line in the file: 

In [7]:
filename = 'programming.txt'

with open(filename, 'w') as file_w:
    file_w.write("I love programming.")
    file_w.write("I love creating games.")

# two lines written appear on one line in the file:
with open(filename, 'r') as file_r:
    contents = file_r.read()
    print(contents)

print("")  # newline to separate outputs from the two examples

# rewrite to file using newlines "\n"
with open(filename, 'w') as file_w:
    file_w.write("I love programming.\n")
    file_w.write("I love creating games.\n")

# the file contents are now saved the way we would have wanted this time round
with open(filename, 'r') as file_r:
    contents = file_r.read()
    print(contents)

I love programming.I love creating games.

I love programming.
I love creating games.



### Appending to a File

When you open a file in append mode, any lines you write to the file will be added at the end of the file. If the file doesn't exist yet, Python will create an empty file for you:

In [8]:
with open(filename, 'a') as file_w:
    file_w.write("I also love finding meaning in large datasets.\n")
    file_w.write("I love creating apps that can run in a browser.\n")
    
# reading the file contents
with open(filename, 'r') as file_r:
    contents = file_r.read()
    print(contents)

I love programming.
I love creating games.
I also love finding meaning in large datasets.
I love creating apps that can run in a browser.



## Errors and Exceptions

Python uses special objects called *exceptions* to manage errors that arise during a program's execution. Errors can be classified into three major groups:

+ Syntax errors
+ Runtime errors
+ Logical errors

#### Syntax errors
These are errors which Python will find when it tries to parse your program, resulting in an exit with an error message without running anything. Common syntax errors include leaving out a symbol such as a colon, comma or brackets; misspelling a keyword, incorrect indentation, etc.

#### Runtime errors
If a program is free of syntax errors – it will be run by the Python interpreter. However, the program may exit unexpectedly (i.e. *crash*) during execution if it encounters a runtime error – a problem which was not detected when the program was parsed, but is only revealed when a particular line is executed. These kind of errors can be caught using exception handling so that your program can continue to run smoothly without crashing.

#### Logical errors
Logical errors are the most difficult to fix. They are caused by a mistake in the program’s logic - resulting in an incorrect result.  

### The `try` and `except` statements

To handle possible exceptions, we use a `try-except` block:

In [9]:
try:
    age = int(input("Please enter your age: "))
    print("I see that you are %d years old." % age)
except ValueError:
    print("Hey, that wasn't a number!")

Please enter your age: abc
Hey, that wasn't a number!


In the above example, Python will try to process all the statements inside the `try` block. If a `ValueError` occurs at any point as it is executing them, the flow of control will immediately pass to the `except` block, and any remaining statements in the `try` block will be skipped. The `except` block tells Python what to do in case a certain exception arises when it tries to run the code in the `try` block.

It is possible for one `except` clause to handle more than one kind of error: we can provide a tuple of exception types instead of a single type:

In [10]:
try:
    dividend = int(input("Please enter the dividend: "))
    divisor = int(input("Please enter the divisor: "))
    print("%d / %d = %f" % (dividend, divisor, dividend/divisor))
except(ValueError, ZeroDivisionError):
    print("Oops, something went wrong!")

Please enter the dividend: 1
Please enter the divisor: 0
Oops, something went wrong!


A `try-except` block can also have multiple `except` clauses. If an exception occurs, Python will check each `except` clause from the top down to see if the exception type matches. If none of the `except` clauses match, the exception will be considered unhandled, and your program will crash:

In [11]:
try:
    dividend = int(input("Please enter the dividend: "))
    divisor = int(input("Please enter the divisor: "))
    print("%d / %d = %f" % (dividend, divisor, dividend/divisor))
except ValueError:
    print("The divisor and dividend have to be numbers!")
except ZeroDivisionError:
    print("The dividend may not be zero!")

Please enter the dividend: 1
Please enter the divisor: a
The divisor and dividend have to be numbers!


One common issue when working with files is handling missing files. Let's try to read a file that doesn't exist by handling `FileNotFoundError` exception:

In [12]:
filename = 'alice.txt'  # non-existent file

try:
    with open(filename, encoding='utf-8') as f:
        contents = f.read()
except FileNotFoundError:
    print(f"Sorry, the file {filename} does not exist.")

### The `try-except-else` block

The `try-except-else` block works like this; Python attempts to run the code in the `try` block. The only code that should go in a `try` block is code that might cause an exception to be raised. Sometimes you'll have additional code that should run only if the `try` block was successful; this code goes in the `else` block: 

In [13]:
print("Give me two numbers, and I'll divide them.")
print("Enter 'q' to quit.")

while True:
    first_number = input("\nFirst number: ")
    if first_number == 'q':
        break
    second_number = input("Second number: ")
    if second_number == 'q':
        break
    try:
        answer = int(first_number) / int(second_number)    
    except ZeroDivisionError:
        print("You can't divide by 0")
    except ValueError:
        print("You have to input numbers!")
    else:
        print(answer)

Give me two numbers, and I'll divide them.
Enter 'q' to quit.

First number: 1
Second number: 2
0.5

First number: 1
Second number: 0
You can't divide by 0

First number: a
Second number: b
You have to input numbers!

First number: q


### Failing Silently

Sometimes, you'll want the program to fail silently when an exception occurs and continue on as if nothing happend. To make a program fail silently, you write a `try` block as usual, but you explicitly tell Python to do nothing in the `except` block. However, not writing any code in the `except` block will cause syntax error as Python expects at least an indented line of code following the `except` statement. You can use the `pass` statement that tells Python to do nothing in the block. 

In the following example, the program will perform a word count for each text file specified in a list. The text files come from Project Gutenberg (https://www.gutenberg.org/)

In [14]:
def count_words(filename):
    """Count the approximate number of words in a file."""
    try:
        with open(filename, encoding='utf-8') as f:
            contents = f.read()
    except FileNotFoundError:
        pass
    else:
        words = contents.split()
        num_words = len(words)
        print(f"The file {filename} has about {num_words} words.")
        
filenames = ['alice.txt','siddhartha.txt','moby_dick.txt','little_women.txt']
for filename in filenames:
    count_words(filename)

The file alice.txt has about 17842 words.
The file siddhartha.txt has about 42166 words.
The file little_women.txt has about 189092 words.


### The `try-except-else-finally` statement

The `finally` clause will be executed at the end of the `try-except` block no matter what – if there is no exception, if an exception is raised and handled, if an exception is raised and not handled, and even if we exit the block using `break`, `continue` or `return`. We can use the `finally` clause for cleanup code that we always want to be executed:

In [15]:
try:
    age = int(input("Please enter your age: "))
except ValueError:
    print("Hey, that wasn't a number!")
else:
    print("I see that you are %d years old." % age)
finally:
    print("It was really nice talking to you.  Goodbye!")

Please enter your age: Hello
Hey, that wasn't a number!
It was really nice talking to you.  Goodbye!


## Storing your data

You might allow users to store preferences in a game or provide data for a visualisation. Whatever the focus of your program is, you'll store the information users provide in data structures such as lists and dictionaries. A simple way to do this involves storing your data using the `json` module.

The `json` module allows you to dump simple Python data structures into a file via JSON (JavaScript Object Notation) data format and load the data from that file the next time the program runs. The JSON data format is not specific to Python and its a useful and portable format, so you can work on JSON format with many other programming languages and systems.

### Using `json.dump()` and `json.load()`

Let's demonstrate a program that retrieve username from a json file and if the file doesn't exist, it will instead ask for user input and save the data in a json file:

In [17]:
import json

filename = 'username.json'

try:
    with open(filename) as f:
        username = json.load(f)
except FileNotFoundError:
    username = input("What is your name? ")
    with open(filename, 'w') as f:
        json.dump(username, f)
        print(f"We'll remember you when you come back, {username}!")
else:
    print(f"Welcome back, {username}!")

What is your name? Colin
We'll remember you when you come back, Colin!


### Refactoring

Often, you'll come to a point where your code will work, but you'll recognise that you could improve the code by breaking it up into a series of functions that have specific jobs. This process is called *refactoring* - it makes your code cleaner, easier to understand, and easier to extend. We can refactor the above example by moving the bulk of its logic into one or more functions. 

In [18]:
import json

def get_stored_username():
    """Get stored username if available."""
    filename = 'username.json'
    try:
        with open(filename) as f:
            username = json.load(f)
    except FileNotFoundError:
        return None
    else:
        return username

def get_new_username():
    """Prompt for a new username."""
    username = input("What is your name? ")
    filename = 'username.json'
    with open(filename, 'w') as f:
        json.dump(username, f)
    return username
    
def greet_user():
    """Greet the user by name"""
    username = get_stored_username()
    if username:
        print(f"Welcome back, {username}!")
    else:
        username = get_new_username()
        print(f"We'll remember you when you come back, {username}!")
            
greet_user()    

Welcome back, Colin!


Each function in this version now has a single, clear purpose. This compartmentalisation of work is an essential part of writing clear code that will be easy to maintain and extend.

Next up, we'll touch on the last lesson - [how to test your code](https://github.com/colintwh/python-basics/blob/master/testcode.ipynb)