# Files and Exceptions

## File Input and Output

* Currently, our variables are stored in memory
* Variables are reset when we exit our program
* Writing data to an _output file_
* Reading data from an _input file_

## Software that read/write files

* Word processors
* Image Editors
* Spreadsheets
* Games
* Web Browsers

## Overview of how to manipulate files

1. Open the file for reading or writing (input vs. output file). Connects the file to the program.
2. Process the file. Either read the data or write data.
3. Close the file. Releases the connection of the file to the program.

## File Types and Access Methods

* Two types of files
    1. Text
    2. Binary (an image, for example)
* File access methods
    1. Sequential (line by line)
    2. Direct (jump directly to any piece of data)
* We'll be accessing text files sequentially

## Opening a file

* We need to open a file with a _file name_
* When we open a file, the program creates a _file object_ 
* We need to assign this to a variable, this represents the file on disk

![open file diagram](images/open_file_diagram.png)

## Opening a file in Python

* Use the `open` function (no need to `import` anything)
* Creates a file object and associates it with a file on disk
* Takes two parameters, the file path/name and the `mode`

## Python File Modes

* String specifying whether the file is being read or written to
* We'll be using 3 values in this course (there are more)
    1. `'r'` means to open the file for reading only.
    2. `'w'` means to open a file for writing. If the file exists, erase it.
    3. `'a'` means to open a file to be written to. Data is written to the end of the file if it exists.
* `'w'` and `'a'` creates the file if it doesn't exist

In [6]:
# Example reading a file.
# Assumes the file exists in a folder called "files" in the same directory as this notebook.
# Windows users should use a \ instead of a /

hello = open('files/hello.txt', 'r')
print(type(hello))
hello.close()

<class '_io.TextIOWrapper'>


In [4]:
# Example writing a file
goodbye = open('files/goodbye.txt', 'w')
print(type(goodbye))
goodbye.close()

<class '_io.TextIOWrapper'>


## Relative vs. Absolute file paths

* Examples above demonstrate _relative_ file paths. 
* File path is based on the current directory
* An _absolute_ file path is the path from the base of the directory
    * In Windows, this might mean `C:\Users\George\temp\hello.txt`
    * In OSX, this might mean `/Users/George/temp/hello.txt`
    * In Linux, this might mean `/home/george/temp/hello.txt`

## Working with Windows paths

* Windows paths use `\`, which is an escape character (recall `\t`, `\"`, etc.)
* Use the `r` prefix to a string to indicate that this is a raw string (no escapes)
* What's the difference between the two:
    * `print('C:\Users\George\temp\test.txt')`
    * `print(r'C:\Users\George\temp\test.txt')`

## Objects and Methods

* We have an instance of a file object
* This file object has functions attached to it called _methods_
* Methods are called the same way as functions
* We write to a file using the `write()` method

In [9]:
# Example writing to a file
goodbye = open('files/goodbye.txt', 'w')
goodbye.write('Goodbye\n')
goodbye.write('Aloha\n')
goodbye.write('Sayonara\n')
goodbye.close()

## Closing files

* Files should be closed when your program is done writing
* Python writes the data into a _buffer_ in memory
* The `close` method ensures that the _buffer_ is written

## Reading a file

* Use the `'r'` mode
* We use the `read` method to get the file contents as a string
* Still should `close` the file after we're done

In [10]:
# Example reading a file
goodbye = open('files/goodbye.txt', 'r')
contents = goodbye.read()
print(contents)
goodbye.close()

Goodbye
Aloha
Sayonara



## Reading files using `readline`

* May not want to put all the data into a single string
* Maybe input is data that is processed line by line
* Maybe input is very very large
* `readline` reads a line from the file
    * A line ends with a `\n`
    * `readline` prints the line with a trailing `\n`

In [11]:
# Example reading using `readline`
goodbye = open('files/goodbye.txt', 'r')
print(goodbye.readline())
print(goodbye.readline())
print(goodbye.readline())
goodbye.close()

Goodbye

Aloha

Sayonara



## File Cursors

* Python internally tracks a read position, or a _cursor_
* This tracks where in the file we are currently reading from
* Each call to `readline` moves the cursor to the next line

## String concatenation

* We typically write to a file line by line
* This means our strings should have a `\n` at the end
* May need to add it if it is not there
* We can use the `+` operator to do this

In [12]:
goodbye_file = open('files/goodbye.txt', 'a')
user_input = input("What's another way to say goodbye?")
goodbye_file.write(user_input + '\n')
goodbye_file.close()

What's another way to say goodbye? adios


## Stripping a new line from a string

* `readline` prints out an extra `\n`
    * The one at the end of the string
    * The default `print` ender
* Strings are objects, too
* `rstrip()` removes specific characters from the end of a string
* Strips the characters from the right side of the string

In [17]:
goodbye_file = open('files/goodbye.txt', 'r')
for i in range(4):
    line = goodbye_file.readline()
    print(line.rstrip('\n'))

goodbye_file.close()

Goodbye
Aloha
Sayonara
adios


## Reading and Writing Numeric Data

* `write()` method takes a string
* Built in `str` function converts a number to a string
* `readline` gives back a string
* Use `int` or `float` to convert the string into a number

In [20]:
number_file = open('files/numbers.txt', 'w')
# This is an error
# number_file.write(1)

number_file.write(str(1) + '\n')
number_file.write(str(2) + '\n')
user_input = int(input("Enter a number"))
number_file.write(str(user_input) + '\n')
number_file.close()

Enter a number 3


In [23]:
number_file = open('files/numbers.txt', 'r')
total = 0
for i in range(3):
    # rstrip can be used, but is not necessary for int
    line = int(number_file.readline())
    total += int(line)
    
print(total)

6


In [25]:
# Using a loop to write data
scores = open('files/test_scores.txt', 'w')
num_students = int(input("How many students are in the class?"))
for i in range(num_students):
    score = int(input(f"Enter student {i + 1}'s score"))
    scores.write(str(score) + '\n')

scores.close()

How many students are in the class? 2
Enter student 1's score 10
Enter student 2's score 9


## Reading a file with a loop

* Some examples above
* How do we know when there is no more data?
* `readline` returns `''` when it tries to read beyond the end of the file
* Use a `while` loop to detect this

![Loop read algorithm](images/loop-read-algorithm.png)

In [27]:
# Using a loop to read data
scores = open('files/test_scores.txt', 'r')
total = 0
count = 0

line = scores.readline()
while line != '':
    total = total + int(line)
    count = count + 1
    line = scores.readline()
    
print("The average score is", total / count)
scores.close()

The average score is 9.5


## Using the `for` loop to read lines

* Recall that the Python `for` loop _iterates_ over a list of items
* Not strictly true, as it can iterate over _iterable_ objects
* For a file, the `for` loop can iterate over lines in a file

In [28]:
# Using a for loop to read data
scores = open('files/test_scores.txt', 'r')
total = 0
count = 0

for line in scores:
    total += int(line)
    count += 1
    
print('The average score is', total / count)
scores.close()

The average score is 9.5


## Exceptions

An _exception_ is an error that occurs while a program is running, causing the program to abruptly halt.

In [30]:
# Division program
def main():
    num1 = int(input('Enter a number:'))
    # Try inputting 0 here
    num2 = int(input('Enter another number:'))
    result = num1 / num2
    print("Result is", result)
    
main()

Enter a number: 10
Enter another number: 0


ZeroDivisionError: division by zero

In [31]:
# Gracefully avoiding the error
def main():
    num1 = int(input('Enter a number:'))
    # Try inputting 0 here
    num2 = int(input('Enter another number:'))
    if num2 != 0:
        result = num1 / num2
        print("Result is", result)
    else:
        print("Cannot divide by 0")
    
main()

Enter a number: 10
Enter another number: 0


Cannot divide by 0


In [32]:
# Bad input error raises a ValueError exception
def main():
    number = int(input("Enter a number"))
    print(number + 1)
    
main()

Enter a number four


ValueError: invalid literal for int() with base 10: 'four'

## Try/Except

* We can handle this error so our program doesn't crash
* We use try/except
* The try block is the statements where an exception may be raised
* The except block is where we handle the exception
    * If we expect a certain type of exception, we can specify it here

## Try/Except Execution

* If a statement in the try raises an exception specified in an except clause, we jump to statements that handle the exception
* If the exception is not specified, the program halts and a traceback is printed out
* If there is no exception, the try clause finishes and except clauses are skipped

In [34]:
def main():
    try:
        number = int(input("Enter a number"))
        print(number + 1)
    except ValueError:
        print("ERROR: value must be a number")
    
main()

Enter a number four


ERROR: value must be a number


In [35]:
# Input validation
def main():
    valid_number = False
    while not valid_number:
        try:
            number = int(input("Enter a number"))
            valid_number = True
        except ValueError:
            print("ERROR: Invalid number. Try again")
            
    print(number + 1)
    
main()

Enter a number four


ERROR: Invalid number. Try again


Enter a number three


ERROR: Invalid number. Try again


Enter a number 2


3
