# File IO and Exceptions
## 1DV501 - Introduction to Programming


Recording of this lecture:
https://www.youtube.com/watch?v=3sqDNuaDKww

Recordings of lecture last year (2020):
https://youtube.com/playlist?list=PLdXitOaYf2HAgjHNCfnosGw08AqzE2c4o


### Sign up for the first Python Test on Friday, October 8.
- Read information in Moodle.
- **Registration is mandatory. Deadline October 5.**
- Registration is now open in Moodle (Scroll down a bit to find it.)


### For students staying in Sweden
- The test will take place at Campus Växjö or at Campus Kalmar.
- We will not allow any student living in Sweden to take the test remotely ...
    - ...except students on the Physics distance program
- Exact time and place for the test will be presented later on.

### For students staying abroad
- Will be given an opportunity to take the Python Test remotely.
- The test will be monitored using Zoom. You will be asked to setup a webcam (or mobile phone) in such a way that you and your computer is in clear view during the test.

- More instructions related to the distance version of the Python Test will be
presented later on in Moodle.

## Today

- Working with Files and Directories
- File input/output (IO)
    - Working text files
    - Working with data files
- Text Processing (Extra material, Assignment 3 preparation)

**Reading instructions:** Chap. 9.3

## Working with files and directories

When programming we often need to access files and directories on our computer
- For example, read data from a file in certain directory, or output processed text to a certain file in a certain directory.
- The Python library os (operating system) can be used to quarry/access the file system on your computer.
- For example, to print all the Python (.py) files in a certain directory.
- The file system depends on your operating system (e.g. Mac or Windows) ⇒ The examples we show to today can look slightly different on different computers (operating systems)

### The `os` module



- Note . Windows and *nix systems differs in file structure / \

- Relative path -> the path relative to where your program runs.
- Absolute path -> the actual full path of the file system.

---

- *Tips for advanced users. If you're running Windows, it's possible to install WSL (Windows Subsystem for Linux).*

---

In [None]:
import os
path = os.getcwd()
print(path)

In [None]:
os.chdir('figures')

In [None]:
os.getcwd()

In [None]:
os.chdir('..')

In [None]:
os.getcwd()

In [None]:
os.chdir('/Users/frahaa/dev/courses/1DV501')

- The `os` module gives support for queries related to files and directories
- `os.getcwd()`  name of current working directory
    - The virtual machine's start directory in this execution
    - Not same as folder containing this program code
    - Topmost directory inside Visual Studio Code 
- `os.chdir('figures')` change to child directory
- `os.chdir('..')` change to parent directory


In [None]:
import os                       # Operating system module
os.chdir('/Users/frahaa/dev/courses/1DV501')

path = os.getcwd()              # Get current working directory
print("Current dir:", path)     # ... /1DV501

lst = os.listdir(path)  # List files and directories in path directory

for s in lst:
    print(s)       

subdir = os.chdir('figures')
print("\nMoved to dir:", os.getcwd())

lst = os.listdir(subdir) # List files and folders in subdir

for s in lst:
    if s.endswith(".py"):     # Print files ending with ".py"
        print(s)              # time.py, tax.py, shortname.py, quote.py, 

- `os.listdir(path)` List content (as strings) of directory `path`
- Directory content -> files and directories 
- Hidden entities (e.g. `.vscode` or `.DS_Store`) have names starting with a `.` 



## Using `os.scandir()`



`os.scandir()`

`s.scandir(path='.')` Returns an iterator 

https://docs.python.org/3/library/os.html


In [None]:
import os
os.chdir('/Users/frahaa/dev/courses/1DV501')

entries = os.scandir()

for a in entries:
    print('Class:', type(a))
    print('Name:', a.name)
    print('Ends with \'g\':', a.name.endswith('g'))
    print('Is a file:', a.is_file())
    print('Is a dir:', a.is_dir())
    print()

In [None]:
# Example from PDF-slides
import os
os.chdir('/Users/frahaa/dev/courses/1DV501')

def is_hidden(entry):
    return entry.name.startswith(".")

def print_entries(list_of_entries):
    for entry in list_of_entries:
        if entry.is_file() and not is_hidden(entry):
            print("File: ", entry.name, type(entry) )
        elif entry.is_dir() and not is_hidden(entry):
            print("Dir: ", entry.name, entry.path)

path = os.getcwd()
entries = os.scandir(path)  # List of entries of type DirEntry
print_entries(entries)      
print()

subdir = os.chdir('..')
entries = os.scandir(subdir)  # List of entries of type DirEntry
print_entries(entries)        

## Files and dirs, continued ...

Note "Portable Operating System Interface (POSIX)"

### The `os.listdir(...)` approach 

- `os.listdir(path)` -> all file and directory names in directory `path`
- Problem: all names are given as `strings` -> hard to know if it is a file or a directory
- Suitable approach when you quickly wants to find the content of a given directory

### The `os.scandir(...)` approach 

- `os.scandir(path)` -> all files and directories in `path` as `DirEntry` objects
- Each `DirEntry` object `entry`  comes with two attributes:
    - `entry.name` -> short local name of file or directory
    - `entry.path` -> fully qualified name of file or directory
    and two methods

- `entry.is_file()` -> True if `entry`  is a file 
- `entry.is_dir()` -> True if `entry`  is a directory

- Suitable approach for more complex problems like:
    - List all python files in a given directory
    - Find all sub-directories (transitively)  of a given directory


In [None]:

def count_dirs(path='.'):
    c_ = 0
    entries = os.scandir(path)
    for entry in entries:
        if entry.is_dir():
            #print(entry.name)
            c_ += 1 + count_dirs(entry.path)
    return c_

path = '/Users/frahaa/dev/courses/1DV501'
print(f"The path {path} contains {count_dirs(path)} subdirectories")

In [None]:
path = '/Users/frahaa/dev'
print(f"Dir {path} contains {count_dirs(path)} subdirectories")

- `count_dirs(path)` is a recursive function that visits all subdirectories
- Visits all  subdirectories transitively -> subdirs to subdirs to subdirs ...
- Difficult to handle without recursion (**perfect example when recursion is used**)


## Reading text from file

- `file =  open(path,"r")` open file `path` for reading (`r`)
- `file` is here an object representing a connection to a file
- `for line in file` -> read from file line by line




In [8]:

path = '/Users/frahaa/dev/courses/1DV501'
path += '/data/holy_grail_script_scene1.txt'
print("Reading from ",path)

file =  open(path,"r")
line_count = 0
for line in file:
    line_count += 1
    print(line)
file.close()
print("Line count: ",line_count)



Reading from  /Users/frahaa/dev/courses/1DV501/data/holy_grail_script_scene1.txt




    [wind]

    [clop clop clop]

KING ARTHUR:  Whoa there!

    [clop clop clop]

SOLDIER #1:  Halt!  Who goes there?

ARTHUR:  It is I, Arthur, son of Uther Pendragon, from the castle of Camelot.

    King of the Britons, defeator of the Saxons, sovereign of all England!

SOLDIER #1:  Pull the other one!

ARTHUR:  I am,... and this is my trusty servant Patsy.  We have ridden the 

    length and breadth of the land in search of knights who will join me in my 

    court at Camelot.  I must speak with your lord and master.

SOLDIER #1:  What?  Ridden on a horse?

ARTHUR:  Yes!

SOLDIER #1:  You're using coconuts!

ARTHUR:  What?

SOLDIER #1:  You've got two empty halves of coconut and you're bangin' 'em 

    together.

ARTHUR:  So?  We have ridden since the snows of winter covered this land, 

    through the kingdom of Mercea, through--

SOLDIER #1:  Where'd you get the coconuts?

ARTHUR:  We found 

 
---


- Ugly printout since `line` includes a `"\n"` and the file ends with a empty lines.



In [9]:
#path = ...

filelist=[]
file =  open(path,"r")
for x in file:
    filelist.append(str(x).replace('\n',''))
file.close()

In [10]:
filelist[3]

'    [clop clop clop]'

In [11]:
file =  open(path,"r")

type(file)

_io.TextIOWrapper

# Input/output modules

https://docs.python.org/3/library/io.html

The io module provides Python’s main facilities for dealing with various types of I/O. There are three main types of I/O: text I/O, binary I/O and raw I/O. These are generic categories, and various backing stores can be used for each of them. A concrete object belonging to any of these categories is called a file object. Other common terms are stream and file-like object.

In [18]:
#file = 'test'
file =  open(path,"r")

for x in file:
    print(x)

file.close()





    [wind]

    [clop clop clop]

KING ARTHUR:  Whoa there!

    [clop clop clop]

SOLDIER #1:  Halt!  Who goes there?

ARTHUR:  It is I, Arthur, son of Uther Pendragon, from the castle of Camelot.

    King of the Britons, defeator of the Saxons, sovereign of all England!

SOLDIER #1:  Pull the other one!

ARTHUR:  I am,... and this is my trusty servant Patsy.  We have ridden the 

    length and breadth of the land in search of knights who will join me in my 

    court at Camelot.  I must speak with your lord and master.

SOLDIER #1:  What?  Ridden on a horse?

ARTHUR:  Yes!

SOLDIER #1:  You're using coconuts!

ARTHUR:  What?

SOLDIER #1:  You've got two empty halves of coconut and you're bangin' 'em 

    together.

ARTHUR:  So?  We have ridden since the snows of winter covered this land, 

    through the kingdom of Mercea, through--

SOLDIER #1:  Where'd you get the coconuts?

ARTHUR:  We found them.

SOLDIER #1:  Found them?  In Mercea?  The coconut's tropical!

ARTHUR:  Wh

- Ugly print problem solved by using `print(line.strip())` -> remove trailing `"\n"`



In [19]:

path = '/Users/frahaa/dev/courses/1DV501'
path += "/data/holy_grail_script_scene1.txt"

file =  open(path,"r")
full_text = ""
for line in file:
    full_text += line
file.close()
print(full_text)



    [wind]
    [clop clop clop]
KING ARTHUR:  Whoa there!
    [clop clop clop]
SOLDIER #1:  Halt!  Who goes there?
ARTHUR:  It is I, Arthur, son of Uther Pendragon, from the castle of Camelot.
    King of the Britons, defeator of the Saxons, sovereign of all England!
SOLDIER #1:  Pull the other one!
ARTHUR:  I am,... and this is my trusty servant Patsy.  We have ridden the 
    length and breadth of the land in search of knights who will join me in my 
    court at Camelot.  I must speak with your lord and master.
SOLDIER #1:  What?  Ridden on a horse?
ARTHUR:  Yes!
SOLDIER #1:  You're using coconuts!
ARTHUR:  What?
SOLDIER #1:  You've got two empty halves of coconut and you're bangin' 'em 
    together.
ARTHUR:  So?  We have ridden since the snows of winter covered this land, 
    through the kingdom of Mercea, through--
SOLDIER #1:  Where'd you get the coconuts?
ARTHUR:  We found them.
SOLDIER #1:  Found them?  In Mercea?  The coconut's tropical!
ARTHUR:  What do you mean?
SOLDIER 

---

- We first store entire text in a string (including linebreaks)

- Reading text is easy, just remember:
    - We read the text line by line,
    -  Lines also includes a final `"\n"` and
    -  Empty lines are also included.

---


- It is important to close the file connections (`file.close()`) once reading/writing is done.
- A non-closed connection might cause problems later on when you try to access a file.


## Writing text to a file



In [None]:

path = '/Users/frahaa/dev/courses/1DV501/output.txt'
full_text = 'It is I, Arthur, son of Uther Pendragon, from the castle of Camelot. King of the Britons, defeator of the Saxons, sovereign of all England!'

file = open(path,"w")
file.write(full_text)
file.close()


In [None]:
file = open(path,'r')
for line in file:
    print(line.strip())
file.close()

- Write entire text to file.

- Result: Text in file has same formatting as `full_text`.



In [None]:
lines = ["do\n","re\n","mi\n","fa\n","so\n","la\n"]

file = open(path,"w")
file.writelines(lines)
file.close()


In [None]:
file = open(path,'r')
for line in file:
    print(line.strip())
file.close()

- We write text line by line to file.
- Result: do,re,mi,fa,so,la as six separate lines.

- Writing text is also easy, just remember to handle the line breaks.
---

### Recommendations
- Always look at the content of the file you are about to read to understand how it is organized
- Always open the output file when writing to a file to inspect the result

## Reading and Writing text - Summary

We use `open(...)` to make a file connection
-  `open(path,"r")` -> open file for reading. Program will crash is file doesn't exists (or is read protected)
-  `open(path,"w")` -> open file for writing. The file will be created if it doesn't exist, or replaced if it does exist.
-  `open(path,"a")` -> open file for appending -> add new text at the end of a file. The file will be created if it doesn't exist, or appended if it does exist.
-  Default is `"r"` -> `open(path)` means open file for reading

---

- `file` in `file = open(...)` is a file object. File object usage:

-  `for line in file:` -> read one line at the time
-  `full_text = file.read()` -> read entire file content
-  `file.write(full_text)` -> write entire text 
-  `file.writelines(lines)` where `lines` is a list of strings -> write line by line (but not adding any linebreaks) 


## Safe file handling with `with-as`



In [None]:
path = '/Users/frahaa/dev/courses/1DV501/output3.txt'

#with open(path, "r") as file:
#   for line in file:
#       print( line.strip() )     
        

# Safe file writing 

with open(path, "a") as file:
   file.write("First line to add\n")
   file.write("Last line to add\n")

- `with` and `as` are two Python keywords
- The `with-as` statement includes file closing and was introduced to make sure that an open file is always closed (no matter what happens)
- **Although a bit cryptic, it is the recommended approach to open a file.**


# Runtime Errors



In [None]:
def div(a,b): 
   return a/b

def m(a,b): 
   return div(a,b)

# Program starts
print( m(5,0) )

print('do fancy stuff')

## Error message interpretation:

1. From a call `print( m(5,0) )` in line 8
2. via a call `div(a,b)` at line 5
3. we had a `ZeroDivisionError` in line 2.

Hence, the error message not only points out where the error occurred, it also describes the executions trace leading up to the error.


## A first look at error handling



In [None]:
a = [0,1,2]

In [None]:
a[10]

In [None]:
# Returns a given element in a list
def get_element_at(lst,index):
    if 0 <= index < len(lst):
        return lst[index]
    else: 
        return -99  # What else am I supposed to do?

# Program starts
a = list( range(10) )
n = get_element_at(a,15) # Index out of range

In [None]:
get_element_at(a,10)

- The function `get_element_at(lst,index)` returns -99 when used with an index out of range. Is this really the best way to handle a detected error? Or should we let the program crash?

- In general, how do we handle errors due to an incorrect use of a function?


## Exceptions - A first example



In [None]:
x = 5 * y

In [None]:
try:
  x = 5*y            # y is not defined
  print(" x =", x)
except NameError:
  print("An exception occurred")

- `try` and `except`  are two Python keywords used for exception handling
- Errors occurring in the `try` block can be handled in the `except` block
- Using this approach we can avoid ugly traceback printouts.
- We can also decide to take some action (e.g. try again) when an error occurs.  


## Another exception example 



In [None]:
def div(a,b):
    return a/b  # Error if b = 0

def m(a,b): return div(a,b)

# Program starts
try: 
    x, y = 1e100,0
    div = m(x,y)
    print(f"{x} divided by {y} is {div}")
except ZeroDivisionError:
    print(f'{m(x,y+1e-90)}') # or ... 0.0000000...1


- Errors occurring due to code or calls executed in the `try` block 
- ... can be handled in the enclosing `except` block
- The execution jumps directly from the error (in function `div(a,b)`) to the `except` block -> `print(f"{x} ...")` is not executed.

## One more exception example

In [None]:
repeat = True
while repeat:
    x = int( input("Enter integer x: "))
    y = int( input("Enter integer y: "))
    try: 
        result = x/y       # Error if y = 0
        print(f"{x} divided by {y} is {result}")
        repeat = False
    except ZeroDivisionError:
        print("Dividing by zero, try again ...\n")

- The program will keep asking the user for input as long y is zero. 
- Each time zero is entered for y, the error message `Dividing by zero, try again ...` will be displayed.

## Raising exceptions



In [None]:

# A function with error handling
def get_element_at(lst,p):
    if 0 <= p < len(lst):
        return lst
    else: 
        err_msg = f"Gör om gör rätt [0,{len(lst)-1}]"
        raise IndexError(err_msg)  # We raise an exception

# Program starts
try:
    a = list( range(10) )
    n = get_element_at(a,15) # Index out of range
except IndexError as e:
    print("An error has occurred!")
    print(type(e)," ==> ",e)

In [None]:
get_element_at(a,15)

## Raising exceptions - A tedious example

In [None]:
def input_odd_int():
    s = input("Enter an odd integer: ")
    try:
        n = int(s)
    except ValueError: 
        raise ValueError("The input must be an integer!")
    if n%2 == 0:
        raise ValueError("The integer must be odd!")
    return n
    
# Program starts
try:
    n = input_odd_int()
    print("A valid input is:", n)
except ValueError as e:
    print(type(e),"==>",e)

In [None]:
input_odd_int()

- A function that raises a `ValueError` if input is a non-integer or an even number.
- Do **not** use this approach in assignment if not explicitly asked for.

## Exceptions: Basics


- Python handles all errors and abnormal conditions using *exceptions*.
- An exception is an object that encapsulates information about an error.
- Error -> program *raises* (or *throws*) an exception. (e.g., `raise IndexError(err_msg`)
    - execution halts immediately
    - call stack is unwounded until an appropriate enclosing exception handler is found (e.g., `except IndexError as e:`).
- No enclosing exception handler
    - The virtual machine catches exception, abruptly terminates program, and prints a stack trace
---

### Advantages


- Uniform handling of all abnormal conditions

**Separation of responsibilities:**

- The programmer identifies problems and raises exceptions.
- The client (or user) determines how to handle the problem
    - (ignore and continue, recover, try again, exit, ...).


- Here we talk about a **programmer** responsible for the development of a software component, and a **user** or **client** (most likely also a programmer) that uses the component. 

## Background

- The programmer can't know how a user wants to deal with an error.
- Different users and situations different types of error handling. 


**An Unspoken Contract**

 - The programmer is responsible for identifying errors and to notify the user by raising an exception.
- The user/client decides how to handle the exception.


**Example:** The function `get_element_at(lst,index)`

- The programmer finds the faulty index (outside the range)
    - `raise IndexError("Index out of range: " + str(index))`


- The function user can (if he/she likes) catch and handle the error


In [None]:
try:
    a = list( range(10) )
    n = get_element_at(a,15) # Index out of range
except IndexError as e:
    print("An error has occurred!")
    print(type(e)," ==> ",e)


## Handling multiple types of exceptions

We might have several `except` blocks. Each one handling a specific type of error.


```python

try:
    ...
except IndexError as e:
    "Do somethong with e"
except ValueError as e:
    "Do somethong with e"
except Exception as e:
    "Do somethong with e"
finally: # Always executed
    "Save what is possible. Close database connections, networks and so on"
    

```

- Repeated `except` -> the first suitable is used.
- `Exception` is the base class for all exceptions -> handles everything
- The `finally` block is always executed. It is mainly used to save what possibly can be saved before the program crashes.



## Built-in Errors to chose from

There are a number of built-in errors to chose from:

- IndexError is thrown when trying to access an item at an invalid index.
- ImportError is thrown when a specified function can not be found.
- TypeError is thrown when an operation or function is applied to an object of an inappropriate type.
- NameError is thrown when an object could not be found.
- ZeroDivisionError is thrown when the second operator in the division is zero.
- ValueError is thrown when a function's argument is of an inappropriate type.
- Exception handles all type of exceptions -> catches everything
- ... and many more.

---

Hence, when you want to raise an exception, select a suitable error type, and put together a suitable error message. The just `raise ErrorType(err_msg)`.

Always position `except Exception` last if you are catching multiple error types.


## Handling IO Errors



In [None]:
import os

# Safe file reading handling IOErrors

path = os.getcwd()
path += "/temp/holly_gräil.txt" 
try:
    with open(path, "r") as file:
        for line in file:
            print( line.strip() )
except IOError as e:
    print(type(e),"==>",e)
    print("No such file: ",path)

- File IO often results in errors. For example, reading from a non-existing (or read protected) file.
- Thus, enclosing all critical file IO operations with a try-except block is a very common programming pattern.

## Exceptions summary

- By enclosing error prone code with a try-except block we can catch and handle errors.
- By raising exceptions we can inform a user of an error
- Basic idea: The programmer is responsible for identifying errors and to notify the user by raising an exception. The user/client decides how to handle the exception.

### Are exceptions important?

- Exception handling is very important in larger (commercial) systems since we don't want our customers to experience an ugly stack trace due to an unhandled exception. A commercial system should never crash.
- It is less important for smaller projects when the user and programmer often is the same group of persons. After a crash we simply try to fix the problem and run the program again.
- Python is very liberal when it comes to exception handling. The programmer decides when and if to handle a potential exception. Other languages (like Java) is much stricter. Certain operations (like File IO) must include exception handling.
