# MSDM5051 Tutorial 11 - More Untilities in Python II

## Contents:

1. Exception Handling
2. Modules & Packages

---
# 1. Exception Handling

Exceptions are objects specialized for storing information about the occurence of an error, and provide the standard interfaces for users/other programs to deal with its occurence. In Python, the words "error" and "exception" are used almost interchangeably, and they are dealt with
in exactly the same way.

## 1.1. Raising exceptions

Exception not only can be created when there are mistakes in the code, but we can also produce them manually. In Python, we can use the keyword `raise` to manually create our own error, e.g. 

In [1]:
def my_error_function():
    
    print("This line can be printed normally")              # anything before the raise keyword can be run normally
    
    raise Exception("My exception message", "hello", 123)   # creating an Exception object containing the error message 
                                                            # in fact you can have any input here. the IDE simply print out all of them
    
    print("This line cannot be run")                        # anything after the raise keyword cannot be run, including the return
    
    return "This line can never be returned"

In [2]:
my_error_function()

This line can be printed normally


Exception: ('My exception message', 'hello', 123)

Whenever an exception is raised, the current running function will terminate immediately and return an `Exception` object. If there are no further instructions how this exception should be handled, normally our IDE will print out the message of the exception, together with the "traceback" about on which line the exception is raised. 

## 1.2. Catching exceptions

A function that can only raise exceptions and terminate is pretty useless. Often we would like to take further steps after encountering the exception. In programming jargon, this is called "catching an exception" (the error can suddenly appear from nowhere, so we have to "catch" it). In Python, we use the `try-except` clauses to define what to do after finding the error during a run of a function.

- `try` - For running the tasks we want.

- `except` - When an exception is detected in the `try` clause, the `except` clause will "catch" it. Tasks running in `try` will terminate, and instructions in `except` will be run instead. The exception message will not be printed out.

- `else` - If no exception is raised during the process, or if the type of exceptions raised do not match with that in the `except` clause, instructions in `else` will run after the `try` clause finishes. 

- `finally` - Instructions inside will always run at the end, no matter if any exception is raised or not.

In [3]:
# In practice, it just makes more sense to separate the exception handling from the original function
# By wrapping these instructions into a decorator, we can resue it on any functions we want

def my_exception_handler(func):
    
    def f(*args, **kwargs): 
        
        # Take a revision on decorator if you forgot it!
        ################################################                       
        try: 
            result = func(*args, **kwargs)                        # run func() like normal
        
        except Exception as ex:
            print(ex.args)                                        # input of the Exception object are stored in the .args attribute
            return "return this line when something went wrong"   # you can also return anything else
        
        else:
            print("Everything runs smoothly")                     
            return result                                         # return whatever the function should return
            
        finally:
            print("This line always runs at the end, before the function's result return")
        ################################################
            
    return f

In [4]:
@my_exception_handler
def input_must_be_even(num):
    
    if num%2:
        raise Exception(str(num)+ " is not even", "hello", 123)      # this exception will be caught by the except clause
        
    print(num, " is an even number")
    
    return num/2
    

In [5]:
input_must_be_even(2)

2  is an even number
Everything runs smoothly
This line always runs at the end, before the function's result return


1.0

In [6]:
input_must_be_even(3)

('3 is not even', 'hello', 123)
This line always runs at the end, before the function's result return


'return this line when something went wrong'

The `with` statement is a variation of the `try-finally` clause



## 1.3. Built-in exception sub-classes

Python has a lot of built-in exceptions classes, which can handle most of the common errors in normal use. Anyone with Python experience should have seen them at least once, for example:

- `SyntaxError` - E.g. Missing brackets, missing a `:` in the `if`/`for`, etc.

- `IndexError` - E.g. Searching with an array index that exceed the size of the array.

- `TypeError` - E.g. The function input's data type is different from what is expected.

- `RunTimeError` - Raised whenever the error does not match any existing types of exception.

- ...

You can find the full list of built-in exceptions [here](https://docs.python.org/3/library/exceptions.html). In fact, most built-in exceptions are derived from the `Exception` base class (The "Exception()" we have been using to construct the exception message in the above examples). We can design a handler that can deal with different types of exceptions differently by stacking multiple `except` clauses:

In [7]:
def my_exception_handler_2(func):
    
    def f(*args, **kwargs): 
        
        ################################################                       
        try: 
            return func(*args, **kwargs)
        
        except ZeroDivisionError:                                # Instructions when catching ZeroDivisionError
            print("You can't divide by 0")
        except (TypeError, ValueError):                          # Instructions when catching either TypeError or ValueError
            print("Only positive numbers, OK?")
        except Exception:                                        # Instructions when catching any other kinds of exception that are
            print("Somethings wrong but I have no idea why")     # derived from the Exception base class    

        ################################################
            
    return f


In [8]:
@my_exception_handler_2
def positive_num_division(a,b):
    
    if a < 0 or b < 0:
        raise ValueError("Both numbers should be larger than 0")
        
    print(a//b)

In [9]:
positive_num_division(12,4)        # Okay
positive_num_division(-1,4)        # ValueError
positive_num_division(9,0)         # ZeroDivisionError
positive_num_division([1,2,3],6)   # TypeError

3
Only positive numbers, OK?
You can't divide by 0
Only positive numbers, OK?


Note that there a few built-in exceptions that are not derived from the `Exception` base class because they serve special purposes. This is by design so that they are not accidentally caught when we call the general `Exception` class in the `except` clause.

- `KeyboardInterrupt` - Raised when the user manually terminating the program through special key press.

- `SystemExit` - Raised by the program itself when the program finishes running. 

Normally when these two exceptions are caught, they should be raised again after some further processing by the exception handling function, or else the program will never truly terminate.


## 1.4 `assert`

The `assert` statement is a simple tool for debugging, mainly when one wants to ensure that certain conditions or assumptions hold true during running the program (Although most people prefer using `print()`). When the condition is not met, an exception of the specific class `AssertionError` will be raise.

In [10]:
a = 1
b = 1

assert a+b == 2, "my error msg 1"        # the condition "a+b == 2" is true. Program continues
print(a+b)

assert a+b == 3, "my error msg 2"        # the condition "a+b == 3" is false. Print the error message and terminate
print(a+b)

2


AssertionError: my error msg 2

## 1.5. Custom Exceptions

We can create our own type of exception objects by inheriting the `Exception` base class or any of its sub-classes. Custom exceptions are truely helpful when we are developing frameworks/libraries/API... which are intended to be accessed by other developers. With custom exceptions, we can provide more sensible information about any bugs to the client programmers. 

In [11]:
# Custom exception are still objects so you can always add your own attributes and methods
class isListError(ValueError):
    def __init__(self, a, b):
        super().__init__(a,b)
        self.input1 = a
        self.input2 = b
        self.msg = "This function is for integer input only. Here is your result but please don't do this again."
        
    def divide_one_by_one(self):
        return [self.input1[i]//self.input2[i] for i in range(len(self.input2))]

In [12]:
def my_exception_handler_3(func):
    
    def f(*args, **kwargs): 
        
        ################################################                       
        try: 
            return func(*args, **kwargs)
        
        except isListError as ex:                             # because isListError is inherited from ValueError, 
            print(ex.msg)                                     # it should be checked before the general ValueError check
            print(ex.divide_one_by_one())
             
        except ZeroDivisionError:
            print("You can't divide by 0")
        except (TypeError, ValueError):
            print("Only positive numbers, OK?")
        except Exception:
            print("Somethings wrong but I have no idea why")

        ################################################
            
    return f


In [13]:
@my_exception_handler_3
def positive_num_division(a,b):
    
    if len(a) > 1 and len(b) > 1:
        raise isListError(a,b)
        
    if a < 0 or b < 0:
        raise ValueError("Both numbers should be larger than 0")
        
    print(a//b)

In [14]:
positive_num_division([4,5,6],[1,2,3])

This function is for integer input only. Here is your result but please don't do this again.
[4, 2, 2]


---
# 2. Modules & Package

Up to this point, you should have at least some experience of using some popular libraries in Python, e.g. `numpy`, `matplotlib`, etc. But what exactly constitue a library? The answer is, they are no more than a bundle of `.py` files that contain definitions and statements to be used by the library.  


## 2.1. All `.py` are modules

In Python jargon, any `.py` files that make up of a library are called **modules**. But technically speaking, every `.py` file we write can be used as a module, even a very simple one. For demonstration, I have created a file `tutorial_11.py` in the same folder with this jupyter notebook. The file contains the following texts:

```python
code = 5051

def get_course():
    print("My course is　MSDM5051")
    
class Student:
    
    def __init__(self, name, grade):
        self.name = name
        self.grade = grade
        
    def study(self):
        print(self.name, " is studying hard. He gets a ", self.grade, " in this course.")
```

Using the `import` keyword, we can access the variables, functions and classes defined in the file:

In [15]:
import tutorial_11                         # module's name must be the same as the .py file.  

print(tutorial_11.code)                    # print the "code" variable defined in the tutorial10 module

tutorial_11.get_course()                   # run the "get_course()" function defined in the tutorial10 module

tom = tutorial_11.Student("Tom" , "A+")    # create a "Student" object using the class definition in the tutorial_11 module
tom.study()

5051
My course is　MSDM5051
Tom  is studying hard. He gets a  A+  in this course.


That's it! The file `tutorial_11.py` is now being used as a module. With modules, we can "modularize" our program by putting utility functions into separate files and import them only when needed, making the main program tidy and manageable. 

**Note:** In convention, modules' name should be short and only contain lowercase letters. Underscores `_` can be used if necessary, although not encouraged. 

## 2.2 Importing a module

We already know that we can use the keyword `import` to import the `.py` module we need. In fact, there are multiple ways to import a module:

- **Import solely the module name** - Equivalently to importing all definitions in the module. In the case we need to prefix each variable we want to use by the module's name.

In [None]:
# This is for clearing any definitions in previous cells in jupyter notebook
%reset 

# you need to prefix `.tutorial_11` in front of every variables/functions/class defined in the module
import tutorial_11

print(tutorial_11.code)
tutorial_11.get_course()
tutorial_11.Student("Tom" , "A+")

# Also possible to replace the prefix by a shorter name
import tutorial_11 as t11

print(t11.code)
t11.get_course()
t11.Student("Tom", "A+")

- **Import individual objects** - Import the definitions one by one. This is useful when the module contains too many definitions but we only need some specific ones in our task. In this case we don't need to prefix the variable with the module name.

In [None]:
# This is for clearing any definitions in previous cells in jupyter notebook
%reset 

from tutorial_11 import code, get_course        # `code` and `get_course` can be directly called
from tutorial_11 import Student as Stud         # Student object can be created with the shorter name `Stud`

print(code)
get_course()
Stud("Tom", "A+")

- **Import a specific list of objects defined by the module** - If the module file has defined a variable `__all__` that contains a list of variable names, importing with `*` will import only those definitions in the list. Otherwise, it will import all defintions as individual objects, like the example above.
  

In [None]:
# This is for clearing any definitions in previous cells in jupyter notebook
%reset 

# If you have this line at the start of tutorial_11.py
# __all__ = ["code", "Student"]

# Then importing with * will only import the definitions of `code` and `Student`
from tutorial_11 import *

print(code)               # code is imported --> ok
Stud("Tom", "A+")         # Student is imported --> ok
get_course()              # get_course() is NOT imported --> raise error 

## 2.3. Deleting a module

Although rarely needed, we can remove an imported module by `del()`. Interestingly, in this way we can remove the imported definitions, but any objects created using those definitions will not be deleted. For example,

In [16]:
del tutorial_11

In [17]:
tutorial_11.code                    # defintion of a variable from the module is removed

NameError: name 'tutorial_11' is not defined

In [18]:
tutorial_11.get_course()            # definition of function from the module is removed

NameError: name 'tutorial_11' is not defined

In [19]:
tutorial_11.Student("Amy", "B+")    # cannot use the definition from the module to create new object

NameError: name 'tutorial_11' is not defined

In [20]:
print(tom)                          # however any objects created before removing the module can still be accessed
tom.study()                         # even the methods can still be accessed

<tutorial_11.Student object at 0x000001FAA9469190>
Tom  is studying hard. He gets a  A+  in this course.


## 2.4. Module namespace

To understand why deleting a module will lead to such results, we first need to understand the concept of "namespace". A namespace is a container which stores the list of the unique names of all variables that can be accessed by the program. In analogy, we can compare the namespace with the index pages of a book:

> When you are reading a book and come to some terminologies that you have forgotten about, go to the index pages to search for the page number where that terminology was first mentioned, and finally go back to that page to revisit that content.

The namespace of a program functions in a similar way: 

> When the program comes to a variable `a`, it first needs to load the values or definition of `a` into the computation units before any calculation can be proceeded. So it first comes to the namespace of the code to search for the memory location where `a`'s definition is saved, and then visit that memory location to retrieve the definition of `a`.

Since Python is a heavily OOP language, the namespace in Python is realized as a dictionary object, in which the variable names are only mapped to their definitions at OOP level, instead of the real memory address (This is one of the reason why Python is slow). To see the namespace in a program, i.e. the variables we have defined so far, we can use `dir()` or `globals()`:

- `dir()` - Return only the names of the defined variables, as a list.
- `global()` - Return the names and also the full definition of the variables, as a dictionary (it is very long).

In [106]:
# This is for clearing any definitions in previous cells in jupyter notebook
%reset 

Once deleted, variables cannot be recovered. Proceed (y/[n])?  y


In [1]:
import tutorial_11    # try to import a module by module's names

a = 2                 # create some definitons of variables and functions

def hello():
    my_num = 123
    print(my_num)

In [2]:
# Remember we can use dir() to get a list of available attributes and methods from an object?
# Well in python, a piece of program is also regarded as an object!
print(dir())

['In', 'Out', '_', '__', '___', '__builtin__', '__builtins__', '__doc__', '__loader__', '__name__', '__package__', '__session__', '__spec__', '_dh', '_i', '_i1', '_i2', '_ih', '_ii', '_iii', '_oh', 'a', 'exit', 'get_ipython', 'hello', 'open', 'quit', 'tutorial_11']


On the list we can see all the definitions we made, plus some extra ones:

- `tutorial_11`, `a` and `hello` are the module, variable and function definitions created by the us.
- `__builtins__` is the default library of Python, which will be loaded by default for every Python program.
- The others are created by jupyter notebook for helping with the GUI. We can ignore them for now.

Note that from the list, we cannot see those definitions in the module. To see them, we need to call `dir()` on the module:

In [3]:
print(dir(tutorial_11))

['Student', '__builtins__', '__cached__', '__doc__', '__file__', '__loader__', '__name__', '__package__', '__spec__', 'code', 'get_course']


By importing solely with module name, we can see that defintions from a module are stored at a different location from those defined in the main function - in jargon we describe that a module has its own namespace separated from the namespace of the main function. **This is also the reason why you need to prefix the module's name** to use the definition - you need to first tell the computer to go to the namespace of `tutorial_11`, and then search for the `code` variable in it. 

On the contrary, importing definitions as individual objects does not require the prefix because the defintions are added to the namespace of the main program directly. Namespace of the module is not created.

In [107]:
%reset

Once deleted, variables cannot be recovered. Proceed (y/[n])?  y


In [108]:
from tutorial_11 import code, get_course
from tutorial_11 import Student as Stud 

a = 2

def hello():
    my_num = 123
    print(my_num)

In [109]:
print(dir())

['In', 'Out', 'Stud', '__builtins__', '_dh', '_i', '_i108', '_i109', '_ih', '_ii', '_iii', '_oh', 'a', 'code', 'exit', 'get_course', 'get_ipython', 'hello', 'open', 'quit']


This also explains why deleting a module will not affect any object we create using the definitions from a module - the object is added to the namespace of the main function. It is not linked to the module after creation. 

## 2.5. `if __name__ == "__main__"`

With the concept of namespace, understanding what this line doing is easy. `__name__` is a variable created by the Python interpreter for each module when it is run. Its value depends on how the code is run:

- If the module is the first one to be initiated, its value is assigned to be the string `"__main__"`.
- If the module is the one being imported by another another module, its value is assigned to be the module's name.

In [4]:
print(__name__)

__main__


In [5]:
print(tutorial_11.__name__)

tutorial_11


Therefore you will see a lot of people writing their `.py` program like the follow:

```python
class some_class1():
    #...

class some_class2():
    #...

    
if __name__ = "__main__":
    # execute some actions
``` 

In this way this file can be used as both an imported module and also a standalone program. When the program is imported as a module, only the class and function defintions are added to the namespace of the main program. When the program is run directly as the main function, `__name__ = "__main__"` returns `True`, so it will execute the actions inside the `if` clause. This is convenient for one to add codes for debugging or use the module as an executable. 

## 2.6. Package

Finally we can come to package - a package is basically a bundle of modules which is intended to be distributed to other users as a single project. It is not necessary to understand in details how to make one if you are only making single-use small projects. For your interest, you may visit https://packaging.python.org/en/latest/tutorials/packaging-projects/ to read more. 