<h1><center> PPOLS564: Foundations of Data Science </center><h1>
<h3><center> Lecture 9 <br><br><font color='grey'> Exception Handling and Classes </font></center></h3>

# Exception Handling

Python comes with many different errors ([See here for a list of all errors](https://www.programiz.com/python-programming/exceptions)). Often we want our functions to break when they encounter an error; however, at other times, we want to be able to control what our code does when particular errors are encountered. This is where exception handling comes into play. 

Consider the following function, which coerces some input `x` into a float value:

In [1]:
def change(x):
    new_x = float(x)
    return new_x
change("22")

22.0

When I try the same code on a string, I get a `ValueError`. That is, float doesn't know how to convert a string to a float. I gave it the wrong type of value, and it yelled at me. 

In [28]:
change("bus")

Got this error: could not convert string to float: 'bus'


ValueError: could not convert string to float: 'bus'

### `try-except`

We can control what python does with particular errors using the "try-except" key words, which behave as follows:

```python
try:
    <try this code chunk>
except ParticularError:
    <if you get this ParticularError, do this>
```

In [29]:
def change(x):
    try:
        new_x = float(x)
    except ValueError: 
        print("I can't do anything with this input. Try a different value")
        new_x = None
    return new_x

change("bus")

I can't do anything with this input. Try a different value


As noted, there are [many types of errors](https://www.programiz.com/python-programming/exceptions), meaning there are many ways in which our code can break. 

For example, let's feed a list to our change function.

In [8]:
change([1,2,3])

TypeError: float() argument must be a string or a number, not 'list'

I get a TypeError because I fed the `float()` constructor the wrong type. We can get around this by stringing (a) multiple exceptions with different error types or (b) multiple error types into our exception.

In [9]:
def change(x):
    try:
        new_x = float(x)
    except ValueError: 
        print("I can't do anything with this input. Try a different value")
        new_x = None
    except TypeError: 
        print("I can't do anything with this input. Try a different value")
        new_x = None        
    return new_x
change([1,2,3])

I can't do anything with this input. Try a different value


In [10]:
# OR
def change(x):
    try:
        new_x = float(x)
    except (ValueError,TypeError): 
        print("I can't do anything with this input. Try a different value")
        new_x = None
    return new_x
change([1,2,3])

I can't do anything with this input. Try a different value


We can capture error statements as an object using `as`.

In [19]:
def change(x):
    try:
        new_x = float(x)
    except (ValueError,TypeError) as e: 
        print(f'Got this error: {e}')
        new_x = None
    return new_x

change([1,2,3])
change("this")

Got this error: float() argument must be a string or a number, not 'list'
Got this error: could not convert string to float: 'this'


### `raise`-ing errors

Errors can be re-raise after being captured using the `raise` keyword.

In [21]:
def change(x):
    try:
        new_x = float(x)
    except (ValueError,TypeError) as e: 
        print(f'Got this error: {e}')
        raise
    return new_x

change([1,2,3])

Got this error: float() argument must be a string or a number, not 'list'


TypeError: float() argument must be a string or a number, not 'list'

We can even have our code raise specific error types when need be.

In [23]:
def error_prone(x):
    if x == 1:
        raise ValueError()
    elif x == 2:
        raise IndentationError()
    else:
        print(x)

error_prone(1)    

ValueError: 

In [24]:
error_prone(2)

IndentationError: None (<string>)

### Common errors

- **`IndexError`**: Raised when index of a sequence is out of range.

In [25]:
x = [1,2,3,4]
x[5]

IndexError: list index out of range

- **`ValueError`**: Raised when a function gets argument of correct type but improper value.

In [27]:
int("car")

ValueError: invalid literal for int() with base 10: 'car'

- **`KeyError`**: Raised when a key is not found in a dictionary.

In [31]:
y = {'A':4,'B':6,'C':7}
y['D']

KeyError: 'D'

- **`TypeError`**: Raised when a function or operation is applied to an object of incorrect type.

> **Note**: Avoid protecting against TypeErrors. Doing so goes against the dynamic typing framework in python. If your function can't work with a specific data type, it's better that python throws an error. 

In [32]:
int([1,2,3])

TypeError: int() argument must be a string, a bytes-like object or a number, not 'list'

### The Errors that refuse to be captured

Note that there are **some errors that _cannot_ be captured** as they are due to programming errors (i.e. we structured our code wrong). 

Examples of these:

- `NameError`: Raised when a variable is not found in local or global scope.

In [11]:
d

NameError: name 'd' is not defined

- `SyntaxError`: Raised by parser when syntax error is encountered.

In [12]:
def x()

SyntaxError: invalid syntax (<ipython-input-12-5d5dd44867a2>, line 1)

- `IndentationError`: Raised when there is incorrect indentation.

In [15]:
def x():
pass

IndentationError: expected an indented block (<ipython-input-15-008ad155395e>, line 2)

### `try-finally`

```python
try:
    <try this>
finally:
    <even if the try block fails, still return this.>
```

In [45]:
def change2(x):
    try:
        new_x = float(x)
    except ValueError:
        new_x = 1        
    finally:
        print("great")
    print(new_x)
    
change2(3)
change2("bus")

great
3.0
great
1


---

# Classes 

Classes are the way of defining the structure and behavior of an object at the time when we create the object. An object's class controls its initialization and which attributes are available through the object. Classes make complex problems tractable but class can make simple solutions overly complex as well. Thus, we need to strike a balance when using classes. 

We can initialize a class using the `class` keyword, which is a built in that allows us to define the class object. The convention is to use "camel-case" when naming classes in python. Class is a statement that binds the class level code to the class name.

In [1]:
class DataWrangler:
    pass 

We **initialize** a class by calling **constructor** (which we just created).

In [3]:
DW = DataWrangler()
type(DW) # It is of type 'DataWrangler'

__main__.DataWrangler

#### What is going on here?

Recall when we write a function using the `def` keyword, we are binding the code contained in that function's code chunk to the specified name.
```python
def this_name(x)
    return x**2
```
binds the code `x**2` to the name `this_name`.


Likewise, we can can do this with larger, more complex chunks of code using classes. 

```python
class MyClass:
    
    def func_1():
    
    def func_2():
    
    def func_n():
```

**Classes** offer a way of housing whole systems of code to an object. In essence, it offers us a way to create our own object types with their own methods (internal functions) and attributes (dunder). Put simply, a class is a logical grouping of data and functions.

In [52]:
class DataWrangler:
    
    def __init__(self):
        self

    def say_hello(self):
        print('Hello!')

In [53]:
DW = DataWrangler()
DW.say_hello()

Hello!


A `class` is a "blueprint" for what we'd like our object to look, but we don't "create" the object when we read in the class code. Rather we do so when we create an **instance** of the class, i.e. use our class constructor `DataWrangeler()` and assign it to some object, `DW`.

For a nice post on classes, see [Jeff Knupp's post on the topic](https://jeffknupp.com/blog/2014/06/18/improve-your-python-python-classes-and-object-oriented-programming/).

## `class` features

- **self** &rarr; the _instance_ of the class.
- **__init__** &rarr; allows us to bind object to the instance when initializing the object.
- **instance method** &rarr; a function defined within a class. This is a function that takes `self` as an argument.
- **method** &rarr; a function can be called on objects that exist outside the class. This is a function that does not take `self` as an argument.

#### `self`

When we initialize our class object, we create an instance of it. `self` offers us a way of referencing that instance.

In [60]:
class DataWrangler:

    def say_hello(self,word):
        print(word)
        
DW = DataWrangler()
DW.say_hello(word="Cat")

Cat


This is equivalent to passing the following...

In [63]:
DataWrangler.say_hello(DW,word="Cat")

Cat


#### `__init__`

We can store data within it and pass information to the various functionality within the class that is bound to the instance upon initialization. That is, when we first create the object, we can store initial information that we can then share internally with the other methods in our class. The `__init__` attribute is known as the "initializer".

In [75]:
class DataWrangler:

    def __init__(self,word=''):
        self.word = word
    
    def say_hello(self):
        print(self.word)
        
DW = DataWrangler(word="Cat")
DW.say_hello()

Cat


We can access the data (variables) we assign to the instances and overwrite them just as we would any other object.

In [76]:
DW.word

'Cat'

In [77]:
DW.word = "Dog"
DW.say_hello()

Dog


#### instance method

An instance method is a function that requires an instances of the class in order to run. Put differently, it requires `self` to be an argument in the function. This gives the function to all the data and values contained within a specific instance, which can be a convenient and powerful way to pass around information.

The `say_hello()` function is an example of such a method.

#### method

A method is a function that does not require the instance to run. For example, see the function `add()`, note that we do not initialize the object and then use the function. 

In [72]:
class DataWrangler:

    def __init__(self,word=''):
        self.word = word
    
    def say_hello(self):
        print(self.word)
        
    def add(x,y):
        return x + y
    
DataWrangler.add(2,3)

5

#### class objects 

In [90]:
class DataWrangler:
    
    new_word = "!"

    def __init__(self,word=''):
        self.word = word
    
    def say_hello(self):
        print(self.word + DataWrangler.new_word)
        
    def add(x,y):
        return x + y
    
DW = DataWrangler(word="hello")
DW.say_hello()

hello!


In [89]:
class DataWrangler:
    
    container = []

    def __init__(self,word=''):
        self.word = word
    
    def say_hello(self):
        print(self.word)
        
    def load1(self,x):
        return DataWrangler.container.append(x)
    
    def load2(self,x):
        return DataWrangler.container.append(x)
    
DW = DataWrangler(word="hello")
DW.container
DW.load1(1)
DW.container
DW.load2(2)
DW.container

[1, 2]

#### attributes

We can define how our class should behave to python's other functionality.

In [None]:
def __eq__(self, other):
        print "B __eq__ called: %r == %r ?" % (self, other)
        return self.value == other.value

In [101]:
class DataWrangler:

    def __init__(self,word=''):
        self.word = word
    
    def say_hello(self):
        print(self.word)
        
    def __eq__(self,other):
        return self.word == other

DW = DataWrangler(word="hello")
print(DW == 1)
print(DW == "hello")

False
True


In [119]:
class DataWrangler:

    def __init__(self,word=''):
        self.word = word
    
    def say_hello(self):
        print(self.word)
        
    def __iter__(self):
        return iter(self.word)

DW = DataWrangler(word="hello")
for i in DW:
    print(i)

h
e
l
l
o


### Things to keep in mind

- classes should have docstrings, just as functions do, explaining their functionality.
- classes are the central to object oriented programming.
- all types in python have a class (we've talked about this extensively)
- there are other (more advanced) class features that we won't spend any time on here. Specifically, static methods, attributes, decorators, and class inheritance. 

# Class Example

In [126]:
import csv

class DataWrangler:
    '''
    Class that wrangles data
    '''

    def __init__(self,data_path=''):
        '''
        Read in data given some provided file path
        '''
        pass
            
    def columns(self):
        '''
        Print off all available columns
        '''
        pass
    
    def select(self,variable):
        '''
        Select a variable
        '''
        pass
    
    def display(self):
        '''
        Display the data frame
        '''
        pass

Let's fill in each function piece by piece.

In [151]:
import csv

class DataWrangler:

    def __init__(self,data_path=''):
        try:
            with open(data_path) as file:
                dat = list(csv.reader(file))
                # convert data types
                for row in dat:
                    for ind,val in enumerate(row):
                        if '.' in val:
                            row[ind] = float(val)
                        elif val.isdigit():
                            row[ind] = int(val)
                        else:
                            val
                self.data=dat
        
        except FileNotFoundError:
            print('File does not exist.')
        
                
    def columns(self):
        '''
        Print off all available columns
        '''
        pass
    
    def select(self,variable):
        '''
        Select a variable
        '''
        pass
    
    def display(self):
        '''
        Display the data frame
        '''
        pass 
    
    
DW = DataWrangler(data_path='example_data.csv')
DW.data

[['var1', 'var2', 'var3', 'var4'],
 [1, 2, 3, 4],
 [3, 4, 6, 7],
 [7, 62, 3, 8],
 [0.77, 6, 37, 100]]

In [152]:
import csv

class DataWrangler:

    def __init__(self,data_path=''):
        try:
            with open(data_path) as file:
                dat = list(csv.reader(file))
                # convert data types
                for row in dat:
                    for ind,val in enumerate(row):
                        if '.' in val:
                            row[ind] = float(val)
                        elif val.isdigit():
                            row[ind] = int(val)
                        else:
                            val
                self.data=dat
        
        except FileNotFoundError:
            print('File does not exist.')
                
    def columns(self):
        '''
        Print off all available columns
        '''
        return self.data[0]
    
    
    def select(self,variable):
        '''
        Select a variable
        '''
        pass
    
    
    def display(self):
        '''
        Display the data frame
        '''
        pass 
    
    
DW = DataWrangler(data_path='example_data.csv')
DW.columns()

['var1', 'var2', 'var3', 'var4']

In [153]:
import csv

class DataWrangler:

    def __init__(self,data_path=''):
        try:
            with open(data_path) as file:
                dat = list(csv.reader(file))
                # convert data types
                for row in dat:
                    for ind,val in enumerate(row):
                        if '.' in val:
                            row[ind] = float(val)
                        elif val.isdigit():
                            row[ind] = int(val)
                        else:
                            val
                self.data=dat
        
        except FileNotFoundError:
            print('File does not exist.')
            
                
    def columns(self):
        '''
        Print off all available columns
        '''
        return self.data[0]
    
    
    def select(self,variable):
        '''
        Select a variable
        '''
        
        columns = self.columns()
        output = []
        if variable in columns:
            position = columns.index(variable)
            for row in self.data[1:]:
                output.append(row[position])
        else:
            print(f'{variable} is not in the data. Please choose another variable')
        return output

    
    def display(self):
        '''
        Display the data frame
        '''
        pass 
    
    
DW = DataWrangler(data_path='example_data.csv')
DW.select(variable="var3")

[3, 6, 3, 37]

In [154]:
import csv

class DataWrangler:

    def __init__(self,data_path=''):
        try:
            with open(data_path) as file:
                dat = list(csv.reader(file))
                # convert data types
                for row in dat:
                    for ind,val in enumerate(row):
                        if '.' in val:
                            row[ind] = float(val)
                        elif val.isdigit():
                            row[ind] = int(val)
                        else:
                            val
                self.data=dat
        
        except FileNotFoundError:
            print('File does not exist.')
        
                
    def columns(self):
        '''
        Print off all available columns
        '''
        return self.data[0]
    
    
    def select(self,variable):
        '''
        Select a variable
        '''
        
        columns = self.columns()
        output = []
        if variable in columns:
            position = columns.index(variable)
            for row in self.data[1:]:
                output.append(row[position])
        else:
            print(f'{variable} is not in the data. Please choose another variable')
        return output

    
    def display(self):
        '''
        Display the data frame
        '''
        print(self.data)
    
    
DW = DataWrangler(data_path='example_data.csv')
DW.display()

[['var1', 'var2', 'var3', 'var4'], [1, 2, 3, 4], [3, 4, 6, 7], [7, 62, 3, 8], [0.77, 6, 37, 100]]


We now have a customized object that takes in some .csv as an argument, and allows us to manipulate the data in a contained setting.

In [157]:
DW = DataWrangler(data_path='example_data.csv')
print(DW.columns())
print(DW.select(variable='var4'))

['var1', 'var2', 'var3', 'var4']
[4, 7, 8, 100]
