# Custom Classes and Functions

*Author: Evan Carey*

*Copyright 2019, BH Analytics, LLC*

## Overview

In this module, we will be working through two different ways of customizing your Python code: implementing custom functions, and implementing custom classes. 

Functions:

* Review of functions
* Variable scope
* Exception handling
* Variable length arguments
* Docstrings
  
Classes:  

* Procedural versus object oriented programming
* Attributes and methods
* initial function
* self and methods
* instance wide (class wide) variables versus attributes
* modifying instance wide variables

## Libraries

In [1]:
import sys
import os
import textwrap

In [2]:
## So all output comes through from Ipython
from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all"

In [3]:
## Get Version information
print(textwrap.fill(sys.version),'\n')

3.6.7 | packaged by conda-forge | (default, Feb 26 2019, 03:50:56)
[GCC 7.3.0] 



## Check your working directory

Subsequent sessions may require you to identify and update your working directory so paths correctly point at the downloaded data files. You can check your working directory like so:

In [4]:
# Working Directory
print("My working directory:\n" + os.getcwd())
# Set Working Directory (if needed)
os.chdir(r"/home/ra/host/BH_Analytics/Discover/DataEngineering/")
#print("My new working directory:\n" + os.getcwd())

My working directory:
/home/ra/host/BH_Analytics/Discover/DataEngineering/notebooks


# Advanced functions

### Basic function review

Recall that we should always be on the lookout for avoiding duplication of our code. As a basic math example, what if we needed to check some integers to find the largest valid value, given an upper limit? We might write some code like the following: 

In [5]:
## Enter data (establish references...)
a1 = 6
a2 = 10
a3 = 99
upper_limit=12

In [6]:
## Calculate largest number below the upper limit
all_values = [a1,a2,a3]
print(all_values)
all_values_valid = [x for x in all_values if x < upper_limit]
print(all_values_valid)
max_value = max(all_values_valid)
print(max_value)

[6, 10, 99]
[6, 10]
10


But what if we wanted to repeat this task multiple times? We would write a function, allowing the a1, a2, a3, and the upperlimit as the inputs. We may possibly allow a default value for upper_limit as well:

In [7]:
## define function
def max_valid(a1,a2,a3,upper_limit=12):
    max_result=max([x for x in [a1,a2,a3] if x < upper_limit])
    print('Success! Max is',max_result)
    return(max_result)

## call function 
max_valid(6,34,14)
max_valid(6,9,14,
          upper_limit=10)


Success! Max is 6


6

Success! Max is 9


9

### Variable scoping in Python

What if there is a naming conflict in the references established in the function versus the global environment? How does Python decide which reference to use? Also, what if we want to use a global variable within the function?

In [8]:
max_result = 'Something else'
print('Global variable is:',max_result)
## run function which assigns and prints max result
max_valid(6,9,14)
## print max result from global environment
print('Global variable is still:',max_result)

Global variable is: Something else
Success! Max is 9


9

Global variable is still: Something else


This is called variable scoping. The references established inside the function are local to the function. They are not affected by the references assigned outside of the function when they conflict. 

But what if we wanted to use a global variable inside the function? Good programming practice says you should always pass variables in as arguments if you want to use them, and out as returns if you want them outside the function. 

There is a method for creating global variables within a function by calling the global statement. However, this is rarely a good idea (in my opinion). Instead, pass the things out of the function via return that you want out of the function!

In [9]:
%who

InteractiveShell	 a1	 a2	 a3	 all_values	 all_values_valid	 max_result	 max_valid	 max_value	 
os	 sys	 textwrap	 upper_limit	 


In [10]:
del max_result

In [11]:
## delete variable in global environment
#del max_result
## create global variable within the function by declaring global
def max_valid2(a1,a2,a3,upper_limit=12):
    global max_result
    foo = 34
    max_result=max([x for x in [a1,a2,a3] if x < upper_limit])
    print('Success! Max is',max_result)
    return(max_result)
## Run function
max_valid2(1,5,10)
## print the now established global variable
print(max_result)
## delete
#del max_result

Success! Max is 10


10

10


In [12]:
maxres = max_valid2(3, 4, 5, 10)
maxres

Success! Max is 5


5

### Exception handling

Can we break our function? What if we pass in things other than integers? Note the code below would break our function and return an error. How do we deal with this? 

In [13]:
## Run code with non-integer argument
max_valid(23, 'Ra', 10)

TypeError: '<' not supported between instances of 'str' and 'int'

There are a few options. We could code a general exception catch, so any error will cause the function to return a message:

In [15]:
def max_valid3(a1,a2,a3,upper_limit=12):
    try:
        max_result=max([x for x in [a1,a2,a3] if x < upper_limit])
    except:
        print('Failure! Looks like you entered a non-integer?')
    else:
        print('Success! Max is',max_result)
        return(max_result)
## Run code with non-integer argument
max_valid3(23,'Evan',10)
max_valid3(23,'12',10)
max_valid3(23,12,10)

Failure! Looks like you entered a non-integer?
Failure! Looks like you entered a non-integer?
Success! Max is 10


10

We could also code for a specific error type. In this case, it is a `TypeError`:

In [16]:
def max_valid4(a1,a2,a3,upper_limit=12):
    try:
        max_result=max([x for x in [a1,a2,a3] if x < upper_limit])
    except TypeError:
        print('Failure! Looks like you entered a non-integer?')
    else:
        print('Success! Max is',max_result)
        return(max_result)
## Run code with non-integer argument
max_valid4(23,'Evan',10)
max_valid4(23,10,20)

Failure! Looks like you entered a non-integer?
Success! Max is 10


10

And finally, we can catch the error and print it:

In [17]:
def max_valid5(a1,a2,a3,upper_limit=12):
    try:
        max_result=max([x for x in [a1,a2,a3] if x < upper_limit])
    except TypeError as e:
        print('Failure! Looks like you entered a non-integer?\n',e)
    else:
        print('Success! Max is',max_result)
        return(max_result)
## Run code with non-integer argument
max_valid5(23,'Evan',10)
max_valid5(23,10,20)

Failure! Looks like you entered a non-integer?
 '<' not supported between instances of 'str' and 'int'
Success! Max is 10


10

### Docstrings

How do we know what the function is supposed to do? Where should we document this?

It turns out there is a specific way to document a function so that it is accesible from the function! This is called a docstring. If we place a string on the first line of the function, it is called the `docstring`. It will then be accessible using the `__doc__` attribute from the function. We often want a multiline string. 

In [18]:
def max_valid6(a1,a2,a3,upper_limit=12):
    '''This is a small function that finds the max value of 3 numbers, subject to an upper limit. 
    The default upper limit is 12.'''
    try:
        max_result=max([x for x in [a1,a2,a3] if x < upper_limit])
    except TypeError as e:
        print('Failure! Looks like you entered a non-integer?\n',e)
    else:
        print('Success! Max is',max_result)
        return(max_result)
## Print doc string
print(max_valid6.__doc__)

This is a small function that finds the max value of 3 numbers, subject to an upper limit. 
    The default upper limit is 12.


### Variable length arguments

Let's add one more final complexity. What if we wanted to take the maximum value of an unknown number of integers? It could be 3, but it could be 5?

It turns out there is a way to allow variable length arguments into functions. We simply preface an input with a star, then all subsequent *positional* arguments passed are collected into a tuple:

In [19]:
def max_valid7(*aX,upper_limit=12):
    '''This is a small function that finds the max value of numbers, subject to an upper limit. 
The default upper limit is 12.'''
    try:
        max_result=max([x for x in aX if x < upper_limit])
    except TypeError as e:
        print('Failure! Looks like you entered a non-integer?\n',e)
    else:
        print('Success! Max is',max_result)
        return(max_result)
## Call function with default
max_valid7(4,7,0,19,20,14,123,)
## Call function with different upper limit
max_valid7(4,7,0,19,upper_limit=20)

Success! Max is 7


7

Success! Max is 19


19

If we pass a double star, then all subsequent named arguments are passed as a dictionary. 

In [20]:
def test_func(x,*nums,**dicts):
    print('X is',x)
    print('Tuple of positional args:', nums)
    print('Dict of keyword args',dicts)
test_func(3,4,5,6,id='01',age=35)

X is 3
Tuple of positional args: (4, 5, 6)
Dict of keyword args {'id': '01', 'age': 35}


## Custom Classes

Another way to customize Python is to create our own classes. We have seen a number of different classes in Python so far in this material, but what if you want to create your own, with your own rules? 


This is actually a far deeper concept than just creating a class. Most of the code written so far has been 'procedural style' code. We write statements that manipulate data/objects in order. That is fine for many use cases, especially in data management or data science! However, we can also think of a different approach called object oriented programming. Object oriented programming means combining the data objects and associated functionality into a single object. The general form of this object is then called a class!

Two things to note:

* Any function assigned to a class must have *self* as the first argument. This is a placeholder for the class object. 
* When the class is instantiated, there is a special method (attached function) ran once right away. It is the __init__() method, and it allows us to instantiate input values into the object. 

In [21]:
## Create a new class for called positive integers
class positiveInt:
    
    # Attribute (class variable) shared by all instances of this class
    isnumber=True
    
    # function run upon instantiation of the class
    def __init__(self,int1):
        self.int1 = int1
        print('Class created succesfully')
    
    # Attached function for adding 10 to the object
    def add_ten(self):
        print(self.int1 + 10)

In [22]:
## Instantiate the class
x = positiveInt(5)
## check the things attached to the class
x.isnumber
x.int1
x.add_ten()

Class created succesfully


True

5

15


We might think of making a class for a special sort of object, so we can organize functions and attributes around the namespace of that object. 

Let's make a basic employee class to explore these concepts further. Just like functions, the first string literal following the class statement is the docstring. 

In [23]:
## Create employee class
class Employee:
    'Employee Class'
    ## add some attributes that apply to all employees
    costs_money = 'yes'
    ## add counter to assign employee id's later
    employee_counter = 0
    # function run upon instantiation of the class
    def __init__(self,empl_name):
        self.empl_name = empl_name
        Employee.employee_counter += 1
        self.employee_id = 100 + Employee.employee_counter
        print('Employee number:',self.employee_id ,'created')
    ## add a function
    def name_print(self):
        print('Employee name is:',self.empl_name,
             ', Employee number is:',self.employee_id)
    def assign_salary(self,salary):
        self.salary = salary    

In [24]:
## Instantiate the class
x1 = Employee('Bob')

## Call things attached to the class
x1.__doc__
x1.empl_name
x1.employee_id
x1.name_print()

## Assign Salary
x1.assign_salary(50)
x1.salary

Employee number: 101 created


'Employee Class'

'Bob'

101

Employee name is: Bob , Employee number is: 101


50

Notice we added a bit of complexity there. We initiated a class wide variable called `employee_counter`, which we used to assign an individual employee ID automatically upon creation. Now if we create a few more instances, their ID's will keep going up. 

In [25]:
## add another employee
x2 = Employee('Ana')
x2.name_print()
x2.assign_salary(120)

Employee number: 102 created
Employee name is: Ana , Employee number is: 102


## Procedural example

Many people write procedural style code using Python. However, using an object oriented approach allows you to organize your functions as well as reuse your code. In the prior example, we created a new class of an object (employees). Sometimes we create a class that represents an action, like a calculation. Let's work through a simple example of calculating a total restaraunt bill to show the difference between procedural code and object oriented code. We will start with procedural style:

In [26]:
#### Procedural style of coding: 
## Define main function to calculate total bill
def calculate_total(pre_tax,tax_rate=.1,tip_rate=.2):
    tax = calculate_tax(pre_tax,tax_rate)
    tip = calculate_tip(pre_tax,tip_rate)
    total_bill = pre_tax + tax + tip
    print_total(total_bill)
    return(total_bill)
## define sub functions
def calculate_tax(pre_tax,tax_rate):
    tax = tax_rate*pre_tax
    return(tax)
def calculate_tip(pre_tax,tip_rate):
    tip = tip_rate * pre_tax
    return(tip)
def print_total(total_bill):
    print('total bill was:',total_bill)

## Call function
calculate_total(100)


total bill was: 130.0


130.0

### Object oriented approach: classes

There is nothing wrong with procedural coding! Many people write procedural style coding in Python. However, you may want to organize your code differently into objects. Let's explore an object oriented flavor to this calculation. 

We can actally create a new class for calculating a restaurant bill, then organize all the needed functions into the class. 

In [27]:
class restauraunt_bill_calculator:
    """This class calculates the restauraunt bill using tip, tax, and pre-tax amount. """
    ## add counter to assign calculation IDs
    calculation_counter = 0
    # function run upon instantiation of the class
    def __init__(self,pre_tax,tax_rate=.1,tip_rate=.2):
        self.pre_tax = pre_tax
        self.tax_rate = tax_rate
        self.tip_rate = tip_rate
        self.tax = tax_rate * pre_tax
        self.tip = pre_tax*tip_rate
        restauraunt_bill_calculator.calculation_counter += 1
        self.calculationID = 100 + restauraunt_bill_calculator.calculation_counter
        
    def calculate_total(self):
        self.total = self.pre_tax + self.tax + self.tip
        print('Total bill is:',self.total)
        return(self.total)


In [28]:
## Test basic class instance
x = restauraunt_bill_calculator(pre_tax=100)
x.calculate_total()

Total bill is: 130.0


130.0

In [29]:
def __main__:
    #if you call from the command line, will call this main function
    

SyntaxError: invalid syntax (<ipython-input-29-4d1070611126>, line 1)

## More about special methods

There are some other helpful special methods or dunder ( __ ) methods to be aware of in Python:

Special methods are intended to be called by the Python interpreter, not in your code. You **can** redefine the `__len__` or the `__get_item__` method in your custom class, but you would probably do so only to take advantage of core language features like slicing and iterating.

## The __repr__ method

The `__repr__` method is used to call a string representation of the object for inspection. For instance, with our restaurant bill calculator, if we ask for the instance of the object back we get some not-so-nice output.

We can improve that by defining the `__repr__` method.

In [30]:
x

<__main__.restauraunt_bill_calculator at 0x7fa1800e9e48>

In [31]:
class restauraunt_bill_calculator:
    """This class calculates the restauraunt bill using tip, tax, and pre-tax amount. """
    ## add counter to assign calculation IDs
    calculation_counter = 0
    # function run upon instantiation of the class
    
    def __repr__(self):
        return 'restaurant_bill_calculator: {0}'.format(self.bill)
    
    def __init__(self,pre_tax,tax_rate=.1,tip_rate=.2):
        self.pre_tax = pre_tax
        self.tax_rate = tax_rate
        self.tip_rate = tip_rate
        self.tax = tax_rate * pre_tax
        self.tip = pre_tax*tip_rate
        restauraunt_bill_calculator.calculation_counter += 1
        self.calculationID = 100 + restauraunt_bill_calculator.calculation_counter
        self.bill = self.pre_tax
        
    def calculate_total(self):
        self.total = self.pre_tax + self.tax + self.tip
        print('Total bill is:',self.total)
        self.bill = self.total
        return(self.total)

In [32]:
x = restauraunt_bill_calculator(pre_tax=100)
x

restaurant_bill_calculator: 100

In [33]:
x.calculate_total()
x

Total bill is: 130.0


130.0

restaurant_bill_calculator: 130.0

## The `__str__` constructor

The `__str__` constructor is what's called by the `str()` method on your object, and present the string output of your class. It should return a string suitable for end users.

If you only implement one of these, use `__repr__` as `__str__` calls it as a fallback.

In [34]:
print(x)

restaurant_bill_calculator: 130.0


In [35]:
class restauraunt_bill_calculator:
    """This class calculates the restauraunt bill using tip, tax, and pre-tax amount. """
    ## add counter to assign calculation IDs
    calculation_counter = 0
    # function run upon instantiation of the class
    
    def __repr__(self):
        return 'restaurant_bill_calculator: {0}'.format(self.bill)
    
    def __str__(self):
        return 'the restaurant bill currently stands at {0}'.format(self.bill)
    
    def __init__(self,pre_tax,tax_rate=.1,tip_rate=.2):
        self.pre_tax = pre_tax
        self.tax_rate = tax_rate
        self.tip_rate = tip_rate
        self.tax = tax_rate * pre_tax
        self.tip = pre_tax*tip_rate
        restauraunt_bill_calculator.calculation_counter += 1
        self.calculationID = 100 + restauraunt_bill_calculator.calculation_counter
        self.bill = self.pre_tax
        
    def calculate_total(self):
        self.total = self.pre_tax + self.tax + self.tip
        print('Total bill is:',self.total)
        self.bill = self.total
        return(self.total)

In [36]:
x2 = restauraunt_bill_calculator(pre_tax=50)
x2
print(x2)

restaurant_bill_calculator: 50

the restaurant bill currently stands at 50


## Conclusion


Here we looked at implementing custom functions, as well as custom classes. This gave us a more thorough of how Object-Oriented Programming (OOP) principles apply to the Python language.

In particular, we looked at the details of variable scope, exception handling and the importance of docstrings. We also looked at defining a class, along with its attributes and methods, and hence were able to contrast procedural vs OOP programming paradigms. We examined the necessary boiler-plate code needed to create the class, including the `__init__`, `self` and other methods.

This OOP approach leads to clean syntax for complex data structures and is at the very heart of the Python language.