<h1><center>Object Oriented Programming</center></h1>

<h2>Jupyter Notebook</h2>

Jupyter notebooks can be very handy as it helps in combining code with the text. We can create presentations, or technical documents using notebook. Default jupyter notebook can be difficult to work with, particularly when large document with a lot of sections and subsections. Or even codes become difficult to read if the code includes lots of functions, loops etc.

If you use jupyter notebook, please install the extension: nbextension. <a href="https://jupyter-contrib-nbextensions.readthedocs.io/en/latest/install.html">check here for installations</a>

After you install the extension, a new option will come up as shown Figure 1 (a). Click on it to and select the extension you want in your notebook as I have selected some in Figure 1(b). Once clicked, you can use those extensions.


<img src="notebook.png" width="1000">

<h2>Introduction</h2>

<b>Object oriented programming</b> is a programming paradigm that structures coding such that properties or behavior are bundled as individual objects. Although, python is not completely an object oriented programming language, like JAVA, we can still structure our programming in this format

We will proceed with OOP using an example. We use a data from Kaggle about user churn or leaving the service (in Telcon). The database contains different information for each user. We will create a class named User and a class named MachineAlgo for understanding object oriented programming.

<img src="fig1.png" width="500">


<center>Class and Objects (image from <a href="https://www.wikitechy.com/tutorials/python/python-object-class">here</a>)</center>

<h2>Data</h2>

In [9]:
import pandas as pd 
import matplotlib.pyplot as plt
import numpy as np
from IPython.core.display import display, HTML

In [188]:
data = pd.read_csv('churn.csv')
print('shape',data.shape)
print(data.head())

shape (7043, 21)
   customerID  gender  SeniorCitizen  Partner  Dependents  tenure  \
0  7590-VHVEG       0              0        1           0       1   
1  5575-GNVDE       1              0        0           0      34   
2  3668-QPYBK       1              0        0           0       2   
3  7795-CFOCW       1              0        0           0      45   
4  9237-HQITU       0              0        0           0       2   

   PhoneService  MultipleLines  InternetService  OnlineSecurity  ...  \
0             0              0                1               0  ...   
1             1              0                1               1  ...   
2             1              0                1               1  ...   
3             0              0                1               1  ...   
4             1              0                2               0  ...   

   DeviceProtection  TechSupport  StreamingTV  StreamingMovies  Contract  \
0                 0            0            0              

<h2>Loops</h2>

Loops are used for iterations. It is going over the values. For example, lets add the sum of MonthlyCharges of the 6th to 10th users. Since indexing starts from 0, 6th user will be represented by index 5. There are different types of loops in different languages, we will go through just the "for" loop here.

In [190]:
'''
u is the counter 
range(x,y): is the range in which the u value should move (u changes from x to y-1), in the example, it will be from 5 to 9
'''

sum_charges = 0
for u in range(5,10):
    monthlyCharge = data.iloc[u]['MonthlyCharges']
    print(u, monthlyCharge)
    sum_charges = sum_charges + monthlyCharge       # short: sum_charges += monthlyCharge

print('sum of monthly charges:',sum_charges)

5 99.65
6 89.1
7 29.75
8 104.8
9 56.15
sum of monthly charges: 379.45


In [198]:
'''
Stopping (breaking), passing and continuing the loop
stopping: if some condition is satisfied, we can stop the loop (break will stop the current loop)
EXAMPLE USE: if we are checking if "V" comes in "DAVIS", loop-> "D","A","V" (break here as V is in Davis and no longer we need to check for all the remaining letters) - save computational time
'''

for i in range(5,10):
    print(i, ', mod 7 of i (remainder when i is divided by 7):',i%7)
    if i%7 == 0:                        # if mod of i (remainder when i is divided by 7) is 0, stop the loop
        print('broken here')
        break                           # it breaks the loop here
        
    print('randomly printing:', i*100)  # the loop will not reach here as it was broken already

5 , mod 7 of i (remainder when i is divided by 7): 5
randomly printing: 500
6 , mod 7 of i (remainder when i is divided by 7): 6
randomly printing: 600
7 , mod 7 of i (remainder when i is divided by 7): 0
broken here


In [200]:
'''
passing the current loop does not break the loop but it will perform no further action in that loop
EXAMPLE: suppose the code gives error in one index. Go through the next without stoppping the code (error handling)
'''

for i in range(5,10):
    print(i, ', mod 7 of i (remainder when i is divided by 7):',i%7)
    if i%7 == 0:                        # if mod of i (remainder when i is divided by 7) is 0, stop the loop
        pass                            # do nothing (even if condition is satisfied): just a filler
        print('passed here')
        
    print('randomly printing:', i*100)  # it will be printed as the loop was never broken

5 , mod 7 of i (remainder when i is divided by 7): 5
randomly printing: 500
6 , mod 7 of i (remainder when i is divided by 7): 6
randomly printing: 600
7 , mod 7 of i (remainder when i is divided by 7): 0
passed here
randomly printing: 700
8 , mod 7 of i (remainder when i is divided by 7): 1
randomly printing: 800
9 , mod 7 of i (remainder when i is divided by 7): 2
randomly printing: 900


In [201]:
'''
continuing the current loop does not break the loop but it will perform no further action in that loop
EXAMPLE: if going through users, I observe that userid is not mentioned, then ignore that user and move on
'''
for i in range(5,10):
    print(i, ', mod 7 of i (remainder when i is divided by 7):',i%7)
    if i%7 == 0:                        # if mod of i (remainder when i is divided by 7) is 0, stop the loop
        print('continued here')
        continue                        # does not break it
        
    print('randomly printing:', i*100)  # it will not be printed for the value i that was "continued", i.e. 7

5 , mod 7 of i (remainder when i is divided by 7): 5
randomly printing: 500
6 , mod 7 of i (remainder when i is divided by 7): 6
randomly printing: 600
7 , mod 7 of i (remainder when i is divided by 7): 0
continued here
8 , mod 7 of i (remainder when i is divided by 7): 1
randomly printing: 800
9 , mod 7 of i (remainder when i is divided by 7): 2
randomly printing: 900


In [None]:
# Another method for runnign the loops
# both the methods would give the same results, depending on the need, you can use either

a = ['a','b','c','d']

#method 1: as discussed above
for i in range(len(a)):
    print(a[i])
    
# method 2: directly use the array
for i in a:
    print(i)

<h2>Functions</h2>

Functions are used to make the code modular. For example, if we have to write the same loop again and again, it would be easier to code (short and easy to follow), if we create a function which we can call. For example in ML, we do not write the whole code for linear regression, we just call a linear regression function written in python.

In [205]:
'''
Suppose we have to find sum the monthly charges of 6th to 10th and tenure of 120th to 130th user
this method requires us to write two loops 
it is bad coding if we have big loops or big codes (it becomes difficult to read)
'''

sum_charges = 0
sum_tenure  = 0

for u in range(5,10):
    sum_charges += data.iloc[u]['MonthlyCharges']
    
for u in range(119,130):
    sum_tenure += data.iloc[u]['tenure']
    
print(sum_charges,sum_tenure)

379.45 307


In [207]:
'''
instead, we can create a function that would give us the sum
def find_sum(start,end,data,column_name) : start, end, data, column_name are the inputs to the function
return (value) : returns the output of the function (note that the variable value cannot be accessed from outside)
'''

def find_sum(start,end,data,column_name):
    value = 0
    for u in range(start,end):
        value += data.iloc[u][column_name]
    return(value)

sum_charges = find_sum(5,10,data,'MonthlyCharges')
sum_tenure  = find_sum(119,130,data,'tenure')

print(sum_charges, sum_tenure)

# value is defined inside the function, outside the function, the code does not know what value is
# the following line would give an error
print(value)

379.45 307


NameError: name 'value' is not defined

<h2>Definition of Class and Object</h2>

1. <b>Class</b> is an idea as how something should be defined (like a template or a blue print). A class gives a structure to the individual agents in the environment. In this case, we define a class "User": it will define how data and some operations are performed for this Class.

2. <b>Object</b> is an instance of a class. In this example, each individual user is an object. Class defines the structure of the object. Other examples include, Dog is a class, a dog named Max is an instance of Dog class, with attributes "name: Max, gender: male, age:3". Dog has a function Bark. Cat is a class, a cat named Luna is an instance of Cat class, with attributes "name:Luna, gender:female,age:3". Cat has a function Meow.

PS: "Max" is the most common name for male dogs, "Luna is the most common name for female cats <a href="https://www.rover.com/blog/dog-names/#top-list">here</a>. 

<h2>Building User Class</h2>

In [93]:
# building a Class that will define users
class User():
    '''
    this is called a constructor variable
    it initializes the object
    for example, even though the user has multiple attributes, we show how to initialize
    '''
    def __init__(self,customerID,gender,SeniorCitizen,Partner,Dependents,tenure,PhoneService,MultipleLines,InternetService,OnlineSecurity,OnlineBackup,DeviceProtection,TechSupport,StreamingTV,StreamingMovies,Contract,PaperlessBilling,PaymentMethod,MonthlyCharges,TotalCharges,Churn):
        self.customerID      = customerID
        self.gender          = gender
        self.SeniorCitizen   = SeniorCitizen
        self.Partner         = Partner
        self.Dependents      = Dependents
        self.tenure          = tenure
        self.PhoneService    = PhoneService
        self.MultipleLines   = MultipleLines
        self.InternetService = InternetService
        self.OnlineSecurity  = OnlineSecurity
        self.OnlineBackup    = OnlineBackup
        self.DeviceProtection= DeviceProtection
        self.TechSupport     = TechSupport
        self.StreamingTV     = StreamingTV
        self.StreamingMovies = StreamingMovies
        self.Contract        = Contract
        self.PaperlessBilling= PaperlessBilling
        self.PaymentMethod   = PaymentMethod
        self.MonthlyCharges  = MonthlyCharges
        self.TotalCharges    = TotalCharges
        self.Churn           = Churn
        
    '''
    we also define a function specifically for this class (a random function)
    checks if the user uses streaming services
    self here indicates to the instance itself
    '''
    def check(self):
        var = ''
        if self.StreamingTV == 1 and self.StreamingMovies == 1:
            print("printing within the function: both")
            var = "both"
        else:
            print("printing within the function: one or none")
            var = "non or both"
            
        return(var)

In [94]:
# creating one instance of User class

'''
intializing the first user
as we put a value when defining the user (line 8), it goes to the __init__ function in the class
'''
customerID      = data.iloc[0,0]
gender          = data.iloc[0,1]
SeniorCitizen   = data.iloc[0,2]
Partner         = data.iloc[0,3]
Dependents      = data.iloc[0,4]
tenure          = data.iloc[0,5]
PhoneService    = data.iloc[0,6]
MultipleLines   = data.iloc[0,7]
InternetService = data.iloc[0,8]
OnlineSecurity  = data.iloc[0,9]
OnlineBackup    = data.iloc[0,10]
DeviceProtection= data.iloc[0,11]
TechSupport     = data.iloc[0,12]
StreamingTV     = data.iloc[0,13]
StreamingMovies = data.iloc[0,14]
Contract        = data.iloc[0,15]
PaperlessBilling= data.iloc[0,16]
PaymentMethod   = data.iloc[0,17]
MonthlyCharges  = data.iloc[0,18]
TotalCharges    = data.iloc[0,19]
Churn           = data.iloc[0,20]

# dont need to write self here (self is a keyword to represent an instance of class)
user = User(customerID,gender,SeniorCitizen,Partner,Dependents,tenure,PhoneService,MultipleLines,InternetService,OnlineSecurity,OnlineBackup,DeviceProtection,TechSupport,StreamingTV,StreamingMovies,Contract,PaperlessBilling,PaymentMethod,MonthlyCharges,TotalCharges,Churn)    

# accessing the attribute values
print("gender",user.gender, "monthly charges",user.MonthlyCharges)

gender 0 monthly charges 29.85


In [95]:
# creating a population of the users

'''
there are different structures we can use to create user population
we show two methods of storing user information. first is in simple array, second is a dictionary
'''
users_array     = []
users_dictinary = {}

number_of_users = data.shape[0]
number_of_attributes = data.shape[1]

attributes      = data.columns

for u in range(number_of_users):
    customerID      = data.iloc[u,0]
    gender          = data.iloc[u,1]
    SeniorCitizen   = data.iloc[u,2]
    Partner         = data.iloc[u,3]
    Dependents      = data.iloc[u,4]
    tenure          = data.iloc[u,5]
    PhoneService    = data.iloc[u,6]
    MultipleLines   = data.iloc[u,7]
    InternetService = data.iloc[u,8]
    OnlineSecurity  = data.iloc[u,9]
    OnlineBackup    = data.iloc[u,10]
    DeviceProtection= data.iloc[u,11]
    TechSupport     = data.iloc[u,12]
    StreamingTV     = data.iloc[u,13]
    StreamingMovies = data.iloc[u,14]
    Contract        = data.iloc[u,15]
    PaperlessBilling= data.iloc[u,16]
    PaymentMethod   = data.iloc[u,17]
    MonthlyCharges  = data.iloc[u,18]
    TotalCharges    = data.iloc[u,19]
    Churn           = data.iloc[u,20]
    
    #creating one user at a time
    user = User(customerID,gender,SeniorCitizen,Partner,Dependents,tenure,PhoneService,MultipleLines,InternetService,OnlineSecurity,OnlineBackup,DeviceProtection,TechSupport,StreamingTV,StreamingMovies,Contract,PaperlessBilling,PaymentMethod,MonthlyCharges,TotalCharges,Churn)    

    # once all the attribute values are added to the object, we can add it to the population
    
    # first adding it to the array
    users_array.append(user)
    
    # adding it to a dictionary, here we can use user id also as an identifier for that user
    users_dictinary[user.customerID] = user

In [96]:
# accessing the user information using both the population set
len(users_array),users_array[0].__dict__

(7043,
 {'customerID': '7590-VHVEG',
  'gender': 0,
  'SeniorCitizen': 0,
  'Partner': 1,
  'Dependents': 0,
  'tenure': 1,
  'PhoneService': 0,
  'MultipleLines': 0,
  'InternetService': 1,
  'OnlineSecurity': 0,
  'OnlineBackup': 1,
  'DeviceProtection': 0,
  'TechSupport': 0,
  'StreamingTV': 0,
  'StreamingMovies': 0,
  'Contract': 1,
  'PaperlessBilling': 1,
  'PaymentMethod': 2,
  'MonthlyCharges': 29.85,
  'TotalCharges': '29.85',
  'Churn': 0})

In [97]:
len(users_dictinary),users_dictinary['7590-VHVEG'].__dict__

(7043,
 {'customerID': '7590-VHVEG',
  'gender': 0,
  'SeniorCitizen': 0,
  'Partner': 1,
  'Dependents': 0,
  'tenure': 1,
  'PhoneService': 0,
  'MultipleLines': 0,
  'InternetService': 1,
  'OnlineSecurity': 0,
  'OnlineBackup': 1,
  'DeviceProtection': 0,
  'TechSupport': 0,
  'StreamingTV': 0,
  'StreamingMovies': 0,
  'Contract': 1,
  'PaperlessBilling': 1,
  'PaymentMethod': 2,
  'MonthlyCharges': 29.85,
  'TotalCharges': '29.85',
  'Churn': 0})

In [106]:
'''
checking the function inside the class
'''
print("TV",user.StreamingTV, "Movie",user.StreamingMovies)
check = user.check()
print(check)

TV 0 Movie 0
printing within the function: one or none
non or both


<h2>Another method of constructing classes</h2>

In [82]:
# another method of defining a class with no constructor function (__init__)
class User():
    def check(self):
        var = ''
        if self.StreamingTV == 1 and self.StreamingMovies == 1:
            print("both")
            var = "both"
        else:
            print("one or none")
            var = "non or both"
            
        return(var)

In [81]:
# initializing using a different method
# there is no constructor function but we are assigning values outside the initialization
users_array      = []
users_dictionary = {}

number_of_users = data.shape[0]
number_of_attributes = data.shape[1]

attributes      = data.columns

for u in range(number_of_users):
    user = User()                         # creating the object without using any constructor variable
    
    user.customerID      = data.iloc[u,0]
    user.gender          = data.iloc[u,1]
    user.SeniorCitizen   = data.iloc[u,2]
    user.Partner         = data.iloc[u,3]
    user.Dependents      = data.iloc[u,4]
    user.tenure          = data.iloc[u,5]
    user.PhoneService    = data.iloc[u,6]
    user.MultipleLines   = data.iloc[u,7]
    user.InternetService = data.iloc[u,8]
    user.OnlineSecurity  = data.iloc[u,9]
    user.OnlineBackup    = data.iloc[u,10]
    user.DeviceProtection= data.iloc[u,11]
    user.TechSupport     = data.iloc[u,12]
    user.StreamingTV     = data.iloc[u,13]
    user.StreamingMovies = data.iloc[u,14]
    user.Contract        = data.iloc[u,15]
    user.PaperlessBilling= data.iloc[u,16]
    user.PaymentMethod   = data.iloc[u,17]
    user.MonthlyCharges  = data.iloc[u,18]
    user.TotalCharges    = data.iloc[u,19]
    user.Churn           = data.iloc[u,20]
    
    users_array.append(user)
    users_dictionary[user.customerID] = user

<h2>Inheritance</h2>

A class can inherit attributes and behavior methods from another function called super class. A class which inherits is called child class or sub class.

In [108]:
'''
lets create a subclass named User_subclass
User inside the class definition passes off all the attribute value and functions to the subclass (User_subclass)
'''
class User_subclass(User):
    
    def round_charges(self):
        #a function that returns the interger value of the total charges
        print(int(float(self.TotalCharges)))

In [111]:
'''
instead of creating a new User, lets create an instance of User_subclass
we can also check the functions on this new instance. It will perform both the check and the round_charges function
'''

# creating an instance of User_subclass
user = User_subclass(customerID,gender,SeniorCitizen,Partner,Dependents,tenure,PhoneService,MultipleLines,InternetService,OnlineSecurity,OnlineBackup,DeviceProtection,TechSupport,StreamingTV,StreamingMovies,Contract,PaperlessBilling,PaymentMethod,MonthlyCharges,TotalCharges,Churn)    

# accessing the attribute values
print("gender",user.gender, "monthly charges",user.TotalCharges)

# functions
check = user.check()
print("from outside the function",check)

print('integer value from the round_charges function')
user.round_charges()

gender 0 monthly charges 29.85
printing within the function: one or none
from outside the function non or both
integer value from the round_charges function
29


In [114]:
'''
Suppose we want to create an object from a sub_class, after inheriting from super class, we want to use constructor
all the parameters still need to be passed (see that an an extra parameter for country is passed)
instead of writing self.variable = value, we can reduce the amount of code we need to write and create a new variable
'''
class User_subclass(User):
    def __init__(self,customerID,gender,SeniorCitizen,Partner,Dependents,tenure,PhoneService,MultipleLines,InternetService,OnlineSecurity,OnlineBackup,DeviceProtection,TechSupport,StreamingTV,StreamingMovies,Contract,PaperlessBilling,PaymentMethod,MonthlyCharges,TotalCharges,Churn,country):
        super().__init__(customerID,gender,SeniorCitizen,Partner,Dependents,tenure,PhoneService,MultipleLines,InternetService,OnlineSecurity,OnlineBackup,DeviceProtection,TechSupport,StreamingTV,StreamingMovies,Contract,PaperlessBilling,PaymentMethod,MonthlyCharges,TotalCharges,Churn)
        self.country = country

country = 'USA'
user    = User_subclass(customerID,gender,SeniorCitizen,Partner,Dependents,tenure,PhoneService,MultipleLines,InternetService,OnlineSecurity,OnlineBackup,DeviceProtection,TechSupport,StreamingTV,StreamingMovies,Contract,PaperlessBilling,PaymentMethod,MonthlyCharges,TotalCharges,Churn,country)


<h2>Encapsulation</h2>

It refers to binding the code with the data. It can also be used to restrict the access to variables and methods. So that data is not modified accidentally. There are two types of methods - public and private. The method check in the class User was public method and it can be accessed outside the class. We will create a private function for the class which cannot called from outside.

name of the private variables and private methods (functions), start with double underscore __

<h3>private methods</h3>

In [208]:
class User():
    def check(self):
        var = ''
        if self.StreamingTV == 1 and self.StreamingMovies == 1:
            print("both")
            var = "both"
        else:
            print("one or none")
            var = "non or both"        
        return(var)
    
    # this is how private function is define
    def __private_function(self):
        print('private function')
        
u = User()
u.StreamingTV     = 1
u.StreamingMovies = 0
u.check()
u.__private_function()               # this line gives an error

one or none


AttributeError: 'User' object has no attribute '__private_function'

<h3>private variables</h3>

Just like private classes cannot be accessed from outside, we can create private variables which cannot be accessed from outside. In the previous cases, we can change any value by setting "user.MonthlyCharges = 0". If we want to avoid such mistakes, we can use private variables so that the values do not change, unless intended.

In [177]:
# ---------------------------------- creating private function----------------------------
class User():
    __country   = ''
    __continent = ''
    
    def __init__(self,country,continent):             
        self.__country   = country
        self.__continent = continent
        self.__private_function()  # private function inside the __init__ function works (can be placed in any function)
        
    def check(self):
        var = ''
        if self.StreamingTV == 1 and self.StreamingMovies == 1:
            print("both")
            var = "both"
        else:
            print("one or none")
            var = "non or both"
        self.__private_function()
        return(var)
    
    # this is how private function is define
    def __private_function(self):
        print('private function')  # can be called when placed within any other function
        
    def change_country(self,country,continent):
        self.__country   = country
        self.__continent = continent
        
u = User('USA','North America')
u.StreamingTV     = 1
u.StreamingMovies = 0
u.check()
print(u.__dict__)

# we can changes the value only when a call the function that can change the value

u.change_country('UK','Europe')
print(u.__dict__)

# we cannot call u.__country (from outside as it is a private variable), the following line throws an error
#print(u.__country)

# however, we can still access and change the values by addressing it through the following method
# please check the u.__dict__ as how it stores value
# however, it is unlikely that a coder would change the value using this method by mistake
u._User__country   = 'India'
u._User__continent = 'Asia'
print(u.__dict__)


private function
one or none
private function
{'_User__country': 'USA', '_User__continent': 'North America', 'StreamingTV': 1, 'StreamingMovies': 0}
{'_User__country': 'UK', '_User__continent': 'Europe', 'StreamingTV': 1, 'StreamingMovies': 0}
{'_User__country': 'India', '_User__continent': 'Asia', 'StreamingTV': 1, 'StreamingMovies': 0}


<h2>Abstraction</h2>

Like a Class is a template/blueprint for an object, an abstract class is a blueprint to construct other classes that will be built based on the abstract class. It has just declaration but no implementation (we cannot create an object for abstract class). However, the subclasses (based on abstract class) can be instantiated (or an object can be created) to access the methods in the abstract class.

In [184]:
'''
there is a predefined method for creating abstract class, from abc module
if abstract class has 3 methods and normal class has 1 method only, an object cant be created.
'''
from abc import ABC,abstractmethod

class Abstract(ABC): # because this Class extends ABC (or inherits ABC), it becomes an abstract class
    
    @abstractmethod  # to create an abstract method, we need to put @abstractmethod before it
    def method1():
        pass
    
    @abstractmethod
    def method2():
        pass
    
    def method3():
        pass
           
# we create a new method, that inherits abstract class and it can be implemented
class User1(Abstract):
    
    def method1(self):
        print('method 1: current class')
    

# cannot be created (and give an error) becasue all the abstract methods in abstract class have not been created.
u = User1()

TypeError: Can't instantiate abstract class User1 with abstract methods method2

In [185]:
# User2 class implements all the abstractmethods from the abstract class, he we can create an instance of User2
class User2(Abstract):
    
    def method1(self):
        print('method 1: current class')
    
    def method2(self):
        print('method 1: current class')

u = User2()     # no error as all abstract classes are used

In [186]:
# we can create a subclass to User2 with the missing abstract method and then we can create an object
# but it should inherit the functions from User1
class User3(User1):
    
    def method2(self):
        print('method 2: current class')

u = User3()    # no error as by inheriting, all abstract classes are used

We can also create a constructor class in the abstract class (by using __init__) method. 


<b>Where is it used?</b> It can be helpful for creating a common Application Program Interface (API). It is used by third party companies to work with the main company. For example, some of the games e.g. helicopter is owned by a different company. Facebook provides API to the company as how they can write a code so that it can be integrated with facebook easily (and other ethical considerations). Facebook can also provide a host of different inbuilt functionality in the abstract class that can be useful for the game developed company.

It is also used when we are working with a big team. To maintain the code structure, a single template is provided so that every user in the team (working on their own part of the project) can follow the same guidelines.

<h1>OOP Example in Machine Learning</h1>

Similar to the previous Section, we will build classes and objects for machine learning models. This might be useful when working with team and you are working on one of the ML methods.

<h2>Data preparation</h2>

In [118]:
'''
Some of the columns have continuos data (or integer data)
Some of the columns have binary data
Three columns have integer data (where value ranges from 0 to more than 1): we will convert this to one-hot vector first
'''
print('original shape :', data.shape)
# converting to one hot vector
columns_to_change = ['InternetService','Contract','PaymentMethod']
for c in columns_to_change:
    oneHot = pd.get_dummies(data[c],prefix=c)  # converting to one hot
    data   = data.drop(c,axis = 1)             # drop the converted column
    data   = pd.concat([data,oneHot],axis=1)   # Join the encoded df

print('new shape :',data.shape)
print('new shape of the data is higher than original as it adds new column: check the data with data.head() command')

# normalize the data (it is always good habit to normalize the data in machine learning : not required in decison trees)
columns_to_normalize = ['tenure','MonthlyCharges']
for c in columns_to_normalize:
    data[c] = (data[c]-np.mean(data[c]))/np.std(data[c])
    
# lets drop total charges as it is roughly equal to monthly charges x tenure (good habit to drop correlated variables)
del data['TotalCharges']

original shape : (7043, 21)
new shape : (7043, 28)
new shape of the data is higher than original as it adds new column: check the data with data.head() command


In [152]:
# building the class for machine learning algorithms
from sklearn.linear_model import LogisticRegression
from xgboost import XGBClassifier
from sklearn.metrics import accuracy_score

class MachineAlgo():
    def __init__(self,X,y):
        self.X = X
        self.y = y
        self.score_logistic = None
        self.score_xgboost  = None
        
    def logistic_regression(self,parameter):
        model  = LogisticRegression(**parameter).fit(self.X, self.y)
        y_pred = model.predict(self.X)
        predictions = [round(value) for value in y_pred]    # converting probability to binary
        accuracy    = accuracy_score(self.y, predictions)
        self.score_logistic  = accuracy
        print(accuracy)
        
        
    def xgboost(self,parameter):
        model   = XGBClassifier(**parameter)
        model   = model.fit(self.X, self.y)
        y_pred  = model.predict(self.X)
        predictions = [round(value) for value in y_pred]    # converting probability to binary
        accuracy    = accuracy_score(self.y, predictions)
        self.score_xgboost   = accuracy
        print(accuracy)

In [129]:
# preparing data for machine learning (will skip training and testing)
# removing unnecessary columns for dataset e.g. userid and Y

X = pd.DataFrame.copy(data)
del X['customerID']
del X['Churn']

y = data['Churn']

<h2>ML Models</h2>

When running a code for any machine learning model, keep the documentation for that algorithm open (really helpful)

1. logistic regression: https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html
2. XGBoost model      : https://xgboost.readthedocs.io/en/latest/python/python_api.html#module-xgboost.sklearn

Go through the different parameters, select the parameters that fits the model best (this is called hyper-parameter tuning)

In [155]:
# building the agent for Machine learning
ML = MachineAlgo(X,y)

# checking if the data is entered in the model or not
print('X shape:',ML.X.shape,'y shape:',ML.y.shape)

# building the logistic regression model
parameter = {'penalty':'l2', 'solver':'lbfgs', 'max_iter':100, 'verbose':0, 'n_jobs':-1}
ML.logistic_regression(parameter)

# building the xgboost model
parameter = {'max_depth':5, 'n_estimators':100, 'eta':1, 'objective':'binary:logistic','min_child_weight':1}
ML.xgboost(parameter)

print()
print('printing accuracy ------ ')
print(ML.score_logistic)
print(ML.score_xgboost)

X shape: (7043, 25) y shape: (7043,)
0.8037767996592361
0.8401249467556439

printing accuracy ------ 
0.8037767996592361
0.8401249467556439


<h3>What is __main__ in python (Extra: Not important now)?</h3>

Have you seen that in most of the github codes, there is a __main__ function which is used. Although it is important for JAVA, it is not really necessary in python. Then why is it used?

In [209]:
# just print to see what python actually runs when we run a code
print(__name__)            # it will display name (python automatically sets its value to main)

__main__


In [None]:
'''
- it is used when we import a python file
- suppose we import a file that has lots of functions
- when we import the file, all the files will be run (just when we call the import)
- to avoid this, we we use a main function so that all the functions are not run unnecessary
- to run this code, create two files, keep them in same folder and then run
'''
# file 1
print('abc')
print('efg')

# file 2
import file1
print('second file',__name__)

# when file 2 is run, it will print abc and efg (from file1) and print __main__ (for file 2)
# if we introduce a main function, it will not run unnecessarily when we run file 2

# file1
def main():
    print('abc')
    print('efg')

if __name__=='__main__':
    main()
    
# file2
import file1
print('second file',__name__)    # this file 2 will not print abc and efg and importing did not call the main function

In [None]:
# another example of how it works
# file 1

print('xyz')                       # this will always run as it is outside the main function
if __name__ == '__main__':         # main function is used when we want it to run standalone
    print('directly run')
else:
    print('indirectly run')

# if file 1 run on its own, it will print 'xyz' and directly run'

# file 2
import file1

# if file 2 is run, it will print, 'xyz' and 'indirectly run'
# thus, we want to run functions on importing, put it outside the main function
# other wise if you want it to run only when requested, put it inside the main function

<h1><center>Recursion</center></h1>

Recursion is an important concept. It can be asked in interviews. As its name suggests, it is defining something in terms of itself. For example fibonacchi series or may be an inception (dream within a dream). Recursion is an advanced topic. It helps in making the code look better, efficient and elegant. 

NOTE: <b>The most important thing in recursion is to think about its stopping criteria. It is very easy to get trapped in an infinite loop from which we can never come out<b> (again, the inception movie).

<h3>Example 1: Fibonacchi Series</h3>

Fibonacchi series is a magical series found in a lot of places in nature. Examples include pineapple, shells, sunflowers, pinecone etc. It follows a simple equation: f(n) = f(n-1) + f(n-2) where f(0) = 0 and f(1) = 1. 

Sequence follows as: 0, 1, 1, 2, 3, 5, 8, 13, 21, 34,...

In [226]:
# function to give fibonacchi number for the nth sequence

def fibonacchi(n):
    if n <= 1:
        return(n)                               # this is the stopping criteria as it returns and does not go into loop
    else:
        return(fibonacchi(n-1)+fibonacchi(n-2)) # recursion (make sure it is going towards end,here subtracting it by 1 and 2)
    
fibonacchi(6)

8

<h3>Example 2: Factorial</h3>

In [225]:
def factorial(n):
    if n == 1:
        return(1)                              # returns, stopping criteria
    else:
        return(n*factorial(n-1))               # function that moves it towards the end (by subtracting 1)
        
factorial(5)

120

Image below explains how the recursion proceeds for both the recursionn operations in the examples.

<img src='Recursion.png'></img>