-
Notifications
You must be signed in to change notification settings - Fork 1
/
oo.Rmd
181 lines (138 loc) · 8.65 KB
/
oo.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
---
title: "Object-Oriented Programming"
date: "`r Sys.Date()`"
output:
workflowr::wflow_html:
toc: true
---
```{r setup, include=FALSE}
library(tidyverse)
library(reticulate)
knitr::opts_chunk$set(echo = TRUE)
```
## Very brief history
From "Clean Architecture by Robert C. Martin".
In 1966 Ole Johan Dahl and Kristen Nygaard discovered Object-Oriented Programming (OOP). They noticed that the function call stack (static) frame in ALGOL could be moved to a heap (dynamic), thereby allowing local variables declared by a function to exist long after the function returned. The function became a constructor for a class, the local variables became instance variables, and the nested fcuntions became methods. This led to the discovery of polymorphism through the disciplined use of function pointers.
## Introduction
From [Introduction to OOP systems in Advanced R](https://adv-r.hadley.nz/oo.html).
The main reason to use OOP is **polymorphism**. Polymorphism means that a developer can consider a function's interface separately from its implementation, making it possible to use the same function form for different types of input. This is closely related to the idea of **encapsulation**: the user does not need to worry about details of an object because they are encapsulated behind a standard interface.
To be concrete, polymorphism is what allows `summary()` to produce different outputs for numeric and factor variables.
```{r summary_height}
class(women$height)
summary(women$height)
```
```{r summary_diet}
class(ChickWeight$Diet)
summary(ChickWeight$Diet)
```
You could imagine `summary()` containing a series of if-else statements, but that would mean only the original author could add new implementations (because you can't inherit the function?). An OOP system makes it possible for any developer to extend the interface with implementations for new types of input.
To be more precise, OO systems call the type of an object its **class**, and an implementation for a specific class is called a **method**. Roughly speaking, a class defines what an object _is_ and methods describe what that object can _do_. The class defines the **fields**, the data possessed by every instance of that class. Classes are organised in a hierarchy so that if a method does not exist for one class, its parent's method is used, and the child is said to **inherit** behaviour. For example, in R, an ordered factor inherits from a regular factor, and a generalised linear model inherits from a linear model. The process of finding the correct method given a class is called **method dispatch**.
There are two main paradigms of OOP which differ in how methods and classes are related. These paradigms can be considered encapsulated and functional:
* In **encapsulated** OOP, methods belong to objects or classes, and method calls typically look like `object.method(arg1, arg2)`. This is called encapsulated because the object encapsulates both data (with fields) and behaviour (with methods), and is the paradigm found in most popular languages, like Python.
* In **functional** OOP, methods belong to **generic** functions, and method calls look like ordinary function class: `generic(object, arg2, arg3)`. This is called functional because from the outside it looks like a regular function call, and internally the components are also functions.
### Encapsulated OOP
Encapsulated OOP is a programming paradigm that organises a program into objects, which are data structures consisting of **attributes** and **methods**. Objects are instantiated from a class; you can think of a class as blueprints or a factory. Each instance generated from a class has access to the class's attributes and methods.
Classes can be inherited into sub-classes and it is this inheritance hierarchy that makes code object-oriented. Classes support code reuse in ways that other components cannot and this is the main purpose of OOP. With classes, we code by customising existing code, instead of either changing existing code in place or starting from scratch. Once you get used to programming by software customisation, writing a new program becomes a task of mixing together existing superclasses that already implement the behaviour required by your program. In many application domains, collections of superclasses are known as frameworks that implement common programming tasks as classes that are ready to be used in your programs. With frameworks, you often simply code a subclass that is specific to your purposes, and inherit from all class tree.
## Programming paradigms
Most of the code I write ends up in scripts that perform a certain task. The
code is interpreted starting from the first line until it reaches the last
line. I believe this type of programming style is known as procedural
programming, where the execution is like following a recipe from start to
finish until a desired state is reached. Then there's functional programming,
which is a style that focuses on the use of functions that have certain
characteristics (that make it a pure function). OOP organises a program into
objects, which are data structures consisting of **attributes** and **methods**,
and these objects interact with each other to solve a problem.
## OOP in Python
```{r include=FALSE, message=FALSE}
if (!dir.exists(miniconda_path())){
reticulate::install_miniconda()
}
use_python(paste0(miniconda_path(), '/bin/python3'))
```
I will use Python to illustrate some OOP concepts. Python's main object-oriented
programming tool comes via classes, which is used to implement class objects
that support inheritance. A
[class](https://docs.python.org/3/tutorial/classes.html) is like a blueprint or
definition for creating an object. Python classes provide a means of bundling
data and functionality together. Creating a new class creates a new type of
object, allowing new instances of that type to be made.
The simplest form of a class definition:
```
class ClassName:
<statement-1>
...
<statement-N>
```
```{python}
class MyClass:
x = 2
my_obj = MyClass()
```
When `MyClass` was called, a new object with a distinct namespace was generated
or instantiated; `my_obj` is an instance of `MyClass`. Each object generated
from a class has access to the class's attributes and methods, and gets a
namespace. Class objects support two kinds of operations: attribute references
and instantiation. Attribute references use the standard syntax used for all
attribute references in Python: `obj.name`. Valid attribute names are all the
names that were in the class's namespace when the class object was created. The
only operations understood by instance objects are attribute references. There
are two kinds of valid attribute names: data attributes and methods. A method is
a function that belongs to an object.
```{python}
print(my_obj.x)
```
Let's create a class with a method.
```{python}
class MyClass2:
"""A simple class with a method"""
i = 1984
def __init__(self, name):
self.name = name
def f(self):
print(self)
return 'Big Brother is watching you'
x = MyClass2('Winston')
```
With the class definition above, `MyClass2.i` and `MyClass2.f` are valid
attribute references, returning an integer and a function object, respectively.
When a class defines the special method named `__init__()`, class instantiation
automatically invokes `__init__()` for the newly created class instance. **This
means that the `__init__()` function is always executed when the class is being
initiated.** Use the `__init__()` function to assign values or to run operations
that are necessary when the object is being created. This function is typically
called the constructor.
Another important note regarding methods is that **the instance object is
automatically passed as the first argument**. The following are equivalent; self
is the instance object which we assigned to x.
```{python}
x.f()
MyClass2.f(x)
```
Use the `obj.name` syntax to add a new data attribute not defined in the class.
Use the `dir()` function to return all functions and properties of a class.
```{python}
x.room_no = 101
print("\n".join(dir(x)), "\n")
```
However, objects need to be part of an inheritance hierarchy for the code to
qualify as being truly object-oriented. The syntax for a derived class
definition is:
```
class DerivedClassName(BaseClassName):
<statement-1>
...
<statement-N>
```
The syntax for multiple inheritance:
```
class DerivedClassName(Base1, Base2, Base3):
<statement-1>
...
<statement-N>
```
The search for attributes occurs depth-first, left-to-right, and not searching
twice in the same class when there is an overlap in the hierarchy. But it is
slightly more complex with the method resolution order changing dynamically to
support cooperative calls to super().
TBC.