# Object-Oriented Programming (OOP)

We've met many data types and data structures, and we learned how to use them properly. However, when we deal with complex data, it is very convenient to be able to define new data structures that are tailored for our needs. The ability to do that and the corresponding functionalities are called in general Object-Oriented Programming, or in short - OOP. In this course we will not speak about the philosophy behind OOP, but will focus more on the technical details.

To make things more tangible we will apply the new concepts and examples, when applicable, to an imaginary supermarket and its data.

## Introduction

Let's reconsider the list object. Generally speaking, before we could do anything with it we had to **initialize** it. We saw several ways for doing that, but they all resulted in a new **instance** of the **class** called "list". For example, the following line of code **instantiates** the **object** _names_, which then becomes an **instance** of the **class** _list_.

In [1]:
names = ['Amit', 'Itamar', 'Reut']

We are familiar with the built-in function _type()_, and now we can also get acquainted with the built-in function _isinstance()_.

In [2]:
print(type(names))
print(isinstance(names, list))
print(isinstance(names, float))

<class 'list'>
True
False


As an instance of the class _list_, the object _names_ has (inherently) some **attributes** and **methods** (collectively called **preoperties**), which have been defined (by the authors of Python) for the class _list_. The access to the attributes and the call for the method is done by the '.' (point) character.

Attributes are inherent charteristics, which any instance of the class _list_ has, e.g. length. Similarly, methods are inherent functions, supported by any instance of the class _list_, e.g. _append()_ and _pop()_. Since we created _names_ as a list, which is a predefined class (a built-in type in this case), the attributes and methods of the class _list_ are part of who _names_ is, and this is why they are available for us without the need to define them ourselves.

## Instantiation

The first step of any class is a method called **\_\__init()_\_\_**, which serves as the **constructor** of the class. The constructor is where a new instance of the class is created and also gets the initial values for its **attributes**.

Let's think for a moment what are the attributes that **any** table should have. A table is a collection of records with pre-defined fields. However, when the table is only created, the fields are known, but no record is yet present. Let's see the \_\__init()_\_\_ function of our _Table_ class and then discuss the implementation details.

In [3]:
class Table:
    def __init__(self, fields):
        self.fields = fields
        self.records = []

We see that the definition itself is done by the word **class** followed by the name of the class. Then, with proper indentation, the method \_\__init()_\_\_ is defined. It should be noted that a method is defined exactly like a function, and it is only a convention to use this term (method) when referring to in-class functions.

That said, there is a very important difference - the use of the argument _**self**_. _self_ is a special and suitable word that emphasizes the OOP concept, and it indicates that any methods of the class "carry" the information of the instance. All class methods have _self_ as their first input argument, and we will see the importance of that immediately.

The other input (in this case _fields_) is used to initialize the attributes of the new instance. In the example above, _fields_ becomes an attribute with the same name, and another attribute called _records_ is initialized with an empty list.

Having the supermarket idea in our heads, let's see how to use our new class for relevant tables of data. The next lines create (instantiate) new objects of the _Table_ type.

In [4]:
customers = Table(['name', 'address', 'age'])
products = Table(['name', 'category', 'units', 'unit_price'])

The objects are created, but we still know very little about them...

In [5]:
print(customers)
print(customers.fields)
print(customers.records)

<__main__.Table object at 0x000001DBEB377CF8>
['name', 'address', 'age']
[]


In [6]:
print(products)
print(products.fields)
print(products.records)

<__main__.Table object at 0x000001DBEB377D68>
['name', 'category', 'units', 'unit_price']
[]


Very often there are attributes that are useful and do not require specific input from the user. Such attributes in the case of the _Table_ class are the number of fields (_n_\__fields_), which can be evaluated from _fields_, and the number of records in the table (_n_\__records_), which is initially zero. Adding that to the construtor yields the following final version of the method \_\__init()_\_\_.

In [7]:
class Table:
    def __init__(self, fields):
        self.fields = fields
        self.n_fields = len(fields)
        self.records = []
        self.n_records = 0

In [8]:
customers = Table(['Name', 'Address', 'Age'])

This is how we start, but we still can't do anything useful with the objects we've made. We will now see how to write methods to add functionality to the class.

## methods
The first method any _Table_ should have is obviously one that adds a record to it. We remember that methods are simply functions that are "aware" of the (general) instance of which they are part, and that this "awareness" is implemented by "carrying" _self_ as the first input argument. Let's write our first method, _add_\__record_(), and add it to the class implementation.

In [9]:
class Table:
    def __init__(self, fields):
        self.fields = fields
        self.n_fields = len(fields)
        self.records = []
        self.n_records = 0
    
    def add_record(self, rec):
        self.records.append(rec)
        self.n_records += 1

The method _add_\__record_() expects a single input argument - _rec_. Since the method "knows" the details of its caller (represented by _self_), it can append _rec_ to the **existing** list _self.records_. The attribute _n_\__records_ is not changed automatically with the change of _records_, so we have to update it explicitly whenever _add_\__record()_ is called.

It should be noted that the method didn't return anything (well, except _None_), but only altered the calling object. We should be familiar with this behavior from other mutable types like lists and dictionaries.

Let's test our new method.

In [10]:
# Initialization
customers = Table(['name', 'address', 'age'])
print("Current records:", customers.records)
print("Current number of records:", customers.n_records)

Current records: []
Current number of records: 0


In [11]:
# Calling add_record()
customers.add_record(['Russell Crowe', 'Dizengoff 4', 51])
print("Current records:", customers.records)
print("Current number of records:", customers.n_records)

Current records: [['Russell Crowe', 'Dizengoff 4', 51]]
Current number of records: 1


In [12]:
# Calling add_record() again
customers.add_record(['Nicolas Cage', 'Basel 7', 52])
print("Current records:", customers.records)
print("Current number of records:", customers.n_records)

Current records: [['Russell Crowe', 'Dizengoff 4', 51], ['Nicolas Cage', 'Basel 7', 52]]
Current number of records: 2


Let's add some more methods to our _Table_ class to facilitate the work with tables. The methods we are going to add are:

* _remove_\__record(self, rec)_ - Removes from _self_ the specified record
* _get_\__column(self, field)_ - Returns a list with the values of the column associated with _field_. This method does **NOT** change _self_.
* _get_\__records(self, field, value)_ - Returns a _**Table**_ instance like _self_, but only with the records that have the value _value_ at the field _fields_. This method does **NOT** change _self_.
* _get_\__fields(self, *fields)_ - Returns a _**Table**_ instance like _self_, but only with the specified _fields_. This method does **NOT** change _self_.

In [13]:
class Table:
    def __init__(self, fields):
        self.fields = fields
        self.n_fields = len(fields)
        self.records = []
        self.n_records = 0
    
    def add_record(self, rec):
        self.records.append(rec)
        self.n_records += 1
        
    def remove_record(self, rec):
        self.records.remove(rec)
        self.n_records -= 1
        
    def get_column(self, field):
        ind = self.fields.index(field)
        return [rec[ind] for rec in self.records]        
                
    def get_records(self, field, value):
        ind = self.fields.index(field)
        ret = Table(self.fields)
        for rec in self.records:
            if rec[ind] == value:
                ret.add_record(rec)
        return ret
            
    def get_fields(self, fields):
        columns = [self.get_column(field) for field in fields]
        records = [list(rec) for rec in zip(*columns)]
        ret = Table(fields)
        for rec in records:
            ret.add_record(rec)
        return ret

In [14]:
customers = Table(['name', 'address', 'age'])
customers.add_record(['Russell Crowe', 'Dizengoff 4', 51])
customers.add_record(['Nicolas Cage', 'Basel 7', 52])
customers.add_record(['Diane Keaton', 'Basel 9', 52])
print("Current records: {}\nCurrent number of records: {}".format(customers.records, customers.n_records))

Current records: [['Russell Crowe', 'Dizengoff 4', 51], ['Nicolas Cage', 'Basel 7', 52], ['Diane Keaton', 'Basel 9', 52]]
Current number of records: 3


In [15]:
ages = customers.get_column('address')
print(ages)

['Dizengoff 4', 'Basel 7', 'Basel 9']


In [16]:
aged_52 = customers.get_records('age', 52)
print(aged_52)
print(aged_52.records)

<__main__.Table object at 0x000001DBEB3BA160>
[['Nicolas Cage', 'Basel 7', 52], ['Diane Keaton', 'Basel 9', 52]]


In [17]:
names_and_ages = customers.get_fields(['name', 'age'])
print(names_and_ages)
print(names_and_ages.records)

<__main__.Table object at 0x000001DBEB3BA1D0>
[['Russell Crowe', 51], ['Nicolas Cage', 52], ['Diane Keaton', 52]]


In [18]:
aged_52_name_and_age = customers.get_records('age', 52).get_records('name', 'Nicolas Cage')
print(aged_52_name_and_age.records)

[['Nicolas Cage', 'Basel 7', 52]]


## Examples

One of the great features of classes is that they can be used for many applications. In the supermarket example we can use the same class for constructing tables for customers and products. For that we will use the files "customers.txt" and "products.txt".

> **NOTE:** This chapter should run smoothly on a stand-alone machine. However, whe running on Google Colab, we should consider the file transfer from our local machine to Google Colab's remote server. This can be used with your Google Drive with the following syntax for uploading files to Google Colab:
>
> `from google.colab import files`
> `uploaded = files.upload()`
>
> or the following syntax for downloading files
>
> `files.download(filename)`

In [19]:
import sys

if 'google.colab' in sys.modules:
    from google.colab import files
    uploaded = files.upload()

### Customers

In [20]:
customers = Table(['name', 'street', 'house', 'appartment', 'floor'])
with open('customers.txt') as f:
    for line in f:
        customer = line.split(',')
        
        name = ' '.join(customer[0].split()[1:])
        street = customer[1].split()[1]
        house = int(customer[2].split()[1])
        appartment = int(customer[3].split()[1])
        floor = int(customer[4].split()[1])
        
        customers.add_record([name, street, house, appartment, floor])

**Q1:** In what streets our customers live?

In [21]:
print(set(customers.get_column('street')))

{'Hertzel', 'Allenby', 'Dizengoff', 'Weizmann', 'Basel'}


**Q2:** How many customers do we have on each street?

In [22]:
for street in set(customers.get_column('street')):
    print("There are {} customers on {} street.".format(customers.get_records('street', street).n_records, street))

There are 28 customers on Hertzel street.
There are 15 customers on Allenby street.
There are 23 customers on Dizengoff street.
There are 26 customers on Weizmann street.
There are 19 customers on Basel street.


**Q3:** Who are the customers that live in the 3rd floor or higher?

In [23]:
max_floor = max(customers.get_column('floor'))
print([customers.get_records('floor', floor).get_column('name') for floor in range(3, max_floor+1)])

[['Reese Witherspoon', 'Charlton Heston', 'Burt Lancaster', 'Adrien Brody', 'Eddie Redmayne', 'Simone Signoret', 'Natalie Portman', 'Marlee Matlin', 'David Niven', 'Grace Kelly', 'Anne Bancroft', 'Art Carney', 'Jeff Bridges', 'Patricia Neal', 'Diane Keaton', 'Anna Magnani', 'F. Murray Abraham', 'Halle Berry', 'Robert Duvall', 'Jack Nicholson', 'Katharine Hepburn', 'Emma Thompson', 'Audrey Hepburn', 'Denzel Washington', 'Lee Marvin', 'William Holden', 'Jamie Foxx', 'Geraldine Page', 'Susan Hayward', 'Al Pacino', 'Sally Field', 'Jennifer Lawrence', 'Kate Winslet', 'John Wayne', 'Ellen Burstyn', 'Gwyneth Paltrow', 'Peter Finch', 'Nicole Kidman'], ['Julie Christie', 'Matthew McConaughey', 'Elizabeth Taylor', 'Frances McDormand', 'Paul Newman', 'Rex Harrison', 'Cate Blanchett', 'Julianne Moore', 'Sophia Loren', 'Roberto Benigni', 'Ben Kingsley', 'Tom Hanks']]
