# 1.0 Object-Oriented Programming - Part I

## 1.1 Introduction

So far, we've learned a system of programming known as **Procedural Programming**. In its simplest definition, procedural programming involves writing code in a number of sequential steps — and sometimes we combine these steps into commands called **functions**.

In this chapter, we're going to learn about a new system: **object-oriented programming (OOP)**. Rather than code being designed around sequential steps, it is instead defined around **objects**. For now, you can think of objects as being closely related to variables. We'll learn a more formal definition on the next section.

>When working with data, it's much more common to use a style that is closer to procedural programming style than OOP, but it's very important to understand how OOP works, because **Python is an object-oriented language**.

This means almost everything in Python is actually an object; when you're working with Python, you are creating and manipulating objects. As you continue to learn to work with data in Python, you'll encounter objects everywhere:

  - **NumPy** and **pandas** — the two libraries essential to working with data in Python — both define a number of their own object types.
  - **Matplotlib** — which you use to create data visualizations — uses object types to define the charts you create.
  - **Scikit-learn** — which you use to create machine learning models — uses object types to represent the models you train and make predictions with.

While it's much less common for data scientists and data analysts to define new types of objects, we'll be using objects all the time. Understanding how objects work allows us to better understand what is happening behind the scenes as we work with data.

In OOP, objects have types, but instead of "type" we use the word **class**. So far, we've been using the word "type" to describe different variables:

- String type
- List type
- Dictionary type

Technically, the correct name for each of these is:

- String class
- List class
- Dictionary class

In everyday English, the word **class** refers to a group of similar things. In OOP, we use the word similarly — a **class** is a type of object.

When talking about programming, the words **'type'** and **'class'** are often used interchangeably, but **'class'** is used when talking more formally about objects. Throughout this chapter, we'll be using **'class'** instead of **'type'** as we learn about OOP.

Before now, we used the [type() function](https://docs.python.org/3.7/library/functions.html#type) to return the types of variables. Let's do a quick exploration exercise to refamiliarize ourselves with the output of the 'type' command.

**Exercise**

<img width="100" src="https://drive.google.com/uc?export=view&id=1E8tR7B9YYUXsU_rddJAyq0FrM0MSelxZ"/>


In the cell below, we have defined one variable of each of three types we've worked with so far, plus a new kind of object. Follow each of the instructions below, in order.

1. Use the print() function to display the type of the list l.
2. Use the print() function to display the type of the string s.
3. Use the print() function to display the type of the dictionary d.
4. Use the print() function to display the type of my_set.


In [1]:
l = [1, 2, 3]
s = "string"
d = {"a": 1, "b": 2}
my_set = {2, 3, 5}

# PUT YOUR CODE HERE

print(type(l))
print(type(s))
print(type(d))
print(type(my_set))

<class 'list'>
<class 'str'>
<class 'dict'>
<class 'set'>


## 1.2 Sets

In the exercise on the previous section, we saw a new type of object, which we named *my_set*. This type of object is called a **set**.

We created a **set** by separating its values with commas and encompassing it all within curly braces. Note that sets are different from dictionaries. In dictionaries, we have key value pairs between the curly braces; in sets, we just have its elements.

We can think of sets as unordered collections of objects without repetition.

- Unordered because it doesn't matter in which order the elements of a set are arranged upon creation, nor how they are displayed when we print a set:

In [2]:
order_1 = {0, 2, 3, 1}
order_2 = {1, 3, 2, 0}

print(order_1, order_2, sep="\n")
print("order_1 is equal to order_2: {ans}".format(ans=order_1==order_2))

{0, 1, 2, 3}
{0, 1, 2, 3}
order_1 is equal to order_2: True


- Without repetition because it can't have more than one of each element:


In [3]:
repetition_1 = {2, 3, 5, 2, 5}
repetition_2 = {2, 2, 3, 3, 3, 5, 5, 5, 5, 5}

print(repetition_1, repetition_2, sep="\n")
print("repetition_1 is equal to repetition_2: {ans}".format(ans=repetition_1==repetition_2)
)

{2, 3, 5}
{2, 3, 5}
repetition_1 is equal to repetition_2: True


**Sets** in Python try to model the concept of sets in mathematics. As such, we can do binary set operations like union (grouping together in one set the elements of two sets) and intersection (grouping together only the elements of both sets), among other operations like set difference and cartesian product.

Don't worry if any of these seem a bit alien to you, we will explain these concepts as needed later on.

For now, let's focus on the first two operations. We can represent the intersection of the sets *order_1* and *repetition_1* with the following Venn diagram:

<center>
<img width="400" src="https://drive.google.com/uc?export=view&id=1kwWZbvhB6kP3SQOzeHT20goXpajbCcg8"/></center>

To obtain the intersection of order_1 with repetition_1 in Python, we can use the [set.intersection()](https://docs.python.org/3.7/library/stdtypes.html#frozenset.intersection) like this:


In [4]:
o_intersect_r = order_1.intersection(repetition_1)
print(o_intersect_r)

{2, 3}


To find the union between these sets, we can use the method [set.union()](https://docs.python.org/3.7/library/stdtypes.html#frozenset.union) method in a similar manner:



In [5]:
o_union_r = order_1.union(repetition_1)
print(o_union_r)

{0, 1, 2, 3, 5}


Here's a Venn diagram of the union:

<center>
<img width="400" src="https://drive.google.com/uc?export=view&id=1JaZsBpmrm3wbhMKBGeN-f5gmFH2KnKwn"/></center>

We can also create sets from lists and other native Python objects, by using the **set** class:

In [6]:
fib_6 = set([1, 1, 2, 3, 5, 8])
print(fib_6)

{1, 2, 3, 5, 8}


You'll learn more about classes in this mission, for now it's enough to know the syntax of how to create a set. The example above presents us with another use case for sets: duplicate removal.

Another use case is testing membership. It works just like with lists:

In [7]:
print(0 in fib_6)
print(1 in fib_6)

False
True


We can also create an empty set and add elements to it:

In [8]:
even = set()
print(even)
even.add(0)
print(even)

set()
{0}


Let's practice with sets!

**Exercise**

<img width="100" src="https://drive.google.com/uc?export=view&id=1E8tR7B9YYUXsU_rddJAyq0FrM0MSelxZ"/>

In the cell below we created two lists.

1. Create a set whose elements are those of *tri_num_sequence*. Assign it to *trinum_5*.
2. Create a set whose elements are the positive odd numbers smaller than 20. Assign it to *odd_20*.
3. Create a set whose elements are the odd numbers in *trinum_5*. Assign it to *odd_trinum*.
4. Print *odd_trinum*.

In [9]:
tri_num_sequence = [1, 3, 6, 10, 15, 10, 6, 3, 1]
odd_numbers = [1, 3, 5, 7, 9, 11, 13, 15, 17, 19]

In [10]:
# PUT YOUR CODE HERE

trinum_5 = set(tri_num_sequence)

odd_20 = set(odd_numbers)

odd_trinum = set(trinum_5.intersection(odd_20))

print(odd_trinum)

{1, 3, 15}


## 1.3 Class and Objects

When we used the **type()** function on the first screen, it returned values labeled "class":

```python
<class 'list'>
<class 'str'>
<class 'dict'>
<class 'set'>
```

This helps us understand that "type" and "class" are used interchangeably. Because of this, it becomes clear that we've been using classes for some time already:

- Python lists are objects of the **list** class.
- Python strings are objects of the **str** class.
- Python dictionaries are objects of the **dict** class.
- Python sets are objects of the **set** class.


In the rest of this chapter, we're going to learn how classes work by creating one of our own. We're going to create a simple class called **NewList** and recreate some of the basic functionality of the Python **list** class.

As we mentioned earlier, it's less common for data scientists and data analysts to define new types of objects, but understanding how objects work behind the scenes will be extremely valuable to you as you continue to extend your Python knowledge and work with objects more.

Before we start, let's learn some more about objects and classes. Earlier, we said that an object is similar to a variable. Just like with 'class' and 'type,' the two terms are used interchangeably. We'll learn in more detail the subtle difference between objects and variables later, but for now, let's look at the relationship between objects and classes.

- An **object** is an entity that stores data.
- An object's **class** defines specific properties objects of that **class** will have.

One way to understand the difference between a class and an object in Python is by comparing them to real-world objects. We'll compare Python string objects to **Tesla electric cars**.


There are hundreds of thousands of Tesla cars around the world. Each car is similar in that it is a Tesla — it's not a Ford or Toyota — but at the same time, it is not necessarily identical to other Teslas. We would say that each of the cars are objects that belong to the **Tesla** class.

Tesla has a blueprint — or plan — for making their cars. The blueprint defines what the car is, what it does, and how — everything that makes the car unique. That said, the blueprint isn't a car, it's just all the information needed to create the car. Similarly, in Python, we have code blueprints for classes. These blueprints are **class definitions**.

## 1.4 Defining a Class

We define a class in a very similar way to how we define a function:

<center>
<img width="600" src="https://drive.google.com/uc?export=view&id=1AFoor12o92_0JnoXJuCuvNOUWdXUBNAU"/></center>

Just like a function, we use parentheses and a colon after the class name (():) when we define a class. Similarly, the body of our class is indented like a function's body is.

The rules for naming classes are the same as they are for naming functions and variables:

1. We must use only letters, numbers, or underscores.
  - We cannot use apostrophes, hyphens, whitespace characters, etc.
2. Class names can't start with a number.

That said, there is a convention used for variables and functions in Python called **Snake Case**, where all lowercase letters are used with underscores between: like_this. With classes, the convention is to use **Pascal Case**, where no underscores are used between words, and the first letter of each word is capitalized: **LikeThis**.

If we try to run either of the code examples in the diagram at the top of this section, above, we will get a **SyntaxError**, because Python doesn't let us define classes or functions when they are empty.

We can use the [pass statement](https://docs.python.org/3/reference/simple_stmts.html#pass) to avoid this error. The pass statement doesn't do anything, but it lets us define an empty code block. Let's take a look at what it looks like:

In [11]:
class MyClass():
    pass

The **pass** statement is useful if you're building something complex and you want to create a placeholder for a function that you will build out later without causing your code to error. We'll use the **pass** statement in this section to define an empty class without causing an error.

Let's use what we've learned to create the very first version of our **NewList** class!


**Exercise**

<img width="100" src="https://drive.google.com/uc?export=view&id=1E8tR7B9YYUXsU_rddJAyq0FrM0MSelxZ"/>

1. Define our new class called NewList().
  - Use the **pass** keyword in the body of our class to avoid a **SyntaxError**.

In [12]:
# PUT YOUR CODE HERE
class NewList():
  pass

## 1.5 Instantiating a Class

Earlier, we compared objects to Tesla cars to help us understand the distinction between a class and an object. Let's extend that analogy to help us understand more about classes.

In OOP, we use **instance** to describe each different object. Let's look at an example:

<center>
<img width="500" src="https://drive.google.com/uc?export=view&id=1cs2KF-JuSxfhFVi2X8QJUsMItuq30Vng"/></center>

The same can be said of Python strings. We might create two Python strings, and they can hold different values, but they work the same way:


<center>
<img width="500" src="https://drive.google.com/uc?export=view&id=1xywNZa1H3UXTWpA6yDOjmLdNRcIpRP0w"/></center>

Once we have defined our class, we can create an object of that class, which is known as **instantiation**. If you create an object of a particular class, the technical phrase for what you did is to "**Instantiate** an object of that class." Let's learn how to instantiate an instance of our new class:

In [13]:
my_class_instance = MyClass()

That single line of our code actually did two things:

- Instantiated an object of the class **MyClass**.
- Assigned that instance to the variable named **my_class_instance**.

To illustrate this more clearly, let's look at an example using Python's built-in integer class. In the previous mission, we used the syntax **int()** to convert numeric values stored as strings to integers. Let's look at a simple example of this in code and break down the syntax into parts, which we'll read right-to-left:

<center>
<img width="500" src="https://drive.google.com/uc?export=view&id=1ZTEGlFDRQBLxr_PYdgNB4bqq_NH8Nl9a"/></center>


The syntax to the right of the assignment operator (=) instantiates the object, and the assignment operator and variable name create the variable. This helps us understand some of the subtle differences between an object and a variable.

Keep in mind that in casual usage, "object" and "variable"' are commonly used interchangeably. The distinction is usually only important if you're talking about OOP concepts like classes.

In a moment, we'll redefine our **NewList** class and instantiate it for the first time. Before we do, we'll need to learn one more thing. Usually, we'd define a class without anything in the parentheses, like this:

In [14]:
class MyClass():
    pass

**Exercise**

<img width="100" src="https://drive.google.com/uc?export=view&id=1E8tR7B9YYUXsU_rddJAyq0FrM0MSelxZ"/>


1. Define a new class called **NewList**:
  - Use the **pass** keyword so our empty class does not raise a **SyntaxError**.
2. Create an instance of the **NewList** class. Assign it to the variable name **newlist_1**.
3. Print the type of the **newlist_1** variable.

In [15]:
# PUT YOUR CODE HERE

class NewList():
  pass

newlist_1 = NewList()

print(type(newlist_1))

<class '__main__.NewList'>


## 1.6 Creating Methods

Congratulations — you've created and instantiated your first class! Right now, your class isn't very exciting, because it doesn't do anything. In order to make our class do something, we need to define some **methods**. Methods allow objects to perform actions.

Relating back to our Tesla metaphor, an object of the Tesla "class" can do things like "unlock" and "accelerate". Similarly, Python strings have methods that can replace substrings, convert the case of the object, and more:

<center>
<img width="500" src="https://drive.google.com/uc?export=view&id=1M0M8MF9uvlYJozNLACRZWp1aPGM6_om4"/></center>


You can think of methods like special functions that belong to a particular class. This is why we call the replace method **str.replace()** — because the method belongs to the **str** class.

While a function can be used with any object, each class has its own set of methods. Let's look at an example using some Python built in classes:

In [16]:
my_string = "hello"   # an object of the str class
my_list = [1, 2, 3]   # an object of the list class

The list object has the **list.append()** method:

In [17]:
my_list.append(4)
print(my_list)

[1, 2, 3, 4]


The string object has the **str.replace()** method:



In [18]:
my_string = my_string.replace("h","H")
print(my_string)

Hello


We can't use a method from one class with the other class:

The syntax for creating a method is almost identical to when we create a function, except it is indented within our class definition. This is how we would define a simple method:

In [19]:
class MyClass():
    def greet():
        return "hello"

Let's create a simple method for our **NewList** class.

**Exercise**

<img width="100" src="https://drive.google.com/uc?export=view&id=1E8tR7B9YYUXsU_rddJAyq0FrM0MSelxZ"/>


1. Define a new class called **NewList()**.
2. Inside the class, define a method called **first_method()**.
3. Inside the method, return the string "This is my first method".
4. Create an instance of the **NewList** class. Assign it to the variable name **newlist**.

In [20]:
# PUT YOUR CODE HERE

class NewList():
  def first_method(self):
    return "This is my first method."

newlist=NewList()
newlist.first_method()

'This is my first method.'

On the previous section, we defined a class with a simple method, then created an instance of that class:

In [21]:
class NewList():
    def first_method(self):
        print("hello")

instance = NewList()

instance.first_method()

hello


Let's look at what happens when we call (run) that method:


In [22]:
instance.first_method()

hello


This error is a bit confusing. It says that one argument was given to **first_method()**, but when we called the method we didn't provide any arguments. It seems like there is a "phantom" argument being inserted somewhere. To understand what's happening, let's look at what happens behind the scenes when we call a method. We'll start by looking at our instance object containing a single method:

<center>
<img width="400" src="https://drive.google.com/uc?export=view&id=1vciwU_BT1fy2Bu1YVk6zJf3u47jyKOW8"/></center>

When we call the **first_method()** method belonging to the instance object, Python interprets that syntax and adds in an argument representing the instance we're calling the method on:

<center>
<img width="400" src="https://drive.google.com/uc?export=view&id=1uIjP6l-ROuqq8L5bPyHVtMdL2viseraH"/></center>

We can verify that this is the case by checking it with Python's built-in str type. We'll use **str.title()** to convert a string to title case.

In [23]:
# create a str object
s = "MY STRING"

# call `str.title() directly
# instead of `s.title()`
result = str.title(s)
print(result)

My String


The extra argument that Python has added, which is the instance itself, is what is causing our error. You might be wondering if we can prove that the extra argument is the object itself? Let's see if we can:

We'll start by:

- Defining a **MyClass** class with a **print_self** method that takes one argument, and then prints that argument.
- Instantiating an object of that class, and assigning it to **mc**.

In [24]:
class MyClass():
    def print_self(self):
        print(self)

mc = MyClass()

Next, let's print the **mc** object so we can understand what the object itself looks like when its printed:



In [25]:
mc.print_self()

<__main__.MyClass object at 0x7f62b210e410>


The same output was displayed both when we printed the object using the syntax **print(mc)** and when we printed the object inside the method using **print_self()** — which proves that this "phantom" argument is the object itself!

Technically, we can give this first argument — which is passed to every method — any parameter name we like. However, the convention is to call the parameter **self**. This is an important convention, as without it class definitions can get confusing.

Let's modify the class we created on the previous screen by adding **self** as an argument to our method. Then, let's call the method to make sure it runs without error.

**Exercise**

<img width="100" src="https://drive.google.com/uc?export=view&id=1E8tR7B9YYUXsU_rddJAyq0FrM0MSelxZ"/>

1. Modify the **first_method()** method by adding self as an argument.
2. Create an instance of the **NewList** class. Assign it to the variable name **newlist**.
3. Call **newlist.first_method()**. Assign the result to the variable **result**.

In [26]:
class NewList():
  def first_method(self):
    return "This is my first method"

# PUT YOUR CODE HERE

newlist = NewList()
result = newlist.first_method()

print(result)

This is my first method


## 1.8 Creating a Method That Accepts an Argument

On the previous section, we learned that:

- Methods have a "phantom" argument that gets passed to them when they are called.
- The "phantom" argument is actually the object itself.
- We need to include that in our method definition.
- The convention is to call the "phantom" argument **self**.

The method we worked with on the previous two screens didn't accept any arguments except the **self** argument. Like with functions, methods are often called with one or more arguments so that the method can use or modify that argument.

Let's create a method that accepts a string argument and then returns that string. The first argument will always be the object itself, so we'll specify **self** as the first argument, and the string as our second argument:

In [27]:
class MyClass():
    def return_string(self, string):
        return string

Let's instantiate an object and call our method. Notice how when we call it, we leave out the **self** argument, just like we did on the previous section:

In [28]:
mc = MyClass()
result = mc.return_string("Hey there!")
print(result)

Hey there!


Let's practice what we've learned by creating a simple method for our **NewList** class, which accepts a list and then returns it.


**Exercise**

<img width="100" src="https://drive.google.com/uc?export=view&id=1E8tR7B9YYUXsU_rddJAyq0FrM0MSelxZ"/>

1. Define a new class called NewList().
  - Use **NewList()** when defining the class, so we can perform answer checking on your class.
2. Inside the class, define a method called **return_list()**.
  - The method should accept a single argument **input_list** when called.
  - Inside the method, return **input_list**.
3. Create an instance of the **NewList** class, and assign it to the variable name **newlist**.
4. Call the **newlist.return_list()** method with the argument [1, 2, 3]. Assign the result to the variable **result**.

In [29]:
# PUT YOUR CODE HERE

class NewList():
  def return_list(self, input_list):
    return input_list
  
newlist=NewList()

newlist.return_list([1,2,3])

[1, 2, 3]

## 1.9 Attributes and the init method

The example we used in the previous section — a method that takes input and returns output without interacting with the object — isn't often used.

After all, we could do the same thing with a function without the hassle of defining a class and method. We used this example so you could practice creating a simple class with what you've learned so far.

The power of objects is in their ability to store data, and data is stored inside objects using **attributes**.

Relating back to our Tesla metaphor, an object of the Tesla "class" has attributes like their color, battery, and motor. Similarly, Python strings have attributes — the data stored inside the string:

<center>
<img width="600" src="https://drive.google.com/uc?export=view&id=1rlr_m8s8FnnG9wxEtUIx7R_gTqVLOevd"/></center>

You can think of attributes like special variables that belong to a particular class. Attributes let us store specific values about each instance of our class.

When we instantiate an object, most of the time we specify the data that we want to store inside that object. Let's look at an example of instantiating an int object:

In [30]:
my_int = int("3")

When we used **int()**, we provided the argument "3", which was converted and stored inside the object. We define what is done with any arguments provided at instantiation using the **init method**.

The init method — also called a **constructor** — is a special method that runs when an instance is created so we can perform any tasks to set up the instance.

The init method has a special name that starts and ends with two underscores: **\_\_init\_\_()**. Let's look at an example:

In [31]:
class MyClass():
    def __init__(self, string):
        print(string)

mc = MyClass("Hola!")

Hola!


Let's walk through how it works:

- We defined the **\_\_init\_\_()** method inside our class as accepting two arguments: **self** and **string**.
- Inside the **\_\_init\_\_()** method, we called the **print()** function on the string argument.
- When we instantiated **mc** — our **MyClass** object — we passed "Hola!" as an argument. The init function ran immediately, displaying the text "Hola!"

It's unusual to use **print()** inside an init method, but it helps us understand that the method has access to any arguments passed when we instantiate an object.

The init method's most common usage is to store data as an attribute:



In [32]:
class MyClass():
    def __init__(self, string):
        self.my_attribute = string

mc = MyClass("Hola!")

When we instantiate our new object, Python calls the init method, passing in the object:

<center>
<img width="400" src="https://drive.google.com/uc?export=view&id=12AHm1s1-UeEOXNH6AEM8Q6t5DP2gTYzW"/></center>

Our code didn't result in any output, but now we have stored "Hola" in the attribute **my_attribute** inside our object. Like methods, attributes are accessed using dot notation, but attributes don't have parentheses like methods do. Let's use dot notation to access the attribute:

In [33]:
print(mc.my_attribute)

Hola!


The table below summarizes some of the differences between **attributes** and **methods**:

|           	|      Purpose     	| Similar to 	|  Example Syntax  	|
|:---------:	|:----------------:	|:----------:	|:----------------:	|
| Attribute 	| Stores data      	| Variable   	| object.attribute 	|
|   Method  	| Performs actions 	| Function   	| object.method()  	|


Let's take a moment to summarize what we've learned so far:

- The power of objects is in their ability to store data.
- Data is stored as attributes inside objects.
- We access attributes using dot notation.
- To give attributes values when we instantiate objects, we pass them as arguments to a special method called **\_\_init\_\_()**, which runs when we instantiate an object.

We now have what we need to create a working version of our **NewList** class! This first version will:

- Accept an argument when you instantiate a **NewList** object.
- Use the init method to store that argument in an attribute: **NewList.data**.

Let's get started!

**Exercise**

<img width="100" src="https://drive.google.com/uc?export=view&id=1E8tR7B9YYUXsU_rddJAyq0FrM0MSelxZ"/>


1. Define a new class called **NewList()**.
2. Create an init method which accepts a single argument, **initial_state**.
3. Inside the init method, assign **initial_state** to an attribute called **data**.
4. Instantiate an object of your **NewList** class, providing the list [1, 2, 3, 4, 5] as an argument. Assign the object to the variable name **my_list**.
5. Use the **print()** function to display the data attribute of **my_list**.

In [34]:
# PUT YOUR CODE HERE

class NewList():
  def __init__(self, initial_state) -> None:
      self.data = initial_state

my_list=NewList([1,2,3,4,5])

print(my_list.data)

[1, 2, 3, 4, 5]


## 1.10 Creating an Append Method

It's time to create a method to transform the data stored in our **NewList** objects. We'll be recreating the functionality of the **list.append()** from the built-in Python list class.

Let's start by looking at an example of **list.append()** in action:

In [35]:
my_list = [1, 2, 3, 4]
my_list.append(5)
print(my_list)

[1, 2, 3, 4, 5]


The method:

- Accepts one argument.
- Changes the underlying value of the object, so the list contains one extra value, which is the argument it was passed.
- Doesn't return any value.


In order to create this method, we need a way to add one extra item to a list. One straightforward way is to add brackets around the second item, making it a list with a single item, then use the + operator to join those two lists:

In [36]:
my_list = [1, 2, 3]
new_item = 4
new_item_list = [new_item]
my_list = my_list + new_item_list
print(my_list)

[1, 2, 3, 4]


We now have everything we need to create the **NewList.append()** method. Remember that to access our attribute from within the method, we need to use **self.data**, just like we did with the **\_\_init\_\_()** method.


**Exercise**

<img width="100" src="https://drive.google.com/uc?export=view&id=1E8tR7B9YYUXsU_rddJAyq0FrM0MSelxZ"/>

1. To the **NewList** definition, add a new method **NewList.append()**.
  - The method will require a single argument to be passed to it when called.
  - The method will modify the **NewList.data** attribute by appending the argument to the list.
  - The method should not return any value.
2. Create an object of the **NewList** class, initializing the data in the object with [1, 2, 3, 4, 5]. Assign the object to the variable name **my_list**.
  - Use the **print()** function to display the **my_list.data** attribute.
3. Use the **NewList.append()** method to append the integer 6 to the **my_list** object.
  - Use the **print()** function to display the **my_list.data** attribute.


In [37]:
# The NewList definition from the previous
# screen is copied here for your convenience

# class NewList():
#     """
#     A Python list with some extras!
#     """
#     def __init__(self, initial_state):
#         self.data = initial_state

# PUT YOUR CODE HERE

class NewList():
  def __init__(self, initial_state):
    self.data = initial_state

  def append(self, append_list):
    self.data = self.data + [append_list]

my_list = NewList([1,2,3,4,5])

print(my_list.data)

my_list.append(6)

print(my_list.data)

[1, 2, 3, 4, 5]
[1, 2, 3, 4, 5, 6]


In [38]:
class NewList():
    """
    A Python list with some extras!
    """
    def __init__(self, initial_state):
        self.data = initial_state
    
    def append(self, new_item):
        """
        Append `new_item` to the NewList
        """
        self.data = self.data + [new_item]

my_list = NewList([1, 2, 3, 4, 5])
print(my_list.data)
my_list.append(6)
print(my_list.data)

[1, 2, 3, 4, 5]
[1, 2, 3, 4, 5, 6]


## 1.11 Creating and Updating an Attribute


Let's summarize the work we've done so far:

- We've created a **NewList** class which stores a list at the point of instantiation using the init constructor.
- We stored that list inside an attribute **NewList.data**.
- We've created a method — **NewList.append()** — which mimics the behavior of **list.append()**.

Right now, each behavior we've created for our **NewList** class is also something a regular Python list does. Now we're going to create some new functionality: a new attribute.

When we want to find the length of a list, we use the **len()** function. What if we created a new attribute, **NewList.length**, which stores the length of our list at all times? We can achieve this by adding some to the init method:

In [39]:
class NewList():
    """
    A Python list with some extras!
    """
    def __init__(self, initial_state):
        self.data = initial_state

        # we added code below this comment
        length = 0
        for item in self.data:
            length += 1
        self.length = length
        # we added code above this comment

    def append(self, new_item):
        """
        Append `new_item` to the NewList
        """
        self.data = self.data + [new_item]

Let's look at what happens when we use the **NewList.length** attribute as defined above:



In [40]:
my_list = NewList([1, 2, 3])
print(my_list.length)

my_list.append(4)
print(my_list.length)

3
3


Because the code we added that defined **NewList.length** was added only in the **init method**, if the list is made longer using the **append()** method, our **NewList.length** attribute is no longer accurate.

To address this, we need to run the code that calculates the length after any operation which modifies the data, which, in our case, is just the **append()** method.

Rather than writing the code out twice, we can add a helper method, which calculates the length, and just call that method in the appropriate places.

Here's a quick example of a helper method in action:

In [41]:
class MyBankBalance():
    """
    An object that tracks a bank
    account balance
    """

    def __init__(self, initial_balance):
        self.balance = initial_balance
        self.calc_string()

    def calc_string(self):
        """
        A helper method to update self.string
        """
        string_balance = "${:,.2f}".format(self.balance)
        self.string = string_balance

    def add_value(self, value):
        """
        Add value to the bank balance
        """
        self.balance += value
        self.calc_string()

mbb = MyBankBalance(3.50)
print(mbb.string)

$3.50


In this example, we created a helper method **MyBankBalance.calc_string()**, which calculates a string representation of our object's bank balance stored in the attribute **MyBankBalance.string**. We called that helper method from the init method so it updates based on the initial value.

We also called the helper method from the **MyBankBalance.add_value()** method, so the value updates whenever the balance is increased:

In [42]:
mbb.add_value(17.01)
print(mbb.string)

$20.51


In [43]:
mbb.add_value(5000)
print(mbb.string)

$5,020.51


You may have noticed we defined our helper method after our init method. We mentioned earlier that the order in which you define methods within a class doesn't matter, but there is a convention to order methods as follows:

1. Init method
2. Other methods

Now it's time for you to add the **NewList.length** attribute.

**Exercise**

<img width="100" src="https://drive.google.com/uc?export=view&id=1E8tR7B9YYUXsU_rddJAyq0FrM0MSelxZ"/>


1. Add a helper method called **calc_length()** to our **NewList** class, which calculates the length of the list stored in the object and stores it in a new **NewList.length** attribute.
2. Add the helper method to the end of both the **\_\_init\_\_()** and **append()** methods.
3. Test that the new attribute works as expected:
  - Create a new NewList object containing the list [1, 1, 2, 3, 5] and assigning it to the variable name **fibonacci**.
    - Use the **print()** function to display the length attribute of the **fibonacci** object.
  - Append the value 8 to **fibonacci**
    - Use the **print()** function to display the updated length attribute of the **fibonacci** object.

In [44]:
# PUT YOUR CODE HERE

class NewList():
  def __init__(self, initial_state):
      self.data = initial_state
      self.calc_length()
  
  def calc_length(self):
    length = 0
    for i in self.data:
      length += 1
    self.length = length

  def append(self, append_list):
    self.data = self.data + [append_list]
    self.calc_length()

fibonacci = NewList([1,1,2,3,5])

print(fibonacci.length)

fibonacci.append(8)

print(fibonacci.length)

5
6


# 2.0 Object-Oriented Programming - Part II

## 2.1 Solving Problems with Code

We've worked with variables, loops, lists, and other basic building blocks of programming. We know how to use them, but we need to begin identifying when they're appropriate.

Computer programming is an engineering discipline. A successful engineer must be able to think through complex problems and choose an optimal solution. This involves careful planning, some trial and error, and above all else, experience. It's important to practice programming so you can build an intuition for the tools and approaches that fit a situation best.

## 2.2 Defining Custom Classes

Let's take a look at how to use custom classes. We'll use them to explore data on NBA players from the 2013-2014 season. The statistics are in a CSV file with a header and some rows of data. It looks like this:

|     player    	| pos 	| age 	|          team         	|
|:-------------:	|:---:	|:---:	|:---------------------:	|
| Quincy Acy    	| SF  	| 23  	| TOT                   	|
| Steven Adams  	| C   	| 20  	| Oklahoma City Thunder 	|
| Jeff Adrien   	| PF  	| 27  	| TOT                   	|
| Arron Afflalo 	| SG  	| 28  	| Orlando Magic         	|

We need an easy way to represent both the players and the teams. Let's focus on how we can use custom classes to compare the average ages of the players on each team.

You can see in the starter code that we've defined a **Player** class and set up the default **\_\_init\_\_** method to accept a data row as an argument. We made a deliberate choice to split up the logic of players and teams so our code is easy to read and maintain. We also made the convenient choice to initialize our **Player** instances using a data row. That's because all of the information is present in a row, and it will make it easier to create **Player** objects from the data set later on.

**Exercise**

<img width="100" src="https://drive.google.com/uc?export=view&id=1E8tR7B9YYUXsU_rddJAyq0FrM0MSelxZ"/>

1. Create a **Team** class, initialize it with a team name, and store that team name in the instance property **team_name**.
2. Create an instance of the **Team** class with the team name "San Antonio Spurs", and assign it to **spurs**.

In [45]:
import pandas as pd

#Upload nba_players_2013
url = 'https://raw.githubusercontent.com/terrematte/deeplearning/main/week_01/Datasets/nba_players_2013.csv'

nba = pd.read_csv(url)
nba.head()

Unnamed: 0,player,pos,age,team,g,gs,mp,fg,fga,fg.,...,drb,trb,ast,stl,blk,tov,pf,pts,season,season_end
0,Quincy Acy,SF,23,TOT,63,0,847,66,141,0.468,...,144,216,28,23,26,30,122,171,2013-2014,2013
1,Steven Adams,C,20,Oklahoma City Thunder,81,20,1197,93,185,0.503,...,190,332,43,40,57,71,203,265,2013-2014,2013
2,Jeff Adrien,PF,27,TOT,53,12,961,143,275,0.52,...,204,306,38,24,36,39,108,362,2013-2014,2013
3,Arron Afflalo,SG,28,Orlando Magic,73,73,2552,464,1011,0.459,...,230,262,248,35,3,146,136,1330,2013-2014,2013
4,Alexis Ajinca,C,25,New Orleans Pelicans,56,30,951,136,249,0.546,...,183,277,40,23,46,63,187,328,2013-2014,2013


In [46]:
class Player():
    # The special __init__ function runs whenever a class is instantiated
    # The init function can take arguments, but self is always the first one
    # Self is just a reference to the instance of the class
    # It is automatically passed in when you instantiate an instance of the class
    def __init__(self, data_row):
        self.player_name = data_row[0]
        self.position = data_row[1]
        self.age = data_row[2]
        self.team = data_row[3]

# Initialize a player using the first row of our data set
first_player = Player(nba.iloc[0])

# Implement the Team class

## 2.3 More Interesting Instance Properties

Now that we have a **Team** class with a team name, we can also store a team roster within each **Team** instance.

We'll represent a roster as a list of Player instances. We can write code inside the **\_\_init\_\_** method to run some initialization logic.

We've loaded our data set of NBA players into the **nba** variable.

**Exercise**

<img width="100" src="https://drive.google.com/uc?export=view&id=1E8tR7B9YYUXsU_rddJAyq0FrM0MSelxZ"/>

Modify the **\_\_init\_\_** method of the **Team** class to loop through our data set and add a player to the roster every time the row's team name matches the instance's **team_name**.

- You can add an item to a list using **.append(item)**.
Store the "San Antonio Spurs" team in spurs.

In [47]:
class Player():
    # The special __init__ function runs whenever a class is instantiated
    # The init function can take arguments, but self is always the first one
    # Self is just a reference to the instance of the class
    # It is automatically passed in when you instantiate an instance of the class
    def __init__(self, data_row):
        self.player_name = data_row[0]
        self.position = data_row[1]
        self.age = int(data_row[2])
        self.team = data_row[3]

# Initialize a player using the first row of our data set
first_player = Player(nba.iloc[0])

class Team():
    def __init__(self, team_name):
        self.team_name = team_name
        # Team roster initially empty
        self.roster = []
        # Find the players for the roster in the data set

## 2.4 Instance Methods

The **Player** and **Team** classes we've defined serve as blueprints that we can use to create instances of these classes. **Classes** and the **instances** of those classes, which are collectively known as objects, are fundamental to object-oriented programming.

We can define some of our own methods on a class. For example, if we want to compute the average age of the players on a team, we would write a method for the **Team** class that does this. However, because this number can be different for each team, we want to make sure the method acts individually on specific instances of the **Team** class. We call these methods **instance methods**.

For method declarations, the first argument to the method is always **self**, even though we don't explicitly pass in **self** when we call the method. **self** is a reference to the current object we're working with. It's useful when we want to access properties of that object within the method we're defining.

**Exercise**

<img width="100" src="https://drive.google.com/uc?export=view&id=1E8tR7B9YYUXsU_rddJAyq0FrM0MSelxZ"/>

Write an **average_age()** method for the **Team** class that computes the average age of the **Team** instance.

- We've provided a method, **num_players**, that returns the total number of players on a **Team** instance.

Store the result of calling **average_age()** on the "San Antonio Spurs" team in **spurs_avg_age**.

In [48]:
class Player():
    # The special __init__ function runs whenever a class is instantiated
    # The init function can take arguments, but self is always the first one
    # Self is just a reference to the instance of the class 
    # It's automatically passed in when you instantiate an instance of the class
    def __init__(self, data_row):
        self.player_name = data_row[0]
        self.position = data_row[1]
        self.age = int(data_row[2])
        self.team = data_row[3]

class Team():
    def __init__(self, team_name):
        self.team_name = team_name
        # Team roster initially empty
        self.roster = []
        # Find the players for the roster in the data set
        for index, row in nba.iterrows(): 
            if row.iloc[3] == self.team_name:
                self.roster.append(Player(row))
    def num_players(self):
        count = 0
        for player in self.roster:
            count += 1
        return count
    # Implement the average_age() instance method
    
    def average_age(self):
      age = 0
      for player in self.roster:
        age += player.age

      avg_age = age/self.num_players()
      return avg_age

    
spurs = Team("San Antonio Spurs")
spurs_num_players = spurs.num_players()
spurs_avg_age = spurs.average_age()

print(spurs_num_players)
print(spurs_avg_age)

14
28.428571428571427


## 2.5 Class Methods

In traditional **object-oriented programming**, everything (yes, everything) is an object. Integers are objects, and so are Booleans. While Python isn't quite this object-oriented, objects are nonetheless abundant in the Python language. For example, the **math.floor** function is really just a class method for the math class. Class methods act on an entire class, rather than a particular instance of one. We often use them as utility functions.

Notice in the starter code that we've rewritten our **average_age()** method to use the **math** class, along with a list comprehension. This is somewhat advanced Python code, but you've seen all of it before. The **math.fsum** method acts on the math class, takes an iterable (i.e., a list or list-like) argument, and sums the values in the list to produce a result.

Notice also that we've begun writing a class method for you. The **@classmethod** line that appears above it tells the Python interpreter that the method is a class method. You'll need to follow this pattern whenever you declare class methods.

**Exercise**

<img width="100" src="https://drive.google.com/uc?export=view&id=1E8tR7B9YYUXsU_rddJAyq0FrM0MSelxZ"/>

- Modify the **older_team** method to return the team with the greatest average age.
- Store the result of calling **older_team** on the "New York Knicks" team and the "Miami Heat" team in **old_team**.
- Read through all of the code we've written so far for our **Team** class. It's full of advanced Python concepts that will be very useful to you.

In [49]:
import math

class Player():
    # The special __init__ function runs whenever a class is instantiated
    # The init function can take arguments, but self is always the first one
    # Self is just a reference to the instance of the class
    # It's automatically passed in when you instantiate an instance of the class
    def __init__(self, data_row):
        self.player_name = data_row[0]
        self.position = data_row[1]
        self.age = int(data_row[2])
        self.team = data_row[3]

class Team():
    def __init__(self, team_name):
        self.team_name = team_name
        self.roster = []
        for index, row in nba.iterrows(): 
            if row[3] == self.team_name:
                self.roster.append(Player(row))
    
    def num_players(self):
        count = 0
        for player in self.roster:
            count += 1
        return count
   
    def average_age(self):
        return math.fsum([player.age for player in self.roster]) / self.num_players()
    
    @classmethod
    def older_team(self, team_1, team_2):
      flag = ""
      team1 = Team(team_1)
      team2 = Team(team_2)
      if(team1.average_age() >= team2.average_age()):
        flag = team1.team_name
      else:
        flag = team2.team_name
      return flag

team = Team("New York Knicks")
x = team.older_team("New York Knicks", "Miami Heat")

print(x)

Miami Heat


## 2.6 Understanding Inheritance

In object-oriented programming, the concept of **inheritance** enables us to organize classes in a tree-like hierarchy, where the **parent** class has some traits that it passes on to its descendants. When we define a class, we specify a **parent** class from which it inherits. Inheriting from a class means that the behavior of the parent also exists in the child, but that the child can still define its own additional behavior.

Consider a **Player** class with generic information about NBA players. This would be very useful because players have a lot of things in common. However, we may also want to add specific behavior for different positions. We can define classes like **Center**, **Forward**, or **Point Guard**, each with behavior that's specific to that position. These classes would each specify **Player** as its parent class. They would all be siblings -- each would inherit the same behaviors from the **Player** class, while also having special behaviors of their own.

In Python 3, every class is a subclass of a generic **object** class. While this happens automatically when we don't specify an ancestor, it's sometimes good practice to be explicit. For the remainder of this chapter, we'll specify when a class has **object** as its parent while we code. This is a good programming practice -- if we get into the habit of specifying a class's ancestry, we won't forget to specify a parent when it's something other than object. It's simply a way to form good habits.

## 2.7 Overloading Inherited Behavior

When a class inherits from a parent class, it acquires all of the behavior of that parent class. There are times when we don't want all of that behavior, though, or want to modify it slightly for our custom class. We use a technique called **overloading** to accomplish this.

Overloading inherited behavior involves assigning new behavior to our custom class. To accomplish this, we just redefine the method on our new class.

We'll be altering our **Player** class to support comparisons that use these operators:

- \>
- <
- ==
- !=
- \>=
- <=

These methods already exist in the **object** class by default, and we've used these operators to compare integers, floating point numbers (decimals), and strings. The operators work because classes like **string** have implemented them specifically. It's a bit difficult to understand why the **object** class would need to have these methods, however. The best way to wrap your head around this is through an example.

Let's consider the addition operator (+). The **object** class already defines a method for addition. The **sum()** function is defined using this addition method, but the **object** class doesn't really know how to add integers or floating points specifically.

However, the integer and float classes define their own addition method (thus overloading the **object's** addition method), and the **sum()** function will add the values together properly. This architecture is very powerful, because even though **sum()** only had to be defined once, we can call it on a multitude of classes and it will result in proper behavior. This is an example of the power of inheritance and overloading.

**Exercise**

<img width="100" src="https://drive.google.com/uc?export=view&id=1E8tR7B9YYUXsU_rddJAyq0FrM0MSelxZ"/>


- Read the implementation of the **\_\_lt\_\_** (less than, or <) method of our **Player** class. In this exercise, we'll use comparisons to compare players by age.
- Implement **\_\_gt\_\_** (greater than, or \>), **\_\_le\_\_** (<=), **\_\_ge\_\_** (\>=), **\_\_eq\_\_** (==), and **\_\_ne\_\_** (!=).
- Store the result of evaluating carmelo **!=** kobe in **result**.

In [50]:
class Player(object):
    # The special __init__ function runs whenever a class is instantiated
    # The init function can take arguments, but self is always the first one
    # Self is just a reference to the instance of the class
    # It is automatically passed in when you instantiate an instance of the class
    def __init__(self, data_row):
        self.player_name = data_row[0]
        self.position = data_row[1]
        self.age = int(data_row[2])
        self.team = data_row[3]
    def __lt__(self, other):
        return self.age < other.age
    # Implement the rest of the comparison operators here
    def __gt__(self, other):
      return self.age > other.age
    
    def __le__(self, other):
      return self.age <= other.age

    def __ge__(self, other):
      return self.age >= other.age

    def __eq__(self, other):
      return self.age == other.age
    
    def __ne__(self, other):
      return self.age != other.age

carmelo = Player(nba.iloc[17])
kobe = Player(nba.iloc[68])

result = carmelo.__ne__(kobe)
ge = carmelo.__ge__(kobe)
le = carmelo.__le__(kobe)
print(result)
print(ge)
print(le)

True
False
True


## 2.8 Comparing Average Ages

We've seen that we can overload operators for custom classes. On the last screen, we were able to compare NBA players by age using several comparison operators (>, <, ==, etc). The ability to overload behavior is extremely powerful because many built-in Python functions use these simple operators. If we implement them on a custom class, we can use functions like **min** and **max** on instances of our **Player** class. **min** takes a list of values and returns the minimum value. **max** takes a list of values and returns the maximum value.

Our original goal was to compare NBA teams based on average ages. We saw how we could overload methods in our **Player** class, and now it's time to do the same for our **Team** class.



**Exercise**

<img width="100" src="https://drive.google.com/uc?export=view&id=1E8tR7B9YYUXsU_rddJAyq0FrM0MSelxZ"/>


- Overload the same six methods we wrote for the **Player** class on the **Team** class, this time comparing average ages for teams.
- The methods are **\_\_lt\_\_** (<), **\_\_gt\_\_** (>), **\_\_le\_\_** (<=), **\_\_ge\_\_** (>=), **\_\_eq\_\_** (==), and **\_\_ne\_\_** (!=).
  - Each should take a **self** parameter and an other parameter.
self and other are two instances of the Team class, whose average ages we want to compare.
- Compare the "Utah Jazz" and "Detroit Pistons". Store the older team in **older_team**.
- Now that we've implemented comparison operators, we can take advantage of the **max** function to get our maximum value.

In [51]:
import math

class Team(object):
    def __init__(self, team_name):
        self.team_name = team_name
        self.roster = []
        for index, row in nba.iterrows():
            if row[3] == self.team_name:
                self.roster.append(Player(row))

    def num_players(self):
        count = 0
        for player in self.roster:
            count += 1
        return count
        
    def average_age(self):
        return math.fsum([player.age for player in self.roster]) / self.num_players()
    # Define operators here

    def __lt__(self, other):
      return self.average_age() < other.average_age()
    # Implement the rest of the comparison operators here
    def __gt__(self, other):
      return self.average_age() > other.average_age()
    
    def __le__(self, other):
      return self.average_age() <= other.average_age()

    def __ge__(self, other):
      return self.average_age() >= other.average_age()

    def __eq__(self, other):
      return self.average_age() == other.average_age()
    
    def __ne__(self, other):
      return self.average_age() != other.average_age()

team1 = Team("Utah Jazz")
team2 = Team("Detroit Pistons")

if (team1.__ge__(team2)):
  older_team = team1
else:
  older_team = team2

print(older_team.team_name)

Utah Jazz


## 2.9 Oldest NBA Team

A lot of interesting information is readily available to us now that we've implemented the comparison operations. That's because Python uses these comparisons to implement many utility functions. Now we're able to use those functions to analyze data in a new setting. By overloading methods, we've given ourselves access to powerful functions without having to implement tedious logic.


**Exercise**

<img width="100" src="https://drive.google.com/uc?export=view&id=1E8tR7B9YYUXsU_rddJAyq0FrM0MSelxZ"/>

- Alter the list comprehension we've indicated below so that the **teams** variable contains a list of all the teams in **team_names**.
- Use **max** to store the oldest team in **oldest_team**.
- Use min to store the youngest team in **youngest_team**.
- Use **sorted** to store a list of teams (ordered from youngest to oldest) in **sorted_teams**.



In [52]:
import math

class Team(object):
    def __init__(self, team_name):
        self.team_name = team_name
        self.roster = []
        for index, row in nba.iterrows():
            if row[3] == self.team_name:
                self.roster.append(Player(row))
    def num_players(self):
        count = 0
        for player in self.roster:
            count += 1
        return count
        
    def average_age(self):
        return math.fsum([player.age for player in self.roster]) / self.num_players()
    def __lt__(self, other):
        return self.average_age() < other.average_age()
    def __gt__(self, other):
        return self.average_age() > other.average_age()
    def __le__(self, other):
        return self.average_age() <= other.average_age()
    def __ge__(self, other):
        return self.average_age() >= other.average_age()
    def __eq__(self, other):
        return self.average_age() == other.average_age()
    def __ne__(self, other):
        return self.average_age() != other.average_age()

team_names = ["Boston Celtics", "Brooklyn Nets", "New York Knicks", "Philadelphia 76ers", "Toronto Raptors", 
         "Chicago Bulls", "Cleveland Cavaliers", "Detroit Pistons", "Indiana Pacers", "Milwaukee Bucks",
         "Atlanta Hawks", "Charlotte Hornets", "Miami Heat", "Orlando Magic", "Washington Wizards",
         "Dallas Mavericks", "Houston Rockets", "Memphis Grizzlies", "New Orleans Pelicans", "San Antonio Spurs",
         "Denver Nuggets", "Minnesota Timberwolves", "Oklahoma City Thunder", "Portland Trail Blazers", "Utah Jazz",
         "Golden State Warriors", "Los Angeles Clippers", "Los Angeles Lakers", "Phoenix Suns", "Sacramento Kings"]

# Alter this list comprehension
teams = list(["Change this expression" for name in team_names])

## 2.10 Discussion

To solve our problem, we chose an implementation that cleanly separated the idea of a **Player** vs. a **Team**. By doing so, we wrote organized and sensible code that wasn't too difficult to keep track of.

By implementing comparison operators, we were able to identify the oldest and youngest teams in a very efficient manner. We could even rank NBA teams by age with a single line of code. This is the power of object-oriented programming, and it highlights the importance of choosing our implementation wisely.

# 3.0 Functional Programming

## 3.1 Introduction

In this chapter, we will describe a new paradigm of programming called **functional programming**. We will compare it with **object-oriented programming** (classes, objects, and state), and show how Python gives you the ability to switch between the two. 

Let's run through an example of how we have been writing our programs so far.

Suppose we wanted to create a line counter class that took in a file, read each line, then counts the amount of lines. The class could look something like the following:

In [53]:
class LineCounter:
    def __init__(self, filename):
        self.file = open(filename, 'r')
        self.lines = []

    def read(self):
        self.lines = [line for line in self.file]

    def count(self):
        return len(self.lines)

While not the best implementation, it does provide an insight into object-oriented design. Within the class, there are the familiar concepts of methods and properties. The properties set and retrieve the state of the object, and the methods manipulate that state.

For both these concepts to work, the object's state must change over time. This change of state is evident in the lines property after calling the read() method. As an example, here's how we would use this class:

In [54]:
# example_file.txt contains 100 lines.
lc = LineCounter('example_log.txt')
print(lc.lines)

[]


In [55]:
print(lc.count())

0


In [56]:
# The lc object must read the file to
# set the lines property.
lc.read()
# The `lc.lines` property has been changed.
# This is called changing the state of the lc
# object.
print(lc.lines)

['200.155.108.44 - - [30/Nov/2017:11:59:54 +0000] "PUT /categories/categories/categories HTTP/1.1" 401 963 "http://www.yates.com/list/tags/category/" "Mozilla/5.0 (Windows CE) AppleWebKit/5332 (KHTML, like Gecko) Chrome/13.0.864.0 Safari/5332"\n', '36.139.255.202 - - [30/Nov/2017:11:59:54 +0000] "PUT /search HTTP/1.1" 404 171 "https://www.butler.org/main/tag/category/home.php" "Mozilla/5.0 (Macintosh; PPC Mac OS X 10_5_0) AppleWebKit/5332 (KHTML, like Gecko) Chrome/15.0.813.0 Safari/5332"\n', '50.112.115.219 - - [30/Nov/2017:11:59:54 +0000] "POST /main/blog HTTP/1.1" 404 743 "http://deleon-bender.com/categories/category.html" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_5_5 rv:2.0; apn-IN) AppleWebKit/531.48.1 (KHTML, like Gecko) Version/4.0 Safari/531.48.1"\n', '204.132.56.4 - - [30/Nov/2017:11:59:54 +0000] "POST /list HTTP/1.1" 404 761 "http://smith.com/category.htm" "Opera/9.39.(Windows 98; Win 9x 4.90; mn-MN) Presto/2.9.163 Version/12.00"\n', '233.154.7.24 - - [30/Nov/2017:11:59:54 +

In [57]:
print(lc.count())

10000


The ever changing state of an object is both its blessing and curse. To understand why a changing state can be seen as a negative, we have to introduce an alternative. The alternative is to build the line counter as a series of independent functions.

In the exercise, we'll be using an example log file containing log lines from a web server. Think of it as a file containing a list of devices that have accessed a website.

**Exercise**

<img width="100" src="https://drive.google.com/uc?export=view&id=1E8tR7B9YYUXsU_rddJAyq0FrM0MSelxZ"/>


- Write two functions:
  - **read()**: takes in a filename, reads the file, then returns a list of lines from the file.
  - **count()**: takes in a list, and returns its length.
- Call **read()** on the **example_log.txt** file, and assign the return value to a **example_lines** variable.
- Call **count()** on **example_lines**, and assign the return value to the **lines_count** variable.

In [58]:
# PUT YOUR CODE HERE
class LineCounter:
    def __init__(self, filename):
        self.file = open(filename, 'r')
        self.lines = []

    def read(self):
        self.lines = [line for line in self.file]

    def count(self):
        return len(self.lines)


lc = LineCounter('example_log.txt')
example_lines = lc.read()
lines_count = lc.count()

## 3.2 Understanding Pure Functions

In the previous example, we were able to count the lines only with the use of functions. When we only use functions, we call it a functional approach to programming and it is, non-excitingly, called [functional programming](https://docs.python.org/3.6/howto/functional.html). In functional programming functions are **stateless** and rely only on their given inputs to produce an output.

Functions that meet the criteria for functional programming are called **pure functions**. Here's an example to highlight the difference between pure functions, and non-pure:

In [59]:
# Create a global variable `A`.
A = 5

def impure_sum(b):
    # Adds two numbers, but uses the
    # global `A` variable.
    return b + A

def pure_sum(a, b):
    # Adds two numbers, using
    # ONLY the local function inputs.
    return a + b

print(impure_sum(6))

11


In [60]:
print(pure_sum(4, 6))

10


The benefit of using pure functions over impure (non-pure) functions is the reduction of side effects. Side effects occur when there are changes performed within a function's operation that are outside its scope. For example, they occur when we change the state of an object, perform any I/O operation, or even call **print()**

In [61]:
def read_and_print(filename):
    with open(filename) as f:
        # Side effect of opening a
        # file outside of function.
        data = [line for line in f]
    for line in data:
        # Call out to the operating system
        # "println" method (side effect).
        print(line)

Programmers reduce side effects in their code to make it easier to follow, test, and debug. The more side effects a codebase has, the harder it is to step through a program and understand its sequence of execution.

While it's convenient to try and eliminate all side effects, they're often used to make programming easier. If we were to ban all side effects, then you wouldn't be able to read in a file, call print, or even assign a variable within a function. Advocates for functional programming understand this tradeoff, and try to eliminate side effects where possible without sacrificing development implementation time.


## 3.3 The Lambda Expression

Instead of the def syntax for function declaration, we can use a [lambda expression](https://docs.python.org/3.5/tutorial/controlflow.html#lambda-expressions) to write Python functions. The lambda syntax closely follows the **def** syntax, but it's not a 1-to-1 mapping. Here's an example of building a function that adds two numbers:

In [62]:
# Using `def` (old way).
def old_add(a, b):
    return a + b

# Using `lambda` (new way).
new_add = lambda a, b: a + b

print(old_add(10, 5) == new_add(10, 5))

True


The **lambda** expression takes in a comma separated sequences of inputs (like **def**). Then, immediately following the colon, it returns the expression without using an explicit return statement. Finally, when assigning the **lambda** expression to a variable, it acts exactly like a Python function, and can be called using the function call syntax: **new_add()**.

If we didn't assign **lambda** to a variable name, it would be called an **anonymous function**. These anonymous functions are extremely helpful, especially when using them as an input for another function. For example, the [sorted()](https://docs.python.org/3/howto/sorting.html#key-functions) function takes in an optional **key** argument (a function) that describes how the items in a list should be sorted.

In [63]:
unsorted = [('b', 6), ('a', 10), ('d', 0), ('c', 4)]

# Sort on the second tuple value (the integer).
print(sorted(unsorted, key=lambda x: x[1]))

[('d', 0), ('c', 4), ('b', 6), ('a', 10)]


**Exercise**

<img width="100" src="https://drive.google.com/uc?export=view&id=1E8tR7B9YYUXsU_rddJAyq0FrM0MSelxZ"/>

- Call **sorted** on the **lines** variable and in the **key** argument sort on:
  - Split the line on empty spaces ' '.
  - Return the 6th element on the split line.
- Assign the **sorted** return value to a **sorted_lines** variable.
- Print the **sorted_lines** variable

In [64]:
def read(filename):
    with open(filename, 'r') as f:
        return [line for line in f]
    
lines = read('example_log.txt')
# PUT YOUR CODE HERE
sorted_lines = sorted(lines, key = lambda x : x.split(' ')[5])
print(sorted_lines)

['233.154.7.24 - - [30/Nov/2017:11:59:54 +0000] "GET /app HTTP/1.1" 404 526 "http://www.cherry.com/main.htm" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/5360 (KHTML, like Gecko) Chrome/13.0.839.0 Safari/5360"\n', '97.218.117.229 - - [30/Nov/2017:11:59:55 +0000] "GET /blog/tags/tag HTTP/1.1" 401 980 "http://herrera-ayala.com/list/wp-content/register.htm" "Opera/9.25.(X11; Linux x86_64; ml-IN) Presto/2.9.163 Version/10.00"\n', '124.66.196.14 - - [30/Nov/2017:11:59:55 +0000] "GET /wp-content/main HTTP/1.1" 401 514 "https://anderson.com/about.html" "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/5321 (KHTML, like Gecko) Chrome/14.0.888.0 Safari/5321"\n', '183.186.55.245 - - [30/Nov/2017:11:59:55 +0000] "GET /tags/tag/wp-content HTTP/1.1" 401 866 "https://www.edwards-santos.org/posts/tags/homepage.html" "Mozilla/5.0 (Windows NT 5.0; uk-UA; rv:1.9.2.20) Gecko/2010-01-29 13:54:00 Firefox/3.8"\n', '70.163.195.102 - - [30/Nov/2017:11:59:55 +0000] "GET /app HTTP/1.1" 401 984 "https://turner.com/homep

## 3.4 The Map Function

While the ability to pass in functions as arguments is not unique to Python, it is a recent development in programming languages. Functions that allow for this type of behavior are called **first-class functions**. Any language that contains first-class functions can be written in a functional style.

There are a set of important first-class functions that are commonly used within the functional paradigm. These functions take in a Python [iterable](https://docs.python.org/3/glossary.html#term-iterable), and, like **sorted**, apply a function for each element in the list. Over the next few screens, we will examine each of these functions, but they all follow the general form of **function_name(function_to_apply, iterable_of_elements)**.

The first function we'll work with is the **map()** function. The **map()** function takes in an iterable (ie. list), and creates a new iterable object, a special **map** object. The new object has the first-class function applied to every element.

```python
# Pseudocode for map.
def map(func, seq):
    # Return `Map` object with
    # the function applied to every
    # element.
    return Map(
        func(x)
        for x in seq
    )
```

Here's how we could use **map()** to add 10 or 20 to every element in a list:



In [65]:
values = [1, 2, 3, 4, 5]

# Note: We convert the returned map object to
# a list data structure.
add_10 = list(map(lambda x: x + 10, values))
add_20 = list(map(lambda x: x + 20, values))

print(add_10)

[11, 12, 13, 14, 15]


In [66]:
print(add_20)

[21, 22, 23, 24, 25]


Note that it's important to cast the return value from **map()** as a list object. Using the returned **map** object is difficult to work with if you're expecting it to function like a list. First, printing it does not show each of its items, and secondly, you can only iterate over it once.

**Exercise**

<img width="100" src="https://drive.google.com/uc?export=view&id=1E8tR7B9YYUXsU_rddJAyq0FrM0MSelxZ"/>


- Map each line in the **lines** variable to its corresponding IP address:
 - Split the line on empty spaces ' '.
 - Return the first element on the split line.
- Cast the mapped object to a list, and assign it to the **ip_addresses** variable.
- Print the **ip_addresses** variable

In [67]:
lines = read('example_log.txt')
ip_addresses = list(map(lambda x:x.split(' ')[0], lines))

print(ip_addresses)

['200.155.108.44', '36.139.255.202', '50.112.115.219', '204.132.56.4', '233.154.7.24', '241.220.141.78', '191.198.138.97', '172.40.187.145', '225.119.46.80', '97.218.117.229', '4.31.18.29', '124.66.196.14', '103.40.29.163', '215.73.240.165', '236.187.70.48', '183.186.55.245', '74.191.205.248', '70.163.195.102', '94.75.8.56', '246.104.173.21', '216.22.182.174', '182.155.179.87', '127.234.203.89', '228.32.87.90', '68.239.93.169', '250.222.65.128', '139.187.167.17', '160.90.141.49', '218.251.100.198', '197.84.86.14', '26.226.73.67', '33.131.176.95', '113.49.82.235', '248.217.222.140', '206.152.183.187', '50.225.166.157', '103.208.34.36', '236.176.115.9', '5.237.70.145', '180.196.148.112', '164.136.42.138', '4.186.143.85', '56.154.68.234', '82.156.42.167', '224.130.32.21', '216.223.205.192', '108.138.6.235', '54.163.144.75', '75.249.126.226', '26.13.166.162', '101.2.241.170', '193.234.34.65', '28.245.152.27', '25.182.230.255', '59.1.202.120', '118.150.73.152', '254.147.29.31', '7.205.198.1

## 3.5 The Filter Function

The second function we'll work with is the **filter()** function. The **filter()** function takes in an iterable, creates a new iterable object (again, a special **map** object), and a first-class function that must return a bool value. The new **map** object is a filtered iterable of all the elements that returned **True**.

```python
# Pseudocode for filter.
def filter(evaluate, seq):
    # Return `Map` object with
    # the evaluate function applied to every
    # element.
    return Map(
        x for x in seq
        if evaluate(x) is True
    )
```

Here's how we could filter odd or even values from a list:




In [89]:
values = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

# Note: We convert the returned filter object to
# a list data structure.
even = list(filter(lambda x: x % 2 == 0, values))
odd = list(filter(lambda x: x % 2 == 1, values))

print(even)

[2, 4, 6, 8, 10]


In [69]:
print(odd)

[1, 3, 5, 7, 9]


**Exercise**

<img width="100" src="https://drive.google.com/uc?export=view&id=1E8tR7B9YYUXsU_rddJAyq0FrM0MSelxZ"/>


- Filter each line in the **ip_addresses** list to IP addresses that begin with less than or equal to 20.
- Cast the filtered object to a list, and assign it to the **filtered_ips** variable.
- Print the **filtered_ips** variable.

In [70]:
lines = read('example_log.txt')
ip_addresses = list(map(lambda x: x.split()[0], lines))

In [90]:
# PUT YOUR CODE HERE

filtered_ips = list(filter(lambda x: int(x.split('.')[0]) <= 20, ip_addresses))

print(filtered_ips)

['4.31.18.29', '5.237.70.145', '4.186.143.85', '7.205.198.134', '2.98.108.99', '20.123.163.219', '17.192.186.123', '19.137.101.141', '5.175.199.96', '20.245.211.32', '19.219.60.226', '3.208.33.106', '16.153.33.150', '19.190.115.218', '0.153.103.45', '12.194.245.108', '2.96.57.90', '4.223.194.188', '6.0.14.128', '12.115.103.178', '5.181.86.241', '3.29.197.164', '1.240.103.127', '9.49.15.69', '9.168.30.192', '1.149.44.202', '6.71.195.192', '2.189.10.201', '7.145.173.45', '0.254.152.197', '5.40.110.55', '5.38.234.254', '17.92.13.79', '12.48.79.71', '11.36.42.176', '6.72.235.92', '4.68.239.140', '1.30.105.246', '0.177.146.178', '3.172.107.70', '2.176.114.240', '20.189.165.83', '5.8.146.209', '14.0.215.165', '9.238.11.123', '6.84.193.66', '5.40.104.181', '18.154.65.30', '2.220.43.118', '8.59.183.202', '10.182.249.118', '16.204.153.175', '11.229.61.134', '16.209.56.250', '5.183.124.144', '0.49.184.87', '18.62.245.157', '10.56.168.95', '0.115.200.130', '18.165.209.126', '6.59.80.184', '1.132.

## 3.6 The Reduce Function - optional

The last function we'll look at is the **reduce()** function from the [functools](https://docs.python.org/3/library/functools.html) package. The **reduce()** function takes in a function and an iterable object such as a list. It will then reduce the list to a single value by successively applying the given function. It will first apply it on the first two elements and replace them by the result. Then it will apply the function on the first result and the next element and so on until a single value remains.

Here's an example of how we can use **reduce()** to sum all elements in a list.

In [72]:
from functools import reduce

values = [1, 2, 3, 4]

summed = reduce(lambda a, b: a + b, values)
print(summed)

10


<center><img width="400" src="https://drive.google.com/uc?export=view&id=1IaMgnHFeTItMAFrbcrqXT_YpgCllOFZC"/></center>

An interesting note to make is that you do not have to operate on the second value in the lambda expression. For example, you can write a function that always returns the first value of an iterable:

In [73]:
values = [1, 2, 3, 4, 5]

# By convention, we add `_` as a placeholder for an input
# we do not use.
first_value = reduce(lambda a, _: a, values)
print(first_value)

1


Another important aspect of **reduce** is that the lambda function does not necessarily need to return a value that is of the same type as the inputs. Imagine that we have a list of words and that we want to use **reduce()** to add all word lengths together. For example for the list ["I", "love", "data", "science"] the answer would be 1 + 4 + 4 + 7 = 16.

The first solution that might come to mind is to the lambda function **lambda a, b: len(a) + len(b)** that adds the lengths of two strings. The problem however is that after adding the length of "I" with "love", **reduce()** will apply the lambda function on that result, which is 5, and "data" as shows bellow:

<center><img width="600" src="https://drive.google.com/uc?export=view&id=1oZGZhgFGSNhccxscjYcUhHumO-WvCGXM"/></center>

This will result in an error because len(5) is not defined. To overcome this we need to change the lambda function to account for two cases:

1. Both a and b are strings.
2. **a** is already an integer and b is a string.

The following code shows this:


In [74]:
total_len = reduce(lambda x, y: len(x) + len(y) if isinstance(x, str) else x + len(y), ["I", "love", "data", "science"])

In [75]:
total_len

16

The following diagram shows how this is executed:

<center><img width="500" src="https://drive.google.com/uc?export=view&id=13nCE-5ZPAEHHhr7Xoi45rCm_8uKM2ZVz"/></center>

**Exercise**

<img width="100" src="https://drive.google.com/uc?export=view&id=1E8tR7B9YYUXsU_rddJAyq0FrM0MSelxZ"/>


To solve this exercise you will need to use both special cases that we described above: Using a single of the argument and having two cases in the lambda function.

- Using **reduce**, count the total amount of elements in **lines** and **filtered_ips**.
- Find the ratio between **filtered_ips** and **lines**, and assign the value to **ratio**.
- Print the **ratio** variable.



In [76]:
from functools import reduce

lines = read('example_log.txt')
ip_addresses = list(map(lambda x: x.split()[0], lines))
filtered_ips = list(filter(lambda x: int(x.split('.')[0]) <= 20, ip_addresses))

In [77]:
# PUT YOUR CODE HERE <GUIDED EXERCISE>

count_all = reduce(lambda x, _: 2 if isinstance(x, str) else x + 1, lines)
count_filtered = reduce(lambda x, _: 2 if isinstance(x, str) else x + 1, filtered_ips)
ratio = count_filtered / count_all

print(ratio)

0.0808


## 3.7 Rewriting with LIst Comprehension - optional

Because we eventually convert to lists, we should rewrite the **map()** and **filter()** functions using list comprehension instead. This is the more pythonic way of writing them, as we are taking advantage of the Python syntax for making lists. Here's how you could translate the previous examples of **map()** and **filter()** to list comprehensions:

In [78]:
values = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

# Map.
add_10 = [x + 10 for x in values]
print(add_10)

[11, 12, 13, 14, 15, 16, 17, 18, 19, 20]


In [79]:
# Filter.
even = [x for x in values if x % 2 == 0]
print(even)

[2, 4, 6, 8, 10]


From the examples, you can see that we don't need to add the lambda expressions. If you are looking to add **map()**, or **filter()** functions to your own code, this is usually the recommended way. However, in the next section, we'll provide a case to still use the **map()** and **filter()** functions.

**Exercise**

<img width="100" src="https://drive.google.com/uc?export=view&id=1E8tR7B9YYUXsU_rddJAyq0FrM0MSelxZ"/>

- Using list comprehension:
  - Rewrite the **ip_addresses** mapping.
  - Rewrite the **filtered_ips** filter.
- Keeping everything else, print the **ratio** variable.

In [80]:
# GUIDED EXERCISE

lines = read('example_log.txt')
ip_addresses = [line.split()[0] for line in lines]
filtered_ips = [
    ip.split('.')[0]
    for ip in ip_addresses if int(ip.split('.')[0]) <= 20
]
count_all = reduce(lambda x, _: 2 if isinstance(x, str) else x + 1, lines)
count_filtered = reduce(lambda x, _: 2 if isinstance(x, str) else x + 1, filtered_ips)
ratio = count_filtered / count_all
print(ratio)

0.0808


## 3.8 Writing Function Partials - optional

Sometimes we want to use the behavior of a function, but decrease the number of arguments it takes. The purpose is to "save" one of the inputs, and create a new function that defaults the behavior using the saved input. Suppose we wanted to write a function that would always add 2 to any number:

In [81]:
def add_two(b):
    return 2 + b 

print(add_two(4))

6


The **add_two** function is similar to the general function, f(a,b)=a+b
, only it defaults one of the arguments (a=2). In Python, we can use the partial module from the functools package to set these argument defaults. The partial module takes in a function, and "freezes" any number of args (or kwargs), starting from the first argument, then returns a new function with the default inputs.

In [82]:
from functools import partial

def add(a, b):
    return a + b


add_two = partial(add, 2)
add_ten = partial(add, 10)

print(add_two(4))

6


In [83]:
print(add_ten(4))

14


Partials can take in any function, including ones from the standard library!

In [84]:
# A partial that grabs IP addresses using
# the `map` function from the exercises.
extract_ips = partial(
    map,
    lambda x: x.split(' ')[0]
)
lines = read('example_log.txt')
ip_addresses = list(extract_ips(lines))

In [85]:
ip_addresses[:4]

['200.155.108.44', '36.139.255.202', '50.112.115.219', '204.132.56.4']

**Exercise**

<img width="100" src="https://drive.google.com/uc?export=view&id=1E8tR7B9YYUXsU_rddJAyq0FrM0MSelxZ"/>

- Using a partial, create a count function that takes in a list, and runs the reduce implementation of list counting.
- Replace the reduce with count for:
  - count_all
  - count_filtered
- Keeping everything else, print the ratio variable.

In [86]:
# GUIDED EXERCISE

from functools import partial

count = partial(
    reduce,
    lambda x, _: 2 if isinstance(x, str) else x + 1
)

lines = read('example_log.txt')
ip_addresses = [line.split()[0] for line in lines]
filtered_ips = [
    ip.split('.')[0]
    for ip in ip_addresses if int(ip.split('.')[0]) <= 20
]
count_all = count(lines)
count_filtered =  count(filtered_ips)
ratio = count_filtered / count_all
print(ratio)

0.0808


## 3.9 Using Functional Composition - optional

Let's examine how we used each of the **map**, **filter**, and **reduce** functions. In the exercises, we first mapped a list of log lines to their IP addresses, filtered those IP addresses for IPs that start with an integer less than or equal to 20, then counted the results. Notice that the output of each function was inputted into the one immediately following it.

Viewing our exercises this way, it's as if we've created a chain of function calls starting from map, and ending with reduce. This chain of function calls has a term in mathematics called [function composition](https://en.wikipedia.org/wiki/Function_composition). Given a chain of functions, f(x), g(x), h(x), function composition is when you apply the output of each function to the input of the next: h(g(f(x))).

This is exactly the same concept we used in our exercises:

```python
reduce(filter(map(...)))
```

Using a function **compose**, that takes in a sequence of **single argument** functions, we can create a composed single argument function similar to the example above. Here's a composed function with int types instead of iterable types:

In [87]:

def add_two(x):
    return x + 2

def multiply_by_four(x):
    return x * 4

def subtract_seven(x):
    return x - 7

def compose (*functions):
  def inner(arg):
    for f in functions:
      arg = f(arg)
    return arg
  return inner    

composed = compose(
    add_two,  # + 2
    multiply_by_four,  # * 4
    subtract_seven  # - 7
)

# (((10 + 2) * 4) - 7) = 41
answer = composed(10)
print(answer)

41


By restricting each **map**, **filter**, and **reduce** functions, requiring only a single input (an iterable), we can rewrite our previous implementations as a composable function.

**Exercise**

<img width="100" src="https://drive.google.com/uc?export=view&id=1E8tR7B9YYUXsU_rddJAyq0FrM0MSelxZ"/>

- Using **compose**, combine the **map**, **filter**, and **reduce** functions (with partial) to create a composable function that takes in a list, and returns a filtered count.
- Assign the result of the composed function to the variable **counted**.

In [88]:
# GUIDED EXERCISE

lines = read('example_log.txt')
ip_addresses = list(map(lambda x: x.split()[0], lines))
filtered_ips = list(filter(lambda x: int(x.split('.')[0]) <= 20, ip_addresses))

ratio = count_filtered / count_all
extract_ips = partial(
    map,
    lambda x: x.split()[0]
)
filter_ips = partial(
    filter,
    lambda x: int(x.split('.')[0]) <= 20
)
count = partial(
    reduce,
    lambda x, _: 2 if isinstance(x, str) else x + 1
)

composed = compose(
    extract_ips,
    filter_ips,
    count
)
counted = composed(lines)