# Introduction to Object-Oriented Programming

In the previous notebooks, we introduced the paradigms of **functional programming**. In summary, functional programming is when every function takes in an input and produces an output. We can stack these functions into a pipeline and pass variables, lists and other data structures 
through our functions:

<left><img width="600" src="https://drive.google.com/uc?export=view&id=11Yr6ySfPdRfxJVSqe3-__kHkIhxOw52O"></left>

In functional programming, we have a variable and alter the state of the variable by passing it through a number of functions. An alternative to functional programming, is **object-oriented programming**. Object-oriented programming is a paradigm of modeling complex behavior in python.

Let's look at real life objects. Real-life objects are tangible things that we can feel, touch and smell. The computer you're using to read this lesson, is something you can feel and touch. Smell? Perhaps, not as much. However, you use your computer for a specific purpose: to learn data science, to finish work, connect with friends etc.

Objects in python aren't too different. Although you can't touch or feel objects in python, objects are models of certain things. A **class**, is essentially a blueprint for creating an object. For example, let's say we have three top music artists sitting in front of us: Beyonce, Ed Sheeran and Kanye West. Each person is a distinct object, but all three have attributes and behaviors associated with one class: music artists.

# Creating a Class and an Instance

We have three music artists sitting in front of us: **Beyonce**, **Ed Sheeran** and **Kanye West**. All three, belong to the **class** of **MusicArtist**. To create a class, in python, we use the class keyword and then the name of the class after. Let's create a class called MusicArtist:

```python
class MusicArtist():
    pass
```

For now, we'll leave this class empty by using the **pass** keyword. The **pass** keyword allows classes or functions to run successfully without any content. If we didn't include the **pass** keyword, the interpreter would return an error:

In [1]:
class MusicArtist():

SyntaxError: unexpected EOF while parsing (<ipython-input-1-8e45a498f126>, line 1)

The class name, by convention, is in [PascalCase](https://www.python.org/dev/peps/pep-0008/#class-names) where the first letter of every word is capitalized. Examples:

- CapWords
- TheTitle

Let's return to our original code:

In [2]:
class MusicArtist():
    pass

We've successfully created a **MusicArtist** class. Using this class, we can create **instances** from our class. An instance is a specific object with its own set of behaviors and characteristics. If we had Beyonce, Ed Sheeran and Kanye West sitting in front of us, all three would be an **instance** of the **MusicArtist** class. Objects are instances of a class with it's own specific set of characteristics and behaviors. However, all three people would still be a part of the **MusicArtist** class.

To create an instance of our **MusicArtist** class, we'll write out the name of the **class** with the parenthesis. Then, we'll store this object in a name of our choice:

In [10]:
class MusicArtist():
    pass

beyonce = MusicArtist()

<left><img width="400" src="https://drive.google.com/uc?export=view&id=13RE5_SwbcmukP3LqumvYAEjbZnJtP5mX"></left>

We'll dive deeper into what the behavior & data mean in a later screen. For now, understand that we use classes to create objects. We've officially created the first dancer! Now, let's create **ed_sheeran** object.

In [11]:
ed_sheeran = MusicArtist()

<left><img width="400" src="https://drive.google.com/uc?export=view&id=1d1J3j0RpPRxOJp3Qs6-PBSOTk2ZlJOYA"></left>

Now, if we were to print out beyonce and ed_sheeran, the interpreter would return:

In [12]:
print(beyonce, ed_sheeran)

<__main__.MusicArtist object at 0x103e34550> <__main__.MusicArtist object at 0x103e34630>


**0x number**  tells us the memory location of our object. Whenever we define a variable like **a=1**, python assigns a location in memory for the object. Think of this like a unique home address. Whenever we want to use the value, we access the address.

A Notice, that even though **beyonce** and **ed_sheeran** are both defined as **MusicArtist()**, they are stored in different locations. These are two distinct objects. Now, let's add **kanye west** to our objects!

**Exercise**

<left><img width="100" src="https://drive.google.com/uc?export=view&id=1E8tR7B9YYUXsU_rddJAyq0FrM0MSelxZ"></left>

- Create a class called **MusicArtist()**.
- Create multiple instances of **MusicArtist()** and assign to the following variables:
    - beyonce
    - ed_sheeran
    - kanye_west

In [1]:
# put your code here

# Defining Variables Within a Class

We've created objects representing Ed Sheeran, Beyonce and Kanye West. However, these objects are empty since we haven't defined anything in our class.

Whenever we look at three music artists like Ed Sheeran, Beyonce and Kanye West, there are immediate differences between each artist. They perform at different dates throughout the year. They produce different types of music. They produce different albums. They're all music artists, but they differ in characteristics.

How do we make sure these differences are reflected in each instance? We can use **instance variables** to define the **data** for each object. Instance variables are similar to regular variables. However, they are only accessible on the instance level. To better understand this, let's say we wanted to create an instance variable that told us the album name of one of our music artists. To create an instance variable, we'd write the following:

```python
object.name_of_new_variable = "Content of Variable"
```
Let's create an instance variable called album for the beyonce object:

```python
beyonce.album = "Dangerously in Love"
```

The album would be reflected in our chart like this:

<left><img width="400" src="https://drive.google.com/uc?export=view&id=10sH58W0OVeZ2xBAbdUwGx7jCetIdGKSB"></left>

Let's look at the entire class and object:

```python
class MusicArtist():
    pass

beyonce = MusicArtist()

beyonce.album = "Dangerously in Love"
```

To access this variable, we'd just write out:



In [2]:
class MusicArtist():
    pass

beyonce = MusicArtist()

beyonce.album = "Dangerously in Love"
print(beyonce.album)

Dangerously in Love


Because album is an **instance variable**, album only exists in the **beyonce** object. If we were to try and print out album from **ed_sheeran**, we'd return an error:

In [3]:
ed_sheeran = MusicArtist()
print(ed_sheeran.album)

AttributeError: 'MusicArtist' object has no attribute 'album'

Now, let's add instance variables to our **MusicArtist()** objects!

**Exercise**

<left><img width="100" src="https://drive.google.com/uc?export=view&id=1E8tR7B9YYUXsU_rddJAyq0FrM0MSelxZ"></left>


- Create the following **instance variables** for **ed_sheeran** and **beyonce** objects:
    - name
    - genre
    - song
- For **ed_sheeran**, define these variables as:
    - name: "Ed Sheeran"
    - genre: "Pop"
    - song: "Thinking out Loud"
- For **beyonce**, define these variables as:
    - name: "Beyonce"
    - genre: "R&B"
    - song: "Halo"
- Print all these variables.

In [None]:
# put your code here

# Defining Attributes Using The __init__() Method

In the previous example, we've manually created instance variables for **beyonce**:

```python
class MusicArtist():
    pass

beyonce = MusicArtist()
ed_sheeran = MusicArtist()

beyonce.album = "Dangerously in Love"
```

Now, let's add an album instance variable for **ed_sheeran**:

```python
class MusicArtist():
    pass

beyonce = MusicArtist()
ed_sheeran = MusicArtist()

beyonce.album = "Dangerously in Love"
ed_sheeran.album = "Divide"
```

<left><img width="400" src="https://drive.google.com/uc?export=view&id=1xBtMmW0ssSQb6_6q3xrUtsVh0NFVjbhw"></left>

We need to manually create each instance variable for each object. If we had 10 different objects, with 2 instance variables each, we'd need to re-write each instance variable 10 times, making up 20 lines of code. There's an easier way.

So far, we haven't defined anything underneath our class definition:

```python
class MusicArtist():
    pass
```

Since we want both of our objects to have the **album** variable, we could create this instance variable within our class. To do this, we'll use a method called the \_\_init\_\_() method. Functions defined within **class** are called **methods**. The \_\_init\_\_() method, is a special type of method that automatically gets called whenever we create a **new class**. If we wanted to automatically create instance variables when creating an object, we can create these instance variables within the \_\_init\_\_() method.

Let's start, by defining our \_\_init\_\_() method:

```python
class MusicArtist():
    def __init__():
        pass
```

To add a instance variable, we'll need to add arguments to our \_\_init\_\_() method. The first step, will be to add the name of our variable:

```pyton
def __init__(album):
```

Now, if we tried to define the variable **album** and return **album** like this:

In [4]:
class MusicArtist():
    def __init__(album):
        return album

x = MusicArtist("hello")

TypeError: __init__() takes 1 positional argument but 2 were given

We'll explain later why the error message says we gave two arguments when we only wrote **"hello"**. In object oriented programming, methods within a class will automatically take the object itself as an argument. As a result, we'll need to add **self**:

In [22]:
class MusicArtist():
    def __init__(self, album):
          self.album = album

When we create a new object, the \_\_init\_\_ method gets automatically revoked. self tells python to pass the object through the \_\_init\_\_ method as an argument. Let's look at an example:



In [23]:
class MusicArtist():
    def __init__(self, album):
          self.album = album

beyonce = MusicArtist("Dangerously in Love")

Under the hood, this is what's actually happening:

```python
beyonce = MusicArtist.__init__(beyonce, "Dangerously in Love")
```

We can see how they connect here:

<left><img width="600" src="https://drive.google.com/uc?export=view&id=1ckNTtCLC1WtuMbVZffBSji2pw9SJRjWL"></left>

**beyonce** is actually passed in as an argument into \_\_init\_\_. This is why our error message earlier said we gave two arguments. Whenever we invoke a new method within our class, we need to pass in our object as an argument. Luckily, the python creators designed the language to do this automatically. The general convention is to use **self** keyword to reference the object.

<left><img width="500" src="https://drive.google.com/uc?export=view&id=1XKv33fEThAuwbwozaEAXhcTaY7neVd5N"></left>

Now, let's add some attributes to our music artists!


**Exercise**

<left><img width="100" src="https://drive.google.com/uc?export=view&id=1E8tR7B9YYUXsU_rddJAyq0FrM0MSelxZ"></left>

- Within class **MusicArtist()**, define the \_\_init\_\_() method.
Within the \_\_init\_\_() method, define:
    - **name**
    - **genre**
    - **song**
- Create an **ed_sheeran** object, define these variables as:
    - **name**: "Ed Sheeran"
    - **genre**: "Pop"
    - **song**: "Thinking out Loud"
- Create a **beyonce** object, define these variables as:
    - **name**: "Beyonce"
    - **genre**: "R&B"
    - **song**: "Halo"

In [7]:
# put your code here

# Accessing Instance Variables

Here's what we've created in the previous example:

```python
class MusicArtist():
    def __init__(self, album):
          self.album = album

beyonce = MusicArtist("Dangerously in Love")
```

Now that we've created an instance variable, let's access this instance variable. To access any instance variable, we use the dot notation:

```python
beyonce.album
```

This would print:

```python
"Dangerously in Love"
```

Since **beyonce.album** is a string, we can manipulate this variable like a string. For example, we could concatenate the string:

```python
print(beyonce.album + " is awesome.")
```

Or we could make "Dangerously" lowercase:

```python
beyonce.album.lower()
```

This would print:

```python
"dangerously in Love"
```

If we use a string method on an object like **beyonce.album**, to make sure **beyonce.album** is **"dangerously in Love"** whenever we call it, we'll also need to assign the value like **beyonce.album = beyonce.album.lower()**.

An instance variable behaves like a regular variable. Let's start manipulating our instance variables!

**Exercise**

<left><img width="100" src="https://drive.google.com/uc?export=view&id=1E8tR7B9YYUXsU_rddJAyq0FrM0MSelxZ"></left>

- Use **str.format()** to print out **"Ed Sheeran is singing Thinking out Loud tonight"**.
- Use **str.format()** to print out **"Beyonce is singing Halo tonight"**.

In [None]:
# put your code here

# Adding Methods to a Class

In the previous screen, we printed out the song each artist is going to sing. If we had 10 artists, we could write our code this way:

```python
print("{name} is singing {song} tonight".format(name = ed_sheeran.name, song = ed_sheeran.song))
print("{name} is singing {song} tonight".format(name = beyonce.name, song = beyonce.song))
print("{name} is singing {song} tonight".format(name = kanye_west.name, song = kanye_west.song))
print("{name} is singing {song} tonight".format(name = beatles.name, song = beatles.song))
print("{name} is singing {song} tonight".format(name = bob_marley.name, song = bob_marley.song))
print("{name} is singing {song} tonight".format(name = michael_jackson.name, song = michael_jackson.song))
print("{name} is singing {song} tonight".format(name = jay_z.name, song = jay_z.song))
print("{name} is singing {song} tonight".format(name = avicii.name, song = avicii.song))
print("{name} is singing {song} tonight".format(name = justin_timberlake.name, song = justin_timberlake.song))
print("{name} is singing {song} tonight".format(name = bob_dylan.name, song = bob_dylan.song))
```

While we could manually hardcode each of these out, an easier way would be to use a method. We used methods earlier when we defined the \_\_init\_\_ method. Returning to our Dancer class metaphor, let's give our dancers **"Jake"** and **"Julia"** the ability to dance.

To do this, we can create a method within our **class**. Creating a method follows the exact same steps as creating a function:

```python
class MusicArtist():
    def __init__(self, album):
          self.album = album

    def dance():
        pass

beyonce = MusicArtist("Dangerously in Love")
```

Whenever we create a new method, we should define which variables we'd like to use within our method. In our example, let's say we wanted our method to print out **"Beyonce is dancing."** In this case, we'll need to add a **name** instance variable. Then, we'll call this instance variable in our method:




In [8]:
class MusicArtist():
    def __init__(self, album, name):
        self.album = album
        self.name = name

    def dance(name):
        return name + "is dancing." 

beyonce = MusicArtist("Dangerously in Love", "Beyonce")

beyonce.dance()

TypeError: unsupported operand type(s) for +: 'MusicArtist' and 'str'

Remember, methods within a **class** will automatically pass the object in as an argument. As a result, our current **dance()** method is attempting to concatenate the object with a string. Whenever we create a new method, we'll need to add a **self**
keyword as an argument as a placeholder for the object that's automatically passed:

In [9]:
class MusicArtist():
    def __init__(self, album, name):
        self.album = album
        self.name = name

    def dance(self):
        return self.name + " is dancing." 

beyonce = MusicArtist("Dangerously in Love", "Beyonce")

beyonce.dance()

'Beyonce is dancing.'

<left><img width="400" src="https://drive.google.com/uc?export=view&id=14E3K5kBsRuddv3Yt8WKmFt4EFZPMxNKB"></left>

**Exercise**

<left><img width="100" src="https://drive.google.com/uc?export=view&id=1E8tR7B9YYUXsU_rddJAyq0FrM0MSelxZ"></left>

- Write a method called **sing()** that creates a string matching the following pattern and displays it using it **print()**:
    - "**{name} will be singing {song} tonight**"
    - **{name}** should map to the name attribute
    - **{song}** should map to the song attribute
- Call the **sing()** method on both the **ed_sheeran** and **beyonce** objects.

In [36]:
# put your code here

# Creating The SimpleFrame Class

In this Python introduction lesson, here's what we've accomplished:

- We wrote **clean**, **modular** code to figure out **Spotify's** most popular song in 2017.
- We've used **list comprehensions** and **lambda** functions to figure out the dominant artist of 2017.
- We've created our own version of **Mad Libs** using Ed Sheeran lyrics.
- We've used **datetime objects** to figure out the dominant artists for each month.
- We've used **object-oriented programming** to create our own **MusicArtist class**.

In the previous two notebooks, we've been writing functions to analyze our music data. In the previous sections, we introduced **Object-Oriented Programming**. How do we know which paradigm to use? Functional? Procedural? Object-Oriented? Each paradigm has it's pros/cons:


<left><img width="600" src="https://drive.google.com/uc?export=view&id=1c8D3mU-kfbllAMCgx5rnLD-PdVPqm1mU"></left>

Loading data from a file, manipulating it, and computing summary statistics (like a column average or maximum) is common in data science projects. While we can use the core Python data structures (lists, dictionaries, and tuples) and the core Python operations (like arithmetic operators) to accomplish these tasks, there are some benefits to using a specific module for data analysis.

In 2008, software engineer Wes McKinney encountered limitations for data analysis in Python and sought to build a high-performance, flexible tool in Python to perform analysis on financial data. He found the process of importing, merging and analyzing data within Python to be cumbersome. As a result, he decided to build the pandas library. Pandas has become one of the most common libraries for data analysis. Here, you'll see the growth in [Pandas](https://pandas.pydata.org/pandas-docs/stable/):



As Wes McKinney says:

>“Scientists unnecessarily dealing with the drudgery of simple data manipulation tasks makes me feel terrible.”

Core to pandas is the **DataFrame** class, which stores data read in from a file and contains hundreds of methods for working with the data. You can preview the breadth of methods at the pandas API documentation page.

In the next lessons, we'll dive into how to use the [pandas library](https://pandas.pydata.org/pandas-docs/stable/api.html#dataframe) to perform common data analysis tasks easily. In this Guided cells, we'll apply what we learned in this course to replicate a small part of the pandas library. We'll create a SimpleFrame class that supports some common tasks like:

- Reading in data from a CSV file
- Computing the maximum, minimum, or average value of a column
- Adding new columns

After building up this class, we'll use it to answer the following questions using simple function calls:

- Which song had the highest number of plays in one day?
- Which song had the lowest number of plays in one day?

In [10]:
import csv
from statistics import mean, stdev, median, mode

class SimpleFrame():
    def __init__(self, filename):
        self.filename = filename
    
    def read_data(self):
        '''
        Reads and opens the data
        '''
        f = open(self.filename,"r")
        self.data = list(csv.reader(f))
        self.columns = self.data[0]
    
    def head(self):
        '''
        Displays the first five rows
        '''
        return self.data[:5]
        
    
    def shape(self):
        num_rows = 0
        for row in self.data:
            num_rows += 1
        
        num_cols = len(self.data[0])
        return [num_rows, num_cols]
    
    def new_column(self, column_name):
        for pos, d in enumerate(self.data):
            if pos == 0:
                d.append(column_name)
            else:
                d.append('NA')
    
    def apply(self, column_name, new_value):
        for pos, col in enumerate(self.data[0]):
            if col == column_name:
                column_index = pos
        
        for data in self.data[1:]:
            data[column_index] = new_value
    
    def subset(self, column_name, row_value):
        for pos, col in enumerate(self.data[0]):
            if col == column_name:
                column_index = pos
        
        print(column_index)
        subset_data = []
        for data in self.data[1:]:
            if row_value in data:
                subset_data.append(data[column_index])
        return subset_data

    
    def summary_stats(self, column_name):
        for pos, col in enumerate(self.data[0]):
            if col == column_name:
                column_index = pos

        num_data = [data[column_index] for data in self.data[1:]]
        m = statistics.mean(num_data)
        std = stdev(num_data)
        median = statistics.median(num_data)
        
        print("Mean is {mean}".format(mean= m))
        print("Standard Deviation is {std}".format(std= std))
        print("Median is {median}".format(median= median))
        
            
    def minimum(self, column):
        for pos, col in enumerate(self.data[0]):
            if col == column:
                column_index = pos

        ## Find min value
        col_data = []
        for row in self.data[1:]:
            col_data.append([row[1],row[2],row[column_index]])
        
        return min(col_data, key= lambda x: x[2])
    
    def maximum(self, column):
        for pos, col in enumerate(self.data[0]):
            if col == column:
                column_index = pos
        ## Find min value
        col_data = []
        for row in self.data[1:]:
            col_data.append([row[1],row[2],row[column_index]])
        return max(col_data, key= lambda x: x[2])
  

In [12]:
s = SimpleFrame("music_data.csv")
s.read_data()
print(s.shape())

[37101, 6]


In [13]:
s.columns
s.new_column('hello')
s.head()

[['', 'Track Name', 'Artist', 'Streams', 'Date', 'Region', 'hello'],
 ['0',
  'Reggaetón Lento (Bailemos)',
  'CNCO',
  '19272',
  '2017-01-01',
  'ec',
  'NA'],
 ['1', 'Chantaje', 'Shakira', '19270', '2017-01-01', 'ec', 'NA'],
 ['2',
  'Otra Vez (feat. J Balvin)',
  'Zion & Lennox',
  '15761',
  '2017-01-01',
  'ec',
  'NA'],
 ['3', "Vente Pa' Ca", 'Ricky Martin', '14954', '2017-01-01', 'ec', 'NA']]

In [15]:
len(s.subset("Artist","Shakira"))

2


666

In [37]:
print(s.maximum("Streams"))
print(s.minimum("Streams"))

2
['Reggaetón Lento (Bailemos)', 'CNCO', '9998']
['Ay Mi Dios', 'IAmChino', '10000']
