# Lab - Object Oriented Programming

In [1]:
import pandas as pd
import numpy as np

# Challenge 2

In order to understand the benefits of simple object-oriented programming, we have to build up our classes from the beginning. 

You'll use the following dataframe generator to test some things. Try to understand what the following function does.

In [2]:
chars = ['a', 'b', 'c','d', 'e', 'f', ' ', 'á','é','ó']

def create_weird_dataframe(size=10):
    def create_weird_colnames(size=size):
        probs = [.2,.2,.15,.1,.1,.1,.05,.05,.025,.025]

        return [''.join(
            [(char.upper() if np.random.random() < 0.2 else char) 
                     for char in np.random.choice(chars,size=12, p=probs)]) for i in range(size)]
    
    data = np.random.random(size=(size,size))
    colnames = create_weird_colnames(size)
    return pd.DataFrame(data=data, columns=colnames)

Test the results of running that function below. Run it several times

In [3]:
df = create_weird_dataframe()
df.head()

Unnamed: 0,ddbcbb Óbdec,bcdcbe BaaaA,abaaafeaaa,dáéAafácabaa,cCeaóaebaDbé,aadac ddfbcb,dcfAaeaFec d,dceáéfaabaéf,áÁcbea aedá,c béCádbaabc
0,0.016226,0.105391,0.116765,0.667572,0.577595,0.904384,0.822368,0.806613,0.337573,0.442554
1,0.449491,0.237465,0.563181,0.506778,0.421964,0.931779,0.551095,0.349666,0.157103,0.403551
2,0.128462,0.20017,0.128086,0.725989,0.042616,0.457246,0.688638,0.894189,0.601321,0.099328
3,0.159506,0.349599,0.60556,0.930295,0.118903,0.700307,0.65221,0.560123,0.311208,0.922066
4,0.821243,0.901651,0.596592,0.889394,0.563118,0.524178,0.856793,0.06243,0.368661,0.433195


## Correcting the column names

We'll create a function that rename the weird column names. The idea is to, later, extend that idea to our own brand new dataframe class.

### let's start simple: get the column names of the dataframe.

Store it in a variable called `col_names`


In [4]:
col_names = df.columns
col_names

Index(['ddbcbb Óbdec', 'bcdcbe BaaaA', '  abaaafeaaa', 'dáéAafácabaa',
       'cCeaóaebaDbé', 'aadac ddfbcb', 'dcfAaeaFec d', 'dceáéfaabaéf',
       'áÁcbea aedá ', 'c béCádbaabc'],
      dtype='object')

### Let's iterate through this columns and transform them into lower-case column names

Create a list comprehension to do that if possible. Store it in a variable called `lower_colnames`

In [5]:
lower_colnames = [x.lower() for x in col_names] 
lower_colnames

['ddbcbb óbdec',
 'bcdcbe baaaa',
 '  abaaafeaaa',
 'dáéaafácabaa',
 'cceaóaebadbé',
 'aadac ddfbcb',
 'dcfaaeafec d',
 'dceáéfaabaéf',
 'áácbea aedá ',
 'c bécádbaabc']

### Let's remove the spaces of these column names!

Replace each column name space ` ` for an underline `_`. Again, try to use a list comprehension to do that. 
For this first task use `.replace(' ','')` method to do that.

In [6]:
[y.replace(" ", "_") for y in lower_colnames]

['ddbcbb_óbdec',
 'bcdcbe_baaaa',
 '__abaaafeaaa',
 'dáéaafácabaa',
 'cceaóaebadbé',
 'aadac_ddfbcb',
 'dcfaaeafec_d',
 'dceáéfaabaéf',
 'áácbea_aedá_',
 'c_bécádbaabc']

### Create a function that groups the results obtained above and return the lower case underlined names as a list

Name the function `normalize_cols`. This function should receive a dataframe, get the column names of a it and return the treated list of column names.

In [7]:
def normalize_cols(df):
    df = create_weird_dataframe()
    col_names = df.columns
    lower_colnames = [x.lower() for x in col_names]
    no_spaces = [y.replace(" ", "_") for y in lower_colnames]
    return no_spaces

### Test your results

Use the following line of code to test your results. Run it several times to see some behaviors.

In [8]:
normalize_cols(create_weird_dataframe())

['aaófcfbóbccd',
 'bbádaaaedecb',
 'cfcffbbbeafe',
 'dbdaccedác_e',
 'feaad_babfc_',
 'cdbcaaaáfáaé',
 'faebeaafbbbc',
 'cbeaóaafeaaa',
 'cbabááfbb_b_',
 'cdadeécbffáó']

### hmmm, we've made a mistake!

We've commited several mistakes by doing this. Have observed any bugs associated with our results?

In order for us to see some problems in our results, we have to look for edge cases. 

For example: 

**Problem #1:** what if there are 2 or more following spaces? We want it to replace the spaces by several underlines or condense them into one?

**Problem #2:** what if there are spaces at the beginning? Should we substitute them by underline or drop them?

Let's correct each problem. Starting by problem 2.

## Correcting our function

Instead of substituting the spaces at first place, let's remove the trailing and leading spaces!

Recreate the `normalize_cols` with the solution to `Problem 2`.

*Hint: Copy and paste the last `normalize_cols` function to change it.*

In [9]:
def normalize_cols(df):
    df = create_weird_dataframe()
    col_names = df.columns
    lower_colnames = [x.lower() for x in col_names]
    lead_trail = [k.strip() for k in lower_colnames]
    no_spaces = [y.replace(" ", "_") for y in lead_trail]
    return no_spaces

### Test your results again.

At least, for now, you should not have any trailing nor leading underlines.

In [10]:
normalize_cols(create_weird_dataframe())

['dccéáebcébdd',
 'bdddbaacbbed',
 'cóbcabfaccé',
 'aaadaafcaóaf',
 'daébebfbbdfc',
 'ebfadcfóbadc',
 'abdffbcf_cbá',
 'cefcf_cbaéba',
 'aaeábfcdaeef',
 'bf__dacbáfóa']

### Correcting problem 1

To correct problem 1, instead of using `.replace()` string method, we want to use a regular expression. Use the module `re` to substitute the pattern of `1 or more spaces` by 1 underline `_`.

Test your solution on the variable below:

In [11]:
import re 


text = 'these spaces      should all be one underline'
re.sub(" +", "_", text)

'these_spaces_should_all_be_one_underline'

### Now correct your `normalize_cols` function

*Hint: Copy and paste the last `normalize_cols` function to change it.*

In [12]:
def normalize_cols(df):
    no_spaces = []
    df = create_weird_dataframe()
    col_names = df.columns
    lower_colnames = [x.lower() for x in col_names]
    lead_trail = [k.strip() for k in lower_colnames]
    for y in lower_colnames:
        c = re.sub(" +", "_", y)
        no_spaces.append(c)
    return no_spaces

### Again, test your results.

Now, sometimes some column names should have smaller sizes (because you are removing consecutive spaces)

In [13]:
normalize_cols(create_weird_dataframe())

['óedeededfefc',
 'cbf_édbebdad',
 'abbéfbcbabfa',
 'ead_ddcefcac',
 'fdefaóbddafb',
 'acb_cbéaáeba',
 '_eaacebcbácb',
 'áacbfábaaébf',
 'cfaabbacébba',
 'dcbcaacfbdcc']

## Last step: remove accents

The last step consists in removing accents from the strings.

Import the package `unidecode` to use its module also called `unidecode` to remove accents. Test on the word below.

In [14]:
import unidecode
text = 'aéóúaorowó'

In [15]:
unaccented_string = unidecode.unidecode(text)
unaccented_string

'aeouaorowo'

### Now remove the accents for each column name in your `normalized_cols` function.

*Hint: Copy and paste the last `normalize_cols` function to change it.*

In [16]:
def normalize_cols(df):
    no_spaces = []
    df = create_weird_dataframe()
    col_names = df.columns
    lower_colnames = [x.lower() for x in col_names]
    lead_trail = [k.strip() for k in lower_colnames]
    for y in lower_colnames:
        u = unidecode.unidecode(y)
        c = re.sub(" +", "_", u)
        no_spaces.append(c)
    return no_spaces

### Test your results

In [17]:
normalize_cols(create_weird_dataframe())

['da_acaofdcaa',
 'ac_baeeabbec',
 'aeeabaeebeeb',
 '_dcfac_beaae',
 'baaaaca_aade',
 'adafcceccdaa',
 'aaaceafdaaeb',
 'b_acbcdbbebc',
 'eaadaaebbfee',
 'cfaaobaaffaa']

## Good job. 

Right now you have a function that receives a dataframe and returns its columns names with a good formatting.

# Creating our own dataframe.

In [18]:
from pandas import DataFrame

A dataframe is just a simple class. It contains its own attributes and methods. 

When you create a pd.DataFrame() you are just instantiating the DataFrame class as an object that you can store in a variable. From this point onwards, you have access to all DataFrame class attributes (`.columns` for example) and methods (`.isna()` for example). We've been using those since always! 

If we wish, we could create our own class inheriting everything from a DataFrame class.

In [19]:
class myDataFrame(DataFrame):
    pass

Instead of just creating myDataFrame, put your function inside your new inherited class, that is, transform `normalize_cols` into a method of your own DataFrame.

Remember you'll have to give self as the first argument of the `normalize_cols`. So you could replace everything you once called `dataframe` inside your `normalize_cols` by `self`. 

At the end, return the list of the correct names.

In [20]:
class myDataFrame(DataFrame):
    
    def normalize_cols(DataFrame):
        no_spaces = []
        df = create_weird_dataframe()
        col_names = df.columns
        lower_colnames = [x.lower() for x in col_names]
        lead_trail = [k.strip() for k in lower_colnames]
        for y in lower_colnames:
            u = unidecode.unidecode(y)
            c = re.sub(" +", "_", u)
            no_spaces.append(c)
        return no_spaces

Test your results.

In [21]:
df = myDataFrame(create_weird_dataframe())
df.normalize_cols()

['ddooddefacba',
 'ababccaddccf',
 'fcfboff_ebfe',
 'afaceafacfaa',
 'bafcadeaaadc',
 'ceaafeaaacee',
 'aabeaa_fdbec',
 'aec_ddacbdd_',
 'edcacaeccedb',
 '_acobcbdabab']

## Understanding even more the `self` argument

Instead of returning a list containing the correct columns, you should now assign the correct columns to the `self.columns` - this will effectively replace the values of your object by the correct columns.


Now change your method to return the dataframe itself. That is, return the `self` argument this time and see the results! 

```python
class myDataFrame(DataFrame):
    def normalize_cos(self):
        ...
        return self
```

In [57]:
class myDataFrame(DataFrame):
        
    def normalize_cols(self):
        no_spaces = []
        df = create_weird_dataframe()
        col_names = df.columns
        lower_colnames = [x.lower() for x in col_names]
        lead_trail = [k.strip() for k in lower_colnames]
        for y in lower_colnames:
            u = unidecode.unidecode(y)
            c = re.sub(" +", "_", u)
            no_spaces.append(c)
        col_names = [x for x in no_spaces]
        df.columns = col_names
        return df
    

In [58]:
df = myDataFrame(create_weird_dataframe())
df.normalize_cols()

Unnamed: 0,be_eceebafaa,a_babeacbeaf,eadedeeaaafa,fafaadacdaaa,badobedcoeae,fffaaaacec_a,ba_oaeebaacc,effaofdaadac,b_bbabcccaca,aacbdaeea_ce
0,0.82777,0.110819,0.82671,0.006433,0.591657,0.863813,0.219001,0.083723,0.560232,0.928782
1,0.455884,0.323964,0.282224,0.985413,0.267172,0.666007,0.020701,0.157035,0.388611,0.295958
2,0.401238,0.038388,0.942277,0.151542,0.670713,0.673015,0.363183,0.375101,0.248377,0.28492
3,0.128311,0.575432,0.543415,0.926718,0.413166,0.885301,0.340005,0.001445,0.654852,0.499442
4,0.966379,0.68292,0.193075,0.350574,0.528797,0.081795,0.909741,0.389013,0.505196,0.996507
5,0.183901,0.955627,0.914884,0.710046,0.608778,0.426318,0.851444,0.408447,0.549823,0.975516
6,0.612468,0.963309,0.080879,0.24681,0.922057,0.211607,0.980954,0.455313,0.802151,0.86535
7,0.27596,0.427017,0.902355,0.108855,0.567312,0.187116,0.693541,0.228582,0.189356,0.39259
8,0.882931,0.636678,0.630999,0.478697,0.280483,0.797274,0.055765,0.092009,0.599489,0.656945
9,0.91383,0.52433,0.31958,0.000611,0.749023,0.60104,0.658026,0.57141,0.960975,0.023884


# Challenge 1

## Creating a class

First of all, let's create a simple class. Name this class `Car`. ([PEP8](https://www.python.org/dev/peps/pep-0008/#class-names) suggests using CamelCase for class names, i.e., using the first letter of each name as upper-case.)

That should be as simple as possible. Use the class syntax to create it and its content should be only the 
```python 
pass
```
statement.


The `pass` statement is used just as a placeholder. This will be a class that doesn't do anything (yet).

In [24]:
# your code here

In [None]:
class Car:
    pass

In [None]:
my_car = Car()

## Let's think of which attributes should a car have

Think of attributes that are intrinsic of a car. Think of 5 attributes that all cars have and their possible values. Write down these 5 attributes for later use.

In [None]:
# write the attributes name you've chosen as a comment here.


We will create the `__init(self,)__` special method. This is the first thing that is run when you instantiate a new object (by calling `Car()` for example).

So each object that you are creating will instantly do whatever operation you perfom inside `__init(self,)__`. If you create new attributes over there, it will be accessible as soon as you create it. If you, instead, run some internal methods, it will perform as soon as the variable is created.

Let's check that.

### Create a `__init__(self)` special method inside your `Car` class and then perform a `for loop`  inside of it. 


To see the what happens when you initialize your class when a `__init__(self)` method exists, define this function and plug the following piece of code inside of it.

```python
from tqdm.auto import tqdm
import time

for i in tqdm(range(10), desc='__init__ is running, yay'):
    time.sleep(.1)
```

In [None]:
# your code here

In [None]:
class Car:
    def __init__(self):
        from tqdm.auto import tqdm
        import time
        
        for i in tqdm(range(10), desc='__init__ is running, yay'):
            time.sleep(.1)

### Afterwards, instantiate your `Car` class and see this beauty.

In [None]:
my_car = Car()

## Understanding the self argument

Now, below the `for loop` you've created, let's create the attributes of the `Car` class. Remember the attributes you wrote down earlier? Let's put them as arguments of the `__init__(self,)` function.

Remember, the first argument of the `__init__(self,)` function should always be the `self` keyword. 

The `self` argument represents the object itself. That is a way for you to have access to the objects own attribute. 


### First, let's start creating one single attribute of this car.

Let's say you have chosen `name` as a car attribute (what? can't a car have a name?). 

If you want your class to receive a specific car name as an argument, you have to put this variable as the argument of the `__init__` function. So, to add `name`, the results of your special function definition would be:

```python
def __init__(self, name):
    pass
```

Now, when you instantiate your Car class, the syntax would be similar to calling a function (which, by now, you should now that it is what you are effectively doing - you are calling the __init__ method), so what the syntax would be:

*Hint: If you don't specify an argument, the python interpreter will complain that your class requires one argument (try that - if you don't try it now, it is not a problem, you'll try in future, even when you don't want to).*


In [None]:
# your code here

### Now let's store that new argument

By now, you are only receiving the name of the car as an argument, but you are not doing anything specifically with that variable called `name`.

Let's store that in the object. That's the first use of the `self` keyword.

To store the variable in a way that the user can access via a `car.SOMETHING`, you have to specify that the object itself is receiving the attribute `name` (for example)

Then, **create a variable called `name` that receives the argument `name`** (keep in mind that the name of the variable need not necessarily be the same, you could assing the argument `name` to an attribute called `chimpanze` for example).

Also **create the other 5 attributes that you previously had in mind**


In [None]:
# your code here

### Access the attribute

You should now be able to access the object's attribute once you instantiate it as `my_car.name`

You can try to write `my_car.<TAB>` to check what attributes or methods your object contains.

## Understanding special methods

Special methods are the ones that start with double underlines (usually called `dunder`), for example the `__init__` method, the `__doc__` method or `__repr__` method (called as `dunder init`, `dunder doc`, `dunder repr`).

The `__repr__` method is responsible to show how your class will be displayed on screen when you display it.
Let's create a `__repr__(self)` function on our `Car` class that returns the following string below (copy the string below):

```python
    car = f'''
                  ______--------___
                 /|             / |
      o___________|_\__________/__|
     ]|___     |  |=   ||  =|___  |"
     //   \\    |  |____||_///   \\|"
    |  X  |\--------------/|  X  |\"
     \___/                  \___/
    '''
```

Your class should now have two special methods, `__init__` and `__repr__`

In [None]:
class Car:
    
    def __init__(self, car_name):
        self.car_name = car_name
    
    def __repr__(self):
        
        car = f'''
                      ______--------___
                     /|             / |
          o___________|_\__________/__|
         ]|___     |  |=   ||  =|___  |"
         //   \\    |  |____||_///   \\|"
        |  X  |\--------------/|  X  |\"
         \___/                  \___/
        '''
        
        return car

### Now instantiate your Car class again

In [None]:
my_car = Car('Jeguinho')

### And check what happens when you print your object on screen

In [None]:
print(my_car)

### Now create a simple method to receive and return the `self` variable

Create a simple method inside your `class Car` and return `self` the self argument. Name this method `get_itself`.

In [None]:
class Car:
    
    def __init__(self, car_name):
        self.car_name = car_name
    
    def __repr__(self):
        
        car = f'''
                      ______--------___
                     /|             / |
          o___________|_\__________/__|
         ]|___     |  |=   ||  =|___  |"
         //   \\    |  |____||_///   \\|"
        |  X  |\--------------/|  X  |\"
         \___/                  \___/
        '''
        
        return car
    
    def get_itself(self):
        return self

#### Now instantiate the Car class and call `get_itself()`

In [None]:
my_car = Car('andre')

In [None]:
my_car.get_itself()

This happens because you are print this specific object. 

# Bonus 1

### Now let's parametrize this drawing.

Change your class to receive the drawing you want to output as a parameter. Modify your __repr__ method to use that parameter instead of the fixed drawing we used upwards.

In [None]:
class Car:
    
    def __init__(self, car_name, car):
        self.car_name = car_name
        self.car = car
        
    def __repr__(self):
        
        car = self.car
        
        return car
    
    def get_itself(self):
        return self

In [None]:
car = f'''
              ______--------___
             /|             / |
  o___________|_\__________/__|
 ]|___     |  |=   ||  =|___  |"
 //   \\    |  |____||_///   \\|"
|  X  |\--------------/|  X  |\"
 \___/                  \___/
'''

my_car = Car(car_name = 'A', car=car)
my_car

In [None]:
car = '''
                   _
 _________________| \_
|   ___    |  ,|   ___`-.
|  /   \   |___/  /   \  `-.
|_| (O) |________| (O) |____|
   \___/          \___/
'''

my_car = Car(car_name = 'B', car=car)
my_car

# Bonus 2

## Create a specialized version of a car - an Uber

You'll now create a specific version of a car. It contains the same attributes and functions of the class of cars, but it is specifically a Uber.

### Create a class called `Uber` that inherits from a `Car`

In [None]:
# your code here

### Extending the `Car` class. 

When you create a new class based on another and create new attributes and methods for it, you are extending it. 

#### Let's create 2 new attributes that only `Uber cars` have. 

Create the `category` of the Uber (`Black`, `Platinun`, etc) and one more attribute of your choice.

#### Let's create a method for this new `Uber` class that calculates the price of the run given the distance in km and time spent (in minutes) in the run. 

Suppose each km costs `R$ 1,00` and 1 minute costs `R$ 0,50` for `Uber` Black and `R$ 1,20` and 1 minute costs `R$ 0,60` for `Uber`  Platinum.  The final price is the max between the two.

```python
def get_price(km, time):
    ...
    return final_price
```

Then calculate the price of your `Uber` from:

1. A `Uber Black` going from Ironhack to Guarulhos Airport (`1h:20min, 30.5km`)
1. A `Uber Platinum` going from Ironhack to Guarulhos Airport (`1h:20min, 30.5km`)

In [None]:
black = Uber(..., category='Black')
black.get_price()

In [None]:
platinum = Uber(..., category='Platinum')
platinum.get_price()