# Object-Oriented Programming (V2.1: Mod 3, Section 21)


- online-ds-pt-100719
- 03/04/20

## Announcemens/Questions?


- Please fill out this survey (focuses only on study groups)
    - https://forms.gle/Kw7YJGCKqC4q6XYz9 
    
- Message me on Slack if you scheduled a review and did not receive a confirmation email. (Your appointment is fine, there was a bug with the scheduler).

- Still interested in recording of lab walk through 

## Things to Discuss

- **Dictionaries are your friend!**
    - Constructing Dictionaries `{k:v}` vs `dict(k=v)`
    - Iterating With Dictionaries
    - The `kwargs` and the  `**` operator
    
- **Unpacking/packing with `*` and ` ** `**

- **Inspecting classes:**
    - `help(obj)` vs `dir(obj)`
    
- **Deeper dive into Classes/Objects**
    - special methods/properties (`__repr__(),__str__(),__call__(),__version__(),__name__()`)
    - Methods: vs Bound Methods vs Static Methods 

In [1]:
!pip install fsds_100719
from fsds_100719.imports import *

fsds_1007219  v0.7.16 loaded.  Read the docs: https://fsds.readthedocs.io/en/latest/ 


Handle,Package,Description
dp,IPython.display,Display modules with helpful display and clearing commands.
fs,fsds_100719,Custom data science bootcamp student package
mpl,matplotlib,Matplotlib's base OOP module with formatting artists
plt,matplotlib.pyplot,Matplotlib's matlab-like plotting module
np,numpy,scientific computing with Python
pd,pandas,High performance data structures and tools
sns,seaborn,High-level data visualization library based on matplotlib


[i] Pandas .iplot() method activated.


# What does it mean to be 'Object-Oriented'?

> ### _"Everything is an object._"
- some Python sensei


In [22]:
prove_it = max
prove_it([0,11,13])

13

In [23]:
prove_it

<function max>

# Dictionaries & Dictionary Methods

- Iterating throught a dict:
    - `dict.items()`
    - `dict.keys()`
    - `dict.values()`
    - `**dict` vs `*dict`

- Retrieving Value:
    - `dict.get(k)` vs `dict[k]`

- Removing / Extracting Entries
    - `dict.pop(k)` vs `del dict[k]`
    - `dict.clear()`
    
- Merging Dictionaries:
    - `d1.update(d2)`
        - for every (k,v) in d2"
            - if k is NOT in d1, insert (k,v) into d1
            - if k IS in d1, updates value of k in d1
    - Use `**` operator:
        - `combined_d = {**d1,**d2}`
    
- Updating Dictionaries
    - `d1.update(key1=new_value1,new_key2=new_value2)`

- Setting Dictionary Values
    - `dict[k] = 5`
    - `dict.setdefault(k,5)`


# OOP VOCABULARY


- "Object" is an instance of a template class that currently exists in memory
- "Calling" a function: 
    - When we use `( )` with a function we are calling it.

- **Function:**  Codes that maniuplates data in a useful way. 

- Parameters: the defined data/varaibles that are passed accepted by a function
- Argument: the actual variable/value passed in for a parameter
- Positional Argument:
    - The first arguments required
    - their id is determined by their order
- Keyword/default Arguments:
    - arugments that have a defined default value
    - must come after positional arguments

<br><br>
- **Class:** Template/blue print.
- Instance: Ab object built from the class blueprint
- Attribute: A variable stored inside an object. 
- Method: Functions are stored inside an object.
    - Objects always pass themselves into a method, so we used `self` to account for this.
- Private Attributes/Methods: they start with _ and are hidden from the user. They can be updated using getting and setting functions.
- Getters/Setters:
    - Methods for retreiving or changing private attributes

- Object: 

- "dunders" = double underscores __ 

# Defining and Initializing Classes




```python
## Bare minimum to define a class.
class Person:
    pass
```

- Use `class NewClassName():` like you use `def function_name():` for functions.
    - the `()` are optional for classes. (used to inherit other classes, more on that later)
- Convention for naming classes = `UpperCamelCase`
- Convention for naming function = `snake_case`


### Initialization 


- We create an instance by setting a `instance = ClassName()`
-  This uses the template `ClassName` to create an instance of the class ( which we named `instance`)
- When an instance is `initialized`, we `call` it using `()`, which runs a default `__init__()` method.


#### Know thy `self`
- Because Methods are designed to operate on the `object_its.attached_to()`, Python automatically gives every method a copy of instance its attached to, which we call `self`
- We have to pass `self` as the first parameter for every method we make.
- Otherwise it will think that the first thing we give it is actually itself. This will cause an *existential crisis** and corresponding error.



In [12]:
class Person:
    species = 'human'
    alive = True

    def __init__(self,name,fav_color,location=None):
        self.name = name
        self.location = location
        self.fav_color = fav_color


    def who_am_i(self):
        print(f"My name is {self.name}")
        print(f"I live in {self.location}") 
        print(f"My favorite color is {self.fav_color}")

In [25]:
me = Person('James','purple')#,'Baltimore, MD')
me

<__main__.Person at 0x1c24f95a20>

In [26]:
me.who_am_i()

My name is James
I live in None
My favorite color is purple


## Inheritance


- Define a Class based on another class by passing the class to inherit from as a parameter:

```python
def FlatironStudent(Person):
    pass # hopefully! lol
```

### What did you inherit?    
- To view all of the attributes and methods of a class, **use the help() command**
    -  Note: There is often ***information in `help()` that you may not be able to find ANYWHERE else*** and does not show up in documentation.


In [27]:
class FlatironStudent(Person):
    pass # hopefully! lol

### Peeking Under the Hood: `help` and `dir`

In [30]:
student = FlatironStudent('James','purple')
help(student)

Help on FlatironStudent in module __main__ object:

class FlatironStudent(Person)
 |  Method resolution order:
 |      FlatironStudent
 |      Person
 |      builtins.object
 |  
 |  Methods inherited from Person:
 |  
 |  __init__(self, name, fav_color, location=None)
 |      Initialize self.  See help(type(self)) for accurate signature.
 |  
 |  who_am_i(self)
 |  
 |  ----------------------------------------------------------------------
 |  Data descriptors inherited from Person:
 |  
 |  __dict__
 |      dictionary for instance variables (if defined)
 |  
 |  __weakref__
 |      list of weak references to the object (if defined)
 |  
 |  ----------------------------------------------------------------------
 |  Data and other attributes inherited from Person:
 |  
 |  alive = True
 |  
 |  species = 'human'



In [31]:
dir(student)

['__class__',
 '__delattr__',
 '__dict__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__le__',
 '__lt__',
 '__module__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 '__weakref__',
 'alive',
 'fav_color',
 'location',
 'name',
 'species',
 'who_am_i']

# Special Class Methods

## Using special methods to control the output of a class

### `__repr__()` controls display when final element of a cell (or when display is used)

In [35]:
class FlatironStudent(Person):
    # def __init__(self):
    def __repr__(self):
        msg = []
        msg.append(f'Name = {self.name}')
        msg.append(f'Species = {self.species}')
        msg.append(f'Fav Color = {self.fav_color}')
        return '\n'.join(msg)
student=FlatironStudent('james','purple')
display(student)
''

Name = james
Species = human
Fav Color = purple

''

### `__str__()` controls whats displayed when an object is printed

In [37]:
class FlatironStudent(Person):
    # def __init__(self):
    def __repr__(self):
        msg = []
        msg.append(f'Name = {self.name}')
        msg.append(f'Species = {self.species}')
        msg.append(f'Fav Color = {self.fav_color}')
        return '\n'.join(msg)
    def __str__(self):
        print(f"My name is {self.name}")
        print(f"I live in {self.location}") 
        print(f"My favorite color is {self.fav_color}")
        return ''

In [39]:
student=FlatironStudent('james','purple')
print(student)
# student

My name is james
I live in None
My favorite color is purple



Name = james
Species = human
Fav Color = purple

In [40]:
df = fs.datasets.load_mod1_proj()

In [42]:
import tzlocal
tzlocal.get_localzone()

<DstTzInfo 'America/New_York' LMT-1 day, 19:04:00 STD>

In [46]:
import datetime as dt
dt.datetime.now()
print(dt.datetime.now())

2020-03-04 18:34:38.576037


In [96]:
fs.quick_refs.ts_date_str_formatting()

CODE,MEANING,EXAMPLE
,,
%Y,Year with century as a decimal number.,2001
%y,Year without century as a zero-padded decimal number.,01
%m,Month as a zero-padded decimal number.,02
%B,Month as locale’s full name.,February
%b,Month as locale’s abbreviated name.,Feb
%d,Day of the month as a zero-padded decimal number.,03
%A,Weekday as locale’s full name.,Saturday
%a,Weekday as locale’s abbreviated name.,Sat
%H,Hour (24-hour clock) as a zero-padded decimal number.,16

CODE,MEANING,EXAMPLE
,,
%#m,Month as a decimal number. (Windows),2
%-m,Month as a decimal number. (Mac/Linux),2
%#x,Long date,"Saturday, February 03, 2001"
%#c,Long date and time,"Saturday, February 03, 2001 16:05:06"


In [178]:
class Timer:
    
    def __init__(self,fmt='%m/%d/%Y - %I:%M:%S %p',start=True,label=''):
        import tzlocal
        import datetime as dt
        
        self._tz = tzlocal.get_localzone()
        self._created_at =dt.datetime.now(self._tz)
        self._fmt = fmt
        if start==True:
            self.start(label=label)
        
    def _get_time(self):
        import datetime as dt
        return dt.datetime.now(self._tz)
        
    def start(self,label=''):
        self._start = self._get_time()
        self._start_label = label
        
        print(f'[i] Timer started at {self._start.strftime(self._fmt)}')
        if len(label)>0:
            print(f'\t- Process running: {label}')
#         print()
        
    def stop(self,label=''):
        self._stop = self._get_time()
        elapsed = self._stop - self._start
        print(f'[i] Timer stopped at {self._stop.strftime(self._fmt)}')

        print(f"\t- The process {label} took {elapsed}.")
        
    def __call__(self):
        print(self._get_time())
        
    


In [179]:
timer = Timer()

[i] Timer started at 03/04/2020 - 07:08:29 PM


In [180]:
# dir(timer)
timer.start('Testing this thing')

[i] Timer started at 03/04/2020 - 07:08:31 PM
	- Process running: Testing this thing


In [181]:
timer.stop()#@'Testing this other thing')

[i] Timer stopped at 03/04/2020 - 07:08:31 PM
	- The process  took 0:00:00.437140.


In [182]:
timer()

2020-03-04 19:08:34.857247-05:00


## Method Chaining

- In Python, everytime you close your parentheses for a function or method, the result of that operation is returned.
- You can continue to chain additional methods one after another 

In [4]:
df = fs.datasets.load_mod1_proj()
df.head()

Unnamed: 0,id,date,price,bedrooms,bathrooms,sqft_living,sqft_lot,floors,waterfront,view,...,grade,sqft_above,sqft_basement,yr_built,yr_renovated,zipcode,lat,long,sqft_living15,sqft_lot15
0,7129300520,10/13/2014,221900.0,3,1.0,1180,5650,1.0,,0.0,...,7,1180,0.0,1955,0.0,98178,47.5112,-122.257,1340,5650
1,6414100192,12/9/2014,538000.0,3,2.25,2570,7242,2.0,0.0,0.0,...,7,2170,400.0,1951,1991.0,98125,47.721,-122.319,1690,7639
2,5631500400,2/25/2015,180000.0,2,1.0,770,10000,1.0,0.0,0.0,...,6,770,0.0,1933,,98028,47.7379,-122.233,2720,8062
3,2487200875,12/9/2014,604000.0,4,3.0,1960,5000,1.0,0.0,0.0,...,7,1050,910.0,1965,0.0,98136,47.5208,-122.393,1360,5000
4,1954400510,2/18/2015,510000.0,3,2.0,1680,8080,1.0,0.0,0.0,...,8,1680,0.0,1987,0.0,98074,47.6168,-122.045,1800,7503


In [183]:

df= fs.datasets.load_iowa_prisoners()
df.head()

Unnamed: 0,Fiscal Year Released,Recidivism Reporting Year,Race - Ethnicity,Age At Release,Convicting Offense Classification,Convicting Offense Type,Convicting Offense Subtype,Release Type,Main Supervising District,Recidivism - Return to Prison,Days to Recidivism,New Conviction Offense Classification,New Conviction Offense Type,New Conviction Offense Sub Type,Part of Target Population,Recidivism Type,Sex
0,2010,2013,Black - Non-Hispanic,25-34,C Felony,Violent,Robbery,Parole,7JD,Yes,433.0,C Felony,Drug,Trafficking,Yes,New,Male
1,2010,2013,White - Non-Hispanic,25-34,D Felony,Property,Theft,Discharged – End of Sentence,,Yes,453.0,,,,No,Tech,Male
2,2010,2013,White - Non-Hispanic,35-44,B Felony,Drug,Trafficking,Parole,5JD,Yes,832.0,,,,Yes,Tech,Male
3,2010,2013,White - Non-Hispanic,25-34,B Felony,Other,Other Criminal,Parole,6JD,No,,,,,Yes,No Recidivism,Male
4,2010,2013,Black - Non-Hispanic,35-44,D Felony,Violent,Assault,Discharged – End of Sentence,,Yes,116.0,,,,No,Tech,Male


In [184]:
df.isna().sum()
df.fillna('MISSING',inplace=True)

In [185]:
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelEncoder
drop_cols= [col for col in df.columns if 'New' in col]
drop_cols.append('Days to Recidivism')

In [186]:
df.drop(columns=drop_cols,inplace=True)


In [187]:
cat_cols = df.select_dtypes('object').columns
cat_cols


Index(['Race - Ethnicity', 'Age At Release ',
       'Convicting Offense Classification', 'Convicting Offense Type',
       'Convicting Offense Subtype', 'Release Type',
       'Main Supervising District', 'Recidivism - Return to Prison',
       'Part of Target Population', 'Recidivism Type', 'Sex'],
      dtype='object')

In [188]:
df.dtypes

Fiscal Year Released                  int64
Recidivism Reporting Year             int64
Race - Ethnicity                     object
Age At Release                       object
Convicting Offense Classification    object
Convicting Offense Type              object
Convicting Offense Subtype           object
Release Type                         object
Main Supervising District            object
Recidivism - Return to Prison        object
Part of Target Population            object
Recidivism Type                      object
Sex                                  object
dtype: object

In [189]:
encoders_dict= {}
for col in cat_cols:
    le = LabelEncoder()
    print(col)
    df[col] = le.fit_transform(df[col])
    encoders_dict[col] = le

Race - Ethnicity
Age At Release 
Convicting Offense Classification
Convicting Offense Type
Convicting Offense Subtype
Release Type
Main Supervising District
Recidivism - Return to Prison
Part of Target Population
Recidivism Type
Sex


In [190]:
df.head()

Unnamed: 0,Fiscal Year Released,Recidivism Reporting Year,Race - Ethnicity,Age At Release,Convicting Offense Classification,Convicting Offense Type,Convicting Offense Subtype,Release Type,Main Supervising District,Recidivism - Return to Prison,Part of Target Population,Recidivism Type,Sex
0,2010,2013,6,0,3,4,16,4,6,1,1,0,2
1,2010,2013,11,0,4,2,21,1,10,1,0,2,2
2,2010,2013,11,1,2,0,23,4,4,1,1,2,2
3,2010,2013,11,0,2,1,11,4,5,0,1,1,2
4,2010,2013,6,1,4,4,3,1,10,1,0,2,2


In [194]:
encoders_dict['Sex'].inverse_transform(df['Sex'])

array(['Male', 'Male', 'Male', ..., 'Female', 'Male', 'Male'],
      dtype=object)

In [173]:
target='Recidivism - Return to Prison'
y = df[target].copy()
X = df.drop(target,axis=1).copy()

In [174]:
X_train, X_test,y_train,y_test = train_test_split(X,y)


In [176]:
from sklearn.tree import DecisionTreeClassifier
from sklearn.ensemble import RandomForestClassifier 
from sklearn.metrics import accuracy_score
tree = RandomForestClassifier()#DecisionTreeClassifier( )

timer =Timer(start=True,label='Training Decision Tree Classifier')

tree.fit(X_train, y_train)

y_hat_test = tree.predict(X_test)
acc = accuracy_score(y_test,y_hat_test)
timer.stop(f'- Training complete. Accuracy = {acc}')

[i] Timer started at 03/04/2020 - 07:05:59 PM
	- Process running: Training Decision Tree Classifier
[i] Timer stopped at 03/04/2020 - 07:06:00 PM
	- The process - Training complete. Accuracy = 1.0 took 0:00:00.821586.


In [177]:
tree

RandomForestClassifier(bootstrap=True, ccp_alpha=0.0, class_weight=None,
                       criterion='gini', max_depth=None, max_features='auto',
                       max_leaf_nodes=None, max_samples=None,
                       min_impurity_decrease=0.0, min_impurity_split=None,
                       min_samples_leaf=1, min_samples_split=2,
                       min_weight_fraction_leaf=0.0, n_estimators=100,
                       n_jobs=None, oob_score=False, random_state=None,
                       verbose=0, warm_start=False)