<h1>Tutorial: Fancy Tools for Exploring Data Science with Python</h1>

*To open in Colab, click the badge below!*

<a href="https://colab.research.google.com/github/teboozas/python_tutorial_for_data_science/blob/master/Eng/Tutorial_Ch2_3(object in Python_basic types).ipynb" target="_parent"><img align="left" src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# 2. Python & Object-Oriented Programming(OOP)

## 2.3 Object in Python - Basic Types

### Class and Instance(1): Object in Python

**Classification of Python objects - by source**

* **internal(built-in) object** - objects that are pre-installed with Python, including basic data types(like numbers, strings, lists, etc.) and built-in functions (like `print()`, `help()`)
* **external object** - objects included in packages that had been registered in Python user community. External objects are like `ndarray` data type in NumPy package, `dense layer` data type in Tensorflow package.
* **user-defined object** -objects that are defined by user in purpose via `class` and `def` keywords. <u>In this case, attributes and methods for user-defined objects can be defined by both internal and external objects.</u>

**Classification of Python objects - by type**

* **data type** - objects used to store and handle data for programming. They can be classified again by univarite data type(number, boolean, etc.) and multivariate data type(list, tuple, array, dictionary, etc.)

    For multivariate data type, various methods are often used. Especially, <u>methods in list type</u> are very useful in implementation of data science methodologies.

* **function type** - objects used to implement functionalities, with representation of relationship between input/output objects. Internal functions and user-defined functions with `def` are the cases of function type object. Function type object is important, for its usage of **implementing methods of class**.

**Useful internal functions to check properties of object**

* **`type()`** - prints type information of target object.
* **`dir()`** - prints attributes and methods included in object.
* **`help()`** - prints overall information of object, especially details in usage of methods.

In [0]:
# this is option to print multiple results in single code cell
from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all"

In [0]:
# example of type() function with list data type
type([1,2,3,4,5])

In [0]:
# example of dir() function with list data type
dir([1,2,3,4,5])

In [0]:
# example of help() function with list data type
# you can see details in usage of methods included in list data type
help([1,2,3,4,5])

### Class and Instance(2): Variable and Assignment

In Python, variables are core components of program instead of direct use of objects.

* Variable of Python is like 'pointer' in C language. In other words, variable stores address of real object in the storage, not object itself.
* `=` (assignment operator) assigns object to variable as below:
> variable_name = target_object
    
    Variable name can be freely defined, but it is recommanded to follow naming tradition.

* Variables can changed by functions, keywords, or another assignment operators. In this case, assigned object is changed together with variable, and new address is generated. Then this new address is stored in existing variable name.

In [1]:
# example to understand relationship among class, instance, and variable
# 'sum' is a 'class' with a type 'function'
sum

<function sum>

In [2]:
# relized object by a list 'sum([1,2,3])' is 'instance'
sum([1,2,3])

6

In [3]:
# 'x' and 'y' are 'variable', which assigned same 'instance' simutaneously 
# 'x' and 'y' share same address, indicates the storage of 'instance'
# id() function prints storage address of assigned instance
x = sum([1,2,3])
y = sum([1,2,3])
print(id(x))
print(id(y))
print(id(sum([1,2,3])))

10968960
10968960
10968960


### Attribute: Data Types of Python

Python provides various data types to contain and handle data. These data types are used both to assign instances from internal functions or classes, and to define attributes in user-defined class/function. This is why title of the section starts with 'attribute'.

Thus, we are going to study selected internal data types, divided into **simple(univariate)** and **compound(multivariate)** ones, and following major attributes/methods.

**Simple data type**

Simple data types in Python are almost the same, but different in boolean and null(`None`) type.

* **number** - can be roughly classified into integer, float, and complex number.
    > integer variable : `x = 1`<br>
    > floating number variable : `x = 1.0`<br>
    > complex number variable : `x = 1 + 2j` (*imaginary part can be expressed with `j` or `J` characters.*)
* **string** - stores character values surrounded by `' '` or `" "`.
    > string variable : `x = "this is string."`
* **boolean** - has values of `True` or `False`.
    > boolean variable : `x = True` (*note that First character must be capital*)
* **`None`(null type)** - express null status of variable with a value `None` (*typical property of Python*)
    > `None` variable : `x = None`

**Compound data type**
    
Python has various compound data types like other languages. Handling compound type is the core part of data science with Python. Particularly, list type and dictionary type is very useful in data science.

* **list** - ordered and mutable. That is, values in list type have their numerical index and can be changed by user.
    > list variable : `x = [1,2,3]` (*defined by squared brackets*)
* **tuple** - ordered and immutable. Thus values in tuple cannot be changed once they were defined.
    > tuple variable : `x = (1,2,3)` or `x = 1,2,3` (*defined by round brackets or just comma*)
* **dictionary** - stores mappinngs of key-value pair. Key prevents data from overlapping of values.
    > dictionary variable : `x = {'a' : 1, 'b' : 2, 'c' : 3}` (*defined by key:value pair surrounded by curly brackets*)
* **set** - unordered and mutable. It also prevents data from overlapping but can be changed.
    > set variable : `x = {1,2,3}` (*defined by curly brackets*)



**Attributes and methods of data type**

Data types are also object in Python. So they have their own attributes and methods. And it is important to use them efficiently included in compound data types.

* list type - `append()` (insert a new entity), `extend()` (insert new multiple entities), etc.
* dictionaty type - `keys()` (output key as instance), `values()` (output values as instance), etc.
* set type - `union()` (output union set as instance), `intersection()` (output intersection set as instance), etc.

### Method(1): Operator, Control Flow, and Exception

Functions are the heart of computer programming. Methods, representation of functionality, are also the core elements of object-oriented programming.

Methods define relationship between input and output, and give functionalities to object. Especially, <u>operation, control flow and exception control</u> are main components of function and method. In this section, let's briefly check three of them in Python.

**Operators**

* **arithmetic operator**

    Usage of arithmetic operator is quite intuitive, and can be applied on other types of object such as string, list, etc.
    > `a + b` (addition)<br>
    > `a - b` (subtraction)<br>
    > `a * b` (multiplication)<br>
    > `a / b` (true division)<br>
    > `a // b` (floor division)<br>
    > `a % b` (modulus)<br>
    > `a ** b` (exponentiation)<br>
    > `-a` (negation)
    
* **bitwise operator**

    Bitwise operator performs operations on objects expressed in binary. Note that some of expressions are different from other programming languages(`&`, `^`, etc.).
    > `a & b` (bitwise AND)<br>
    > `a | b` (bitwise OR)<br>
    > `a ^ b` (bitwise XOR)<br>
    > `a << b` (bit shift left)<br>
    > `a >> b` (bit shift right)<br>
    > `~a` (bitwise NOT)
    
* **assignment operator**
    
    Assignment operator is that performs an arithmetic/bitwise operations at the same time as assigning variables. These can be done with placement of `=` operator right after the arithmetic/bitwise operators.
    > example) These two operators have same functionality.<br>
    > `a += 2`<br>
    > `a = a + 2`

* **comparison operator**

    Comparison opertator compares **values** of two objects and return boolean(`True` or `False`) value with respect to comparison result.
    > `a == b` (check `a` is equal to `b`)<br>
    > `a != b` (check `a` is not equal to `b`)<br>
    > `a < b` (check `a` is less than `b`)<br>
    > `a > b` (check `a` is greater than `b`)<br>
    > `a <= b` (check `a` is less or equal to `b`)<br>
    > `a >= b` (check `a` is greater or equal to `b`)<br>

* **boolean operator**
    
    boolean operator is a logical operator that compares two **boolean values** and return new boolean value.
    > `a and b` (check `a` and `b` are all `True`)<br>
    > `a or b` (check one of `a` or `b` is `True`)<br>
    > `not a` (check `a` is `False`)

* **identity & membership operator**

    Identity operator compares two **objects** and return boolean value. (*Note that diffefence with identity operator and comparison operator is in object/value*)
    > `a is b` (check object `a` is equal to object `b`)<br>
    > `a is not b` (check object `a` is not equal to object `b`)
    
    Membership operator verifies existance of a value within compound data type, which returns boolean value.
    > `a in b` (check value `a` is in objet `b`)<br>
    > `a not in b` (check value `a` in not in object `b`)

* **precedence and associativity of operators**

    As other languages, Python has precedence and associativity of operators, which means multiple operators can be used in a single statement. Precedence is an order of priority within operators.(*More on precedence and associativity are [here](https://www.programiz.com/python-programming/precedence-associativity)*)
    > example) `a > b or c < d and e > f`


**Control flow**

Control flow means implementation of *conditional execution* and *loop* within program. Control flow syntax in Python is practically the same with other languages, but more intuitive in expressions.

* **Conditional statement** - `if`, `elif`, `else`
    
    Conditional statement takes boolean value as its input, and execute included code block when boolean values is `True`. In this case, inclusion of code block is expressed with indentation as we saw before.

    Commands for conditional statement are `if`, `elif`, and `else`. To write conditional statement, first define condition that returns boolean value right after conditional commands. After put colon(`:`) to declare start of code block, write down code block to execute using indentation.

    `elif` command replaces `else if` command in other programming languages.

    `else` command only exeutes included code block if all boolean statements are `False`. `else` can be used in loop technique, which plays opposite role to `break` command.

* `for ... in ...` **loop**

    So called `for` loop is the standard to implement loop technique in most of languages. `for` loop is more often used compared to `while` in Python.

    `for` loop statement can be written linking a varlabie and 'iterator' object using `in` command. Iterator object is that can be subject to loop technique in Python, for example `range` and `list` type of objects.

* `while ...` **loop**

    `while` loop executes contained code block until entered boolean statement becomes `False`. Thus 'infinite loop' can be caused with `while` command, so we have to control it using loop control commands.

* **loop control** - `break`, `continue`, `else`

    Loop control commands are used to detailed execution like intended interruption or partial execution of loop. Loop technique becomes rich with these control commands and conditional commands like `if`.

    `break` command instantly stop execution. In this case, loop no longer runs.

    `continue` command instantly starts next iteration without execution of remaining code block. Loop itself is still on running.

    `else` command can be used in loop technique. Code blocks defined with `else` command are executed when interruption by `break` command doesn't occur.

In [0]:
# example of writing conditional statements in Python
# conditional statements need boolean statement
# indentation and colon are used to express inclusion of code blocks

x = 10

if x > 20:
    print("x is greater than 20")
    print("x is a large number")
elif x > 10:
    print("x is greater than 10")
    print("x is a not large number")
else:
    print("x is less than or equal to 10")
    print("x is a small number")

In [0]:
# example of implementing 'sieve of Eratosthenes' with loop and loop control
# list type and following method('append'), and 'range' function are used

# define empty list instance with `[]` and assign it into variable name 'L'
L = []
nmax = 30

# notice the indentation position of the 'else' statement!
for n in range(2,nmax):
    for factor in L:
        if n % factor == 0:
            break
    else:
        L.append(n)

print(L)

**Error and exception**

One of the conditions of well-written program is a predictability of error. Pointing out type and position of errors is very important for debugging, and this functionality is much powerful in Python.

We call these as exception control. Exception control techniques are so broad in Python, thus we will briefly check the things that needed.

* **Types of error** - `Syntax`, `Runtime`, `Semantic`

    `SyntaxError` is the most frequently seen type of error, but is the easiest to resolve. It occurs when some statement is not valid compared to specified grammar syntax in Python, like indentation or usage of brackets.

    `RuntimeError` appears when an incorrect inputs are entered, even if there are no sytax errors. Reference of variable name that is not defined(`NameError`), operation with invalid data type(`TypeError`), or dividing by zero(`ZeroDivisionError`) are the cases of `RuntimeError`.

    `SemanticError` means that, something is returned which is not expected. And this is the hardest case to debug.

* **Exception control**  - `try`, `except`, `else`, `finally`

    `try` command interrupts execution when some error is occured, and move on to following `except` statement. If no errors occur, only code blocks in `try` statements are executed.

    `except` command defines included code blocks to execute when some errors occured at `try` statement. Types of error can be designated in `except` statement. If errors are not matched with designated ones, program will be interrupted with error message.

    `else` command also can be used in exeption control as in loop. It runs included code blocks when no errors occur in `try` statement.

    `finally` statement is used to define code blocks, that have to be executed regardless of whether errors occur or not. However, it is not highly utilized.

* **Raising error in intention** - `raise`

    `raise` statement is used to raise error when unintended inputs are entered into user-difined function or class. Messages can be written in `raise` statement to help debugging

* **Automated error test** - `assert`

    `assert` command can be used to test whether some boolean statement is a kind or error or not. If `assert` command takes `False` statement as an input, `AssertionError` occurs.

In [0]:
# followings are flow of exception control process

try:
    print("try something here")
except:
    print("this happens only if it fails")
else:
    print("this happens only if it succeeds")
finally:
    print("this happens no matter what")

In [0]:
# example of exception control with `raise` statement

try:
    print("try something here")
    raise RuntimeError
    print("something bad not happened")
except:
    print("this happens only if it fails")
else:
    print("this happens only if it succeeds")
finally:
    print("this happens no matter what")

In [0]:
# example of automated test using `assert`statement

assert 1>0, "this is true"
assert 1<0, "this is false"

### Method(2): `def` statement and user-defined function

User-defined function is a function that made with functionalities(*operators, conditions, loops, and exception control*) to define relationship among data types and objects.

In Python, user-defined function can be defined with `def` statement. Or, you can define anonymous function for temporal use, with `lambda` statement.

`def` **statement**

The way of defining user-defined function is below. Note that colon(`:`) must be used to define included functionality. Designation of arguments is not mandatory.
> `def function_name(argument):`
>
>     code block (to define functionality)

* **Argument**

    Arguments are factors that users have to enter to use function. It is not necessary to define arguments, but empty round bracktes(`()`) have to be placed in this case. Otherwise, `SyntaxError` may occur.

    Default value of argument can be defined in advance. Arguments with default value have to be placed after arguments without defalut value. `SyntaxError` will also occur in this case.

    ※ *Strictly, elements defined in function are calls 'parameter', and 'arguments' are values entered in parameters. However, we will use 'argument' to express these terms in our tutorial for convinience.*

* **Returning result** - `return`

    `return` statement outputs results of execution done within function. There are functions that nothing is returned. `return` statement is not always needed.

    If you want to return multiple values, compound type is used to do (`tuple` is widely used in this case). Note that multiple return statements are not valid; only a statement on the top will be returned.

In [0]:
# example of user-defined function implementing 'Fibonacci numbers'
# arguments `a` and `b`(with default) are placed after argument `n`
# `len` function(return length of compound data type) is used

def fibonacci(n, a=0, b=1):
    L = []
    while len(L) < n:
        a, b = b, a + b
        L.append(a)
    return L

fibonacci(10)

`lambda` **statement**

`lambda` statement defines anonymous function; this function is not stored in memory. It is useful for simple usage of functionality without memory allocation.

> `lambda` arguments : relational expression(as return)

※ *Lambda expression is one way to implement 'functional programming', the paradigm of computer programming like OOP*

In [0]:
# assigning simple addition function with `lambda`
# using lambda expression in this way is not recommended (just for example)

my_add = lambda a, b : a + b
my_add(1,2)

## Reference (Python & object-oriented programming(OOP))

* [Python official documentation (Python version 3.6.9)](https://docs.python.org/3.6/index.html) - can explore Python official documentations, including tutorial (*also available in Korean*).
* ['Jump to Python' WikiDocs (Korean)](https://wikidocs.net/book/1) - well-known Python material in Korean, which is free-accessible online via WikiDocs.
* [a Whirlwind Tour of Python](https://jakevdp.github.io/WhirlwindTourOfPython/) - introductory text of Python, written by the author of 'Python Data Science Handbook' (also participaed in opening video of Colab introduction).
* and much of open-source lectures are available online (in [Coursera](https://www.coursera.org/specializations/python), [Edwith](https://www.edwith.org/sogang_python), [OpenTutorials](https://opentutorials.org/course/1750), etc.)