# Python/R Basics


## Python Basics
 




### Define Functions

Note that type hint will not force type conversion (unlike cython or other statically typed language).

In [None]:
def myfunc(a:float, *args, **kwargs) -> str:
    return str(a)

In [None]:
# This should not work, but it DOES!
import numpy as np
x = np.array([1,1])
myfunc(x)

'[1 1]'

`*args` is called list unpacks. Inside the function, the `args` are just like lists. 

In [None]:
def my_sum(*args):
    result = 0
    for x in args:
        result += x
    return result
mysum(1,2,3) # Works
mysum(2,3,5,6) #Works

NameError: ignored

On the other hand, `**kwargs` is called keyword argument. It is basically a python dictionary. 

In [None]:
def my_concat(**kwargs):
    result = ""
    
    for k, v in kwargs.items():
        result += v
    return result
my_concat(x="a",y="b") # works
my_concat(fff = 1, bsr=2) # works

### Exception Handling

---
The most commonly used ways for exception handling is to raise an exception (also called throw). 

In [None]:
def raise_exception(x):
    raise Exception("I am an EXCEPTION!!!") # Something bad has happend

def catcher(x):
    try:
        raise_exception(x) # This will run the code. If everything is fine, it will return normally.
    except (TypeError, NameError):  # If a specific error occurs, this will follow the behavior. 
        print("I am ok with this!")
    except Exception as e: # This is often used to handle unknown exception, by letting someone else do the job
        raise e
    finally: # This will always execute no matter what
        print("Let us swallow everything when exception occurs!")
    
    

In [None]:
catcher(1)

There are quite some problem with the following approach.



*   It breaks the program, as long as one exception is not handled.
*   This is ok if we are testing our code. However, if this is a production system, you don't want a night time call to restart the system.
*   Once one function throws an exception, everyone else that calls the function has to modify their code by addding `try-except` blocks.
*   Many exception will be passed all the way to the top, and then handled. However, the top function does not know the details of each function! Therefore, it is extremely hard to devise a complete plan. 



An alternative way is to use log. There are many logging options and we will not delve into the details. The idiom is to log what goes wrong and specify the bevahior. 

The advantage is that you will keep the program warning, and by adjusting the log level, you can adjust the behavior. However, **someone still have to handle the exceptions!**

In [None]:
import logging
logging.info("This is some useful information.")
logging.warning("This is some warning!")
logging.error("Something went wrong!")


A final option, that is very popular is to use a Monad. Monad is quite complex to explain. So let us see an example. 

In [None]:
class Failure():
    def __init__(self, value, failed=False):
        self.value = value
        self.failed = failed
    def get(self):
        return self.value
    def is_failed(self):
        return self.failed
    def __str__(self):
        return ' '.join([str(self.value), str(self.failed)])
    def __or__(self, f):
        if self.failed:
            return self
        try:
            x = f(self.get())
            return Failure(x)
        except:
            return Failure(None, True)

In [None]:
# This will work.
from operator import neg
x = '1'
y = Failure(x) | int | neg | str
print(y)

-1 False


In [None]:
# This will not
from operator import neg
x = 'hahaha'
y = Failure(x) | int | neg | str
print(y)

None True


A beautiful collection of functional programming primitives can be found [here](https://github.com/jasondelaat/pymonad.git). Use the following command to install.


### Python Class

In [None]:
class MyClass(object):
    def __init__(self, x):
        self.x = x
    def __del__(self): # WARNING: Perhaps a very bad idea!
        print("I am gone")

In [None]:
my_class = MyClass(1)

In [None]:
del my_class

In [None]:
my_class

In [None]:
my_class_a = MyClass(1)
my_class_b = my_class_a
my_class_c = MyClass(1)

In [None]:
my_class_b.x= 2
print(my_class_a.x) # Note that this is a reference to the class, therefore, they are pointing to the same thing which is why it changes. 

In [None]:
my_class_b == my_class_a

In [None]:
my_class_a = MyClass(1)
my_class_c = MyClass(1)
my_class_a == my_class_c

In [None]:
from copy import deepcopy
my_class_a = MyClass(1)
my_class_b = deepcopy(my_class_a)
my_class_b == my_class_a

In [None]:
my_class_b.x= 2
print(my_class_a.x)

### The Ghost Bus Incidence

---
Is it usually a terrible idea to use mutable variables as default argument. The following snippets illustrate the point. 

In [None]:
class GhostBus:
    def __init__(self, passengers=[]):
        self.passengers = passengers
    
    def pick(self, name):
        self.passengers.append(name)
        
    def drop(self, name):
        self.passengers.remove(name)

In [None]:
# Run this several times
ghost_bus = GhostBus()
ghost_bus.pick('A Ghost')
ghost_bus.passengers

What goes wrong here? Note that self.passengers is a reference to passengers, and passengers is a refernece to `[]` (which is global). Note when you mutate self.passengers, you are mutating `[]` as well. So please use `None` instead. 

### Common Data Structures: List

---
Python list is a little bit like C++ vector, except it can hold any type of object. It is ordered. 

In [None]:
a = []
# a = list()
b= [1,a,'2']

In [None]:
b[0]

In [None]:
b[:1]

In [None]:
b[1:]

In [None]:
b[2:3]

In [None]:
b[-1]

In [None]:
b[:-2]

In [None]:
b.append(5)
b

In [None]:
b.extend([1,2])
b

In [None]:
b.insert(1,'haha')
b

In [None]:
del b[0]
b

In [None]:
b.remove(1)

In [None]:
matrix  = [[1,2],[3,4],[5,6],[7,8]]
matrix

In [None]:
tranpose =[[row(i) for rwo in matrix] for i in range(2)]

To understand what happens. Note that we have used a syntax. In short.

```
x = [i*2 for i in range(10]
```

is the same as

```
x = list()
for i in range(10):
    x.append(i)
```

### Common Data Structures: Set

---
This is essentially a hashset, basically means it is unordered. The 'equivalent' in C++ will be unordered_set. Also, there are no duplicate element. 

In [None]:
a = {1,2,3}

In [None]:
my_set = {1, 3}
print(my_set)
my_set.add(2)
print(my_set)
my_set.update([2, 3, 4])
print(my_set)
my_set.update([4, 5], {1, 6, 8})
print(my_set)

{1, 3}
{1, 2, 3}
{1, 2, 3, 4}
{1, 2, 3, 4, 5, 6, 8}


In [None]:
my_set.add(1)
my_set

{1, 2, 3, 4, 5, 6, 8}

In [None]:
my_set.remove(1)
my_set

{2, 3, 4, 5, 6, 8}

In [None]:
set_a = {1,2,3}
set_b = {3,4,5}

Here are some set operations. Pretty self-explanatory. 

In [None]:
print(set_a|set_b)
print(set_a - set_b)
print(set_b - set_a)
print(set_a.union(set_b))
print(set_a.intersection(set_b))
print(set_a^set_b)

{1, 2, 3, 4, 5}
{1, 2}
{4, 5}
{1, 2, 3, 4, 5}
{3}
{1, 2, 4, 5}


### Common Data Structures: Dict

---
Dict is basically a hashmap. Its 'equivalent' in C++ will be unordered map. Therefore it is not with an order. To avoid pain, if you need order, use OrderedDict. 

In [None]:
a = dict()
a = {'x':'1', 'y':'2'}

In [None]:
print(a['x'])
print(a['not_here'])

In [None]:
a['new_element'] = 'haha'
print(a)

In [None]:
print(a.keys())
print(a.values())

In [None]:
del a['new_element']

In [None]:
a

In [None]:
keys = ['a','b','c']
values = [1,2,3]
dict_from_zip = dict(zip(keys, values))
print(dict_from_zip)

In [None]:
def my_concat(**kwargs):
    result = ""
    
    for k, v in kwargs.items():
        result += v
    return result
my_concat(x="a",y="b")

In [None]:
my_concat(**a)

In [None]:
# You can also use dict comprehension to shorten your code. 
odd_squares = {x: x*x for x in range(11) if x % 2 == 1}
print(odd_squares)

### Common Data Structure: NamedTuple

In [None]:
from collections import namedtuple

In [None]:
employee = namedtuple('Employee', ['age','place', 'education'])

In [None]:
tom = employee(age=10, place='beijing', education='none')

In [None]:
print(tom)

### Common Data Structure: dataclass

---

Data class is a great way to pass many parameters to a function. It helps with documentation, with range check, so people won't just stack anything into it. 

In [None]:
from dataclasses import dataclass, field
from typing import Optional

In [None]:
@dataclass
class MyDataClass:
    name : str = field(
    default='tom',
    metadata={'help':"Name of the person"})
    
    age: Optional[int] = field(
    default = None,
    metadata={'help':"Age of the pesson. Optional."})
    
    vip: int = field(
    default = 100,
    metadata = {'help':"Some very important field."})
        

    def __post_init__(self): # This function will help you to handle ilegal argument. 
        if self.vip <= 0:
            raise Exception("That important thing has to be larger than 0")
            
    @property
    def age_type(self):
        if self.age >= 100:
            return 'You are old'
        else:
            return 'You are still young' 

In [None]:
my_data_class = MyDataClass(name='jerry', age = 20)
print(my_data_class)

In [None]:
print(my_data_class.age)
print(my_data_class.age_type)

A word about docs. 

In general, using [Spinx](https://www.sphinx-doc.org/en/master/) to generate a documentaion is a pretty good idea. Therefore, some command should be given to functions. In general, for public api's, the docstring should include at least 

1.   Functionality
2.   Argument type and explanation.
3.   Return type.
4.   (Optional) A use case. 

Note that if a function will change some of the input parameter. This **MUST** be highlighted in the doc. 


## R 

---
Before we venture into more advanced staff. Let us introduce very briefly what R does, and magic functions. To use R, you have to activate the functionality. 

In [None]:
%load_ext rpy2.ipython

  from pandas.core.index import Index as PandasIndex


To use R, we can use `%%R` cell magic. 

In [None]:
%%R # This means 
install.packages('caret')

R[write to console]: 

R[write to console]: 
R[write to console]: The downloaded source packages are in
	‘/tmp/RtmpDDbuDm/downloaded_packages’
R[write to console]: 
R[write to console]: 



In [None]:
%%R 
library('caret')

In [None]:
%%R
a  <- 1
2 -> b
c = 1
a == c

In [None]:
%%R
for (i in 1:100){
    print(i)
}

In [None]:
%%R
myfunc <- function(a){
    a = a+1
    return(a+1)
}

In [None]:
%%R
myfunc(a) # It will usually make a copy

In [None]:
%%R
a

In [None]:
%%R
data(mtcars) # This is a built-in R dataset

In [None]:
%%R
summary(mtcars)

In [None]:
%%R
mtcars$mpg

# Magic Functions in Python Object


In [None]:
class Vector:
    def __init__(self, x=0, y=0):
        self.x = x
        self.y = y

Let us see if we can print it out in a nice way. 

In [None]:
class Vector:
    def __init__(self, x=0, y=0):
        self.x = x
        self.y = y

    def __repr__(self):
        return 'Vector(%r,%r)' % (self.x, self.y)
    def __str__(self):                              
        return 'Vector(%r,%r)' % (self.x, self.y)

In [None]:
v = Vector(1,2)
print(str(v))
print(v)

How about some arithmatics?

In [None]:
class Vector:
    def __init__(self, x=0, y=0):
        self.x = x
        self.y = y

    def __repr__(self):
        return 'Vector(%r,%r)' % (self.x, self.y)
    
    def __add__(self, other):
        x = self.x + other.x
        y = self.y + other.y
        return Vector(x, y)
    
    def __sub__(self, other):
        x = self.x - other.x
        y = self.y - other.y
        return Vector(x, y)
    
    def __mul__(self, scalar):
        return Vector(self.x * scalar, self.y * scalar)

In [None]:
v1 = Vector(0,0)
v2 = Vector(1,2)

v1+v2

How about comparison

In [None]:
from math import hypot

class Vector:
    def __init__(self, x=0, y=0):
        self.x = x
        self.y = y

    def __repr__(self):
        return 'Vector(%r,%r)' % (self.x, self.y)
    
    def __add__(self, other):
        x = self.x + other.x
        y = self.y + other.y
        return Vector(x, y)
    
    def __sub__(self, other):
        x = self.x - other.x
        y = self.y - other.y
        return Vector(x, y)
    
    def __mul__(self, scalar):
        return Vector(self.x * scalar, self.y * scalar)
    
    def __abs__(self):
        return hypot(self.x, self.y)
    
    def __bool__(self):
        return bool(abs(self))
    
    def __eq__(self, other):
        return self.x == other.x and self.y == other.y
    
    def __lt__(self, other):
        return abs(self) < abs(other)
    
    def __gt__(self, other):
        return abs(self) > abs(other)

In [None]:
v1 = Vector(1,1)
v2 = Vector(1,1)
v3 = Vector(1,2)

print(v1 == v2)
print(v1 == v3)

print(v3 > v1)
print(v1 < v3)

## Basic Functional Programming in Python

### Common Higher Older Function

In [None]:
my_input = [1,2,3,4,5,6,6]
result = map(lambda x: x+1, my_input)
print(result) # map is lazy
print(list(result))

In [None]:
from functools import reduce
result = reduce(lambda x, y: x+y, filter(lambda x: x > 3, map(lambda x: x+1, my_input)))

In [None]:
print(result)

### Decorators

In [None]:
def my_decorator(func):
    def my_decorator_impl(x):
        result = x if x > 0 else 0
        return func(result)
    return my_decorator_impl

@my_decorator
def myfunc(x):
    return np.sqrt(x)

In [None]:
myfunc(-1)

In [None]:
from functools import partial
def decor_impl(fun, argument):
    def impl(x):
        result = x if x > argument else argument
        return fun(result)
    return impl

decor = partial(decor_impl, argument = 2)

@decor
def myfunc(x):
    return np.sqrt(x)

In [None]:
myfunc(-1)

In [None]:
def para(dec):
    def layer(*args, **kwargs):
        def repl(f):
            return dec(f, *args, **kwargs)
        return repl
    return layer

@para
def decor(f, n):
    def impl(x):
        result = x if x > n else n
        return f(result)
    return impl

@decor(0)
def myfunc(x):
    return np.sqrt(x)

In [None]:
myfunc(-1)