# Object-Oriented Programming (OOP) 

## Define a Class

In [1]:
class Client:
    name = "default"
    phone = "(123)456-7890"
    email = "foo@bar.com"
    purchases = 0


## Creating an Instance (Object) of a Class


In [2]:
firstClient = Client()  # Create a new object

In [3]:
# Change the value of class attributes
firstClient.name = "Lily"    
firstClient.email = "python@cuhk.edu.hk"

In [4]:
print(firstClient.name)
print(firstClient.phone)
print(firstClient.email)
print(firstClient.purchases)

Lily
(123)456-7890
python@cuhk.edu.hk
0


In [5]:
try:
    print(firstClient.middleName)  # Error: no such field defined
except Exception as ex:
    print("!!!Error: ", ex)

!!!Error:  'Client' object has no attribute 'middleName'


## Example: HK Stocks

In [6]:
class HKStocks:
    def __init__(self,stock_code = "default", name = "NA", price = 0, market_cap = 0 ):
        self.profile = {'stock_code':stock_code,'name':name,'price':price,'market_cap':market_cap}

In [7]:
s0 = HKStocks()
print(s0.profile)

{'stock_code': 'default', 'name': 'NA', 'price': 0, 'market_cap': 0}


In [8]:
# object 1 of Class-HKStocks
s1 = HKStocks('09988.HK','BABA-SW',89.55,1897.12)
print(s1.profile)
print(s1.profile['stock_code'])
print(s1.profile['name'])
print(s1.profile['price'])
print(s1.profile['market_cap'])

{'stock_code': '09988.HK', 'name': 'BABA-SW', 'price': 89.55, 'market_cap': 1897.12}
09988.HK
BABA-SW
89.55
1897.12


In [9]:
# object 2 of Class-HKStocks
s2 = HKStocks('03690.HK','MEITUAN-W',125.00,780.26)
print(s2.profile)
print(s2.profile['stock_code'])
print(s2.profile['name'])
print(s2.profile['price'])
print(s2.profile['market_cap'])

{'stock_code': '03690.HK', 'name': 'MEITUAN-W', 'price': 125.0, 'market_cap': 780.26}
03690.HK
MEITUAN-W
125.0
780.26


In [10]:
# object 3 of Class-HKStocks
s3 = HKStocks('00700.HK','TENCENT',325.00,3111.63)
print(s3.profile)
print(s3.profile['stock_code'])
print(s3.profile['name'])
print(s3.profile['price'])
print(s3.profile['market_cap'])

{'stock_code': '00700.HK', 'name': 'TENCENT', 'price': 325.0, 'market_cap': 3111.63}
00700.HK
TENCENT
325.0
3111.63


## Define Class Methods 


In [11]:
class Person:
    name = "I have no name :("
    
    def sayName(self):
        print("My name is...", self.name)
        

In [12]:
aPerson = Person()
aPerson.sayName()

My name is... I have no name :(


In [13]:
aPerson.name = "HappyLily :D"
aPerson.sayName()

My name is... HappyLily :D


## Accessing Attributes & Methods 

In [14]:
lisa = Person()
lisa.name = "Lisa, Nice to meet you."
lisa.sayName()


My name is... Lisa, Nice to meet you.


## Constructor: A Special Method

In [15]:
class Person:
    name = ""
    
    def __init__(self):
        self.name = "No name"


bPerson = Person()
bPerson.name

'No name'

In [16]:
class Person:
    name = ""
    
    def __init__(self, aName):
        self.name = aName

cPerson = Person("Lynn")
cPerson.name

'Lynn'

In [17]:
class Person:
    def __init__(self, aName="No name"):
        self.name = aName
        
dPerson = Person()
print(dPerson.name)

ePerson = Person("Rose")
print(ePerson.name)

No name
Rose


## Example: Class Person with Birthday 

In [18]:
class Person:
    name = "No Name"
    age = 0

    def __init__(self,newName,newAge):
        self.name = newName
        self.age = newAge

    def haveBirthday(self):
        print("Happy Birthday!")
        self.mature()

    def mature(self):
        self.age = self.age + 1


In [19]:
aPerson = Person("Cartman",8)
print("%s is %d." %(aPerson.name,aPerson.age))

Cartman is 8.


In [20]:
aPerson.haveBirthday()
print("%s is %d." %(aPerson.name,aPerson.age))

Happy Birthday!
Cartman is 9.


## Example: Class Client with Purchases

In [21]:
class Client:
    name = "No Name"
    purchase = 0

    def __init__(self,newName,newPurchase):
        self.name = newName
        self.purchase = newPurchase

    def makePurchase(self, addPurchase):
        print("Making new purchase"+ str(addPurchase))
        self.purchase = self. purchase + addPurchase
        
        


In [22]:
aClient = Client("Lily",100)
print("%s's purchase position: %d." %(aClient.name,aClient.purchase))

Lily's purchase position: 100.


In [23]:
aClient.makePurchase(150)
print("%s's purchase position: %d." %(aClient.name,aClient.purchase))

Making new purchase150
Lily's purchase position: 250.


# NumPy


## Why NumPy

NumPy (short for Numerical Python) provides data structure called `ndarray`, which is like Python's built-in `list` type but more effcient in storage and data operation. This is especially true as the size of data grows. Each element of a `list` in itself is a high-level python object and when working with large number of elements, it creates overhead in performance efficiency. For example, operating on the elements in the `list` can only be done through iterative loops, which is computationally inefficient. 

NumPy's array data structure stores data in a simpler form and manipulates it more directly in memory without having to deal with high-level python objects. It is less flexible in that only data of the same data type can be stored in a column of an array. However, this limitation also allows for numerical operations such as matrix manipulations and simulations to be completed much faster. For this reason, NumPy arrays form the core of nearly the entire ecosystem of data science tools in Python. Other popular data science libraries such as Pandas and Scikit Learn are built on top of NumPy.


## Installing NumPy

If you sucessfully installed the recommended method of Python installation through Anaconda, you already have NumPy. Google colab also comes with NumPy module installed. You just have to import the module into your jupyter notebook to begin using it. 

For more information regarding NumPy installation, visit - https://scipy.org/install.html.


## Importing NumPy

- Packages can be imported in Python environment by using the `import` statement followed by the module name. 
- You may also give an alias to the name of the module, so you do not have to type the module name everytime you want to access a method attached to it. 

In [24]:
# import - import directive
# numpy - the package being imported
# as np - give alias "np" to the module numpy

import numpy as np

# now you can access method attached to numpy using "np."

## ndarray

- NumPy’s main object is the homogeneous multidimensional array, which is a table with elements of same data type, i.e, integers or string or characters (homogeneous), usually integers.
- `ndarray` stands for N-dimensional array. A NumPy array is a grid of values, all of the same type.
- Like `list`, `ndarray` can also be indexed and sliced.
- NumPy methods and functions can be accessed using dot notation after importing the module itself.

In [25]:
np.array([1,2,3])

array([1, 2, 3])

In [26]:
a=np.array([[1,4],[2,3],[1,4]])
b=np.array([[1,2],[1,2]])
a.dot(b)

array([[ 5, 10],
       [ 5, 10],
       [ 5, 10]])

### NumPy vs List operations
`List` and `ndarray` both contain data. So what is the difference? 

- The primary difference is the way in which the data is stored in `ndarray`, which makes it perform better. NumPy data structures take up less space and are faster than `list`. 
- `List` can contain multiple objects. However, elements of `ndarray` are homogenous.  
- `List` is flexible, allowing items to be added or removed. However, `ndarrays` does not allow adding or removing items.
- In addition, NumPy and SciPy model provide optimized functions for scientific operations such as linear algebra and more.

In [27]:
alist = ['Book', 4, 14.99]

print(type(alist))

print(type(alist[0]))
print(type(alist[1]))
print(type(alist[2]))

<class 'list'>
<class 'str'>
<class 'int'>
<class 'float'>


In [28]:
A = np.array(alist)
A

array(['Book', '4', '14.99'], dtype='<U32')

In [29]:
print(type(A))

print(type(A[0]))
print(type(A[1]))
print(type(A[2]))

<class 'numpy.ndarray'>
<class 'numpy.str_'>
<class 'numpy.str_'>
<class 'numpy.str_'>


- There are additional advantages to using a `ndarray` over a list. 

For example, to add a constant 2 to each element of a numeric `list`, we write -

In [30]:
numlist = range(1,4)

doubled_list = []
for i in numlist:
    print(i)
    doubled_list.append(i+2)

print(doubled_list)

1
2
3
[3, 4, 5]


Do the same with `ndarray`

In [31]:
A = np.arange(1,4)
doubleA = np.add(A,2)

print(A)
doubleA

[1 2 3]


array([3, 4, 5])

### Getting array information

We can get additional information regarding the data using method such as - 
- `np.ndim`, which refers to the number of dimension of an array and 
- `np.shape`, which provides information on the length of each dimension of an array.

In [32]:
A.ndim

1

In [33]:
A.shape

(3,)

In [34]:
A.size

3

In [35]:
A.dtype

dtype('int32')

### Applying operators
- Arithmetic operations are even an easier with NumPy.

For example to add constant 1 to `ndarray`, we can simply write -

In [36]:
A + 2 

array([3, 4, 5])

- This is possible due to broadcasting feature of `ndarray`, which makes NumPy a powerful scientific package. 

Let's look at more examples.

In [37]:
x = np.arange(0,6)
y = np.arange(5,-1,-1)

print(x)
print(y)

[0 1 2 3 4 5]
[5 4 3 2 1 0]


In [38]:
print(x/3)
print(x+y)

[0.         0.33333333 0.66666667 1.         1.33333333 1.66666667]
[5 5 5 5 5 5]


- We can use conditional operators similarly. 

In [39]:
print(x >= 3)
print(x < y)

[False False False  True  True  True]
[ True  True  True False False False]


That's for operation on arrays with same length. But what if they are not of same length or size? 

- we can create a new array using `np.zeros` method, which creates array of specified shape filled with zeros.

In [40]:
z = np.zeros([3,2]) # 3x2, meaning 3 rows and 2 columns
z

array([[0., 0.],
       [0., 0.],
       [0., 0.]])

In [41]:
try:
    x+z
except Exception as ex:
    print("!!!Error: ", ex)

!!!Error:  operands could not be broadcast together with shapes (6,) (3,2) 


### Modifying arrays

- We get an error above because the shapes of these two arrays do not allow the operation. 
- One way is to change the shape of one of the arrays so that the operation can be performed. 
- We can do this using `np.reshape` method.

In [42]:
w = x.reshape(3,2)
w

array([[0, 1],
       [2, 3],
       [4, 5]])

In [43]:
w+z

array([[0., 1.],
       [2., 3.],
       [4., 5.]])

That worked! 

- Of course, all values in z was zeros so the numbers are the same but notice that integer has changed into float. 
- This is because data in z was stored as a float and it takes precedent over integer data type. 
- To convert a float array to an integer array in python, `np.astype` method can be used.

In [44]:
sum_wz = w+z
sum_wz.astype("int64")

array([[0, 1],
       [2, 3],
       [4, 5]], dtype=int64)

### Initializing arrays

So far we saw examples with `np.arange` and `np.zeros` methods. But this is not it. There are many other methods for this task each of them designed to fit a particular situation requiring a mathematical operation. 

Let see a few more:

In [45]:
np.ones((10))                   # Return array filled with 1

array([1., 1., 1., 1., 1., 1., 1., 1., 1., 1.])

In [46]:
np.linspace(1,5,10)             # Return evenly spaced numbers over a specified interval.

array([1.        , 1.44444444, 1.88888889, 2.33333333, 2.77777778,
       3.22222222, 3.66666667, 4.11111111, 4.55555556, 5.        ])

In [47]:
np.random.randint(5, size=10)   # Return array of random integers

array([0, 0, 2, 0, 1, 2, 1, 1, 3, 3])

In [48]:
np.identity(4)                  # Return identity matrix

array([[1., 0., 0., 0.],
       [0., 1., 0., 0.],
       [0., 0., 1., 0.],
       [0., 0., 0., 1.]])

- Refer to NumPy's official documentation to learn about these methods in detail. 

### Basic statistical methods

There are also numerous methods provided to perform basic statistical operations so that we do not have to write the code for these from scratch.

In [49]:
B = np.array([(2,3,4),
              (12,14,15),
              (9,10,6)])

B

array([[ 2,  3,  4],
       [12, 14, 15],
       [ 9, 10,  6]])

In [50]:
# array info
print(B.ndim)
print(B.shape)
print(B.size)
print(B.dtype)

2
(3, 3)
9
int32


In [51]:
print(B.max())
print(B.min())
print(B.sum())

15
2
75


- `np.sum` method provides us the sum of all elements in an array but what is we want sum of the elements in each row or columns?
- `np.sum()` function can be used for this. It allows us to define which dimension we want to sum the elements over. 

In [52]:
print(np.sum(B, axis=0)) # rows
print(np.sum(B, axis=1)) # columns

[23 27 25]
[ 9 41 25]


### Indexing, Slicing and Stepping

- To make things easier, NumPy follows the same convention to index, slice and step as Python lists. 

In [53]:
alpha_list = ['a','b','c','d','e','f']
A = np.array(alpha_list)
A[0]

'a'

In [54]:
A[1:4]

array(['b', 'c', 'd'], dtype='<U1')

In [55]:
A[::2]

array(['a', 'c', 'e'], dtype='<U1')

In [56]:
A[::-1]

array(['f', 'e', 'd', 'c', 'b', 'a'], dtype='<U1')

What about arrays with more than 1 dimension? 

- We can just separate each dimension with a comma and follow the same convention.

In [57]:
B

array([[ 2,  3,  4],
       [12, 14, 15],
       [ 9, 10,  6]])

In [58]:
B[2,1]

10

In [59]:
B[:2,1]

array([ 3, 14])

In [60]:
B[::-1,::-1]

array([[ 6, 10,  9],
       [15, 14, 12],
       [ 4,  3,  2]])

In [61]:
C =np.array([ [[1,2,3,4],
               [4,5,6,7],
               [8,9,10,11]],
             
             [[12,13,14,15], 
              [16,17,18,19],
              [20,21,22,23]]
            ])
C

array([[[ 1,  2,  3,  4],
        [ 4,  5,  6,  7],
        [ 8,  9, 10, 11]],

       [[12, 13, 14, 15],
        [16, 17, 18, 19],
        [20, 21, 22, 23]]])

In [62]:
C[1,1,2]

18

In [63]:
C[:,1:2,3:]

array([[[ 7]],

       [[19]]])

In [64]:
C[::-1, ::-1]

array([[[20, 21, 22, 23],
        [16, 17, 18, 19],
        [12, 13, 14, 15]],

       [[ 8,  9, 10, 11],
        [ 4,  5,  6,  7],
        [ 1,  2,  3,  4]]])

In [65]:
C[::-1, ::-1, ::-1]

array([[[23, 22, 21, 20],
        [19, 18, 17, 16],
        [15, 14, 13, 12]],

       [[11, 10,  9,  8],
        [ 7,  6,  5,  4],
        [ 4,  3,  2,  1]]])

## Copies vs Views

- A slicing operation creates a __view__ of the original array, which is a way of accessing array data. 
- The original array is not copied in memory. 
- Therefore, any changes make on a sliced view, also changes the original array.

In [66]:
B

array([[ 2,  3,  4],
       [12, 14, 15],
       [ 9, 10,  6]])

In [67]:
c = B[:2,1]
c

array([ 3, 14])

In [68]:
c[0] = 400 
B

array([[  2, 400,   4],
       [ 12,  14,  15],
       [  9,  10,   6]])

- If we do not want to change the original array when modified a slice of an array, `np.copy` method should be used. 

In [69]:
B = np.array([(2,3,4),
              (12,14,15),
              (9,10,6)])
c = B[:2,1].copy()
c[0] = 400
B

array([[ 2,  3,  4],
       [12, 14, 15],
       [ 9, 10,  6]])

## Broadcasting

- An arithmetic operation between any two arrays is always performed element wise. 
    - Say _i_ is used to index array A and _j_ is used to index array B.
    - When adding A and B, every _i_th element of A is added to the _j_th element of B. 


- However, when two arrays are of different sizes, not every _i_th element of one array will have a unique _j_th element to add to. 


- When two arrays are of different dimensions, they might have the same size but might not have it in the same dimensions. 


- In these cases, NumPy uses __broadcasting__ technique to perform arithmentic operation. So, it helps to know these rules: 



__Rule 1:__ If the two arrays differ in their number of dimensions, the shape of the one with fewer dimensions is padded with ones on its leading side.

In [70]:
array_4x3 = np.random.randint(4,size=(4,3))
array_4x3

array([[2, 0, 1],
       [2, 2, 0],
       [1, 0, 1],
       [1, 0, 3]])

In [71]:
array_3 = np.array([1,2,3])
array_3 

array([1, 2, 3])

In [72]:
print('ndim of array_4x3 is', array_4x3.ndim)
print('ndim of array_3 is  ', array_3.ndim)

ndim of array_4x3 is 2
ndim of array_3 is   1


In [73]:
print('shape of array_4x3 is', array_4x3.shape)
print('shape of array_3 is  ', array_3.shape)

shape of array_4x3 is (4, 3)
shape of array_3 is   (3,)


These two arrays have same dimension. However, if we align their shapes such that their trailing dimensions are aligned, i.e. 
- 4x3
- <font color='white'> ... </font> 3

we see that they have the same size on the aligned dimension. Thus, the one with fewer dimension can be padded to match the shape of one with leading dimension so that the operation is performed. 

In [74]:
print(array_4x3+array_3)
print("-----------------")
print(array_4x3*array_3)

[[3 2 4]
 [3 4 3]
 [2 2 4]
 [2 2 6]]
-----------------
[[2 0 3]
 [2 4 0]
 [1 0 3]
 [1 0 9]]


- In the following example the shapes of trailing dimension do not match and therefore, are not broadcast-compatible.

In [75]:
ar_5x4x3 = np.random.randint(10,size=(5,4,3))
ar_4x2 = np.random.randint(5,size=(4,2))
print('shape of ar_5x4x3 is', ar_5x4x3.shape)
print('shape of ar_4x2 is     ',ar_4x2.shape)

shape of ar_5x4x3 is (5, 4, 3)
shape of ar_4x2 is      (4, 2)


In [76]:
try:
    print(ar_5x4x3+ar_4x2)
except Exception as ex:
    print("!!!Error: ", ex)
print("--------------")
try:
    print(ar_5x4x3*ar_4x2)
except Exception as ex:
    print("!!!Error: ", ex)

!!!Error:  operands could not be broadcast together with shapes (5,4,3) (4,2) 
--------------
!!!Error:  operands could not be broadcast together with shapes (5,4,3) (4,2) 


__Rule 2:__ If the shape of the two arrays does not match in any dimension (like the case above), the array with shape equal to 1 in that dimension is stretched to match the other shape.

In [77]:
ar_4 = np.random.randint(10,size=(4,1))
print('ndim of ar_4 is', ar_4.ndim)
print('shape of ar_4 is',ar_4.shape)
ar_4

ndim of ar_4 is 2
shape of ar_4 is (4, 1)


array([[3],
       [7],
       [1],
       [3]])

In [78]:
ar_4+array_3

array([[ 4,  5,  6],
       [ 8,  9, 10],
       [ 2,  3,  4],
       [ 4,  5,  6]])

Here is another example of arithmetic operation between two arrays of differing dimensions with some dimensions having a shape of 1.

In [79]:
P = np.random.randint(10,size=(5,1,4,1))
Q = np.random.randint(10,size=(3,1,2))
PQsum = P+Q

print('shape of P is    ', P.shape)
print('shape of Q is       ', Q.shape)
print('shape of PQsum is', PQsum.shape)

shape of P is     (5, 1, 4, 1)
shape of Q is        (3, 1, 2)
shape of PQsum is (5, 3, 4, 2)


To read more about broadcasting visit - https://www.pythonlikeyoumeanit.com/Module3_IntroducingNumpy/Broadcasting.html#A-Simple-Application-of-Array-Broadcasting

## Iteration
Broadcasting is useful but has limitations. While mostly faster,it is not so always. Other times, loops are the only option because broadcasting is not possible for the task. 

Timeseries data provide a good working example. Often when predicting a stock price for the next day, we have to use the information of stock price from previous days. In such case we have to index elements of the same array and perform some operation. If you want to create a simple model that predicts the price of stock today to be the average of the last 5 days price, we can write a for loop.

Below is an example of how to calculate stock price predictions as a 5-point window average based on 100 stock prices.

In [80]:
# create some stock price data 
stockprice = np.random.ranf(100)

# metadata 
N = stockprice.shape[0]
k = 5

# initialize array to store stock predictions
stockprediction = np.zeros(N-k)

# fill in stock prediction array with predictions
for i in range(N-k):
    stockprediction[i] = stockprice[i:i+k].mean()
    
print(stockprediction)

[0.59760478 0.71633572 0.61484024 0.53700892 0.37696342 0.29579634
 0.22048961 0.2717744  0.33585744 0.34020351 0.41442515 0.36268028
 0.37834509 0.48395457 0.59491633 0.52965817 0.48758251 0.33907302
 0.26419842 0.19204747 0.24703801 0.41965425 0.4904984  0.55114045
 0.59880953 0.62112194 0.60112288 0.6253171  0.49037219 0.47428755
 0.46506971 0.37486285 0.36011672 0.43210663 0.39456034 0.3426192
 0.34714739 0.28408429 0.22679366 0.22815377 0.19118897 0.31591748
 0.37581923 0.46734687 0.48890936 0.66578469 0.48133926 0.42396446
 0.36918817 0.30157251 0.1614322  0.19437779 0.2911169  0.38672772
 0.49791795 0.49016025 0.62529867 0.54656635 0.41282858 0.39566538
 0.43260679 0.39904327 0.40289055 0.52402087 0.53864648 0.64480408
 0.58915796 0.70602025 0.62418276 0.70182642 0.63931412 0.60323506
 0.59159483 0.60945809 0.58348991 0.47352707 0.45583267 0.37415457
 0.43008141 0.38085541 0.44996963 0.42253585 0.54799428 0.4705846
 0.51622031 0.61494844 0.7762078  0.58475354 0.68120731 0.637722