In [1]:
#
import sys
sys.path.append("../data")
from mylib import  *
import io

<h3>Basic rules about OOP in python </h3> 

<h3> OOP in general</h3>

<b>1. Abstraction</b>

Abstraction hides the internal functionality of an application from the user. The user could be either the end client or other developers.

We can find abstraction in our daily lives. For example, you know how to use your phone, but you probably don’t know exactly what’s happening inside it each time you open an app.

Another example is Python itself. You know how to use it to build functional software, and you can do it even if you don’t understand Python’s inner workings.

Applying the same to code allows you to collect all the objects in a problem and abstract standard functionality into classes.

<b>2. Inheritance</b>

Inheritance allows us to define multiple subclasses from an already defined class.

The primary purpose of it is to follow the DRY principle. You’ll be able to reuse a lot of code by implementing all the sharing components into superclasses.

You can think of it as the real-life concept of genetic inheritance. Children (subclass) are the result of inheritance between two parents (superclasses). They inherit all the physical characteristics (attributes) and some common behaviors (methods).

<b>3. Polymorphism</b>

Polymorphism lets us slightly modify methods and attributes of the subclasses previously defined in the superclass.

The literal meaning is “many forms.” That’s because we build methods with the same name but different functionality.

Going back to the previous idea, children are also a perfect example of polymorphism. They can inherit a defined behavior get_hungry() but in a slightly different way, for instance, getting hungry every 4 hours instead of every 6.

<b>4. Encapsulation</b>

Encapsulation is the process in which we protect the internal integrity of data in a class.

Although there isn’t a private statement in Python, you can apply encapsulation by using mangling in Python. There are special methods named $getters$ and $setters$ that allow us to access unique attributes and methods.

Let’s imagine a Human class that has a unique attribute named _height. You can modify this attribute only within certain constraints (it’s nearly impossible to be higher than 3 meters).


<h3>OOP in python</h3>


1. <b>self</b> must be the first parameters in the member function definition. The dot notation, when the object member function is called, represent the <b>self</b>. 

2. Every class variable must be prefixed with 'self.', they are all public unless the variable name start with __ 

3. Do not access a class field directly, use <b>getter</b> and <b>setter</b> function instead. Te main purpose of using getters and setters in object-oriented programs is to ensure data encapsulation. 

### Class Definition Syntax

class definition looks like
<pre>
class ClassName:
    statement-1
    .
    .
    .
    statement-N
</pre>

Class definitions, like function definitions (def statements) must be executed before they have any effect.

In practice, the statements inside a class definition will usually be function definitions, but other statements are allowed, and sometimes usefu

$f_{self}(x)$

In [2]:
class MyFirstClass():
    m = 0
    n = []
    s = (1,2,3)
    def ftn(self):
        m=100
        return 'XMU'
    j = 1000

In [3]:
a =  MyFirstClass()
print(a.n)
print(a.m)
a.m = 10000
print(a.m)
a.n.extend([1,4,5,7])
a.n
a.ftn()

a.s = (4,5)

[]
0
10000


In [4]:
b =  MyFirstClass()
print(b.n)
print(b.m)

print(b.s)

[1, 4, 5, 7]
0
(1, 2, 3)


`n` 是一个可变对象（list）。 在 Python 中，类变量（例如 m 和 n 在这个例子中）是所有实例之间共享的。如果这个类变量是 不可变类型（比如整数、字符串），当你在某个实例中修改它时，Python 会创建一个新的副本，只对该实例生效。但如果是可变类型（如列表），则所有实例共享同一个引用。

In [5]:
a.i=100

In [6]:
b = MyFirstClass()

In [17]:
b.i  #不能让别的变量引用这个性质

AttributeError: 'MyFirstClass' object has no attribute 'i'

In [7]:
a.j

1000

In [8]:
a.n = 1
b.n

[1, 4, 5, 7]

Here *MyFirstClass.n* (or *MyFirstClass.j*) and *MyFirstClass.ftn* are valid attribute references, returning an integer and a function object, respectively.

Class instantiation uses function notation. Just pretend that the class object is a parameterless function that returns a new instance of the class. For example (assuming the above class):

In [9]:
#creates a new instance of the class and assigns this object to the variable 'a'
a = MyFirstClass() 
print(a.n,a.j)
print(a.ftn())
a.m

[1, 4, 5, 7] 1000
XMU


0

In [10]:
a.i # an error

AttributeError: 'MyFirstClass' object has no attribute 'i'

In [11]:
a.i = 100
print(a.i)

100


The instantiation operation (“calling” a class object) creates an empty object. Many classes like to create objects with instances customized to a specific initial state. Therefore a class may define a special method named $__init__()$, like this:

In [12]:
def __init__(self):
    self.int = 10

When a class defines an $__init__()$ method, class instantiation automatically invokes $__init__()$ for the newly created class instance. So in this example, a new, initialized instance can be obtained by:

In [13]:
a.int

AttributeError: 'MyFirstClass' object has no attribute 'int'

In [14]:
__init__(a)

In [15]:
a.int

10

the $__init__()$ method may have arguments for greater flexibility. In that case, arguments given to the class instantiation operator are passed on to $__init__()$. For example,

In [16]:
x = Complex(1.0,3.6)
print(x.r,x.i)

NameError: name 'Complex' is not defined

In [17]:
class Complex:
    def __init__(anyName, realpart, imagpart):
        anyName.r = realpart
        anyName.i = imagpart
    def ftn(anyName,a):
        print(a)
y = Complex(100.0,30.6)

In [18]:
(y.ftn(1000))

1000


In [19]:
y.ftn(10)

10


#### Method Objects

A method is a function that “belongs to” an object.

In [None]:
class MyFirstClass():
    n = 100
    def ftn(self):
        return 'XMU'
    j = 1000

a = MyFirstClass()

Valid method names of an instance object depend on its class. By definition, all attributes of a class that are function objects define corresponding methods of its instances. So in our example, *MyFirstClass.ftn* is a valid method reference, since *MyFirstClass.ftn* is a function, but *a.n* is not, since *MyFirstClass.n* is not. But *a.ftn* is not the same thing as *MyFirstClass.ftn* — it is a method object, not a function object.

#### Class and Instance Variables

Generally speaking, instance variables are for data unique to each instance and class variables are for attributes and methods shared by all instances of the class:

In [20]:
class University:
    s = 'students'         # class variable shared by all instances
    l = 'lib'
    room = "classroom"
    def __init__(self,name,n):
        
        self.name = name
        self.n = n    # instance variable unique to each instance

XMU = University('Xiamen university',5000)
THU = University('Tsinghua University',3500)
print(XMU.s,XMU.name)                  # shared by all Universities
print(THU.s,THU.name)                  # shared by all Universities
print(XMU.n,XMU.room)                  # unique to d
print(THU.n,THU.room)                  # unique to e

students Xiamen university
students Tsinghua University
5000 classroom
3500 classroom


Shared data can have possibly surprising effects with involving mutable objects such as lists and dictionaries. For example, the $depart$ list in the following code should not be used as a class variable because just a single list would be shared by all $university$ instances:

In [26]:
class University:
    department = []         # class variable shared by all instances
    m = 10
    def __init__(self, n):
        self.n = n    # instance variable unique to each instance
    def addDepart(self,newDepartment):
        self.department.append(newDepartment)
    j = 100
XMU = University(5000)
THU = University(3500)
XMU.addDepart('Department of Statistics and Data Science')
THU.addDepart('Center for Statistical Science')
XMU.department

['Department of Statistics and Data Science', 'Center for Statistical Science']

In [28]:
XMU.m=10000
XMU.m

10000

Correct design of the class should use an instance variable instead:

In [32]:
class University:
    def __init__(self,n):
        self.n = n    
        self.department = []
    def addDepart(self,newDepartment):
        self.department.append(newDepartment)
XMU = University(5000)
THU = University(3500)
XMU.addDepart('Department of Statistics and Data Science')
THU.addDepart('Center for Statistical Science')
print(XMU.department)
print(THU.department)

['Department of Statistics and Data Science']
['Center for Statistical Science']


In [None]:
#因为已经初始化了

#### Inheritance

Benefits of inheritance are:
  - It represents real-world relationships well.
  - It provides the reusability of a code. We don’t have to write the same code again and again. Also, it allows us to add more features to a class without modifying it.
  - It is transitive in nature, which means that if class B inherits from another class A, then all the subclasses of B would automatically inherit from class A.

Inheritance Syntax
<pre>
Class BaseClass():
    statements
Class DerivedClass(BaseClass):
    statements
</pre>

In [1]:
# parent class
class UniverLibrary():
    def __init__(self,bookName,binary):
        self.name = bookName
        self.borrowed = binary
    def isBorrowed(self):
        return "Is " + self.name + " borrowed? " + self.borrowed +"."
    def test(self):
        return "This is a test for Parent class"

myLib = UniverLibrary('The Old Man and Sea',"Yes")

In [2]:
myLib.isBorrowed()    

'Is The Old Man and Sea borrowed? Yes.'

In [3]:
class SchoolLibrary(UniverLibrary):  #把母class传给子class
    def __init__(self,bookName,binary,x):
        UniverLibrary.__init__(self,bookName,binary) #调用母class初始化
        self.x = x
    def test(self):
        return "This is a test for Child class"

In [4]:
sLib = SchoolLibrary('Deep Learning','No',100)
sLib.isBorrowed()

'Is Deep Learning borrowed? No.'

In [5]:
sLib.test()

'This is a test for Child class'

Multiple inheritances: When a child class inherits from multiple parent classes, it is called multiple inheritances. 

In [12]:
class Base1():
    def __init__(self):
        self.str1 = "Python one"
        print("-----Base1-------")
 
 
class Base2():
    def __init__(self):
        self.str2 = "Python two"
        print("Base2")
 
ob1 = Base1()
ob2 = Base2()

-----Base1-------
Base2


In [11]:
class Derived(Base1,Base2):
    def __init__(self):
        Base1.__init__(self)
        Base2.__init__(self)
        print("Derived")
    def printStrs(self):
        print(self.str1, self.str2)
ob = Derived()
print('-----------------------')
ob.printStrs()

-----Base1-------
Base2
Derived
-----------------------
Python one Python two


Multilevel inheritance: When we have a child and grandchild relationship

In [13]:
class Base(object): #在py3中，和class Base:同等作用
    
    def __init__(self, name):
        self.name = name
 
    # To get name
    def getName(self):
        return self.name
 
 
# Inherited class
class Child(Base):
 
    def __init__(self, name, age):
        Base.__init__(self, name)
        self.age = age
 
    # To get age
    def getAge(self):
        return self.age
    
class GrandChild(Child):  #传最近的一个就好

    def __init__(self, name, age, address):
        Child.__init__(self, name, age)
        self.address = address
 
    # To get address
    def getAddress(self):
        return self.address
 
g = GrandChild("Xiao ming", 23, "Xiamen")
print(g.getName(), g.getAge(), g.getAddress())

Xiao ming 23 Xiamen


**Private members of the parent class**

We don’t always want the instance variables of the parent class to be inherited by the child class i.e. we can make some of the instance variables of the parent class private, which won’t be available to the child class. 
We can make an instance variable private by adding double underscores before its name. For example,

In [5]:
class Base():
    def __init__(self,a,b):
        self.x = a
        self.__y = b**2 # y is private instance variable
        
baseClass = Base(10,5) #实例化
print(baseClass.x)
print(baseClass.y) #It can’t access outside the class.

10


AttributeError: 'Base' object has no attribute 'y'

In [19]:
baseClass._Base__y

25

In [21]:
class Child(Base):
    def __init__(self,a,b,c):
        self.z = c
        Base.__init__(self,a,b)
 
childClass = Child(10,5,30)
 
# produces an error as d is private instance variable
print('value of z:',childClass.z)
print('value of y:',childClass.y)# produces an error as 'y' is private instance variable
print(childClass._Base__y) #需要.母代的隐私化的才可以

value of z: 30
25


quiz:
<pre>
class Pa():
    def __init__(self,n):
        self.n = n
    def ftn(self,m):
        return(self.n + m)
    def ftn1(self,j):
        return(self.n+j)
class Child(University):
    def __init__(self,n,k):
        Pa.__init__(self,k)
        self.n = n
    def ftn1(self,l):
        print(l)
s = Child(100,200)
s.n
s.ftn(10)
s.ftn1(20)
</pre>

In [69]:
class Pa():
    def __init__(self,n):
        self.n = n
    def ftn(self,m):
        return(self.n + m)
    def ftn1(self,j):
        return(self.n+j)

class Child(Pa):
    def __init__(self,n,k):
        Pa.__init__(self,k)
        self.n = n
    def ftn1(self,l):
        return l
    
class Grand(Child):
    def __init__(self,m,n,k):
        Child.__init__(self,n,k)
        self.n = n #n受到这一步的影响所以不参与嵌套
    def ftn1(self,l):
        return l  
    
s = Grand(1,1000,200)
print(s.n)
print(s.ftn(10))
print(s.ftn1(20))

1000
1010
20


## More examples

In [8]:
### some support function ###
def list_to_string(data):    
    return '  '.join([str(x) for x in data]) #使用两个空格作为分隔符将这些字符串连接起来

def Seq(n):
    return [i for i in range(n)] 

def ravel(obj):
    return obj if type(obj)==list else obj.data #如果输入的类型是list就返回本身，否则返回.data性质


In [28]:
list_to_string([0,1,2,3])
Seq(6)
#ravel()

[0, 1, 2, 3, 4, 5]

In [31]:
class Array(object):
    def __init__(self):
        print("Array object")
        
a = Array()
a

Array object


<__main__.Array at 0x11b8abd10>

In [32]:
import numpy as np
np.array([i for i in range(5)])

array([0, 1, 2, 3, 4])

In [10]:
def ravel(obj):
    return obj if type(obj)==list else obj.data

class Array:
    def __init__(self, lst):
        self.data = ravel(lst)
    
    def Add(self, aryb):
        return Array([x+y for x,y in zip(self.data, ravel(aryb))])
    
    def __repr__(self):
        return "---Array----\n"+list_to_string(self.data)+"\n------------\n\n"
    
a = Array([1,2,3,4,5])
b = Array([10,20,30,40,50,60])

#print(a.Add(b).data)
c = a.Add(b)
print(c)
print(c.Add(a))
#a = Array([1,2,3,4,5])
#b = [10,20,30,40,50]
#a.Add(b)
d = Array("1,2,3")

---Array----
11  22  33  44  55
------------


---Array----
12  24  36  48  60
------------




AttributeError: 'str' object has no attribute 'data'

In [39]:
print(a)
#在 Python 中，__repr__ 是一个特殊的方法，用于定义对象的“官方”字符串表示。
#当你执行 print(a) 时，Python 会调用 a 对象的 __repr__ 方法，并将返回的字符串输出到控制台。

d = Array(a)
d.data

---Array----
1  2  3  4  5
------------




[1, 2, 3, 4, 5]

In [98]:
a

---Array----
1  2  3  4  5
------------


Note: $repr()$ is a Python built-in function and when called, it invokes the $__repr__()$ method. We can use it when we want to debug or to know information about an object. It takes one object as its argument and returns a legal string representation of the passed object. see <a href="https://www.pythontutorial.net/python-oop/python-__repr__/"> this for the usage of $__repr__$?</a>

In [105]:
class Array:
    def __init__(self, lst):
        self.data=ravel(lst)
    
    def Sum(self):
        return sum(self.data)
    
    def Add(self, aryb):
        #assert type(aryb)==Array, 'Invalid type'
        aryb=ravel(aryb)
        return Array([x+y for x,y in zip(self.data, aryb)])

    def Len(self):
        return len(self.data)

    def __repr__(self):
        return "-------\n"+list_to_string(self.data)+"\n-------\n\n"
    
a=Array([1,2,3,4,5])
print(a.Sum())
b=Array([10,11,12,13,14])
print(a.Add(b))  #因为调用Add函数的时候又调用了print(实例化)，所以会调用repr函数
print(a.Add([2,2,2]))
print(a.Len())
print(a.Sum())

15
-------
11  13  15  17  19
-------


-------
3  4  5
-------


5
15


In [116]:
a = np.array([i for i in range(9)])
a[3]


3

In [11]:
class Array:
    def __init__(self, lst):
        self.data=ravel(lst)
    
    def Sum(self):
        return sum(self.data)
    
    def Add(self, aryb):
        #assert type(aryb)==Array, 'Invalid type'
        aryb=ravel(aryb)
        return Array([x+y for x,y in zip(self.data, aryb)])

    def Len(self):
        return len(self.data)

    def __repr__(self):
        return "-------\n"+list_to_string(self.data)+"\n-------\n\n"
    
    def __getitem__(self, sl):  # slice, int
        return Array(self.data[sl])            
            
    def subset(self, sl):
        return Array(self.data[sl])            
    
a=Array([1,2,3,4,5])
print(a.Sum())
b=Array([10,11,12,13,14])
print(a.Add(b))
print(a.Add([2,2,2]))
print(a.Len())
print(a.Sum())
print(a)
print(a.subset(slice(1,3)))
a[1:3], a[:], a[2:] 
a[4] #非列表报错
a[[1,2,4]]

15
-------
11  13  15  17  19
-------


-------
3  4  5
-------


5
15
-------
1  2  3  4  5
-------


-------
2  3
-------




AttributeError: 'int' object has no attribute 'data'

Note: $__getitem__$ takes the list index as an argument and returns the name associated with that index. 
see <a href="https://www.zhihu.com/tardis/zm/art/27661382?source_id=1005"> this for the usage of $__getitem__$?</a> 

In [126]:
class Array:
    def __init__(self, lst):
        self.data=ravel(lst)
    
    def Sum(self):
        return sum(self.data)
    
    def Add(self, aryb):
        #assert type(aryb)==Array, 'Invalid type'
        aryb=ravel(aryb)
        return Array([x+y for x,y in zip(self.data, aryb)])

    def Len(self):
        return len(self.data)

    def __repr__(self):
        return "-------\n"+list_to_string(self.data)+"\n-------\n\n"
    
    def __getitem__(self, sl):  # slice, int, or list
        print(type(sl))
        if type(sl)==slice:
            return Array(self.data[sl])            
        elif type(sl)==list:
            return Array([self.data[i] for i in sl])
        elif type(sl)==int:  
            return Array([self.data[sl]]) #反正一定要传list给Array
        else: 
            raise TypeError("selection by "+str(sl)+" is not supported")
            

a=Array([1,2,3,4,5])
print(a.Sum())
b=Array([10,11,12,13,14])
print(a.Add(b))
print(a.Add([2,2,2]))
print(a.Len())
print(a.Sum())
print(a[1:])
print(a[[3,2]])
print(a[3])
#print(a['3':4])  # invalid


15
-------
11  13  15  17  19
-------


-------
3  4  5
-------


5
15
<class 'slice'>
-------
2  3  4  5
-------


<class 'list'>
-------
4  3
-------


<class 'int'>
-------
4
-------




In [113]:
#Note:
#The raise keyword raises an error and stops the control flow of the program. 
b = 3
if b % 2 == 1:
    raise Exception("The number shouldn't be an odd integer")

Exception: The number shouldn't be an odd integer

In [45]:
import pandas as pd
p = pd.Series([1,3,4],index=[2,5,6])
p.index

Index([2, 5, 6], dtype='int64')

In [47]:
class Series(Array):
    def __init__(self, lst, Index=None):
        self.data=ravel(lst)
            
        if Index==None:
            self.index=[i for i in range(len(lst))] 
        else:
            self.index=ravel(Index)
        
    def Index(self):
        return self.index
    
    def __repr__(self):
        s='-----\n'
        for idx, x in zip(self.index, self.data):
            s+= str(idx)+'  '+str(x)+'\n'
        return s+'-----\n'
    
    # add  name to series
a = [1,2,3,7,4,5]
b = [4,6,8,1,2,5]
s1 = Series(a,b)
print(s1)
print(s1.Len())
print(s1.Sum())
print(s1[:4])
print(s1[[2,3,1]])


-----
4  1
6  2
8  3
1  7
2  4
5  5
-----

6
22
-------
1  2  3  7
-------




TypeError: list indices must be integers or slices, not list

In [49]:
### homework ## a <- matrix(1:10,ncol=5)
class Matrix(Array):
    def __init__(self, lst, nrows, ncols):
        self.data=ravel(lst)
        assert nrows*ncols==len(self.data), "Unmatched cells and the dimensions"
        self.nrows=nrows
        self.ncols=ncols
    #assert 条件, "错误消息"
    #条件：一个表达式，如果其值为 False，则触发断言失败。
    #错误消息（可选）：当断言失败时，会抛出一个 AssertionError 并显示该消息。
    
    #def Cell(self, i,j):
     #   return self.data[i*self.ncols+j]
    
    def __repr__(self):
        s=''
        for i in range(self.nrows):
            s+='|' 
            s+= list_to_string(self.data[i*self.ncols:(i*self.ncols+self.ncols)])
            s+='|\n'
        return s

    # def Reshape    # def To_Array
    # def MatMultiply
    # def I
    
mx=Matrix([1,2,3,4,5,6,7,8], 2,4)
print(mx)

mx=Matrix(Array([1,2,3,4,5,6,7,8]), 4,2)
print(mx)

print(mx.Sum())  #inhrited from the Array
print(mx.Len())  #inherited from the Array
print(mx[:6])

|1  2  3  4|
|5  6  7  8|

|1  2|
|3  4|
|5  6|
|7  8|

36
8
-------
1  2  3  4  5  6
-------




In [50]:
def getIndex(sl, N=-1):
    sltype=type(sl)
    #print(sltype, sl)
    if sltype == slice:
        if sl.stop != None: #切片的结束点
            return range(sl.stop)[sl]
        else:
            assert N>0, 'Unknown selection range'
            return range(N)[sl]       
    elif sltype == list:
        return sl
    elif sltype == int:  
        return [int(sl)]
    else: 
        raise TypeError("selection by "+str(sl)+" is not supported")

print(getIndex(slice(2, 6, None)))
print(getIndex(slice(2, None, None),10))
print(getIndex([3,2,1]))
print(getIndex(2))

range(2, 6)
range(2, 10)
[3, 2, 1]
[2]


In [133]:
### modify the __getitem__ with getIndex function
class Array:
    def __init__(self, lst):
        self.data=ravel(lst)
    
    def Sum(self):
        return sum(self.data)
    
    def Add(self, aryb):
        #assert type(aryb)==Array, 'Invalid type'
        aryb=ravel(aryb)
        return Array([x+y for x,y in zip(self.data, aryb)])
    
    def Len(self):
        return len(self.data)

    def Sum(self):
        return sum(self.data)

    def __repr__(self):
        return "-------\n"+list_to_string(self.data)+"\n-------\n\n"
    
    def __getitem__(self, sl):  # slice, int, or list
        idx=getIndex(sl, len(self.data))
        return Array([self.data[i] for i in idx])


a=Array([1,2,3,4,5])
print(a.Sum())
b=Array([10,11,12,13,14])
print(a.Add(b))
print(a.Add([2,2,2]))
print(a.Len())
print(a.Sum())
print(a[1:]) #切片调用getitem函数
print(a[[3,2]])



15
-------
11  13  15  17  19
-------


-------
3  4  5
-------


5
15
-------
2  3  4  5
-------


-------
4  3
-------




In [134]:
### now we are ready to add matrix slicing
class Matrix(Array):
    def __init__(self, lst, nrows, ncols):
        self.data=ravel(lst)
        assert nrows*ncols==len(self.data), "Unmatched cells and the dimensions"
        self.nrows=nrows
        self.ncols=ncols
        
    def Cell(self, i,j):
        return self.data[i*self.ncols+j]
    
    def __repr__(self):
        s=''
        for i in range(self.nrows):
            s+='|' 
            s+= list_to_string(self.data[i*self.ncols: i*self.ncols+self.ncols])
            s+='|\n'
        return s

    # def Reshape    # def To_Array
    # def MatMultiply
    # def I
    
    def __getitem__(self, sls):  # slice, int, or list
        ridx=getIndex(sls[0], self.nrows) #使用之前定义的 getIndex 函数，将行索引 sls[0] 和列索引 sls[1] 转换为标准的整数列表。
        cidx=getIndex(sls[1], self.ncols)
        data=[]
        for i in ridx:
            for j in cidx:
                data.append(self.data[i*self.ncols+j]) #一个一个添加，构成一个新的Matrix
        return Matrix(data, len(ridx), len(cidx))
            
mx=Matrix([1,2,3,4,5,6,7,8,9,8,7,6], 3,4)
print(mx)

mx=Matrix(Array([1,2,3,4,5,6,7,8,9,8,7,6]), 4,3)
print(mx)

print(mx.Sum())  #inhrited from the Array
print(mx.Len())  #inherited from the Array
print(mx[[0,2],[0,2]])

|1  2  3  4|
|5  6  7  8|
|9  8  7  6|

|1  2  3|
|4  5  6|
|7  8  9|
|8  7  6|

66
12
|1  3|
|7  9|



In [140]:
import io
class DataTable(Series):
    def __init__(self, SeriesList, rowIndex=None, colIndex=None):
        self.data=[]
        # need to check the size of each series
        len0 = -1
        for S in SeriesList:
            s=ravel(S)
            if len0<0:
                len0 = len(s)
                if rowIndex==None:
                    self.rowIndex=Seq(len0)
                else:
                    self.rowIndex=ravel(rowIndex)
            assert len(s)==len0, "Column in data table must be the same length"
            if type(S)==Series: assert self.rowIndex==S.index, "Column must have same index"
            self.data.append(s)

        if colIndex==None:
            self.colIndex=['Col'+str(i) for i in range(len(SeriesList))]  #手动加一个
        else:
            self.colIndex=ravel(colIndex)
        
        self.nrows=len(self.rowIndex)
        self.ncols=len(self.colIndex)

    @property # https://www.geeksforgeeks.org/python-property-function/
    def Shape(self):
        return (self.nrows, self.ncols)
    
    def __repr__(self):
        s= '   '+' '.join(self.colIndex)
        s+='\n---------------------------\n'
        for i in range(self.nrows):
            s+=str(self.rowIndex[i])+'   '
            for j in range(self.ncols):
                s+='   '+str(self.data[j][i])
            s+='\n'
        s+='---------------------------\n'
        return s
     
print(b)  
a = Array([1,2,3,4,5])
dt=DataTable([a,b, a.Add(b), a.Add(a)])
print(dt)

-------
10  11  12  13  14
-------


   Col0 Col1 Col2 Col3
---------------------------
0      1   10   11   2
1      2   11   13   4
2      3   12   15   6
3      4   13   17   8
4      5   14   19   10
---------------------------



Note: the $assert$ statement allows you to test for a certain condition. It is widely used to handle problems during Python debugging.
https://pythongeeks.org/assertion-in-python/

In [141]:
#what will happen if the __repr__ is not defined

In [142]:
dt=DataTable([a,b, a.Add(b)], colIndex=['x','y','z'])
print(dt)

   x y z
---------------------------
0      1   10   11
1      2   11   13
2      3   12   15
3      4   13   17
4      5   14   19
---------------------------



In [144]:
dt=DataTable([[1,2,3],[4,5,6], [22,33,44]], colIndex=['x','y','z'], rowIndex=[100,200,300])
print(dt)
#print(dt.Sum())   ## will be an error

   x y z
---------------------------
100      1   4   22
200      2   5   33
300      3   6   44
---------------------------



In [None]:
## try to mimic the iloc[] function
class iLoc():
    def __init__(self, obj):
        self.obj=obj
        
    def __getitem__(self, slc):
        return self.obj.subset(slc)

class DataTable(Series):
    def __init__(self, SeriesList, rowIndex=None, colIndex=None):
        self.iLoc = iLoc(self)
        self.data = []
        # need to check the size of each series
        len0 = -1
        for S in SeriesList:
            s = ravel(S)
            if len0 < 0:
                len0 = len(s)
                if rowIndex == None:
                    self.rowIndex = Seq(len0)
                else:
                    self.rowIndex = ravel(rowIndex)
            assert len(s) == len0, "Column in data table must be the same length"
            if type(S) == Series: assert self.rowIndex == S.index, "Column must have same index"
            self.data.append(s)
            

        if colIndex == None:
            self.colIndex=['Col'+str(i) for i in range(len(SeriesList))] 
        else:
            self.colIndex=ravel(colIndex)
        
        self.nrows=len(self.rowIndex)
        self.ncols=len(self.colIndex)
        
    @property
    def Shape(self):
        return (self.nrows, self.ncols)
    
    def __repr__(self):
        s= '       '+'   '.join(self.colIndex)
        s+='\n---------------------------\n'
        for i in range(self.nrows):
            s+=str(self.rowIndex[i])+'   '
            for j in range(self.ncols):
                s+='   '+str(self.data[j][i])
            s+='\n'
        s+='---------------------------\n'
        return s
     
    def subset(self, slc):  ## callable from other object
        ridx=getIndex(slc[0], self.nrows)
        cidx=getIndex(slc[1], self.ncols)
        return DataTable([[self.data[j][i] for i in ridx] for j in cidx], 
                         rowIndex=[self.rowIndex[i] for i in ridx], 
                         colIndex=[self.colIndex[j] for j in cidx])

    def __getitem__(self, slc):
        return self.subset(slc)
    

                
dt=DataTable([a, b, a.Add(b), a.Add(a), a.Add(b).Add(b)], colIndex=['A','B','C','D'])
print(dt)

print(dt[2:,2:])
print(dt.iLoc[:,1:])
print(dt.iLoc[:2,1:])
print(dt.iLoc[1:,2:])