<small><small><i>
**Python Lectures**: including RegEx in Python, Regular Expression Pattern and Applications, and Classes.
</i></small></small>

# RegEx in Python

## Regular Expression
**前言：**<br>
* 最早的字符串编码是美国标准信息交换码ASCII，仅对10个数字、26个大写字英文字母、26个小写字英文字母及一些其它符号进行了编码。ASCII采用8位即1个字节，因此最多只能对256个字符进行编码。
* 随着信息技术的发展，各国的文字都需要进行编码，常见的编码有UTF-8，GB2312，GBK，CP936。
* 采用不同的编码意味着把同一字符存入文件时，写入的内容可能不同。
* UTF-8编码是国际通用的编码，以8位，即1字节表示英语(兼容ASCII)，以24位即3字节表示中文及其它语言，UTF-8对全世界所有国家需要用到的字符进行了编码。
* GB2312是中国制定的中文编码，使用1个字节表示英语，2个字节表示中文；
* GBK是GB2312的扩充；
* CP936是微软在GBK基础上完成的编码；
* GB2312、GBK和CP936都是使用2个字节表示中文，UTF-8使用3个字节表示中文；
* Unicode是编码转换的基础。
* 在Windows平台上，input()函数从键盘输入的字符串默认为GBK编码，而Python程序的字符串编码使用#coding指定，如
~~~ python
#coding=utf-8
#coding:GBK
#-*-coding:utf-8 -*-
~~~


* 正则表达式是字符串处理的有力工具和技术。
* 正则表达式使用某种预定义的模式去匹配一类具有共同特征的字符串，主要用于处理字符串，可以快速、准确地完成复杂的查找、替换等处理要求。
* Python中，re模块提供了正则表达式操作所需要的功能。

|元字符|功能说明|
|:------:|------------------------------|
|.|匹配除换行符以外的任意单个字符|
|*|匹配位于*之前的字符或子模式的0次或多次出现|
|+|匹配位于+之前的字符或子模式的1次或多次出现|
|-|在[]之内用来表示范围|
|&#124;|匹配位于&#124;之前或之后的字符|
|^|匹配行首，匹配以^后面的字符开头的字符串|
|&#36;|匹配行尾，匹配以$之前的字符结束的字符串|
|?|匹配位于?之前的0个或1个字符。当此字符紧随任何其他限定符<br>（*、+、?、{n}、{n,}、{n,m}）之后时，匹配模式是“非贪心的”。<br>“非贪心的”模式匹配搜索到的、尽可能短的字符串，而默认的<br>“贪心的”模式匹配搜索到的、尽可能长的字符串。<br>例如，在字符串“oooo”中，“o+?”只匹配单个“o”，而“o+”匹配所有“o”|
| \ |表示位于\之后的为转义字符|
|\num|此处的num是一个正整数。例如，“(.)\1”匹配两个连续的相同字符|
|\f|换页符匹配|
|\n|换行符匹配|
|\r|匹配一个回车符|
|\b|匹配单词头或单词尾|
|\B|与\b含义相反\b含义相反|
|\d|匹配任何数字，相当于[0-9]|
|\D|与\d含义相反，等效于[^0-9]|
|\s|匹配任何空白字符，包括空格、制表符、换页符，与 [ \f\n\r\t\v] 等效|
|\S|与\s含义相反|
|\w|匹配任何字母、数字以及下划线，相当于[a-zA-Z0-9_]|
|\W|与\w含义相反\w含义相反，与“[^A-Za-z0-9_]”等效|
|()|将位于()内的内容作为一个整体来对待|
|{}|按{}中的次数进行匹配|
|[]|匹配位于[]中的任意一个字符|
|[^xyz]|反向字符集，匹配除x、y、z之外的任何字符|
|[a-z]|字符范围，匹配指定范围内的任何字符|
|[^a-z]|反向范围字符，匹配除小写英文字母之外的任何字符|

常见在线正则表达式测试
* [在线工具http://tool.lu/regex/](http://tool.lu/regex/) 
* [正则表达式30分钟入门教程](http://www.cnblogs.com/hustskyking/archive/2013/06/04/RegExp.html) 

## For example

In [2]:
import re
#Check if the string starts with "The" and ends with "Spain":
txt = "The rain in Spain"
x = re.search("^The.*Spain$", txt)
if (x):
  print("YES! We have a match!")
else:
  print("No match")


YES! We have a match!


In [3]:
exampleString = '''
Jessica is 15 years old, and Daniel is 27 years old.
Edward is 97 years old, and his grandfather, Oscar, is 102. 
'''
ages = re.findall(r'\d{1,3}',exampleString)
names = re.findall(r'[A-Z][a-z]*',exampleString)

print(ages)
print(names)

['15', '27', '97', '102']
['Jessica', 'Daniel', 'Edward', 'Oscar']


## "re" Module

re模块主要方法
* compile(pattern[,flags]):创建模式对象
* search(pattern,string[,flags]):在字符串中寻找模式
* match(pattern,string[,flags]):从字符串的开始处匹配模式
* findall(pattern,string[,flags]):列出字符串中模式的所有匹配项
* split(pattern,string[,maxsplit=0]):根据模式匹配项分割字符串
* sub(pat,repl,string[,count=0]):将字符串中所有pat的匹配项用repl替换
* escape(string):将字符串中所有特殊正则表达式字符转义

其中flags的值可以是re.l（忽略大小写）、re.L、re.M（多行匹配模式）、re.S（使元字符.也匹配换行符）、re.U（匹配Unicode字符）、re.X（忽略模式中的空格，并可以使用#注释）的不同组合（使用|进行组合）。

### The findall() Function
The findall() function returns a list containing all matches.

In [4]:
str = "The rain in Spain"
x = re.findall("ai", str)
print(x)

['ai', 'ai']


In [5]:
x = re.findall("Portugal", str)
print(x)

[]


### The search() Function
The search() function searches the string for a match, and returns a Match object if there is a match.

If there is more than one match, only the first occurrence of the match will be returned:

In [6]:
str = "The rain in Spain"
x = re.search("\s", str)
x
#print("The first white-space character is located in position:", x.start())

<re.Match object; span=(3, 4), match=' '>

### The split() Function
The split() function returns a list where the string has been split at each match:

In [3]:
str = "The rain in Spain"
x = re.split("\s", str)
print(x)

['The', 'rain', 'in', 'Spain']


In [7]:
str = "The rain in Spain"
x = re.split("\s", str, 1)
print(x)

['The', 'rain in Spain']


### The sub() Function
The sub() function replaces the matches with the text of your choice:

In [8]:
str = "The rain in Spain"
x = re.sub("\s", "9", str)
print(x)

The9rain9in9Spain


In [9]:
str = "The rain in Spain"
x = re.sub("\s", "9", str, 2)
print(x)

The9rain9in Spain


## Match Object
A Match Object is an object containing information about the search and the result.

In [10]:
str = "The rain in Spain"
x = re.search("ai", str)
print(x) #this will print an object

<re.Match object; span=(5, 7), match='ai'>


In [6]:
#import re
re.search("H(a|ae|ä)ndel", 'Händel')

<re.Match object; span=(0, 6), match='Händel'>

The Match object has properties and methods used to retrieve information about the search, and the result:

* .span() returns a tuple containing the start-, and end positions of the match.
* .string returns the string passed into the function
* .group() returns the part of the string where there was a match

In [11]:
str = "The rain in Spain"
x = re.search(r"\bS\w+", str)
print(x.span())

(12, 17)


In [12]:
str = "The rain in Spain"
x = re.search(r"\bS\w+", str)
print(x.string)

The rain in Spain


In [13]:
str = "The rain in Spain"
x = re.search(r"\bS\w+", str)
print(x.group())

Spain


## Regular Expression Pattern

In [9]:
re.findall('o+','ooooo')
#re.findall('o+?','ooooo')

['ooooo']

In [15]:
txt='one industry, two industries.'
re.findall('industr(?:y|ies)',txt)

['industry', 'industries']

In [16]:
txt='Windows2000 WindowsXP Windows10 Windows3.1'
re.findall('Windows(?=2000|10|XP)',txt)

['Windows', 'Windows', 'Windows']

In [17]:
txt='Windows2000 WindowsXP Windows10 Windows3.1'
re.findall('Windows(?!2000|10|XP)',txt)

['Windows']

In [18]:
txt='2000Windows XPWindows 10Windows 3.1Windows'
re.findall('(?:(?<=2000)|(?<=10)|(?<=XP))Windows',txt)

['Windows', 'Windows', 'Windows']

In [19]:
txt='2000Windows XPWindows 10Windows 3.1Windows'
re.findall('(?<!2000)(?<!10)(?<!XP)Windows',txt)

['Windows']

## Regular Expression Applications
### Writing a Tokenizer
A tokenizer or scanner analyzes a string to categorize groups of characters. This is a useful first step in writing a compiler or interpreter.

The text categories are specified with regular expressions. The technique is to combine those into a single master regular expression and to loop over successive matches:

In [20]:
import collections
import re

Token = collections.namedtuple('Token', ['type', 'value', 'line', 'column'])

def tokenize(code):
    keywords = {'IF', 'THEN', 'ENDIF', 'FOR', 'NEXT', 'GOSUB', 'RETURN'}
    token_specification = [
        ('NUMBER',   r'\d+(\.\d*)?'),  # Integer or decimal number
        ('ASSIGN',   r':='),           # Assignment operator
        ('END',      r';'),            # Statement terminator
        ('ID',       r'[A-Za-z]+'),    # Identifiers
        ('OP',       r'[+\-*/]'),      # Arithmetic operators
        ('NEWLINE',  r'\n'),           # Line endings
        ('SKIP',     r'[ \t]+'),       # Skip over spaces and tabs
        ('MISMATCH', r'.'),            # Any other character
    ]
    tok_regex = '|'.join('(?P<%s>%s)' % pair for pair in token_specification)
    line_num = 1
    line_start = 0
    for mo in re.finditer(tok_regex, code):
        kind = mo.lastgroup
        value = mo.group()
        column = mo.start() - line_start
        if kind == 'NUMBER':
            value = float(value) if '.' in value else int(value)
        elif kind == 'ID' and value in keywords:
            kind = value
        elif kind == 'NEWLINE':
            line_start = mo.end()
            line_num += 1
            continue
        elif kind == 'SKIP':
            continue
        elif kind == 'MISMATCH':
            raise RuntimeError(f'{value!r} unexpected on line {line_num}')
        yield Token(kind, value, line_num, column)

statements = '''
    IF quantity THEN
        total := total + price * quantity;
        tax := price * 0.05;
    ENDIF;
'''

for token in tokenize(statements):
    print(token)

Token(type='IF', value='IF', line=2, column=4)
Token(type='ID', value='quantity', line=2, column=7)
Token(type='THEN', value='THEN', line=2, column=16)
Token(type='ID', value='total', line=3, column=8)
Token(type='ASSIGN', value=':=', line=3, column=14)
Token(type='ID', value='total', line=3, column=17)
Token(type='OP', value='+', line=3, column=23)
Token(type='ID', value='price', line=3, column=25)
Token(type='OP', value='*', line=3, column=31)
Token(type='ID', value='quantity', line=3, column=33)
Token(type='END', value=';', line=3, column=41)
Token(type='ID', value='tax', line=4, column=8)
Token(type='ASSIGN', value=':=', line=4, column=12)
Token(type='ID', value='price', line=4, column=15)
Token(type='OP', value='*', line=4, column=21)
Token(type='NUMBER', value=0.05, line=4, column=23)
Token(type='END', value=';', line=4, column=27)
Token(type='ENDIF', value='ENDIF', line=5, column=4)
Token(type='END', value=';', line=5, column=9)


### Obtain information from html

In [21]:
html = u"""
<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <title>文章的标题</title>
</head>
<body>
    <div id="app" class="container">
        <h1>h1文字</h1>

        <label for="input">Input</label>
        <textarea id="input" rows="10" class="form-control">
</body>
</html>
"""

In [22]:
pattern = '<title>(.*?)</title>'
print(re.findall(pattern,html,flags=re.S)[0])

文章的标题


In [3]:
h=u""" abc"""
len(h)

4

# Classes

Variables, Lists, Dictionaries etc in python are objects. Without getting into the theory part of Object Oriented Programming, explanation of the concepts will be done along this tutorial.

* 面向对象程序设计（Object Oriented Programming，OOP）的思想主要针对大型软件设计而提出，使得软件设计更加灵活，能够很好地支持代码复用和设计复用，并且使得代码具有更好的可读性和可扩展性。面向对象程序设计的一条基本原则是计算机程序由多个能够起到子程序作用的单元或对象组合而成，这大大地降低了软件开发的难度，使得编程就像搭积木一样简单。面向对象程序设计的一个关键性观念是将数据以及对数据的操作封装在一起，组成一个相互依存、不可分割的整体，即对象。对于相同类型的对象进行分类、抽象后，得出共同的特征而形成了类，面向对象程序设计的关键就是如何合理地定义和组织这些类以及类之间的关系。
* Python完全采用了面向对象程序设计的思想，是真正面向对象的高级动态编程语言，完全支持面向对象的基本功能，如封装、继承、多态以及对基类方法的覆盖或重写。但与其他面向对象程序设计语言不同的是，Python中对象的概念很广泛，Python中的一切内容都可以称为对象例如，字符串、列表、字典、元组等内置数据类型都具有和类完全相似的语法和用法。创建类时用变量形式表示的对象属性称为数据成员或成员属性，用函数形式表示的对象行为称为成员函数或成员方法，成员属性和成员方法统称为类的成员。

## 类的定义语法

Python使用class关键字来定义类，class关键字之后是一个空格，然后是类的名字，再然后是一个冒号，最后换行并定义类的内部实现。类名的首字母一般要大写，当然您也可以按照自己的习惯定义类名，但是一般推荐参考惯例来命名，并在整个系统的设计和实现中保持风格一致，这一点对于团队合作尤其重要。例如：

A class is declared as follows
```python
class class_name:
    methods (functions)```


In [12]:
class FirstClass:
    "This is an empty class"
    pass

**pass** in python means do nothing. The string defines the documentation of the class, accessible via `help(FirstClass)`

Python提供了一个关键字“pass”，类似于空语句，可以用在类和函数的定义中或者选择结构中。当暂时没有确定如何实现功能，或者为以后的软件升级预留空间，或者其他类型功能时，可以使用该关键字来“占位”。

Above, a class object named "FirstClass" is declared now consider a "egclass" which has all the characteristics of "FirstClass". So all you have to do is, equate the "egclass" to "FirstClass". In python jargon this is called as creating an instance. "egclass" is the instance of "FirstClass"

定义了类之后，可以用来实例化对象，并通过“对象名.成员”的方式来访问其中的数据成员或成员方法

In [13]:
egclass = FirstClass()

In [14]:
type(egclass)

__main__.FirstClass

In [15]:
type(FirstClass)

type

Objects (instances of a class) can hold data. A variable in an object is also called a field or an attribute. To access a field use the notation `object.field`. For example:x

对象(类的实例)可以保存数据。对象中的变量也称为字段或属性。要访问字段，请使用`object.field`符号

In [16]:
obj1 = FirstClass()
obj2 = FirstClass()
obj1.x = 5
obj2.x = 6
x = 7
print("x in object 1 =",obj1.x,"x in object 2=",obj2.x,"global x =",x)

x in object 1 = 5 x in object 2= 6 global x = 7


Now let us add some "functionality" to the class.  A function inside a class is called as a "Method方法" of that class

在Python中，函数和方法是有区别的。**方法**一般指与特定实例绑定的函数，通过对象调用方法时，对象本身将被作为第一个参数传递过去；**普通函数**并不具备这个特点。

In [28]:
class Counter:
    def reset(self,init=0):
        self.count = init
    def getCount(self):
        self.count += 1
        return self.count
counter = Counter()
counter.reset(0)
print("one =",counter.getCount(),"two =",counter.getCount(),"three =",counter.getCount())

one = 1 two = 2 three = 3


Note that the `reset()` and function and the `getCount()` method are callled with one less argument than they are declared with. The `self` argument is set by Python to the calling object. Here `counter.reset(0)` is equivalent to `Counter.reset(counter,0)`.

类的所有实例方法都必须至少有一个名为`self`的参数，并且必须是方法的第一个形参（如果有多个形参的话），`self`参数代表将来要创建的对象本身。在类的实例方法中访问实例属性时需要以`self`为前缀，但在外部通过对象名调用对象方法时并不需要传递这个参数`counter.reset(0)`，如果在外部通过类名调用对象方法则需要显式为self参数传值`Counter.reset(counter,0)`。

Using **self** as the name of the first argument of a method is simply a common convention. Python allows any name to be used.

在Python中，在类中定义实例方法时将第一个参数定义为“self”只是一个习惯，而实际上类的实例方法中第一个参数的名字是可以变化的，而不必须使用“self”这个名字

Note that here it would be better if we could initialise Counter objects immediately with a default value of `count` rather than having to call `reset()`. A constructor method is declared in Python with the special name `__init__`:

默认值初始化对象，在构造函数`__init__`中声明

In [29]:
class FirstClass:
    def __init__(self,name,symbol):
        self.name = name      
        self.symbol = symbol

Now that we have defined a function and added the \_\_init\_\_ method. We can create a instance of FirstClass which now accepts two arguments. 

In [30]:
eg1 = FirstClass('one',1)
eg2 = FirstClass('two',2)

In [31]:
print(eg1.name, eg1.symbol)
print(eg2.name, eg2.symbol)

one 1
two 2


**dir( )** function comes very handy in looking into what the class contains and what all method it offers

**dir(a)** 查看对象a的成员

In [32]:
print("Contents of Counter class:",dir(Counter) )
print("Contents of counter object:", dir(counter))

Contents of Counter class: ['__class__', '__delattr__', '__dict__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', 'getCount', 'reset']
Contents of counter object: ['__class__', '__delattr__', '__dict__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', 'count', 'getCount', 'reset']


**dir( )** of an instance also shows it's defined attributes so the object has the additional 'count' attribute. Note that Python defines several default methods for actions like comparison (`__le__` is $\le$ operator). These and other special methods can be defined for classes to implement specific meanings for how object of that class should be compared, added, multiplied or the like.

Even classes have it's own types of variables.

Class Attribute : attributes defined outside the method and is applicable to all the instances.

Instance Attribute : attributes defined inside a method and is applicable to only that method and is unique to each instance.

在类的定义中，可以说属性有两种，一种是实例属性，另一种是类属性
* **类属性**:方法外部定义的属性，适用于所有实例。
* **实例属性**定义在方法内部的属性，只适用于该方法，并且对每个实例都是唯一的；

在主程序中（或类的外部），**实例属性**属于实例(对象)，只能通过对象名访问；而**类属性**属于类，可以通过类名或对象名访问。

在类的方法中可以调用类本身的其他方法，也可以访问类属性以及对象属性。在Python中比较特殊的是，可以动态地为类和对象增加成员，这一点是和很多面向对象程序设计语言不同的，也是Python动态类型特点的一种重要体现。


In [33]:
class FirstClass:
    test = 'test'
    def __init__(self,n,s):
        self.name = n
        self.symbol = s

Here test is a class attribute and name is a instance attribute.

In [34]:
eg3 = FirstClass('Three',3)

In [35]:
print(eg3.test,eg3.name,eg3.symbol)

test Three 3


In [None]:
class Car:
    price = 100000  #定义类属性
    def __init__(self, c):
        self.color = c #定义实例属性

In [None]:
car1 = Car("Red")
car2 = Car("Blue")
print(car1.color, Car.price)

In [None]:
Car.price = 110000 #修改类属性
Car.name = 'QQ' #增加类属性
car1.color = "Yellow" #修改实例属性
print(car2.color, Car.price, Car.name)
print(car1.color, Car.price, Car.name)

In [None]:
print(car2.color, car2.price, car2.name)
print(car1.color, car1.price, car1.name)

In [None]:
id(car1.color),id(car2.color)

In [None]:
id(car1.price),id(car2.price)

In [None]:
def setSpeed(self, s):
    self.speed = s

In [None]:
import types
car1.setSpeed = types.MethodType(setSpeed, Car)    #动态为对象增加成员方法

In [None]:
car1.setSpeed(50)                              #调用对象的成员方法
print(car1.speed)

In [None]:
car2.setSpeed(50) 

**私有成员与公有成员**

* Python并没有对私有成员提供严格的访问保护机制。在定义类的属性时，如果属性名以两个下划线“__”开头则表示是私有属性，否则是公有属性。私有属性在类的外部不能直接访问，需要通过调用对象的公有成员方法来访问，或者通过Python支持的特殊方式来访问。Python提供了访问私有属性的特殊方式，可用于程序的测试和调试，对于成员方法也具有同样的性质。
* 私有属性是为了数据封装和保密而设的属性，一般只能在类的成员方法（类的内部）中使用访问，虽然Python支持一种特殊的方式来从外部直接访问类的私有成员，但是并不推荐您这样做。公有属性是可以公开使用的，既可以在类的内部进行访问，也可以在外部程序中使用。

In [None]:
class A:
    def __init__(self, value1 = 0, value2 = 0):
        self._value1 = value1
        self.__value2 = value2
    
    def setValue(self, value1, value2):
        self._value1 = value1
        self.__value2 = value2

    def show(self):
        print(self._value1)
        print(self.__value2)

In [None]:
a = A(1,2)
a._value1

In [None]:
a.__value2 #EERROR!!!

In [None]:
a._A__value2 #在外部访问对象的私有数据成员

* 在Python中，以下划线开头的变量名和方法名有特殊的含义，尤其是在类的定义中。用下划线作为变量名和方法名前缀和后缀来表示类的特殊成员：
~~~ python
_xxx：这样的对象叫做保护成员，不能用'from module import *'导入，只有类对象和子类对象能访问这些成员；
__xxx__：系统定义的特殊成员；
__xxx：类中的私有成员，只有类对象自己能访问，子类对象也不能访问到这个成员，但在对象外部可以通过“对象名._类名__xxx”这样的特殊方式来访问。Python中不存在严格意义上的私有成员。
~~~

In [None]:
class Fruit:
    def __init__(self):
        self.__color = 'Red'
        self.price = 1
        
    def getColor(self):
        return self.__color
    
    def setColor(self, c):
        self.__color = c

In [None]:
apple = Fruit()
apple.price #显示对象公开数据成员的值

In [None]:
apple.price = 2 #修改对象公开数据成员的值
apple.price

In [None]:
print(apple.price, apple._Fruit__color) #显示对象私有数据成员的值

In [None]:
apple._Fruit__color = "Blue" #修改对象私有数据成员的值
print(apple.price, apple._Fruit__color)

In [None]:
apple.setColor('black')

In [None]:
apple.getColor()

In [None]:
print(apple.__color)

## 方法
* 在类中定义的方法可以粗略分为四大类：公有方法、私有方法、静态方法和类方法。
* 其中，公有方法、私有方法都属于对象，私有方法的名字以两个下划线“__”开始，每个对象都有自己的公有方法和私有方法，在这两类方法中可以访问属于类和对象的成员；公有方法通过对象名直接调用，私有方法不能通过对象名直接调用，只能在属于对象的方法中通过“self”调用或在外部通过Python支持的特殊方式来调用。
* 如果通过类名来调用属于对象的公有方法，需要显式为该方法的“self”参数传递一个对象名，用来明确指定访问哪个对象的数据成员。
* 静态方法和类方法都可以通过类名和对象名调用，但不能直接访问属于对象的成员，只能访问属于类的成员。一般将“cls”作为类方法的第一个参数名称，但也可以使用其他的名字作为参数，并且在调用类方法时不需要为该参数传递值。 

In [None]:
class Root:
    __total = 0
    
    def __init__(self, v):
        self.__value = v
        Root.__total += 1
    
    def show(self):
        print('self.__value:',self.__value)
        print('Root.__total:',Root.__total)
        
    @classmethod
    def classShowTotal(cls): #类方法
        print(cls.__total)
        
    @staticmethod
    def staticShowTotal(): #静态方法
        print(Root.__total)

In [None]:
r = Root(3)
r.classShowTotal() #通过对象来调用类方法

In [None]:
r.staticShowTotal() #通过对象来调用静态方法

In [None]:
r.show()

In [None]:
rr = Root(5)

In [None]:
Root.classShowTotal() #通过类名调用类方法

In [None]:
Root.staticShowTotal() #通过类名调用静态方法

In [None]:
Root.show() #试图通过类名直接调用实例方法，失败

In [None]:
Root.show(r) #但是可以通过这种方法来调用方法并访问实例成员

In [None]:
r.show()

In [None]:
Root.show(rr) #通过类名调用实例方法时为self参数显式传递对象名

In [None]:
rr.show()

In [None]:
rlist = [Root(i) for i in range(10)]

In [None]:
rlist[8].show()

## 属性
在Python 3.x中，属性得到了较为完整的实现，支持更加全面的保护机制。

* 例如下面的代码所示，如果设置属性为只读，则无法修改其值，也无法为对象增加与属性同名的新成员，同时，也无法删除对象属性。

In [None]:
class Test:
    def __init__(self, value):
        self.__value = value
    
    @property
    def value(self): #只读，无法修改和删除
        print('get value')
        return self.__value

In [None]:
t = Test(3)

In [None]:
t.value

In [None]:
t.value = 5 #只读属性不允许修改值

In [None]:
t.v=5 #动态增加新成员
t.v

In [None]:
del t.v #动态删除成员

In [None]:
del t.value #试图删除对象属性，失败

In [None]:
t.value

* 下面的代码则把属性设置为可读、可修改，而不允许删除。

In [None]:
class Test:
    def __init__(self, value):
        self.__value = value
    
    def __get_value(self):
        print('__get_value')
        return self.__value
    def __set_value(self, v):
        print('__set_value')
        self.__value = v
    value = property(__get_value, __set_value)

    def show(self):
        print(self.__value)

In [None]:
t = Test(3)
t.value #允许读取属性值

In [None]:
t.value = 5 #允许修改属性值
t.value

In [None]:
t.show() #属性对应的私有变量也得到了相应的修改

In [None]:
del t.value #试图删除属性，失败

* 也可以将属性设置为可读、可修改、可删除。

In [None]:
class Test:
    def __init__(self, value):
        self.__value = value
    
    def __get_value(self):
        #print('__get_value')
        return self.__value
    def __set_value(self, v):
        #print('__set_value')
        self.__value = v
    def __del_value(self):
        #print('__del_value')
        del self.__value
    
    value = property(__get_value, __set_value, __del_value)

    def show(self):
        print(self.__value)

In [None]:
t = Test(3)
t.show()

In [None]:
t.value

In [None]:
t.value = 5
t.show()

In [None]:
t.value

In [None]:
del t.value

In [None]:
t.value

In [None]:
t.show()

In [None]:
t.value =1 #为对象动态增加属性和对应的私有数据成员
t.show()

In [None]:
t.value

## 常用特殊方法
* Python类有大量的特殊方法，其中比较常见的是构造函数和析构函数。Python中类的构造函数是__init__()，一般用来为数据成员设置初值或进行其他必要的初始化工作，在创建对象时被自动调用和执行，可以通过为构造函数定义默认值参数来实现类似于其他语言中构造函数重载的目的。如果用户没有设计构造函数，Python将提供一个默认的构造函数用来进行必要的初始化工作。Python中类的析构函数是__del__()，一般用来释放对象占用的资源，在Python删除对象和收回对象空间时被自动调用和执行。如果用户没有编写析构函数，Python将提供一个默认的析构函数进行必要的清理工作。

|方法|功能说明|
|--------|---------|
|&#95;&#95;init&#95;&#95;()|构造函数，生成对象时调用|
|&#95;&#95;del&#95;&#95;()|析构函数，释放对象时调用|
|&#95;&#95;add&#95;&#95;()|+|
|&#95;&#95;sub&#95;&#95;()|-|
|&#95;&#95;mul&#95;&#95;()|&#42;|
|&#95;&#95;div&#95;&#95;()&#95;&#95;、truediv&#95;&#95;()|/|
|&#95;&#95;floordiv&#95;&#95;()|整除|
|&#95;&#95;mod&#95;&#95;()|%|
|&#95;&#95;pow&#95;&#95;()|**|
|&#95;&#95;cmp&#95;&#95;()|比较运算|
|&#95;&#95;repr&#95;&#95;()|打印、转换|
|&#95;&#95;setitem&#95;&#95;()|按照索引赋值|
|&#95;&#95;getitem&#95;&#95;()|按照索引获取值|
|&#95;&#95;len&#95;&#95;()|计算长度|
|&#95;&#95;call&#95;&#95;()|函数调用|
|&#95;&#95;contains&#95;&#95; ()|测试是否包含某个元素|
|&#95;&#95;eq&#95;&#95;()、 &#95;&#95;ne&#95;&#95;()、&#95;&#95;lt&#95;&#95;()、 &#95;&#95;le&#95;&#95;()、&#95;&#95;gt&#95;&#95;()|==、!=、<、<=、>|
|&#95;&#95;str&#95;&#95;()|转化为字符串|
|&#95;&#95;lshift&#95;&#95;()|<<|
|&#95;&#95;and&#95;&#95;()|&|
|&#95;&#95;iadd&#95;&#95;()|+=|

## 案例精选

在MyArray.py文件中，定义了一个数组类，重写了一部分特殊方法以支持数组之间、数组与整数之间的四则运算以及内积、大小比较、成员测试和元素访问等运算符。

In [None]:
# %load MyArray.py
# Filename: MyArray.py
# --------------------
# Function description: Array and its operating
# --------------------
# Author: Dong Fuguo
# QQ: 306467355
# Email: dongfuguo2005@126.com
#--------------------
# Date: 2014-11-18, Updated on 2015-12-20
# --------------------


class MyArray:
    '''All the elements in this array must be numbers'''

    @staticmethod
    def __IsNumber(n):
        if isinstance(n, (int, float, complex)):
            return True
        return False

    def __init__(self, *args):
        if not args:
            self.__value = []
        else:
            for arg in args:
                if not self.__IsNumber(arg):
                    print('All elements must be numbers')
                    return
            self.__value = list(args)

    def __add__(self, n):   #数组中每个元素都与数字n相加，或两个数组相加，返回新数组
        if self.__IsNumber(n):
            b = MyArray()
            for v in self.__value:
                b.__value.append(v + n)
            return b
        elif isinstance(n, MyArray):
            if len(n.__value)==len(self.__value):
                c = MyArray()
                for i, j in zip(self.__value, n.__value):
                    c.__value.append(i+j)
                return c
            else:
                print('Lenght not equal')                
        else:
            print('Not supported')

    def __sub__(self, n):   #数组中每个元素都与数字n相减，返回新数组
        if not self.__IsNumber(n):
            print('- operating with ', type(n), ' and number type is not supported.')
            return
        b = MyArray()
        for v in self.__value:
            b.__value.append(v - n)
        return b

    def __mul__(self, n):     #数组中每个元素都与数字n相乘，返回新数组
        if not self.__IsNumber(n):
            print('* operating with ', type(n), ' and number type is not supported.')
            return
        b = MyArray()
        for v in self.__value:
            b.__value.append(v * n)
        return b

    def __truediv__(self, n):    #数组中每个元素都与数字n相除，返回新数组
        if not self.__IsNumber(n):
            print(r'/ operating with ', type(n), ' and number type is not supported.')
            return
        b = MyArray()
        for v in self.__value:
            b.__value.append(v / n)
        return b

    def __floordiv__(self, n):  #数组中每个元素都与数字n整除，返回新数组
        if not isinstance(n, int):
            print(n, ' is not an integer')
            return
        b = MyArray()
        for v in self.__value:
            b.__value.append(v  //  n)
        return b

    def __mod__(self, n):      #数组中每个元素都与数字n求余数，返回新数组
        if not self.__IsNumber(n):
            print(r'% operating with ', type(n), ' and number type is not supported.')
            return
        b = MyArray()
        for v in self.__value:
            b.__value.append(v % n)
        return b

    def __pow__(self, n):   #数组中每个元素都与数字n进行幂计算，返回新数组
        if not self.__IsNumber(n):
            print('** operating with ', type(n), ' and number type is not supported.')
            return
        b = MyArray()
        for v in self.__value:
            b.__value.append(v ** n)
        return b

    def __len__(self):        
        return len(self.__value)

    #for: x
    #when use the object as a statement directly, the function will be called
    def __repr__(self):
        #equivalent to return `self.__value`
        return 'MyArray:' + repr(self.__value)

    #for: print(x)
    def __str__(self):
        return str(self.__value)

    def append(self, v):    #追加元素
        if not self.__IsNumber(v):
            print('Only number can be appended.')
            return
        self.__value.append(v)

    def __getitem__(self, index):  #获取指定位置的元素值
        if isinstance(index, int) and 0 <= index < len(self.__value):
            return self.__value[index]
        else:
            print('Index out of range.')

    def __setitem__(self, index, v):  #设置指定位置的元素值
        if not self.__IsNumber(v):
            print(v, ' is not a number')
        elif (not isinstance(index, int)) or index<0 or index>=len(self.__value):
            print('Index type error or out of range')
        else:
            self.__value[index] = v

    #member test. support the keyword 'in'
    def __contains__(self, v):        #测试是否包含特定元素
        if v in self.__value:
            return True
        return False

    #dot product
    def dot(self, v):                 #模拟向量内积
        if not isinstance(v, MyArray):
            print(v, ' must be an instance of MyArray.')
            return
        if len(v) != len(self.__value):
            print('The size must be equal.')
            return
        b = MyArray()
        for m, n in zip(v.__value, self.__value):
            b.__value.append(m * n)
        return sum(b.__value)

    #equal to
    def __eq__(self, v):
        if not isinstance(v, MyArray):
            print(v, ' must be an instance of MyArray.')
            return False
        if self.__value == v.__value:
            return True
        return False

    #less than
    def __lt__(self, v):
        if not isinstance(v, MyArray):
            print(v, ' must be an instance of MyArray.')
            return False
        if self.__value < v.__value:
            return True
        return False

if __name__ == '__main__':
    print('Please use me as a module.')


In [18]:
import imp
imp.reload(MyArray)

NameError: name 'MyArray' is not defined

In [19]:
from MyArray import MyArray
a = MyArray(1, 2, 3, 4)
b = MyArray(6, 5, 4, 3, 2, 1)

In [20]:
a.dot(b)

The size must be equal.


In [21]:
import MyArray
a = MyArray.MyArray(1, 2, 3, 4, 5, 6)
b = MyArray.MyArray(6, 5, 4, 3, 2, 1)

In [22]:
print(repr(a))

MyArray:[1, 2, 3, 4, 5, 6]


In [23]:
a,b

(MyArray:[1, 2, 3, 4, 5, 6], MyArray:[6, 5, 4, 3, 2, 1])

In [24]:
len(a)

6

In [None]:
a + 5

In [None]:
a / 2

In [None]:
len(a)

In [None]:
a * 3

In [None]:
a ** 2

In [None]:
a.dot(b)

In [None]:
a < b

In [None]:
a > b

In [None]:
a == a

In [None]:
3 in a

In [None]:
a[0] = 8
a

## Inheritance

There might be cases where a new class would have all the previous characteristics of an already defined class. So the new class can "inherit" the previous class and add it's own methods to it. This is called as inheritance.

**继承机制**

* 继承是为代码复用和设计复用而设计的，是面向对象程序设计的重要特性之一。当我们设计一个新类时，如果可以继承一个已有的设计良好的类然后进行二次开发，无疑会大幅度减少开发工作量。在继承关系中，已有的、设计好的类称为父类或基类，新设计的类称为子类或派生类。派生类可以继承父类的公有成员，但是不能继承其私有成员。如果需要在派生类中调用基类的方法，可以使用内置函数super()或者通过“基类名.方法名()”的方式来实现这一目的。
* Python支持多继承，如果父类中有相同的方法名，而在子类中使用时没有指定父类名，则Python解释器将从左向右按顺序进行搜索。

Consider class SoftwareEngineer which has a method salary.

In [36]:
class SoftwareEngineer:
    def __init__(self,name,age):
        self.name = name
        self.age = age
    def salary(self, value):
        self.money = value
        print(self.name,"earns",self.money)

In [37]:
a = SoftwareEngineer('Kartik',26)

In [38]:
a.salary(40000)

Kartik earns 40000


In [39]:
[ name for name in dir(SoftwareEngineer) if not name.startswith("_")]

['salary']

Now consider another class Artist which tells us about the amount of money an artist earns and his artform.

In [40]:
class Artist:
    def __init__(self,name,age):
        self.name = name
        self.age = age
    def money(self,value):
        self.money = value
        print(self.name,"earns",self.money)
    def artform(self, job):
        self.job = job
        print(self.name,"is a", self.job)

In [41]:
b = Artist('Nitin',20)

In [42]:
b.money(50000)
b.artform('Musician')

Nitin earns 50000
Nitin is a Musician


In [43]:
[ name for name in dir(b) if not name.startswith("_")]

['age', 'artform', 'job', 'money', 'name']

money method and salary method are the same. So we can generalize the method to salary and inherit the SoftwareEngineer class to Artist class. Now the artist class becomes,

In [44]:
class Artist(SoftwareEngineer):
    def artform(self, job):
        self.job = job
        print(self.name,"is a", self.job)

In [45]:
c = Artist('Nishanth',21)

In [46]:
dir(Artist)

['__class__',
 '__delattr__',
 '__dict__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__le__',
 '__lt__',
 '__module__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 '__weakref__',
 'artform',
 'salary']

In [47]:
c.salary(60000)
c.artform('Dancer')

Nishanth earns 60000
Nishanth is a Dancer


Suppose say while inheriting a particular method is not suitable for the new class. One can override this method by defining again that method with the same name inside the new class.

In [48]:
class Artist(SoftwareEngineer):
    def artform(self, job):
        self.job = job
        print(self.name,"is a", self.job)
    def salary(self, value):
        self.money = value
        print(self.name,"earns",self.money)
        print("I am overriding the SoftwareEngineer class's salary method")

In [49]:
c = Artist('Nishanth',21)

In [50]:
c.salary(60000)
c.artform('Dancer')

Nishanth earns 60000
I am overriding the SoftwareEngineer class's salary method
Nishanth is a Dancer


If the number of input arguments varies from instance to instance asterisk can be used as shown.

In [51]:
class NotSure:
    def __init__(self, *args):
        self.data = ' '.join(list(args)) 

In [52]:
yz = NotSure('I', 'Do' , 'Not', 'Know', 'What', 'To','Type')

In [53]:
yz.data

'I Do Not Know What To Type'

### 例：设计Person类，并根据Person派生Teacher类，分别创建Person类与Teacher类的对象。

In [None]:
class Person:
    def __init__(self, name = '', age = 20, sex = 'man'):
        self.setName(name)
        self.setAge(age)
        self.setSex(sex)

    def setName(self, name):
        if not isinstance(name, str):
            print('name must be string.')
            return
        self.__name = name
        
    def setAge(self, age):
        if not isinstance(age, int):
            print('age must be integer.')
            return
        self.__age = age
        
    def setSex(self, sex):
        if sex != 'man' and sex != 'woman':
            print('sex must be "man" or "woman"')
            return
        self.__sex = sex
        
    def show(self):
        print('Name:', self.__name)
        print('Age:', self.__age)
        print('Sex:', self.__sex)

In [None]:
class Teacher(Person):
    def __init__(self, name='', age = 30, sex = 'man', department = 'Computer'):
        super().__init__(name, age, sex)
        #super(Teacher, self).__init__(name, age, sex)
        ## or, use another method like below:
        #Person.__init__(self, name, age, sex)
        self.setDepartment(department)
    
    def setDepartment(self, department):
        if not isinstance(department, str):
            print('department must be a string.')
            return
        self.__department = department
        
    def show(self):
        super().show()
        print('Department:', self.__department)

In [None]:
zhangsan = Person('Zhang San', 19, 'man')
zhangsan.show()

In [None]:
zhangsan.setSex('test')
zhangsan.show()

In [None]:
lisi = Teacher('Li Si',32, 'man', 'Math')
lisi.show()

In [None]:
lisi.setAge(40)
lisi.show()

## Introspection
We have already seen the `dir()` function for working out what is in a class. Python has many facilities to make introspection easy (that is working out what is in a Python object or module). Some useful functions are **hasattr**, **getattr**, and **setattr**:

我们已经看到了`dir()`函数用于确定类中的内容。Python有很多让内省变得容易的工具(即找出Python对象或模块中的内容)。一些有用的函数有hasattr, getattr和setattr:

In [54]:
ns = NotSure('test')
if hasattr(ns,'data'): # check if ns.data exists
    setattr(ns,'copy', # set ns.copy
            getattr(ns,'data')) # get ns.data
print('ns.copy =',ns.copy)

ns.copy = test
