**难**

# 29. 用纯属性取代get和set方法

我们很多时候会实现一些setter或getter方法：

In [2]:
class OldResistor:
    def __init__(self, ohms):
        self._ohms = ohms
        
    def get_ohms(self):
        return self._ohms
    
    def set_ohms(self):
        self._ohms = ohms

但这实在不像是python的代码，用起来会变得麻烦。但这些工作确实会让开发者在之后的工作中更加方便地封装功能、验证用法和限定取值范围。

python中可以通过设置属性也能完成上面的工作，而且用起来比较方便，但我们也希望具有验证、限定等功能，怎么办？

**一种合适的方式是使用@property修饰器和setter方法来进行**

In [10]:
class VoltageResistance:
    """ 一个用于记录电压的工作，同时在记录电压的同时，根据其电压来改变电流（current） """
    def __init__(self, ohms):
        self.ohms = ohms
        self._voltage = 0
        self.current = 0
    
    @property
    def voltage(self):
        return self._voltage
    
    @voltage.setter
    def voltage(self, voltage):
        self._voltage = voltage
        self.current = self._voltage / self.ohms  # 欧姆定理

In [5]:
r2 = VoltageResistance(1e3)
print("Before: %5r amps" % r2.current)
r2.voltage = 10
print("After: %5r amps" % r2.current)

Before:     0 amps
After:  0.01 amps


其能够完成数值和类型的验证：

In [11]:
class BoundedResistance(VoltageResistance):
    """ 一个用于记录电压的工作，同时在记录电压的同时，根据其电压来改变电流（current） """
    def __init__(self, ohms):
        super().__init__(ohms)
    
    @property
    def ohms(self):
        return self._ohms
    
    @ohms.setter
    def ohms(self, ohms):
        if ohms <= 0:
            raise ValueError("%f ohms must be > 0" % ohms)
        self._ohms = ohms

In [12]:
r3 = BoundedResistance(1e3)
r3.ohms = 0

ValueError: 0.000000 ohms must be > 0

在构造的时候输入不正确的值也会出错（这是因为父类中我们有`self.ohms = ohms`的语句，这里会使得其调用setter，如果改成`self._ohms`就不会有这个效果了）：

In [13]:
BoundedResistance(-5)

ValueError: -5.000000 ohms must be > 0

另一个功能，防止父类的属性遭到篡改：

In [14]:
class FixedResistance(VoltageResistance):
    """ 一个用于记录电压的工作，同时在记录电压的同时，根据其电压来改变电流（current） """
    def __init__(self, ohms):
        super().__init__(ohms)
    
    @property
    def ohms(self):
        return self._ohms
    
    @ohms.setter
    def ohms(self, ohms):
        if hasattr(self, "_ohms"):
            raise AttributeError("Can't set attribute")
        self._ohms = ohms

In [15]:
r4 = FixedResistance(1e3)
r4.ohms = 2e3

AttributeError: Can't set attribute

**`@property`的最大缺点是无法在类似的类中进行共享，因为`@ohms.setter`需要属性名。**

还有，**实现`getter`的时候，不要改变其他属性的值。**

# 30. 考虑用@property来代替属性重构

我们的例子是漏桶算法，其作用是将网络流量进行整流，得到稳定速率的数据流。

In [17]:
from datetime import datetime, timedelta

class Bucket:
    def __init__(self, period):
        self.period_delta = timedelta(seconds=period)
        self.reset_time = datetime.now()
        self.quota = 0
    
    def __repr__(self):
        return "Bucket(quota=%d)" % self.quota

In [20]:
def fill(bucket, amount):
    """ 填充桶，如果时间超过了周期，则重置桶的状态，不然，将量填充到配额中 """
    now = datetime.now()
    if now - bucket.reset_time > bucket.period_delta:
        bucket.quota = 0
        bucket.reset_time = now
    bucket.quota += amount

In [21]:
def deduct(bucket, amount):
    """ 扣除，如果超过了周期或扣除的数量超过了桶中的配额，则返回False，不然扣除配额，返回True """
    now = datetime.now()
    if now - bucket.reset_time > bucket.period_delta:
        return False
    if bucket.quota - amount < 0:
        return False
    bucket.quota -= amount
    return True

In [25]:
bucket = Bucket(60)                      # 创建一个漏桶
fill(bucket, 100)                        # 往漏桶中填充100的份额
print(bucket)

if deduct(bucket, 99):                   # 提取99的份额
    print("Had 99 quota")
else:
    print("Not enough for 99 quota")
print(bucket)

if deduct(bucket, 3):                    # 如果剩下的份额不够，则会阻止这次提取
    print("Had 3 quota")
else:
    print("Not enough for 3 quota")
print(bucket)

Bucket(quota=100)
Had 99 quota
Bucket(quota=1)
Not enough for 3 quota
Bucket(quota=1)


但以上的实现，无法得知提取被阻止的时候到底是因为配额不够，还是因为一开始就没有配额。

In [28]:
class Bucket:
    """
    实际上储存的不是quota，而是max_quota和quota_consumed，然后使用这两者实时计算quota
    
    max_quota是一开始加入的配额数量，而quota_consumed是消耗的综合（一个周期内）
    """
    def __init__(self, period):
        self.period_delta = timedelta(seconds=period)
        self.reset_time = datetime.now()
        self.max_quota = 0
        self.quota_consumed = 0
        
    def __repr__(self):
        return ("Bucket(max_quota=%d, quota_consumed=%d)" %
               (self.max_quota, self.quota_consumed))
    
    @property
    def quota(self):
        return self.max_quota - self.quota_consumed
    
    @quota.setter
    def quota(self, amount):
        delta = self.max_quota - amount
        if amount == 0:
            # 如果quota被设置为0，则被看做是重置状态
            self.quota_consumed = 0
            self.max_quota = 0
        elif delta < 0:
            # 如果本次配额的消耗会导致所存有配额的消失，则
            # 1. 如果这是一次消耗造成的，则将最大存有配额改成此次消耗的量
            # 2. 不然，则阻止
            assert self.quota_consumed == 0
            self.max_quota = amount
        else:
            assert self.max_quota >= self.quota_consumed
            self.quota_consumed += delta

In [29]:
bucket = Bucket(60)                      # 创建一个漏桶
fill(bucket, 100)                        # 往漏桶中填充100的份额
print(bucket)

if deduct(bucket, 99):                   # 提取99的份额
    print("Had 99 quota")
else:
    print("Not enough for 99 quota")
print(bucket)

if deduct(bucket, 3):                    # 如果剩下的份额不够，则会阻止这次提取
    print("Had 3 quota")
else:
    print("Not enough for 3 quota")
print(bucket)

Bucket(max_quota=100, quota_consumed=0)
Had 99 quota
Bucket(max_quota=100, quota_consumed=99)
Not enough for 3 quota
Bucket(max_quota=100, quota_consumed=99)


**`@property`适合逐步完善数据模型，但如果用的过于频繁，则需要考虑彻底重构该类并修改相关的调用代码。**

# 31. 用描述符来改写需要复用的`@property`方法

`@property`方法有一个明显的缺点，是不便于复用。就是如果我们要实现多个相同逻辑的属性，则需要手写多个`@property`方法。

一种更好的实现上述功能的方法是**Python的描述符（descriptor）**。因为python会对访问操作进行一定的转义，类似`len(foo)`被转义成了`foo.__len__()`，**访问操作被转义成了`__get__`和`__set__`方法**。

In [30]:
class Grade:
    def __get__(*args, **kwargs):
        pass
    
    def __set__(*args, **kwargs):
        pass
     
class Exam:
    pass
    math_grade = Grade()
    writing_grade = Grade()
    science_grade = Grade()

In [32]:
exam = Exam()
exam.writing_grade = 40   # <==> Exam.__dict__["writing_grade"].__set__(exam, 40)
exam.writing_grade # <==> Exam.__dict__["writing_grade"].__get__(exam, Exam)

只所以有上面的转义，关键在于`object`类的`__getattribute__`方法。**如果Exam实例中没有`writing_grade`属性，则python会转向Exam类，并在该类中查找同名的类属性（必须是类属性，如果是实例属性，赋值会直接把这个`Grade`实例替换掉）。这个类属性如果实现了`__get__`和`__set__`，则Python认为此对象遵从描述符协议。**

In [33]:
class Grade:
    def __init__(self):
        self._value = 0
        
    def __get__(self, instance, instance_type):
        return self._value
    
    def __set__(self, instance, value):
        if not (0 <= value <= 100):
            raise ValueError("Grade must be between 0 and 100")
        self._value = value

In [34]:
class Exam:
    math_grade = Grade()
    writing_grade = Grade()
    science_grade = Grade()

In [35]:
first_exam = Exam()
first_exam.writing_grade = 82
first_exam.science_grade = 99
print("Writing", first_exam.writing_grade)
print("Science", first_exam.science_grade)

Writing 82
Science 99


但如果出现多个Exam实例，就导致了错误的结果：

In [36]:
second_exam = Exam()
second_exam.writing_grade = 75
print("Second", second_exam.writing_grade, "is right")
print("First", first_exam.writing_grade, "is wrong")

Second 75 is right
First 75 is wrong


**这源于我们使用的是类属性的缘故。**一种解决的方式是在Grade属性中使用一个字典，对不同的实例进行记录。

In [37]:
class Grade:
    """ 描述符 """
    def __init__(self):
        self._values = {}
    
    def __get__(self, instance, instance_type):  # 从这里可以看出，instance指的是Grade作为类属性的类的实例
        if instance is None:
            return self
        return self._values.get(instance, 0)
    
    def __set__(self, instance, value):
        if not (0 <= value <= 100):
            raise ValueError("Grade must be between 0 and 100")
        self._values[instance] = value

但以上的实现有个问题：`_values`字典保存了每个instance的引用，所以这些instance不会被垃圾收集器收集，导致**泄露内存**。

我们可以使用`WeekKeyDictionary`来替代普通字典，其特殊之处在于：如果发现字典中所持有的的引用是整个程序中的最后引用，则系统自动将这个实例从字典的键中移除。

In [40]:
from weakref import WeakKeyDictionary

class Grade:
    def __init__(self):
        self._values = WeakKeyDictionary()
    
    def __get__(self, instance, instance_type):
        if instance is None:
            return self
        return self._values.get(instance, 0)
    
    def __set__(self, instance, value):
        if not (0 <= value <= 100):
            raise ValueError("Grade must be between 0 and 100")
        self._values[instance] = value
class Exam:
    math_grade = Grade()
    writing_grade = Grade()
    science_grade = Grade()

In [42]:
first_exam = Exam()
first_exam.writing_grade = 82
first_exam.science_grade = 99
print("Writing", first_exam.writing_grade)
print("Science", first_exam.science_grade)
second_exam = Exam()
second_exam.writing_grade = 75
print("Second", second_exam.writing_grade, "is right")
print("First", first_exam.writing_grade, "is right")

Writing 82
Science 99
Second 75 is right
First 82 is right


# 32. 使用`__getattr__`、`__getattribute__`和`__setattr__`实现按需生成的属性

如果在`__dict__`找不到想要的属性，则会调用`__getattr__`方法：

In [43]:
class LazyDB:
    def __init__(self):
        self.exists = 5
        
    def __getattr__(self, name):
        value = "Value for %s" % name
        setattr(self, name, value)  # 把这个补充到__dict__中
        return value

In [44]:
data = LazyDB()
print("Before:", data.__dict__)
print("foo:   ", data.foo)
print("After: ", data.__dict__)

Before: {'exists': 5}
foo:    Value for foo
After:  {'exists': 5, 'foo': 'Value for foo'}


In [45]:
class LoggingLazyDB(LazyDB):
    def __getattr__(self, name):
        print("Called __getattr__(%s)" % name)
        return super().__getattr__(name)

data = LoggingLazyDB()
print("exists:", data.exists)
print("foo:   ", data.foo)
print("foo:   ", data.foo)

exists: 5
Called __getattr__(foo)
foo:    Value for foo
foo:    Value for foo


我们看到，`__getattr__`方法在存在指定属性的时候是不调用的，但如果我们希望不管这个属性存在还是不存在都调用一些方法，则需要使用`__getattribute__`方法。

In [46]:
class ValidatingDB:
    def __init__(self):
        self.exists = 5
        
    def __getattribute__(self, name):
        print("Called __getattribute__(%s)" % name)
        try:
            return super().__getattribute__(name)  # 这里实际上调用的是object.__getattribute__，实际上就是正常的读取属性的过程，如果没有这个属性会返回一个AttributeError
        except AttributeError:
            value = "Value for %s" % name
            setattr(self, name, value)
            return value

In [47]:
data = ValidatingDB()
print("exists:", data.exists)
print("foo:   ", data.foo)
print("foo:   ", data.foo)

Called __getattribute__(exists)
exists: 5
Called __getattribute__(foo)
foo:    Value for foo
Called __getattribute__(foo)
foo:    Value for foo


上面我们用到一个函数`setattr`，实际上调用的是`__setattr__`方法。而且不止这样会调用，即便使用`self.attr = 5`这样的方式也是会调用这个方法的。

**在使用这些方法的时候，一定有注意不要陷入反复递归的陷阱。**

In [1]:
class BrokenDictionaryDB:
    def __init__(self, data):
        self._data = data
        
    def __getattribute__(self, name):
        print("Called __getattribute__(%s)" % name)
        return self._data[name]

In [2]:
# data = BrokenDictionaryDB({"foo": 3})
# data.foo

**这是因为`self._data`实际上会再次调用`__getattribute__`方法，从而陷入无限递归。解决方案是使用`super().__getattribute__("attrname")`方法，其因为是`object`的方法，即标准的提取属性方法，其效果就是从属性字典中把值提取出来。**

In [3]:
class DictionaryDB:
    def __init__(self, data):
        self._data = data
        
    def __getattribute__(self, name):
        data_dict = super().__getattribute__("_data")
        return data_dict[name]

In [4]:
data = DictionaryDB({"foo": 3})
data.foo

3

**同理，如果要在`__setattr__`中完成修改属性，则需要使用`super().__setattr__`完成。**

# 33. 用元类来验证子类

元类最简单的任务是去检查保证类的风格协调一致，确保一些方法得到了覆写，确保类属性之间具备某些严格的关系。

定义元类的时候，需要从`type`开始继承，对于使用元类的其他类来说，python会将这些类的`class`语句中的相关信息都发送给元类的`__new__`方法，我们可以通过以下的方法是来看到这些信息：

In [5]:
class Meta(type):
    def __new__(meta, name, bases, class_dict):
        print((meta, name, bases, class_dict))
        return type.__new__(meta, name, bases, class_dict)

In [6]:
class MyClass(metaclass=Meta):
    stuff = 123
    
    def foo(self):
        pass

(<class '__main__.Meta'>, 'MyClass', (), {'__module__': '__main__', '__qualname__': 'MyClass', 'stuff': 123, 'foo': <function MyClass.foo at 0x00000214803501E0>})


**因此，我们可以将相关的逻辑代码放在`Meta.__new__`方法中。**比如，我们希望创建一个表示多边形的类，所以我们定义一个特殊的验证类，使得多边形体系的基类将其作为自己的元类。注意，元类中编写的逻辑，是针对的该基类的子类（要做到这一点，需要检验`bases`）。

In [8]:
class ValidatePolygon(type):
    def __new__(meta, name, bases, class_dict):
        if bases != (object,) and bases != ():
            if class_dict["sides"] < 3:
                raise ValueError("Polygons need 3+ sides")
        return type.__new__(meta, name, bases, class_dict)

In [9]:
class Polygon(metaclass=ValidatePolygon):
    sides = None
    
    @classmethod
    def interior_angles(cls):
        return (cls.sides - 2) * 180

In [10]:
class Triangle(Polygon):
    sides = 3

In [11]:
print("Before class")
class Line(Polygon):
    print("Before sides")
    sides = 1
    print("After sides")
    print("After class")

Before class
Before sides
After sides
After class


ValueError: Polygons need 3+ sides

**可以看到，验证是在`class`语句体整个完成后进行的。**

# 34. 用元类去注册子类

In [12]:
import json

class Serializable:
    def __init__(self, *args):
        self.args = args
        
    def serialize(self):
        return json.dumps({"args": self.args})

In [13]:
class Point2D(Serializable):
    def __init__(self, x, y):
        super().__init__(x, y)
        self.x = x
        self.y = y
    def __repr__(self):
        return "Point2D(%d, %d)" % (self.x, self.y)

In [14]:
point = Point2D(5, 3)
print("Object:    ", point)
print("Serialized:", point.serialize())  # Point2D对象序列化

Object:     Point2D(5, 3)
Serialized: {"args": [5, 3]}


In [16]:
class Deserializable(Serializable):
    """ 反序列化，子类可以通过类方法将json_data还原成其对应的类实例 """
    @classmethod
    def deserialize(cls, json_data):
        params = json.loads(json_data)
        return cls(*params["args"])

In [17]:
class BetterPoint2D(Deserializable):
    def __init__(self, x, y):
        super().__init__(x, y)
        self.x = x
        self.y = y
    def __repr__(self):
        return "Point2D(%d, %d)" % (self.x, self.y)

In [18]:
point = BetterPoint2D(5, 3)
print("Before:", point)
data = point.serialize()
print("Serialized:", data)
after = BetterPoint2D.deserialize(data)
print("After:", after)

Before: Point2D(5, 3)
Serialized: {"args": [5, 3]}
After: Point2D(5, 3)


**这种方法的缺点是：我们需要知道现在的json数据是储存的是`BetterPoint2D`的数据，所以才能使用`BetterPoint2D`的类方法来反序列化。但理想的方案应该是更加通用的，不需要知道子类就可以进行。**

一种可行的方式是将类名记录到JSON数据中：

In [24]:
class BetterSerializable:
    def __init__(self, *args):
        self.args = args
        
    def serialize(self):
        return json.dumps({
            "class": self.__class__.__name__,
            "args": self.args
        })

**然后，将类名和类实例构造器之间的映射关系保存到一份字典中，然后在反序列化的时候取出来用。**

In [20]:
registry = {}

def register_class(target_class):
    registry[target_class.__name__] = target_class
    
def deserialize(data):
    params = json.loads(data)
    name = params["class"]
    target_class = registry[name]
    return target_class(*params["args"])

In [25]:
class EvenBetterPoint2D(BetterSerializable):
    def __init__(self, x, y):
        super().__init__(x, y)
        self.x = x
        self.y = y
        
    def __repr__(self):
        return "%s(%d, %d)" % (self.__class__.__name__, self.x, self.y)

In [26]:
register_class(EvenBetterPoint2D)

In [27]:
point = EvenBetterPoint2D(5, 3)
print("Before:", point)
data = point.serialize()
print("Serialized:", data)
after = deserialize(data)
print("After:", after)

Before: EvenBetterPoint2D(5, 3)
Serialized: {"class": "EvenBetterPoint2D", "args": [5, 3]}
After: EvenBetterPoint2D(5, 3)


以上的整个操作过程都是成功的，但有个问题：**开发者很有可能忘记调用`register_class`函数**。

**所以我们现在应该想个办法，在继承`BetterSerializable`的时候，程序自动调用`register_class`函数。这个功能可以通过元类来实现**

In [29]:
class Meta(type):
    def __new__(meta, name, bases, class_dict):
        cls = type.__new__(meta, name, bases, class_dict)  # 这里可以看到，`type.__new__`函数返回的是类本身，和实例中的`__new__`方法不同
        register_class(cls)
        return cls
    
class RegisteredSerializable(BetterSerializable, metaclass=Meta):
    pass

In [30]:
class Vector3D(RegisteredSerializable):
    def __init__(self, x, y, z):
        super().__init__(x, y, z)
        self.x, self.y, self.z = x, y, z

In [31]:
v3 = Vector3D(10, -7, 3)
print("Before:", v3)
data = v3.serialize()
print("Serialized:", data)
print("After:", deserialize(data))

Before: <__main__.Vector3D object at 0x000002148000B550>
Serialized: {"class": "Vector3D", "args": [10, -7, 3]}
After: <__main__.Vector3D object at 0x000002148000B668>


# 35. 用元类来注解类的属性

**元类的一个更加有用的功能，是可以在某个类刚定义好但是尚未使用的时候，提前修改或注解该类的属性。**

In [33]:
class Field:
    def __init__(self, name):
        self.name = name
        self.internal_name = "_" + self.name
    
    def __get__(self, instance, instance_type):
        if instance is None:
            return self
        return getattr(instance, self.internal_name, "")
    
    def __set__(self, instance, value):
        setattr(instance, self.internal_name, value)

In [34]:
class Customer:
    # Class attributes
    first_name = Field("first_name")  # 这里有个小问题，即我们必须将first_name重复写两次
    last_name = Field("last_name")
    prefix = Field("prefix")
    suffix = Field("suffix")

In [35]:
foo = Customer()
print("Before:", repr(foo.first_name), foo.__dict__)
foo.first_name = "Euclid"
print("After:", repr(foo.first_name), foo.__dict__)

Before: '' {}
After: 'Euclid' {'_first_name': 'Euclid'}


解决办法是使用元类：

In [36]:
class Meta(type):
    def __new__(meta, name, bases, class_dict):
        for key, value in class_dict.items():
            if isinstance(value, Field):
                value.name = key
                value.internal_name = "_" + key
        cls = type.__new__(meta, name, bases, class_dict)
        return cls

In [37]:
class DatabaseRow(metaclass=Meta):
    pass

In [39]:
class Field:
    def __init__(self):
        self.name = None
        self.internal_name = None
    
    def __get__(self, instance, instance_type):
        if instance is None:
            return self
        return getattr(instance, self.internal_name, "")
    
    def __set__(self, instance, value):
        setattr(instance, self.internal_name, value)

In [40]:
class BetterCustomer(DatabaseRow):
    first_name = Field()
    last_name = Field()
    prefix = Field()
    suffix = Field()

In [41]:
foo = BetterCustomer()
print("Before:", repr(foo.first_name), foo.__dict__)
foo.first_name = "Euler"
print("After:", repr(foo.first_name), foo.__dict__)

Before: '' {}
After: 'Euler' {'_first_name': 'Euler'}
