## Custom attributes and descriptors
### object internals

#### object attributes and the access to attributes
* use dir() to show the list of an object's attribute names
  + this include the user defined class attributes, such as x, and y in the following code example
  + this also include a lot of built-in attributes
    + \_\_dict\_\_ is a regular python dictionary contained key value pairs of object's user-defined attributes
    + we can modify this dictionary to modify or even delete attributes from the object
    + we can test if an attribute exists using the dictionary's in operation
  + although \_\_dict\_\_ allows us to manipulate object attributes, we should use
    + hasattr, getattr, and setattr to check existence, get value values and set values of attributes

#### code example to show dir()

In [16]:
class Vector:
    """A two-dimensional vector."""

    def __init__(self, x, y):
        self.x = x
        self.y = y

    def __repr__(self):
        return f"{type(self).__name__}({self.x}, {self.y})"

In [17]:
    
v = Vector(5, 3)
print(v)
print(dir(v))

Vector(5, 3)
['__class__', '__delattr__', '__dict__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', 'x', 'y']


#### code example to show \_\_dict\_\_

In [8]:
# about __dict__
print(v.__dict__)
print(type(v.__dict__))

{'x': 5, 'y': 3}
<class 'dict'>


In [9]:
# modify attribute values 
print(v.__dict__['x'])
v.__dict__['x'] = 17
print(v.__dict__['x'])

# remove attribute
del v.__dict__['x']

# confirm x attribute is removed
print(v.x)

5
17


AttributeError: 'Vector' object has no attribute 'x'

In [12]:
# check if an attribute exist
print('x' in v.__dict__)
print('y' in v.__dict__)

False
True


In [13]:
# insert attribute to the object
# let's re-insert x attribute
v.__dict__['x'] = 13
v

Vector(13, 3)

In [15]:
# we can add new attributes, but will not be
# recognized by repr method
v.__dict__['z'] = 42
v

Vector(13, 3)

#### code example to show attribute check, get, set and delete attributes

In [18]:
# attribute existence check
hasattr(v, 'a')

False

In [20]:
# get attribute value
getattr(v, 'x')

5

In [21]:
getattr(v, 'a')

AttributeError: 'Vector' object has no attribute 'a'

In [24]:
# set attribute value
setattr(v, 'y', 9)
print(v.__dict__['y'])

# set new attribute value
setattr(v, 'b', 10)
print(v.b)

9
10


In [25]:
# delete attribute
delattr(v, "b")
v.b

AttributeError: 'Vector' object has no attribute 'b'

#### Dynamic attributes
* atrributes with fixed names sometimes are not convenient, or confusing
* we can accomodate dynamic attribute names using keyword arguments in constructor function
  * we then use the dictionary's update method to update attributes having the same names from input dictionary

In [27]:
class Vector:
    """An n-dimensional vector."""

    def __init__(self, **components):
        self.__dict__.update(components)

    def __repr__(self):
        return "{}({})".format(
            type(self).__name__,
            ", ".join(
                "{k}={v}".format(
                    k=k,
                    v=v,
                ) for k, v in self.__dict__.items()
            )    
        )


In [28]:
v = Vector(p=3, q=7)
v

Vector(p=3, q=7)

In [29]:
dir(v)

['__class__',
 '__delattr__',
 '__dict__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__le__',
 '__lt__',
 '__module__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 '__weakref__',
 'p',
 'q']

In [30]:
print(v.p)
print(v.q)

3
7


#### How to make attributes immutable?
* usually for static attributes, we name the attribute names with prefix \_, and then provide properties with only getters to prevent modification 
* for dynamic attributes, we don't know the names in advance, so we can not declare a property getter which must be named at class definition time, not object instantiation time. we use \_\_getattr\_\_

In [31]:
class Vector:
    """An n-dimensional vector."""

    def __init__(self, **components):
        # store attributes as private prefixed with _ 
        private_components = {f"_{k}": v for k, v in components.items()}
        self.__dict__.update(private_components)

    def __repr__(self):
        return "{}({})".format(
            type(self).__name__,
            ", ".join(
                "{k}={v}".format(
                    k=k[1:],     # remove the _ prefix from attribute name
                    v=v,
                ) for k, v in self.__dict__.items()
            )
        )

In [32]:
v = Vector(p=3, q=7)
v

Vector(p=3, q=7)

In [33]:
# show that the attributes are private prefixed with _
dir(v)

['__class__',
 '__delattr__',
 '__dict__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__le__',
 '__lt__',
 '__module__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 '__weakref__',
 '_p',
 '_q']

#### intercepting attributes
* we need to intercept attributes before error is raised when calling the attributes without prefix
  + when accessing attributes without using the \_ prefix, the attribute doesn't exist and attribute look up falis
  + this will invoke \_\_getattr\_\_
* difference between \_\_getattr\_\_ and \_\_getattribute\_\_
  + \_\_getattr\_\_: invoked after failed attribute lookups (prefer this method if you can)
  + \_\_getattribute\_\_: invoked before all attribute lookups (Use this method only if you must)
* we use \_\_getattr\_\_ to intecept failures in our case 
  + we know that we added a prefix to the input property name, so in \_\_getattr\_\_, we add that prefix to the requested attribut name, and then use regulat getattr to retrieve the private attribute values
  + the problem is that we can still directly access the attributes and set the values.
  + another problem of this design is that if we query an attribue that doesn't exist, \_\_getattr\_\_ and getattr will be alternatively called recursively with infinite loops, and finally raise RecursionError.
    + a sloution is to use hasattr to check if the corresponding private attribute exists before calling getattr, but this doesn't work sicne hasattr also calls getattr to search the attributes
    + the solution is to directly query \_\_dict\_\_ using the in operation to check if the private attributes exist
* finally, another consideration is that instead of calling getattr, which has a complex internal process to retrieve attribute, we directly get attribute from \_\_dict\_\_ by dictionary operatios, using a try/except structure     
  

In [12]:
class Vector:
    """An n-dimensional vector."""

    def __init__(self, **components):
        # store attributes as private prefixed with _ 
        private_components = {f"_{k}": v for k, v in components.items()}
        self.__dict__.update(private_components)

    def __getattr__(self, name):
        private_name = f"_{name}"
        try:
            return self.__dict__[private_name]
        except KeyError:
            raise AttributeError(f"{self!r} object has no attribute {name!r}")
            
        return getattr(self, private_name)
    
    def __setattr__(self, name, value):
        raise AttributeError(f"can't set attribute {name!r}")
    
    def __repr__(self):
        return "{}({})".format(
            type(self).__name__,
            ", ".join(
                "{k}={v}".format(
                    k=k[1:],     # remove the _ prefix from attribute name
                    v=v,
                ) for k, v in self.__dict__.items()
            )
        )

In [13]:
v = Vector(p=3, q=7)
print(v.p)
print(v.q)

3
7


In [14]:
v.p =7

AttributeError: can't set attribute 'p'

In [15]:
v.a

AttributeError: Vector(p=3, q=7) object has no attribute 'a'

In [10]:
v.p

3

#### preventing attribute deletion
* we prevent users to delete attributes directly by overriding \_\_delattr\_\_ method and raise AttributeError

In [22]:
# add __delattr__ method
class Vector:
    """An n-dimensional vector."""

    def __init__(self, **components):
        # store attributes as private prefixed with _ 
        private_components = {f"_{k}": v for k, v in components.items()}
        self.__dict__.update(private_components)

    def __getattr__(self, name):
        private_name = f"_{name}"
        try:
            return self.__dict__[private_name]
        except KeyError:
            raise AttributeError(f"{self!r} object has no attribute {name!r}")
            
        return getattr(self, private_name)
    
    def __setattr__(self, name, value):
        raise AttributeError(f"can't set attribute {name!r}")
    
    def __delattr__(self, name):
        raise AttributeError(f"can't delete attribute {name!r}")
    
    def __repr__(self):
        return "{}({})".format(
            type(self).__name__,
            ", ".join(
                "{k}={v}".format(
                    k=k[1:],     # remove the _ prefix from attribute name
                    v=v,
                ) for k, v in self.__dict__.items()
            )
        )

In [28]:
v = Vector(p=3, q=7)
print(v.p)
print(v.q)

del v._q

3
7


AttributeError: can't delete attribute '_q'

In [29]:
v

Vector(p=3, q=7)

In [26]:
delattr(v, "_p")

AttributeError: can't delete attribute '_p'

#### cutomize attribute storage
* in the follow code example, we create a subclass of Vector, ColoredVector with extra attributes, \_color
* \_color is stored as a list, corresponding to the value of red, green and blue
* we can directly query the values of red, green, and blue as if they are directly stored in the attribute dictionary, by overriding \_\_getattr\_\_() method
* similarly, we can directly set the values of red, green, and blue as if they are directly stored in the attribute dictionary, by overriding \_\_setattr\_\_() method
* in these two overriden methods, we first check if the attributes are in the COLOR_INDEXES tuple, if so, we will go and find its index in \_color attribute stored in \_\_dict\_\_, otherwise, we forward the attributes to the getter and setter of the super class
* the consequence of overriding the setter and getter is that if users want to modidy other attributes from the Vector class, they will get AttributeError, and those attributes are immutable
* to fix the repr method that directly use the private attribute (here, the \_color) raher than red, green, blue values used in the constructor call, we added another private function, \_args, to translate the private properties to the correct directionary that repr can use. This dictionary (\_args, is different from the \_\_dict\_\_.
  + in this \_args dictionary, red, green, and blue are added as the attributes, with the \_colors attribute removed.
  + by doing this, repr will query the \_args dictionary and generate the representation consistent with the constructor of ColoredVector
  + here we use the polymorphism of self object to call the appropriate \_args method and return the appropriate dictionary to \_\_repr\_\_ method
* this awkward implementation shows the importance of composition compared to inheritance, when base class is not carefully designed (assuming all attributes in constructor arguments will be stored as private attribute with a \_ prefix)
  + we should use composition rather than inheritance in this case

In [32]:
# get the index of an element from the element value
test_tuple = ("red", "green", "blue")
print(test_tuple.index("red"))

test_list = ["red", "green", "blue"]
print(test_list.index("red"))

0
0


In [34]:
class Vector:
    """An n-dimensional vector."""

    def __init__(self, **components):
        private_components = {f"_{k}": v for k, v in components.items()}
        self.__dict__.update(private_components)

    def __getattr__(self, name):
        private_name = f"_{name}"
        try:
            return self.__dict__[private_name]
        except KeyError:
            raise AttributeError(f"{self!r} object has no attribute {name!r}")

    def __setattr__(self, name, value):
        raise AttributeError(f"Can't set attribute {name!r}")

    def __delattr__(self, name):
        raise AttributeError(f"Can't delete attribute {name!r}")

    def __repr__(self):
        return "{}({})".format(
            type(self).__name__,
            ", ".join(
                "{k}={v}".format(
                    k=k,
                    v=v,
                ) for k, v in self._args().items()
            )
        )

    def _args(self):
        return {k[1:]: v for k, v in self.__dict__.items()}


class ColoredVector(Vector):

    COLOR_INDEXES = ("red", "green", "blue")

    def __init__(self, red, green, blue, **components):
        super().__init__(**components)
        self.__dict__["_color"] = [red, green, blue]

    def __getattr__(self, name):
        try:
            channel = ColoredVector.COLOR_INDEXES.index(name)
        except ValueError:
            return super().__getattr__(name)
        else:
            return self.__dict__["_color"][channel]

    def __setattr__(self, name, value):
        try:
            channel = ColoredVector.COLOR_INDEXES.index(name)
        except ValueError:
            super().__setattr__(name, value)
        else:
            self.__dict__["_color"][channel] = value

    def _args(self):
        args = {
            "red": self.red,
            "green": self.green,
            "blue": self.blue,
        }
        args.update(super()._args())
        del args["color"]
        return args


In [35]:
cv = ColoredVector(red=32, green=244, blue=18, p=9, q=14)
cv

ColoredVector(red=32, green=244, blue=18, p=9, q=14)

#### Using vars() to access \_\_dict\_\_
* vars() returns a dictionary corresponding to the current local symbol table. Without an argument, vars() acts like locals()
* if we provide an object as the argument, vars(objec) returns the \_\_dict\_\_ attribute of the object, which can be a module, class, instance, or any other objects with a \_\_dict\_\_ attribute
* self.\_\_dict\_\_["\_color"] == vars(self)["\_color"]

#### Intercepting all attributes access by \_\_getattribute\_\_
* \_\_getattribute\_\_() is invoked before all attribute lookups so can be used to intercept all get requests
* must be very careful when using this method
* never access properties using self dot operation, since \_\_getattribute\_\_ search attributes by dot operations
  + access the attributes using the super().\_\_setattr\_\_ or super().\_\_getatribute\_\_
* in the following code, LoggingProxy accept an object, and store the object in the attribute "target"
  + when querying an object, it retrives the object from "target" attribute using super().\_\_getattribute\_\_,
  + it then invoke the getattr method on target, with the attribute name as normal
* code in line 14 calls the super().\_\_getattribute\_\_("\_\_class\_\_") utilize the polymorphism to get the LoggingProxy class name of the self object.
* this applies to all the methods defined in LoggingProxy class due to polymorphism, see super() demo code in this section
* the problem of this implementation is that the attributes can not be set by LoggingProxy, which will be implemented in the next step

In [36]:
class LoggingProxy:
    """Intercept and log all attribute access to an object."""

    def __init__(self, target):
        super().__setattr__("target", target)

    def __getattribute__(self, name):
        target = super().__getattribute__("target")
        try:
            value = getattr(target, name)
        except AttributeError as e:
            raise AttributeError(
                "{} could not forward request {} to {}".format(
                    super().__getattribute__("__class__").__name__,
                    name,
                    target
                )
            ) from e
        print(f"Retrieved attribute {name} == {value!r} from {target!r}")
        return value


In [37]:
cv = ColoredVector(red=23, green=44, blue=328, p=9, q=14)
cw = LoggingProxy(cv)
cw.p

Retrieved attribute p == 9 from ColoredVector(red=23, green=44, blue=328, p=9, q=14)


9

In [38]:
cw.red

Retrieved attribute red == 23 from ColoredVector(red=23, green=44, blue=328, p=9, q=14)


23

In [39]:
cw.pink

AttributeError: LoggingProxy could not forward request pink to ColoredVector(red=23, green=44, blue=328, p=9, q=14)

#### super() demo
* this demo shows that super() will use the class and other properties of self who invoked the method
* the get\_dict() method of SubClass calls the method defined in BaseClass, with self as a SubClass instance. As a result, the subclass instance's \_\_dict\_\_ obj is returned

In [48]:
class BaseClass:
    def __init__(self, a, b):
        self._a = a
        self._b = b
        
    def get_dict(self):
        return self.__dict__


class SubClass(BaseClass):
    def __init__(self, a, b, c):
        super().__init__(a, b)
        self._c = c
    
    def get_class(self):
        print(super().__getattribute__("__class__").__name__)
subclass = SubClass('a', 'b', 'c')
subclass.get_class()

SubClass


In [49]:
subclass.get_dict()

{'_a': 'a', '_b': 'b', '_c': 'c'}

#### Set attribute
* the above implementation can not be used to set attribute to the objects, as shown in the following code
  + the problem is that the attributes are direct set as the attributes of cw (LoggingProxy object), and the attributes of cv object are never changed, and when calling the getter, the previous stored values are retrieved
  + the solution is to implement the \_\_setattr\_\_ method
* the setter method is very similar to getter. Basically, we retrieved the target object, and use setattr method to set the value for the given attribute. if the attribute doesnot exist, we raise the AttributeError. 
  + we can set red, green, and blue attribues using the proxy, but can not set p and q, since they are immuntable and will throw errors

In [52]:
cv = ColoredVector(red=23, green=44, blue=328, p=9, q=14)
cw = LoggingProxy(cv)
cw.p = 13
cw.p

Retrieved attribute p == 9 from ColoredVector(red=23, green=44, blue=328, p=9, q=14)


9

In [55]:
class LoggingProxy:
    """Intercept and log all attribute access to an object."""

    def __init__(self, target):
        super().__setattr__("target", target)

    def __getattribute__(self, name):
        target = super().__getattribute__("target")
        try:
            value = getattr(target, name)
        except AttributeError as e:
            raise AttributeError(
                "{} could not forward request {} to {}".format(
                    super().__getattribute__("__class__").__name__,
                    name,
                    target
                )
            ) from e
        print(f"Retrieved attribute {name} == {value!r} from {target!r}")
        return value
    
    def __setattr__(self, name, value):
        target = super().__getattribute__("target")
        try:
            setattr(target, name, value)
        except AttributeError as e:
            raise AttributeError(
                "{} could not forward request {} to {}".format(
                    super().__getattribute__("__class__").__name__,
                    name,
                    target
                )
            ) from e
        print(f"Set attribute {name} == {value!r} on {target!r}")


In [57]:
cv = ColoredVector(red=23, green=44, blue=328, p=9, q=14)
cw = LoggingProxy(cv)
cw.red = 55
cw.red

Set attribute red == 55 on ColoredVector(red=55, green=44, blue=328, p=9, q=14)
Retrieved attribute red == 55 from ColoredVector(red=55, green=44, blue=328, p=9, q=14)


55

#### built-in protocals bypass attribute lookup
* \_\_getattribute\_\_ only intercepts attribute lookup through the dot operator, and does not intercept built-in protocols
  + to demonstrate that, we can invoke cw.\_\_repr\_\_() and repr(cw)
    + the first invocation is by dot operator, and will be processed by \_\_getattribute\_\_ 
    + the second passed to bulit-in function, and we get the default repr of LoggingProxy produced by the object's base class
* \_\_getattribute\_\_ can only be used to intercept special method calls when the special meethod is retrieved directly (such as \_\_repr\_\_() rather than repr()
* built-in functions such as len(), iter(), repr() etc. bypass \_\_getattribute\_\_ for performance reasons
  + it is upto us to implement the special methods called by built-in functions. as demonstrated by the following code where we implement the \_\_repr\_\_ to return the \_\_repr\_\_ attribute of target

In [58]:
cv = ColoredVector(red=23, green=44, blue=328, p=9, q=14)
cw = LoggingProxy(cv)
print(cw.__repr__())
print(repr(cw))

Retrieved attribute __repr__ == <bound method Vector.__repr__ of ColoredVector(red=23, green=44, blue=328, p=9, q=14)> from ColoredVector(red=23, green=44, blue=328, p=9, q=14)
ColoredVector(red=23, green=44, blue=328, p=9, q=14)
<__main__.LoggingProxy object at 0x000001741F8B4490>


In [65]:
class LoggingProxy:
    """Intercept and log all attribute access to an object."""

    def __init__(self, target):
        super().__setattr__("target", target)

    def __getattribute__(self, name):
        target = super().__getattribute__("target")
        try:
            value = getattr(target, name)
        except AttributeError as e:
            raise AttributeError(
                "{} could not forward request {} to {}".format(
                    super().__getattribute__("__class__").__name__,
                    name,
                    target
                )
            ) from e
        print(f"Retrieved attribute {name} == {value!r} from {target!r}")
        return value
    
    def __setattr__(self, name, value):
        target = super().__getattribute__("target")
        try:
            setattr(target, name, value)
        except AttributeError as e:
            raise AttributeError(
                "{} could not forward request {} to {}".format(
                    super().__getattribute__("__class__").__name__,
                    name,
                    target
                )
            ) from e
        print(f"Set attribute {name} == {value!r} on {target!r}")
        
    def __repr__(self):
        target = super().__getattribute__("target")
        repr_callable = getattr(target, "__repr__")
        return repr_callable()


In [66]:
cv = ColoredVector(red=23, green=44, blue=328, p=9, q=14)
cw = LoggingProxy(cv)
print(cw.__repr__())
print(repr(cw))

Retrieved attribute __repr__ == <bound method Vector.__repr__ of ColoredVector(red=23, green=44, blue=328, p=9, q=14)> from ColoredVector(red=23, green=44, blue=328, p=9, q=14)
ColoredVector(red=23, green=44, blue=328, p=9, q=14)
<class 'str'>
ColoredVector(red=23, green=44, blue=328, p=9, q=14)


### Class internals

#### Class Attribute Look up
* when we use \_\_.dict\_\_, we only see attributes, not methods
* we can get methods by getattr(obj, "\_\_repr\_\_), why?
  + methods are attributes of another object, which is the class object associated with our instance
  + v.\_\_class\_\_.\_\_dict\_\_ contains all the callable methods of the class
    + these callables accept our class instance (here in the code, v) as arguments and execute
    + we can also use built-in function to access \_\_class\_\_.\_\_dict\_\_ using vars(type(v))["\_\_repr\_\_"]
* \_\_dict\_\_ of a class object is not a normal dictionary, but a mappingproxy. 
  + mappingproxy is a special type of mapping used internally in Python, and which does not support item assignment directly
    + a TypeError will be raised if modification is attempted
    + to set an attribute, you need to use setattr

In [80]:
v = Vector(x=3, y = 7)
print(v.__dict__)
getattr(v, "__repr__")

{'_x': 3, '_y': 7}


<bound method Vector.__repr__ of Vector(x=3, y=7)>

In [81]:
# you can't assign items to class object's __dict__
v.__class__.__dict__['new_class_attribute'] = 5

TypeError: 'mappingproxy' object does not support item assignment

In [83]:
# execute the callable __repr__ with instance v as input
# and returns the repr representation string of v
r1 = v.__class__.__dict__["__repr__"](v)
print(r1)

# using built-in functions to retrieve repr
r2 = vars(type(v))["__repr__"](v)
print(r2)

# assign new class attribute by setattr
setattr(v.__class__, "new_class_attribute", 5)
print(v.__class__.__dict__["new_class_attribute"])

Vector(x=3, y=7)
Vector(x=3, y=7)
5


#### A simplified algorithm that Python uses to find an attribute
* implement \_\_getattribute\_\_, which is used in Python to get all attributes
  * first get the class of the object
  * if the attribute is in the instance dictionary, returns it
  + if the attribute is in the class dictionary including its base in MRO, return it. Here getattr will call \_\_getattribute\_\_ if can not find attribute, which wil throw error
  + if it is not in class dictionary, check if \_\_getattribute\_\_ is defined, if so, call \_\_getattribute\_\_(obj, name) and throw error if can not find attribute 
  + finally, if all the previous fail, raise AttributeError

In [None]:
class object:
    """
    This code is illustrative. The real object class and its methods are implemented in C.
    """

    def __getattribute__(obj, name):
        cls = type(obj)
        if name in vars(obj):
            return vars(obj)[name]
        if hasattr(cls, name):
            return getattr(cls, name)
        if hasattr(cls, "__getattr__"):
            return cls.__getattr__(obj, name)
        raise AttributeError(f"{cls.__name__} object has no attribute {name}")

### Optimizing memory usage with slots

#### Slots Trade-offs
* slot can be used to reduce memory usage by instances with a trade-off against flexibility
  + regular python instance objects store attributes in a regular dictionary.
    + we can measure the size of the dictionary using getsizeof() function in sys module
      + an empty dictionary is 64 bytes, large numbers of python objects can be heavy
      + when adding new attributes to the object, the object size increases very fast
+ we can save the memory usage by using slots
  + first, we need to declare a class attribute called \_\_slots\_\_ as a list of strings. Each of which refers to a fixed name of attributes we want all of our objects to contain
  + next, we initialize attribute values in \_\_init\_\_ function
  + the trade-off is that, we can no longer dynamically add new attributes to instance objects since the instance objects will no longer have \_\_dict\_\_
* slots are usually not required 
  + unless measurements may indicate that they may help to save memory usage. 
  + slots may interact with other python features and diagnostic tools in surprising ways 
   

In [84]:
import sys
d = {}
sys.getsizeof(d)

64

In [89]:
# a python Resistor object without using __slots__
import sys
class Resistor:

#     __slots__ = ["resistance_ohms", "tolerance_percent", "power_watts"]

    def __init__(self, resistance_ohms, tolerance_percent, power_watts):
        self.resistance_ohms = resistance_ohms
        self.tolerance_percent = tolerance_percent
        self.power_watts = power_watts


In [90]:
r10 = Resistor(10, 5, 25)
sys.getsizeof(r10) + sys.getsizeof(r10.__dict__)

152

In [91]:
# with the addition of new attributes, the size of objects increases fast
r10.cost_dollars = 0.02
r10.tcr_ohms_per_kelvin = -0.0005
r10.inductance_herys = 2e-9
sys.getsizeof(r10) + sys.getsizeof(r10.__dict__)

192

In [92]:
# example code using __slots__ in class definition
import sys
class Resistor:

    __slots__ = ["resistance_ohms", "tolerance_percent", "power_watts"]

    def __init__(self, resistance_ohms, tolerance_percent, power_watts):
        self.resistance_ohms = resistance_ohms
        self.tolerance_percent = tolerance_percent
        self.power_watts = power_watts


In [93]:
r10 = Resistor(10, 5, 25)
sys.getsizeof(r10) 

56

### Descriptors

#### Reviewing properties
* properties are based on descriptor protocol
* in the following code example, we create a class of Planet, and use property decorators to validate the set process to make sure all the attributes have valid values when setting them
  + this is called self-encapsulation where all properties are access via self
  + this gives us validatio on construction of objects for free
* the trade of defining all properties are
  + the amount of code exploded
  + duplicated code. For example, duplicated code to check non-negative constraints
  + descriptors will ultimately provide a way out of this

In [95]:
class Planet:
    def __init__(
        self,
        name,
        radius_metres,
        mass_kilograms,
        orbital_period_seconds,
        surface_temperature_kelvin,
    ):
        self.name = name
        self.radius_metres = radius_metres
        self.mass_kilograms = mass_kilograms
        self.orbital_period_seconds = orbital_period_seconds
        self.surface_temperature_kelvin = surface_temperature_kelvin

    @property
    def name(self):
        return self._name

    @name.setter
    def name(self, value):
        if not value:
            raise ValueError("Cannot set empty name")
        self._name = value

    @property
    def radius_metres(self):
        return self._radius_metres

    @radius_metres.setter
    def radius_metres(self, value):
        if value <= 0:
            raise ValueError(f"radius_metres value {value} is not positive.")
        self._radius_metres = value

    @property
    def mass_kilograms(self):
        return self._mass_kilograms

    @mass_kilograms.setter
    def mass_kilograms(self, value):
        if value <= 0:
            raise ValueError(f"mass_kilograms value {value} is not positive.")
        self._mass_kilograms = value

    @property
    def orbital_period_seconds(self):
        return self._orbital_period_seconds

    @orbital_period_seconds.setter
    def orbital_period_seconds(self, value):
        if value <= 0:
            raise ValueError(f"orbital_period_seconds value {value} is not positive.")
        self._orbital_period_seconds = value

    @property
    def surface_temperature_kelvin(self):
        return self._surface_temperature_kelvin

    @surface_temperature_kelvin.setter
    def surface_temperature_kelvin(self, value):
        if value <= 0:
            raise ValueError(f"surface_temperature_kelvin value {value} is not positive.")
        self._surface_temperature_kelvin = value


In [97]:
pluto = Planet(
    name='Pluto',
    radius_metres=1184e3,
    mass_kilograms=1.305e22,
    orbital_period_seconds=7816012992,
    surface_temperature_kelvin=55
)

In [98]:
pluto.surface_temperature_kelvin = -1

ValueError: surface_temperature_kelvin value -1 is not positive.

#### Unravelling the property decorators
* in the following code example, we explicitly defined the property object for each numeric property rather than using the property decorators
  + for each property, getter and setter methods are defined as private methods, and are passed to the corresponding property constructors to fget and fset arguments
  + These property objects are equvalent to @property decorator such that @property is just a function which returns a property object, which actually is a descriptor that can be bound to a class attribute
  + after defining these property objects in the class, we directly assigned/bound these properties to attributes of the class. These attributes can be used directly to invoke the corresponding getter and setter methods 

In [107]:
class Planet:
    def __init__(
        self,
        name,
        radius_metres,
        mass_kilograms,
        orbital_period_seconds,
        surface_temperature_kelvin,
    ):
        self.name = name
        self.radius_metres = radius_metres
        self.mass_kilograms = mass_kilograms
        self.orbital_period_seconds = orbital_period_seconds
        self.surface_temperature_kelvin = surface_temperature_kelvin

    @property
    def name(self):
        return self._name

    @name.setter
    def name(self, value):
        if not value:
            raise ValueError("Cannot set empty name")
        self._name = value

    def _get_radius_metres(self):
        return self._radius_metres

    def _set_radius_metres(self, value):
        if value <= 0:
            raise ValueError(f"radius_metres value {value} is not positive.")
        self._radius_metres = value

    radius_metres = property(
        fget=_get_radius_metres,
        fset=_set_radius_metres,
    )

    def _get_mass_kilograms(self):
        return self._mass_kilograms

    def _set_mass_kilograms(self, value):
        if value <= 0:
            raise ValueError(f"mass_kilograms value {value} is not positive.")
        self._mass_kilograms = value

    mass_kilograms = property(
        fget=_get_mass_kilograms,
        fset=_set_mass_kilograms,
    )

    def _get_orbital_period_seconds(self):
        return self._orbital_period_seconds

    def _set_orbital_period_seconds(self, value):
        if value <= 0:
            raise ValueError(f"orbital_period_seconds value {value} is not positive.")
        self._orbital_period_seconds = value

    orbital_period_seconds = property(
        fget=_get_orbital_period_seconds,
        fset=_set_orbital_period_seconds,
    )

    def _get_surface_temperature_kelvin(self):
        return self._surface_temperature_kelvin

    def _set_surface_temperature_kelvin(self, value):
        if value <= 0:
            raise ValueError(f"surface_temperature_kelvin value {value} is not positive.")
        self._surface_temperature_kelvin = value

    surface_temperature_kelvin = property(
        fget=_get_surface_temperature_kelvin,
        fset=_set_surface_temperature_kelvin,
    )


In [108]:
 pluto = Planet(
    name='Pluto',
    radius_metres=1184e3,
    mass_kilograms=1.305e22,
    orbital_period_seconds=7816012992,
    surface_temperature_kelvin=55
)

In [109]:
pluto.radius_metres = -13

ValueError: radius_metres value -13 is not positive.

#### Implementing a descriptor
* descriptor wraps three functions for getting, setting and deletting attributes
* these are the three operations supported by the methods of the descriptor protocol, namely:
  + get operation backed by \_\_get\_\_to retrieve a value
  + set operation backed by \_\_set\_\_ to set a value
  + delete operation backed by \_\_delete\_\_ to delete a value
* in the following code example, we implemented a descriptor class: Positive, with all these 3 methods implemented. 
  + in addition, it implements a \_\_init\_\_() to configure new instances of the descriptor
* How to use the descriptor class?
  + in Planet class file, import the Positive class
  + for attributes that will follow the valid criteria defined by Positive class, bound each of these attributes to a Positive descriptor instance by assigning a new Positive instance to that attribute.
  + then in the \_\_init\_\_ method of Planet class assign values to each of these attributes, which are now Positive instances
    + what happens when these attributes are assigned values in \_\_init\_\_():
      + when self.radius_metres = radius_metres is executed, it invokes Positive.\_\_set\_\_ method on the corresponding descriptor instance object
+ what happens after a Planet object is created (for example, the pluto instance object in the example code)?
  + when we call pluto.radius_metrse, we equvalently calls
  ```python
     positive.__get__(self, instance, owner)     
  ```
   Here the three arguments in Positive.\_\_get\_\_ function represents the following:
    + self: the descriptor instance object referred to by the class attribute, which is radius_metres
    + instance: pluto, the instance object from which descriptor was retrieved
    + owner: object which owns the descriptor, which is the class to which the descriptor is bound, in this case, Planet
    + to summarize, we are calling 
    
    ```python
    Positive.__get__(Planet.__dict__['radius_metres'], pluto, Planet)
    
    ```
      + we will have to dig into the class dictionary to retrieve the descriptor object without triggering the descriptor mechanism
      + note that the descriptor instance object is an attribute of Planet class object, not pluto instance object
  + when we call pluto.radius_metres = value, we equavlently calls
  
  ```python
  Positive.__set__(Planet.__dict__['radius_metres'], self, value)
  ```
  this assigns the value to the pluto instance object's radius_metres attribute
* Positive instance does not know the class attribute name in Planet, the reference chain is from 
  + pluto to Planet class atrribute, which refers to a Positive Descriptor instance, then to Positive class method \_\_get\_\_. There is no way \_\_get\_\_() method know the attribute name from class attributes of Planet class

In [116]:
from weakref import WeakKeyDictionary


class Positive:
    """A data-descriptor for positive numeric values."""

    def __init__(self):
        self._instance_data = WeakKeyDictionary()

    def __get__(self, instance, owner):
        return self._instance_data[instance]

    def __set__(self, instance, value):
        if value <= 0:
            raise ValueError(f"Value {value} is not positive")
        self._instance_data[instance] = value

    def __delete__(self, instance):
        raise AttributeError("Cannot delete attribute")


In [118]:
class Planet:
    def __init__(
        self,
        name,
        radius_metres,
        mass_kilograms,
        orbital_period_seconds,
        surface_temperature_kelvin,
    ):
        self.name = name
        
        # this equals to 
        # Positive.__set__(Planet.__dict__['radius_metres'], self, radius_metres)
        self.radius_metres = radius_metres
        self.mass_kilograms = mass_kilograms
        self.orbital_period_seconds = orbital_period_seconds
        self.surface_temperature_kelvin = surface_temperature_kelvin

    @property
    def name(self):
        return self._name

    @name.setter
    def name(self, value):
        if not value:
            raise ValueError("Cannot set empty name")
        self._name = value

    radius_metres = Positive()
    mass_kilograms = Positive()
    orbital_period_seconds = Positive()
    surface_temperature_kelvin = Positive()


In [119]:
pluto = Planet(
    name='Pluto',
    radius_metres=1184e3,
    mass_kilograms=1.305e22,
    orbital_period_seconds=7816012992,
    surface_temperature_kelvin=55
)

In [121]:
pluto.radius_metres

1184000.0

In [122]:
Positive.__get__(Planet.__dict__['radius_metres'], pluto, Planet)

1184000.0

In [120]:
pluto.radius_metres = -13

ValueError: Value -13 is not positive

In [124]:
pluto.__dict__

{'_name': 'Pluto'}

In [126]:
Positive.__get__(Planet.__dict__['radius_metres'], pluto, Planet)

1184000.0

#### Storing instance data
* in addition to \_\_class\_\_ attribute, Descriptor instance has an instance attribute called \_instance\_data
  + \_instance\_data is an instance of a special collection type from Python standard library called WeakKeyDictionary
  + WeakKeyDictionary is similar to a regular dictionary excpet that it won't retain value objects which are referred to only by the dictionary key references
  + a WeakKeyDictionary owned by each descriptor instance is used to associate Planet instances with the values of the quantity represented by that descriptor although the descriptor itself doesn't know which quantity is being represented
  + in the code of Positive class, for each instance, a WeakKeyDictionary is created, called \_instance\_data. Since we have 4 porperties, each one will have a separate Postive instance with the corresponding WeakKeyDictionary.
  + This WeakKeyDictionary stores the value for each Planet instance, with the instance as the key and its corresponding property value as the value. As a result, a single Dictionary will store all radius_metres values for all Planet instances. Another dictionary will contain all kilograms values for all Planet instances, and so on.
* by storing the instance attribute values completely outside the instances in a way that we can reliably retrieve them in \_\_get\_\_  

In [127]:
def inner_planets():

    mercury = Planet("Mercury",
                     radius_metres=2439.7e3,
                     mass_kilograms=3.3022e23,
                     orbital_period_seconds=7.60052e6,
                     surface_temperature_kelvin=340)

    venus = Planet("Venus",
                   radius_metres=6051.8e3,
                   mass_kilograms=4.8676e24,
                   orbital_period_seconds=1.94142e7,
                   surface_temperature_kelvin=737)

    earth = Planet("Earth",
                   radius_metres=6371.0e3,
                   mass_kilograms=5.972e24,
                   orbital_period_seconds=3.15581e7,
                   surface_temperature_kelvin=288)

    mars = Planet("Mars",
                  radius_metres=3389.5e3,
                  mass_kilograms=6.4185e23,
                  orbital_period_seconds=5.93543e7,
                  surface_temperature_kelvin=210)

    return mercury, venus, earth, mars

In [135]:
from pprint import pprint
# instantiate 4 Planets with their instance attributes stored as descriptors
mercury, venus, earth, mars = inner_planets()

# show that the instance attribute can be retrieved
print("radius metries for venus is "+ str(venus.radius_metres))

# check the WeakKeyDictionary stored by radius_metres attribute of Planet class
# Planet.__dict__["radius_metres"] is a Positive instance, which has _instance_data attribute
# this _instance_data is the WeakKeyDictionary storing all the values for radius_metres
pprint(dict(Planet.__dict__["radius_metres"]._instance_data))

radius metries for venus is 6051800.0
{<__main__.Planet object at 0x000001741FC145E0>: 3389500.0,
 <__main__.Planet object at 0x000001741FC146A0>: 3389500.0,
 <__main__.Planet object at 0x000001741FC152D0>: 6051800.0,
 <__main__.Planet object at 0x000001741FC161D0>: 6051800.0,
 <__main__.Planet object at 0x000001741FC17A90>: 6371000.0,
 <__main__.Planet object at 0x000001741FC17BE0>: 6371000.0,
 <__main__.Planet object at 0x000001741FD58C10>: 2439700.0,
 <__main__.Planet object at 0x000001741FD58FA0>: 1184000.0,
 <__main__.Planet object at 0x000001741FD598D0>: 2439700.0}


In [138]:
del mercury
pprint(dict(Planet.__dict__["radius_metres"]._instance_data))

{<__main__.Planet object at 0x000001741FC145E0>: 3389500.0,
 <__main__.Planet object at 0x000001741FC146A0>: 3389500.0,
 <__main__.Planet object at 0x000001741FC152D0>: 6051800.0,
 <__main__.Planet object at 0x000001741FC161D0>: 6051800.0,
 <__main__.Planet object at 0x000001741FC17A90>: 6371000.0,
 <__main__.Planet object at 0x000001741FC17BE0>: 6371000.0,
 <__main__.Planet object at 0x000001741FD58C10>: 2439700.0,
 <__main__.Planet object at 0x000001741FD58FA0>: 1184000.0,
 <__main__.Planet object at 0x000001741FD598D0>: 2439700.0}


#### Accessing Descriptors via class
* we can access the Positive instance of an attribute by \_\_dict\_\_ of the class object, but we can not directly access it from the class object (see the code in the following cell)
  + this is because when we access by dot operator, we trigger the descriptor method \_\_get\_\_(self, None, owner), since we directly use Planet, we don't provide an instance to it, and we get error since WeakKeyDictionary can not find the key to refer to
  + to solve this problem we can modify the class Positive code to return the Positive instance itself when instance argument is None, as shown in the example code

In [140]:
# access Positive class instance from __dict__ of Planet class
Planet.__dict__["radius_metres"]

<__main__.Positive at 0x1741fd584c0>

In [142]:
# direct access from Planet class doesn't work
Planet.radius_metres

TypeError: cannot create weak reference to 'NoneType' object

In [145]:
from weakref import WeakKeyDictionary


class Positive:
    """A data-descriptor for positive numeric values."""

    def __init__(self):
        self._instance_data = WeakKeyDictionary()

    def __get__(self, instance, owner):
        if instance is None:
            return self
        return self._instance_data[instance]

    def __set__(self, instance, value):
        if value <= 0:
            raise ValueError(f"Value {value} is not positive")
        self._instance_data[instance] = value

    def __delete__(self, instance):
        raise AttributeError("Cannot delete attribute")

class Planet:
    def __init__(
        self,
        name,
        radius_metres,
        mass_kilograms,
        orbital_period_seconds,
        surface_temperature_kelvin,
    ):
        self.name = name
        
        # this equals to 
        # Positive.__set__(Planet.__dict__['radius_metres'], self, radius_metres)
        self.radius_metres = radius_metres
        self.mass_kilograms = mass_kilograms
        self.orbital_period_seconds = orbital_period_seconds
        self.surface_temperature_kelvin = surface_temperature_kelvin

    @property
    def name(self):
        return self._name

    @name.setter
    def name(self, value):
        if not value:
            raise ValueError("Cannot set empty name")
        self._name = value

    radius_metres = Positive()
    mass_kilograms = Positive()
    orbital_period_seconds = Positive()
    surface_temperature_kelvin = Positive()
        
def inner_planets():

    mercury = Planet("Mercury",
                     radius_metres=2439.7e3,
                     mass_kilograms=3.3022e23,
                     orbital_period_seconds=7.60052e6,
                     surface_temperature_kelvin=340)

    venus = Planet("Venus",
                   radius_metres=6051.8e3,
                   mass_kilograms=4.8676e24,
                   orbital_period_seconds=1.94142e7,
                   surface_temperature_kelvin=737)

    earth = Planet("Earth",
                   radius_metres=6371.0e3,
                   mass_kilograms=5.972e24,
                   orbital_period_seconds=3.15581e7,
                   surface_temperature_kelvin=288)

    mars = Planet("Mars",
                  radius_metres=3389.5e3,
                  mass_kilograms=6.4185e23,
                  orbital_period_seconds=5.93543e7,
                  surface_temperature_kelvin=210)

    return mercury, venus, earth, mars

mercury, venus, earth, mars = inner_planets()

In [146]:
Planet.radius_metres

<__main__.Positive at 0x1741f63e230>

#### Setting descriptor names
* Descriptor objects don't know the names of the references to which they are bound (attribute name)
* starting from Python 3.6, this can be solved with a metaclass on the descriptor-owning class
  + this is a main use case for meta class in python
+ Since python 3.6, a special optional method, called \_\_set\_name\_\_ is introduced on descriptors
  + \_\_set\_name\_\_ is invoked when owning class is created to let the descriptor instance know its name

In [150]:
from weakref import WeakKeyDictionary


class Positive:
    """A data-descriptor for positive numeric values."""

    def __init__(self):
        self._instance_data = WeakKeyDictionary()
        
    def __set_name__(self, owner, name):
        self._name = name

    def __get__(self, instance, owner):
        if instance is None:
            return self
        return self._instance_data[instance]

    def __set__(self, instance, value):
        if value <= 0:
            raise ValueError(f"{self._name} {value} is not positive")
        self._instance_data[instance] = value

    def __delete__(self, instance):
        raise AttributeError(f"Cannot delete attribute {self._name}")

class Planet:
    def __init__(
        self,
        name,
        radius_metres,
        mass_kilograms,
        orbital_period_seconds,
        surface_temperature_kelvin,
    ):
        self.name = name
        
        # this equals to 
        # Positive.__set__(Planet.__dict__['radius_metres'], self, radius_metres)
        self.radius_metres = radius_metres
        self.mass_kilograms = mass_kilograms
        self.orbital_period_seconds = orbital_period_seconds
        self.surface_temperature_kelvin = surface_temperature_kelvin

    @property
    def name(self):
        return self._name

    @name.setter
    def name(self, value):
        if not value:
            raise ValueError("Cannot set empty name")
        self._name = value

    radius_metres = Positive()
    mass_kilograms = Positive()
    orbital_period_seconds = Positive()
    surface_temperature_kelvin = Positive()

In [151]:
mercury = Planet("Mercury",
                     radius_metres=2439.7e3,
                     mass_kilograms=3.3022e23,
                     orbital_period_seconds=7.60052e6,
                     surface_temperature_kelvin=340)

In [152]:
mercury.radius_metres = -1

ValueError: radius_metres -1 is not positive

#### Data Descriptors and Non-data Descriptors
* a non-data descriptor only implements \_\_get\_\_ and therefore is read-ony
* a data descriptor also implement \_\_set\_\_ or \_\_delete\_\_ or both
+ attribute lookup machinery gives data and non-data descriptors different precedence with respect to instance attributes
  + data descriptors take precedence over instance attributes in \_\_dict\_\_, which takes precedence over non-data descriptors
* the order is described in the following cell:
  + first get the class of the object
  + check if the attribute is a class attribute and store it. If not, class_attribute will be marked as undefined
  + then check if we can retrieve \_\_get\_\_ from the class attribute, if succeed, we know this is a sort of descriptor, otherwise, descriptor_get will be marked as undefined
  + if descriptor\_get is not undefined, then check if the cls_attribute has \_\_set\_\_ or \_\_delete\_\_ attributes, if so, this is a data descriptor, and return the attribute by descriptor\_get
  + else, follow the oreder of instance attribute, nondata-descriptor, class attribute, \_\_getattr\_\_ fallback and AttributeError
* for code example of data and nondata descriptors, see the cell

In [None]:
class object:

    def __getattribute__(obj, name):
        cls = type(obj)
        cls_attribute = getattr(cls, name, undefined)
        descriptor_get = getattr(type(cls_attribute), "__get__", undefined)
        if descriptor_get is not undefined:
            if (hasattr(type(cls_attribute), "__set__")
                or hasattr(type(cls_attribute), "__delete__")):
                return descriptor_get(cls_attribute, obj, cls)  # data descriptor
        if name in vars(obj):
            return vars(obj)[name]                              # instance attribute
        if descriptor_get is not undefined:
            return descriptor_get(cls_attribute, obj, cls)      # non-data descriptor
        if cls_attribute is not undefined:                       
            return cls_attribute                                # class attribute
        if hasattr(cls, "__getattr__"):
            return cls.__getattr__(obj, name)                   # __getattr__ fallback
        raise AttributeError(f"{cls.__name__} object has no attribute {name}")


In [153]:
# examples of data and nondata descriptors
class DataDescriptor:

    def __get__(self, instance, owner):
        print(f"{type(self).__name__}.__get__({self!r}, {instance!r}, {owner!r}")

    def __set__(self, instance, value):
        print(f"{type(self).__name__}.__set__({self!r}, {instance!r}, {owner!r}")


class NonDataDescriptor:

    def __get__(self, instance, owner):
        print(f"{type(self).__name__}.__get__({self!r}, {instance!r}, {owner!r}")


class Owner:

      a = DataDescriptor()
      b = NonDataDescriptor()

In [155]:
# create an owner object
obj = Owner()

# check its a attribute, which triggers descriptor mechanism and print messages from __get__
obj.a

DataDescriptor.__get__(<__main__.DataDescriptor object at 0x000001741F8956F0>, <__main__.Owner object at 0x000001741F938DC0>, <class '__main__.Owner'>


In [156]:
# create another instance attribute also named as a
obj.__dict__['a'] = "instance"
# when we access attribute a, since data descriptor 
# has precedence over instance attribute, we still get messages
# from data descriptor attribute
obj.a

DataDescriptor.__get__(<__main__.DataDescriptor object at 0x000001741F8956F0>, <__main__.Owner object at 0x000001741F938DC0>, <class '__main__.Owner'>


In [157]:
# now if we do the same thing to attribute b, which is non-data descriptor
# we can see instance attribute take the precedence over non-data descriptor
obj.b

NonDataDescriptor.__get__(<__main__.NonDataDescriptor object at 0x000001741F895840>, <__main__.Owner object at 0x000001741F938DC0>, <class '__main__.Owner'>


In [158]:
obj.__dict__['b']='instance'
obj.b

'instance'