# python data models
https://docs.python.org/3/reference/datamodel.html

先理解下 `class/object/instance` 和`method/function`的区别

https://www.codecademy.com/en/forum_questions/558cd3fc76b8fe06280002ce

In [14]:
# A is a class
class A:
    def __init__(self,a=0):
        self.a = a
    def f(self):
        pass
# Objects can contain arbitrary amounts and kinds of data, without ids. 
# Objects are Python’s abstraction for data.
# creates a new instance of the class and assigns one object to the local variable x.

In [15]:
c =  d = A()

In [16]:
id(c)==id(d)

True

In [17]:
id(a) == id(b)

False

In [18]:
id(a.f) == id(b.f)

True

## Objects, values and types


```sh
1 一切皆object
    Objects are never explicitly destroyed; however, when they become unreachable they may be garbage-collected. 
    Some objects contain references to other objects; these are called containers.:list tuple dict...
2 object的value分为mutable和immutable
    mutable: instance,dict,list,string,number
    immutable: tuple type id
    不是完全不能变，例如:tuple的reference的内容可变，
```

In [1]:
a = [1,2,3]
b = {1:a}

In [2]:
b

{1: [1, 2, 3]}

In [3]:
a.append(4)

In [4]:
b

{1: [1, 2, 3, 4]}

In [5]:
a=(1,2,3)

In [6]:
id(a)

140421351725096

In [7]:
b=(1,2,3)

In [8]:
id(b)

140421343085336

In [9]:
c=a

In [10]:
id(c)

140421351725096

In [11]:
ord('a')

97

In [12]:
chr(97)

'a'

In [13]:
ord('我')

25105

In [14]:
chr(25102)

'戎'

## The standard type hierarchy

```sh
1 None
2 NotImplemented
3 ...(Ellipsis) 省略号也是一种type
4 numbers.Integral (int bool)
5 numbers.Real (float)
6 numbers.Complex (complex)
7 Sequences a[i] a[i:j]
    Immutable sequences: (Strings Tuple)
    Mutable sequences(List Byte Arrays)
8 Set types (set frozenset)
9 Mappings (Dictionaries)
10 Callable types
    User-defined functions
        def xxx(p1,*args,**kwargs):
            pass
    Instance methods:
        def xxx(self,*args,**kwargs):
            pass
    Generator functions
        yield xxx
    Coroutine functions
        async def xxx
    Asynchronous generator functions
        async def func():
            ...
            yield
    Built-in functions: A built-in function object is a wrapper around a C function
        sin.math()
    Built-in methods:
        alist.append()
    Class Instances:
        定义了__call()__类，可直接调用
11 Modules
    import blinker
    type(blinker) 就是module, 要求目录下要有__init__.py
12 Custom classes
    class Test:
        pass
13 I/O objects (also known as file objects)
    fd = open('xx.txt')
14 Internal types
    这个比较少见了.
    Code objects(区别于function objects)
    Frame objects: represent execution frames.
15 Traceback objects
16 Slice objects
17 static method objects
     A static method object is a wrapper around any other object, usually a user-defined method object. When a static method object is retrieved from a class or a class instance, the object actually returned is the wrapped object, which is not subject to any further transformation. Static method objects are not themselves callable, although the objects they wrap usually are.

```

关于staticmethod的进一步说明:
https://stackoverflow.com/questions/41921255/staticmethod-object-is-not-callable-switch-case


这种语法也可以，定义参数列表，比较少见.


In [1]:
def f(a:1):
    print(a)

In [2]:
def f(a:1) -> print('after definition'):
    print(a)

after definition


In [3]:
f(1)

1


## Special method names

### Basic customization

####  \_\_new\_\_ 和\_\_init\_\_

https://www.cnblogs.com/slhs/p/7717045.html

```sh
__new__方法的调用是发生在__init__之前的, 是类级别的方法 ，用来创建实例
__init__是创建好后用来初始化实例,添加一些属性等等，是实例里的方法.
__new__() to create it, and __init__() to customize it
```

In [271]:
class A:
    def __new__(cls,a):
        print('1st __new__')
        return super().__new__(cls)
    def __init__(self,a):
        print('then call __init__')
        self.a = a
    def get_a(self):
        return self.a

In [272]:
a1 = A(2)

1st __new__
then call __init__


```
__new__里定义的参数会传递给__init的参数
```

In [273]:
a1.get_a()

2

#### \_\_del\_\_

In [126]:
class A:
    pass
del A()

SyntaxError: can't delete function call (<ipython-input-126-0bdd8944ecc1>, line 3)

In [174]:
class A:
    def __del__(cls):
        print('call __del__\n123')
a = A()
del a

call __del__
123


#### \_\_str\_\_  和\_\_repr\_\_

In [472]:
class A:
    def __new__(cls,a):
        print('1st __new__')
        return super().__new__(cls)
    def __init__(self,a):
        print('then call __init__')
        self.a = a
    def get_a(self):
        return self.a
    def __str__(self):
        return ('this is str blalala..')
    def __repr__(self):
        return ('class A')

实际中，repr定义的更多用于调试

In [473]:
repr(A(1))

1st __new__
then call __init__


'class A'

In [474]:
str(A(1))

1st __new__
then call __init__


'this is str blalala..'

#### rich comparison methods

```sh
object.__lt__(self, other)
object.__le__(self, other)
object.__eq__(self, other)
object.__ne__(self, other)
object.__gt__(self, other)
object.__ge__(self, other)
```

#### \_\_hash\_\_ and \_\_eq\_\_

https://blog.csdn.net/lnotime/article/details/81194962

```sh
0 实现了__hash__的object可用在set中 和用作dict中的key
1 ___hash__和__eq__要同时出现，因为eq函数比较的就是各自instance的__hash__值
2 mutable class(like list or dict), 不要实现__hash__,因为hash要求对象的key不变化
3 默认的自定义类已经实现了__hash__and__eq__ hash(a)== hash(b) 和a==b的效果一样,可见==比较的就是hash值
4 不要随便重写__eq__, 重写了__eq__()方法的类会隐式的把__hash__赋为None
5 多进程环境下，无法保证str, bytes and datetime等objects的一致
```

In [477]:
a=(1,2)
b=(2,1)
a1=(1,2)
d=(1,3)
e = [1,2]

In [478]:
# list return NoneType
e.__hash__()

TypeError: 'NoneType' object is not callable

In [479]:
# set有__hash__方法，可用作dictionary的key
d = {a:1 ,b:2,c:3}

In [481]:
d[a]

1

In [482]:
# set的元素顺序不一致，hash的值也是不一样的
a.__hash__() == b.__hash__()

False

In [483]:
a.__hash__() == a1.__hash__()

True

In [484]:
a.__eq__(b)

False

In [485]:
a.__eq__(a1)

True

#### \_\_class\_\_

实例调用\_\_class\_\_属性时会指向该实例对应的类，然后可以再去调用其它类属性


In [327]:
a = [1,2,3,4]

```py
def __deepcopy__(self, memo):
    r = self.__class__()
    for k, v in self.items():
        r[copy.deepcopy(k)] = copy.deepcopy(v)
    r.core = self.core
    return r
```

### Customizing attribute access

#### \_\_getattr\_\_ and \_\_getattribute\_\_

In [328]:
a = {"1":1, "2":2 ,"3":3}

正常的字典，就这么用

In [329]:
a["1"]

1

能不能再方便点？人总是偷懒的..比如 a.1 = 1?

a.1 1是a的属性，这就是__getattr__函数做的事情

In [330]:
class AttributeDict(dict):
    def __getattr__(self,name):
        value = self[name]
        if isinstance(value,dict):
            value = AttributeDict(value)
        return value

In [331]:
a = AttributeDict({"name":"Bob", "year":1986})
print(a.name)
print(a.year)

Bob
1986


```sh
1 属性的默认访问顺序--->实例属性(instance.__dict__) -->类属性(class.__dict__) -->父类属性 -->__getattr__()-->AttributeError!
2 __getattr__和__getattribute__的区别?
    前者是在实例属性和类属性都找不到某个属性时，才会调用;
    后者是无条件调用，所有属性查找都先通过它. 如果重写了，最好用 object.__getattribute__(self, name)，并且覆盖__getattr
3 __setattr__其实就是在self.__dict__里 添加内容,self.__dict__['name']=value,即最终调用了self.__dict__的__setitem__
4 __getitem__是给予 对象类似dict的 obj['name']的访问能力.

```

#### \_\_setitem\_\_ and \_\_getitem\_\_

In [632]:
class AttributeDict(dict):
    def __setitem__(self,name,value):
        print('__setitem__,')
        return super().__setitem__(name,value)
    
    def __getitem__(self,name):
        print('__getitem__, find in dict data-structure')
        return super().__getitem__(name)
    
    def __setattr__(self,name,value):
        print('__setattr__ , put name:value in instance.__dict__')
        return super().__setattr__(name,value)
        
    def __getattr__(self,name):
        print('__getattr__ if finally not find anything, come for me')
        try:
            # goto self.__getitem__
            value = self[name]
        except KeyError:
            print('None exsited key')
            return None
        if isinstance(value,dict):
            value = AttributeDict(value)
        return value
    def __getattribute__(self,name):
        print('__getarrtribute__ I handle all attribute at  very first')
        #通过object显示调用__getattr__
        # 如果不显示使用，除非属性找不到，否则不再调用__getattr__
        return object.__getattribute__(self,name)

In [633]:
b={"1":1,"2":2}

In [634]:
b.1

SyntaxError: invalid syntax (<ipython-input-634-f14a8f337f4a>, line 1)

AttributeDict的key-val，通过attribute能访问到, 是因为上面`__getattr__`的实现里间接调用了`__getitem__`;

In [635]:
a = AttributeDict({"name":"Bob", "year":1986})
print(a.name)

__getarrtribute__ I handle all attribute at  very first
__getattr__ if finally not find anything, come for me
__getitem__, find in dict data-structure
Bob


不存在的属性，dict里也没有该数据，最终也不会有

In [636]:
a.what

__getarrtribute__ I handle all attribute at  very first
__getattr__ if finally not find anything, come for me
__getitem__, find in dict data-structure
None exsited key


增加what属性到 `intance.__dict__`中,调用了`__setattr__`

In [637]:
a.what = 666

__setattr__ , put name:value in instance.__dict__


In [638]:
a.__dict__

__getarrtribute__ I handle all attribute at  very first


{'what': 666}

a['what']仅查dict的数据，不会查attribute,所以找不到

In [639]:
a['what']

__getarrtribute__ I handle all attribute at  very first
__getarrtribute__ I handle all attribute at  very first
__getarrtribute__ I handle all attribute at  very first
__getarrtribute__ I handle all attribute at  very first
__getarrtribute__ I handle all attribute at  very first
__getarrtribute__ I handle all attribute at  very first
__getitem__, find in dict data-structure


KeyError: 'what'

'what':123 放在dict里:

In [640]:
a['what'] = 123

__getarrtribute__ I handle all attribute at  very first
__getarrtribute__ I handle all attribute at  very first
__getarrtribute__ I handle all attribute at  very first
__getarrtribute__ I handle all attribute at  very first
__getarrtribute__ I handle all attribute at  very first
__getarrtribute__ I handle all attribute at  very first
__setitem__,


正常字典访问:

In [641]:
a['what']

__getarrtribute__ I handle all attribute at  very first
__getarrtribute__ I handle all attribute at  very first
__getarrtribute__ I handle all attribute at  very first
__getarrtribute__ I handle all attribute at  very first
__getarrtribute__ I handle all attribute at  very first
__getarrtribute__ I handle all attribute at  very first
__getitem__, find in dict data-structure


123

属性访问,访问到`a.__dict__`中

In [643]:
a.__dict__

__getarrtribute__ I handle all attribute at  very first


{'what': 666}

In [642]:
a.what

__getarrtribute__ I handle all attribute at  very first


666

In [646]:
a['what_again'] = 999

__getarrtribute__ I handle all attribute at  very first
__getarrtribute__ I handle all attribute at  very first
__getarrtribute__ I handle all attribute at  very first
__getarrtribute__ I handle all attribute at  very first
__getarrtribute__ I handle all attribute at  very first
__getarrtribute__ I handle all attribute at  very first
__setitem__,


属性访问不到，但最终通过`__getitem__`找到了what_again的值

In [647]:
a.what_again

__getarrtribute__ I handle all attribute at  very first
__getattr__ if finally not find anything, come for me
__getitem__, find in dict data-structure


999

In [648]:
a.__dict__

__getarrtribute__ I handle all attribute at  very first


{'what': 666}

dict里的值，并没有属性what:666

In [650]:
a.items() 

__getarrtribute__ I handle all attribute at  very first


dict_items([('name', 'Bob'), ('year', 1986), ('what', 123), ('what_again', 999)])

dict 的str和repr默认就是item():

In [653]:
str(a)

"{'name': 'Bob', 'year': 1986, 'what': 123, 'what_again': 999}"

In [654]:
repr(a)

"{'name': 'Bob', 'year': 1986, 'what': 123, 'what_again': 999}"

#### Implementing Descriptors \_\_get\_\_ \_\_set\_\_


https://www.cnblogs.com/astropeak/p/9032271.html

```sh

描述符的作用：将属性访问转变为函数调用，并由这个函数来控制这个属性的值（也即函数的返回值），以及在返回值前做定制化的操作。

描述符可成为类的一个属性，控制类实例对象访问这个属性时如何返回值及做哪些额外操作

实现了下面任一函数的类，即遵循了描述符协议，是1个描述符类，其instance就是1个描述符，当这个描述符出现在
某个class的__dict__中时，这个描述符变成了class or instance的属性，可以直接赋值或者访问.


object.__get__(self, instance, owner)
object.__set__(self, instance, value)
object.__delete__(self, instance)
object.__set_name__(self, owner, name)

```



### Serialization \_\_getstate\_\_ \_\_setstate_\_

关于python object的序列化的，如pickle

 pickle 类的的实例时，Python 将只 pickle 当它调用该实例的 _getstate_() 方法时返回给它的值。类似的，在 unpickle 时，Python 将提供经过 unpickle 的值作为参数传递给实例的 _setstate_() 方法。在 _setstate_() 方法内，可以根据经过 pickle 的名称和位置信息来重建文件对象
 
 通过\_\_getstate\_\_ \_\_setstate\_\_来控制序列化对象的行为，不是所有对象都可以序列化，比如file descriptor.

In [144]:
# illustrate __setstate__ and __getstate__  methods
# used in pickling.

class TextReader:
    "Print and number lines in a text file."
    def __init__(self,file):
        self.file = file
        self.fh = open(file,'r')
        self.lineno = 0

    def readline(self):
        self.lineno = self.lineno + 1
        line = self.fh.readline()
        if not line:
            return None
        return "%d: %s" % (self.lineno,line[:-1])

    # return data representation for pickled object
    def __getstate__(self):
        odict = self.__dict__.copy()    # get attribute dictionary
        del odict['fh']          # remove filehandle entry
        return odict

    # restore object state from data representation generated 
    # by __getstate__
    def __setstate__(self,dict):
        fh = open(dict['file'])  # reopen file
        count = dict['lineno']   # read from file...
        while count:             # until line count is restored
            fh.readline()
            count = count - 1
        dict['fh'] = fh          # create filehandle entry
        self.__dict__ = dict     # make dict our attribute dictionary

### Copy \_\_deepcopy\_\_ 和\_\_copy\_\_

这个和python的内存模型有关.和c++里的有点类似的

普通copy时,仅是引用; deepcopy才是真正的copy.有个copy模块

4种程度:
```sh
b=a, 变量id完全一致,重命名
c=a[:] 切片
d=copy.copy(a)
e=copy.deepcopy(a)

```

In [145]:
import copy

In [146]:
a =[1,2,3,4,[1,2,3]]
b= a
c = a[:]
d = copy.copy(a)
e = copy.deepcopy(a)

In [147]:
id(a)

140421341777224

In [148]:
id(b)

140421341777224

In [149]:
id(c)

140421341781832

In [150]:
id(d)

140421341942664

In [151]:
id(e)

140421342366536

copy和切片都为浅拷贝，deepcopy为深拷贝,完全不同的内存空间的复制
```
b=a这种,只是变量重命名.引用的都是一个内存
对a中的list元素: [1,2,3]仅仅是引用, 如果list发生了变化，则浅拷贝的对象,也跟着变化
对a中的其他元素, 浅拷贝以上都是复制
```

In [152]:
a[4].append(4);
print(a)
print(b)
print(c)
print(d)
print(e)

[1, 2, 3, 4, [1, 2, 3, 4]]
[1, 2, 3, 4, [1, 2, 3, 4]]
[1, 2, 3, 4, [1, 2, 3, 4]]
[1, 2, 3, 4, [1, 2, 3, 4]]
[1, 2, 3, 4, [1, 2, 3]]


In [153]:
a.append(5)
print(a)
print(b)
print(c)
print(d)
print(e)

[1, 2, 3, 4, [1, 2, 3, 4], 5]
[1, 2, 3, 4, [1, 2, 3, 4], 5]
[1, 2, 3, 4, [1, 2, 3, 4]]
[1, 2, 3, 4, [1, 2, 3, 4]]
[1, 2, 3, 4, [1, 2, 3]]


回到\_\_deepcopy\_\_方法,实现了这个方法的类，自然就可以被deepcopy正确的调用了，默认的(){}等都是实现了的

如果继承dict写新的类，有新的属性，需要重写该方法.

```py
def __deepcopy__(self, memo):
    r = self.__class__()
    for k, v in self.items():
        r[copy.deepcopy(k)] = copy.deepcopy(v)
    r.core = self.core
    return r
```