# Introduction

## 1. threading.local 对象

在 Python 多线程编程中，可能会用到线程自己的局部变量，这些变量期望对其它线程不可见。一种优雅的方法是使用`threading.local`类。

在线程中定义到该类的变量，只对该线程可见。

如下例子：

In [36]:
import threading

local = threading.local()

def run_on_thread():
    x = getattr(local, "x", None)
    print(x)
    local.x = "sub-thread"
    
local.x = "main-thread"
print(local.x)
t = threading.Thread(target=run_on_thread)
t.start()
t.join()


main-thread
None


从上面例子可以看出，主线程中保存在`local`的变量`x`在子线程中并不可见。同样，修改子线程中的`x`并不影响主线程的变量。这个全局local变量用起来就像是每个线程自己的局部变量一样方便。

`threading.local`是怎么实现的呢？ python lib 里的代码其实很简单。主要有两个类，一个是`_localimpl`类，一个是`local`类。

其中，`_localimpl`类保存了一个字典，该字典的key是线程id，value是线程对象的weak reference以及为该线程创建的一个dict。因此，在一个线程中访问/保存local的属性的时候，其实访问/保存的是`_localimpl`中为该线程创建的dict。 这里不直接用`thread.__dict__`的原因估计是怕污染thread的`__dict__`。

In [None]:
class _localimpl:
    """A class managing thread-local dicts"""
    __slots__ = 'key', 'dicts', 'localargs', 'locallock', '__weakref__'

    def __init__(self):
        # The key used in the Thread objects' attribute dicts.
        # We keep it a string for speed but make it unlikely to clash with
        # a "real" attribute.
        self.key = '_threading_local._localimpl.' + str(id(self))
        # { id(Thread) -> (ref(Thread), thread-local dict) }
        self.dicts = {}

    def get_dict(self):
        """Return the dict for the current thread. Raises KeyError if none
        defined."""
        thread = current_thread()
        return self.dicts[id(thread)][1]

    def create_dict(self):
        """Create a new dict for the current thread, and return it."""
        localdict = {}
        key = self.key
        thread = current_thread()
        idt = id(thread)
        def local_deleted(_, key=key):
            # When the localimpl is deleted, remove the thread attribute.
            thread = wrthread()
            if thread is not None:
                del thread.__dict__[key]
        def thread_deleted(_, idt=idt):
            # When the thread is deleted, remove the local dict.
            # Note that this is suboptimal if the thread object gets
            # caught in a reference loop. We would like to be called
            # as soon as the OS-level thread ends instead.
            local = wrlocal()
            if local is not None:
                dct = local.dicts.pop(idt)
        # 注册一个callback，当local被销毁的时候，线程中保存的local对象的ref就被删除。
        wrlocal = ref(self, local_deleted)
        # 注册一个callback，当thread被销毁的时候，local中保存的对应的线程字典就被删除。
        wrthread = ref(thread, thread_deleted)
        # 必须保存这个wrlocal，否则callback不会生效。
        thread.__dict__[key] = wrlocal
        self.dicts[idt] = wrthread, localdict
        return localdict

第二个类`threading.local`相当于一个代理类，当一个线程访问该类对象中的属性的时候，首先根据线程id拿到保存在`_localimpl`中该线程的dict。然后将该dict设置为`threading.local.__dict__`。这里设置的时候需要加锁（RLock），因为如果多个线程同时设置`threading.local.__dict__`的时候，不加锁可能会导致未知情况。

In [None]:
@contextmanager
def _patch(self):
    impl = object.__getattribute__(self, '_local__impl')
    try:
        dct = impl.get_dict()
    except KeyError:
        dct = impl.create_dict()
        args, kw = impl.localargs
        self.__init__(*args, **kw)
    # 加锁以防止多个线程同时设置 local 的 __dict__，出现未知错误。
    with impl.locallock:
        object.__setattr__(self, '__dict__', dct)
        yield


class local:
    __slots__ = '_local__impl', '__dict__'

    def __new__(cls, *args, **kw):
        if (args or kw) and (cls.__init__ is object.__init__):
            raise TypeError("Initialization arguments are not supported")
        self = object.__new__(cls)
        impl = _localimpl()
        impl.localargs = (args, kw)
        impl.locallock = RLock()
        object.__setattr__(self, '_local__impl', impl)
        # We need to create the thread dict in anticipation of
        # __init__ being called, to make sure we don't call it
        # again ourselves.
        impl.create_dict()
        return self

    def __getattribute__(self, name):
        with _patch(self):
            return object.__getattribute__(self, name)

    def __setattr__(self, name, value):
        if name == '__dict__':
            raise AttributeError(
                "%r object attribute '__dict__' is read-only"
                % self.__class__.__name__)
        with _patch(self):
            return object.__setattr__(self, name, value)

    def __delattr__(self, name):
        if name == '__dict__':
            raise AttributeError(
                "%r object attribute '__dict__' is read-only"
                % self.__class__.__name__)
        with _patch(self):
            return object.__delattr__(self, name)


## 2. Werkzeug 中的 Local 实现

Werkzeug 是一个WSGI工具库，它里面也实现了一个Local对象，作者称自己实现的原因如下：
1. Werkzeug 主要用“ThreadLocal”来满足并发的要求，python 自带的ThreadLocal只能实现基于线程的并发。而python中还有其他许多并发方式，比如常见的协程（greenlet），因此需要实现一种能够支持协程的Local对象。
2. WSGI不保证每次都会产生一个新的线程来处理请求，也就是说线程是可以复用的（可以维护一个线程池来处理请求）。这样如果werkzeug 使用python自带的ThreadLocal，一个“不干净（存有之前处理过的请求的相关数据）”的线程会被用来处理新的请求。因此，作者希望可以自己清理Local对象中保存的内容。

Werkzeug的Local实现非常简单，就是维护了一个全局字典，字典的key是线程/协程的id（`get_indent`的返回值），字典的value是一个字典，保存线程/协程自己的变量。

看代码：

In [None]:
# Since each thread has its own greenlet we can just use those as identifiers
# for the context.  If greenlets are not available we fall back to the
# current thread ident.
try:
    from greenlet import getcurrent as get_ident
except ImportError:  # noqa
    try:
        from thread import get_ident  # noqa
    except ImportError:  # noqa
        try:
            from _thread import get_ident  # noqa
        except ImportError:  # noqa
            from dummy_thread import get_ident  # noqa

class Local(object):
    __slots__ = ('__storage__', '__ident_func__')

    def __init__(self):
        object.__setattr__(self, '__storage__', {})
        object.__setattr__(self, '__ident_func__', get_ident)

    def __iter__(self):
        return iter(self.__storage__.items())

    def __call__(self, proxy):
        """Create a proxy for a name."""
        return LocalProxy(self, proxy)

    def __release_local__(self):
        self.__storage__.pop(self.__ident_func__(), None)

    def __getattr__(self, name):
        try:
            return self.__storage__[self.__ident_func__()][name]
        except KeyError:
            raise AttributeError(name)

    def __setattr__(self, name, value):
        ident = self.__ident_func__()
        storage = self.__storage__
        try:
            storage[ident][name] = value
        except KeyError:
            storage[ident] = {name: value}

    def __delattr__(self, name):
        try:
            del self.__storage__[self.__ident_func__()][name]
        except KeyError:
            raise AttributeError(name)

## 相关资料
* [为什么用__slots__](https://stackoverflow.com/questions/472000/usage-of-slots)
* [什么是greenlet?](https://greenlet.readthedocs.io/en/latest/#indices-and-tables)
* [深入理解 Python 中的 ThreadLocal 变量](https://juejin.im/entry/58217e100ce46300589e02c7)

In [27]:
from greenlet import greenlet

def test1():
    print(12)
    gr2.switch()
    print(34)
    gr2.switch()

def test2():
    print(56)
    gr1.switch()
    print(78)

gr1 = greenlet(test1)
gr2 = greenlet(test2)
gr1.switch()

12
56
34
78


In [29]:
import greenlet
def test1(x, y):
    print(id(greenlet.getcurrent()), id(greenlet.getcurrent().parent)) # 40240272 40239952
    z = gr2.switch(x+y)
    print('back z', z)

def test2(u):
    print(id(greenlet.getcurrent()), id(greenlet.getcurrent().parent)) # 40240352 40239952
    return 'hehe'

gr1 = greenlet.greenlet(test1)
gr2 = greenlet.greenlet(test2)
print(id(greenlet.getcurrent()), id(gr1), id(gr2))    # 40239952, 40240272, 40240352
print(gr1.switch("hello", " world"), 'back to main')    # hehe back to main

1593933394296 1593933396272 1593933396728
1593933396272 1593933394296
1593933396728 1593933394296
hehe back to main
