In [1]:
import time
import random
import threading
from threading import Thread

In Python 3, `threading` is the module used to create and use threads. There's a low level module `_thread` but it's not recommended to use it directly. I'm mentioning it just as a warning, **don't use `_thread`!**.

The most important class in the `threading` module is: `Thread` (doh!).

Very simplified, this is how a thread is instantiated:

```python
class Thread:
    def __init__(self, target, name=None, args=(), kwargs={}):
        pass
```
(there's a `group` argument which should be always `None`, as it's reserved for future use)

In this case, `target` is the function that will be executed in that particular thread.

Once a thread has been _created_ (instantiated), we'll need to `start()` it in order for it to begin to process.

#### Basic example of a thread

In [2]:
def simple_worker():
    print('hello', flush=True)
    time.sleep(2)
    print('world', flush=True)

In [3]:
t1 = Thread(target=simple_worker)

In [4]:
t1.start()

hello


world


In [5]:
t1.join()

#### Running multiple threads in parallel

In [6]:
def simple_worker():
    time.sleep(random.random() * 5)
    value = random.randint(0, 99)
    print(f'My value: {value}')

In [7]:
threads = [Thread(target=simple_worker) for _ in range(5)]

In [8]:
[t.start() for t in threads]

[None, None, None, None, None]

My value: 65
My value: 70
My value: 83
My value: 18
My value: 49


In [9]:
[t.join() for t in threads]

[None, None, None, None, None]

#### Thread States

A thread can be in multiple states, when a thread has just been created, its state is `"ready"`:

In [10]:
def simple_worker():

    print('Thread running...')
    time.sleep(5)
    print('Thread finished...')

In [17]:
t = Thread(target=simple_worker)

In [12]:
t.is_alive()

False

In [18]:
t.start()

Thread running...


In [19]:
t.is_alive()

True

Thread finished...


A thread that has finished can't be started again, as shown in the following example:

In [20]:
t.start()

RuntimeError: threads can only be started once

**Important:** It's not possible(\*) to manage thread states manually, for example, stopping a thread. A thread always has to run its natural cycle.

(\*) You might find hacks in the internet on how to stop threads, but **it's a bad practice**. We'll discuss more later.

#### Thread Identity

The thread class has two attributes that lets us identify each thread. The human-ready `name`, which we can set when we construct the thread, and the machine-oriented `ident` one

In [25]:
def simple_worker():
    print('Thread running...')
    time.sleep(5)
    print('Thread exiting...')

In [26]:
t = Thread(target=simple_worker)

In [27]:
t.name

'Thread-13 (simple_worker)'

`ident` will be `None`until we run the thread.

In [31]:
t.ident is None

False

In [32]:
t.start()

RuntimeError: threads can only be started once

In [None]:
t.name

In [None]:
t.ident

We can create a thread and assign a custom name to it:

In [33]:
t = Thread(target=simple_worker, name='PyCon 2020 Tutorial!')

In [34]:
t.start()

Thread running...


In [35]:
t.name

Thread exiting...


'PyCon 2020 Tutorial!'

In [36]:
t.ident

140679950878272

#### A thread knows itself

It's also possible to know the identity of the thread from within the thread itself. It might be counter intuitive as we don't have the reference to the created object, but the module function `threading.currentThread()` will provide access to it.

In [43]:
def simple_worker():
    sleep_secs = random.randint(1, 5)
    myself = threading.current_thread()
    ident = threading.get_ident()
    print(f"I am thread {myself.name} (ID {ident}), and I'm sleeping for {sleep_secs}.")
    time.sleep(sleep_secs)
    print(f'Thread {myself.name} exiting...')

In [44]:
t1 = Thread(target=simple_worker, name='Bubbles')
t2 = Thread(target=simple_worker, name='Blossom')
t3 = Thread(target=simple_worker, name='Buttercup')

In [45]:
t1.start()

I am thread Bubbles (ID 140679925700160), and I'm sleeping for 4.


In [46]:
t2.start()

I am thread Blossom (ID 140679950878272), and I'm sleeping for 4.

In [47]:
t3.start()




I am thread Buttercup (ID 140679934092864), and I'm sleeping for 5.


In [48]:
print('Waiting...')

Waiting...


#### Passing parameters to threads

Passing parameters is simple with the thread constructor, just use the `args` argument:

In [49]:
def simple_worker(time_to_sleep):
    myself = threading.current_thread()
    ident = threading.get_ident()
    print(f"I am thread {myself.name} (ID {ident}), and I'm sleeping for {time_to_sleep}.")
    time.sleep(time_to_sleep)
    print(f'Thread {myself.name} exiting...')

In [50]:
t1 = Thread(target=simple_worker, name='Bubbles', args=(3, ))
t2 = Thread(target=simple_worker, name='Blossom', args=(1.5, ))
t3 = Thread(target=simple_worker, name='Buttercup', args=(2, ))

In [51]:
t1.start()

I am thread Bubbles (ID 140679942485568), and I'm sleeping for 3.


In [52]:
t2.start()

Thread Bubbles exiting...


I am thread Blossom (ID 140679925700160), and I'm sleeping for 1.5.

In [53]:
t3.start()


Thread Blossom exiting...


I am thread Buttercup (ID 140679950878272), and I'm sleeping for 2.


#### Subclassing `Thread`

So far, the way we've created threads is by passing a `target` function to be executed. There's an alternative, more OOP-way to do it, which is extending the Thread class:

In [63]:
class MyThread(Thread):
    def __init__(self, time_to_sleep, name=None):
        super().__init__(name=name)
        self.time_to_sleep = time_to_sleep
        
    def run(self):
        ident = threading.get_ident()
        print(f"I am thread {self.name} (ID {ident}), and I'm sleeping for {self.time_to_sleep} secs.")
        time.sleep(self.time_to_sleep)
        print(f'Thread {self.name} exiting...')

In [64]:
t = MyThread(2)

In [65]:
t.start()

I am thread Thread-15 (ID 140679934092864), and I'm sleeping for 2 secs.


Thread Thread-15 exiting...


## Shared Data

As we'll see, Threads can access shared data within the process they live in. Example:

In [66]:
TIME_TO_SLEEP = 2

In [67]:
def simple_worker():
    myself = threading.current_thread()
    print(f"I am thread {myself.name}, and I'm sleeping for {TIME_TO_SLEEP}.")
    time.sleep(TIME_TO_SLEEP)
    print(f'Thread {myself.name} exiting...')

In [68]:
t1 = Thread(target=simple_worker, name='Bubbles')
t2 = Thread(target=simple_worker, name='Blossom')
t3 = Thread(target=simple_worker, name='Buttercup')

In [69]:
t1.start()

I am thread Bubbles, and I'm sleeping for 2.


Thread Bubbles exiting...


In [70]:
t2.start()

I am thread Blossom, and I'm sleeping for 2.


Thread Blossom exiting...


In [None]:
t3.start()

I am thread Buttercup, and I'm sleeping for 2.


Thread Buttercup exiting...


How is this possible?

Remember, all threads live **within the same process**, and the variable `TIME_TO_SLEEP` is stored in the process. So all the threads created can access that variable.

<img src="thread_shared_data.png" width=900 />

Can we start 900 threads all at once? Sadly, we can't. Each threads consumes resources and too many threads are usually a problem.

So, what can we do when we need to process too many concurrent jobs? We'll create workers and use a consumer-producer model. But first, we need to talk about shared data, race conditions and synchronization...

## Summary:

* `threading` module ✅
* `_thread`  module ❌

A thread's life cycle is Instantiated > Started > Running > Finished.