In [1]:
import time
import random
import threading
from threading import Thread

In Python 3, `threading` is the module used to create and use threads. There's a low level module `_thread` but it's not recommended to use it directly. I'm mentioning it just as a warning, **don't use `_thread`!**.

The most important class in the `threading` module is: `Thread` (doh!).

Very simplified, this is how a thread is instantiated:

```python
class Thread:
    def __init__(self, target, name=None, args=(), kwargs={}):
        pass
```
(there's a `group` argument which should be always `None`, as it's reserved for future use)

In this case, `target` is the function that will be executed in that particular thread.

Once a thread has been _created_ (instantiated), we'll need to `start()` it in order for it to begin to process.

#### Basic example of a thread

In [3]:
def simple_worker():
    print('hello', flush=True)
    time.sleep(10)
    print('world', flush=True)

In [11]:
t1 = Thread(target=simple_worker)

In [12]:
t1.start()

hello
world


In [13]:
3+4

7

In [5]:
t1.join()#this method makes thread wait for other operation finished before it exist.

#### Running multiple threads in parallel

In [23]:
def simple_worker():
    time.sleep(random.random() * 20)
    value = random.randint(0, 99)
    print(f'My value: {value}')

In [27]:
threads = [Thread(target=simple_worker) for _ in range(5)]

In [28]:
[t.start() for t in threads]

[None, None, None, None, None]

My value: 5
My value: 76
My value: 49
My value: 29
My value: 57


In [26]:
[t.join() for t in threads]

My value: 80
My value: 89
My value: 75
My value: 90
My value: 44


[None, None, None, None, None]

#### Thread States

A thread can be in multiple states, when a thread has just been created, its state is `"ready"`:

In [40]:
def simple_worker():

    print('Thread running...')
    time.sleep(5)
    print('Thread finished...')

In [41]:
t = Thread(target=simple_worker)

In [42]:
t.is_alive()

False

In [43]:
t.start()

Thread running...


In [44]:
t.is_alive()

True

Thread finished...


A thread that has finished can't be started again, as shown in the following example:

In [45]:
t.start()

RuntimeError: threads can only be started once

**Important:** It's not possible(\*) to manage thread states manually, for example, stopping a thread. A thread always has to run its natural cycle.

(\*) You might find hacks in the internet on how to stop threads, but **it's a bad practice**. We'll discuss more later.

#### Thread Identity

The thread class has two attributes that lets us identify each thread. The human-ready `name`, which we can set when we construct the thread, and the machine-oriented `ident` one

In [46]:
def simple_worker():
    print('Thread running...')
    time.sleep(5)
    print('Thread exiting...')

In [47]:
t = Thread(target=simple_worker)

In [48]:
t.name

'Thread-33'

`ident` will be `None`until we run the thread.

In [49]:
t.ident is None

True

In [50]:
t.start()

Thread running...


In [None]:
t.name

In [51]:
t.ident

10836

Thread exiting...


We can create a thread and assign a custom name to it:

In [52]:
t = Thread(target=simple_worker, name='MISY3871 Lecture 5!')

In [53]:
t.start()

Thread running...


In [54]:
t.name

'MISY3871 Lecture 5!'

Thread exiting...


In [55]:
t.ident

10792

#### A thread knows itself

It's also possible to know the identity of the thread from within the thread itself. It might be counter intuitive as we don't have the reference to the created object, but the module function `threading.currentThread()` will provide access to it.

In [56]:
def simple_worker():
    sleep_secs = random.randint(1, 5)
    myself = threading.current_thread()
    ident = threading.get_ident()
    print(f"I am thread {myself.name} (ID {ident}), and I'm sleeping for {sleep_secs}.")
    time.sleep(sleep_secs)
    print(f'Thread {myself.name} exiting...')

In [57]:
t1 = Thread(target=simple_worker, name='Bubbles')
t2 = Thread(target=simple_worker, name='Blossom')
t3 = Thread(target=simple_worker, name='Buttercup')

In [58]:
t1.start()

I am thread Bubbles (ID 7652), and I'm sleeping for 4.


In [59]:
t2.start()

I am thread Blossom (ID 776), and I'm sleeping for 3.


In [60]:
t3.start()

I am thread Buttercup (ID 10544), and I'm sleeping for 5.
Thread Blossom exiting...
Thread Bubbles exiting...
Thread Buttercup exiting...


In [None]:
print('Waiting...')

#### Passing parameters to threads

Passing parameters is simple with the thread constructor, just use the `args` argument:

In [61]:
def simple_worker(time_to_sleep):
    myself = threading.current_thread()
    ident = threading.get_ident()
    print(f"I am thread {myself.name} (ID {ident}), and I'm sleeping for {time_to_sleep}.")
    time.sleep(time_to_sleep)
    print(f'Thread {myself.name} exiting...')

In [62]:
t1 = Thread(target=simple_worker, name='Bubbles', args=(3, ))
t2 = Thread(target=simple_worker, name='Blossom', args=(1.5, ))
t3 = Thread(target=simple_worker, name='Buttercup', args=(2, ))

In [63]:
t1.start()

I am thread Bubbles (ID 10848), and I'm sleeping for 3.


In [64]:
t2.start()

I am thread Blossom (ID 10024), and I'm sleeping for 1.5.


In [65]:
t3.start()

I am thread Buttercup (ID 7740), and I'm sleeping for 2.
Thread Blossom exiting...
Thread Bubbles exiting...
Thread Buttercup exiting...


#### Subclassing `Thread`

So far, the way we've created threads is by passing a `target` function to be executed. There's an alternative, more OOP-way to do it, which is extending the Thread class:

In [68]:
class MyThread(Thread):
    def __init__(self, time_to_sleep, name=None):
        super().__init__(name=name)
        self.time_to_sleep = time_to_sleep
        
    def run(self):
        ident = threading.get_ident()
        print(f"I am thread {self.name} (ID {ident}), and I'm sleeping for {self.time_to_sleep} secs.")
        time.sleep(self.time_to_sleep)
        print(f'Thread {self.name} exiting...')

In [69]:
t = MyThread(2,name="Thread1")

In [70]:
t.start()

I am thread Thread1 (ID 10716), and I'm sleeping for 2 secs.
Thread Thread1 exiting...


## Shared Data

As we'll see, Threads can access shared data within the process they live in. Example:

In [71]:
TIME_TO_SLEEP = 2

In [72]:
def simple_worker():
    myself = threading.current_thread()
    print(f"I am thread {myself.name}, and I'm sleeping for {TIME_TO_SLEEP}.")
    time.sleep(TIME_TO_SLEEP)
    print(f'Thread {myself.name} exiting...')

In [73]:
t1 = Thread(target=simple_worker, name='Bubbles')
t2 = Thread(target=simple_worker, name='Blossom')
t3 = Thread(target=simple_worker, name='Buttercup')

In [74]:
t1.start()

I am thread Bubbles, and I'm sleeping for 2.


In [75]:
t2.start()

I am thread Blossom, and I'm sleeping for 2.


In [76]:
t3.start()

I am thread Buttercup, and I'm sleeping for 2.
Thread Bubbles exiting...
Thread Blossom exiting...
Thread Buttercup exiting...


How is this possible?

Remember, all threads live **within the same process**, and the variable `TIME_TO_SLEEP` is stored in the process. So all the threads created can access that variable.

<img src="img/thread_shared_data.png" width=900 />

#### How many threads can we start?

Let's say we need to get prices for 10 exchanges, 3 symbols, for a total of 30 days. Those are a lot of requests:

In [None]:
10 * 3 * 30

Can we start 900 threads all at once? Sadly, we can't. Each threads consumes resources and too many threads are usually a problem.

So, what can we do when we need to process too many concurrent jobs? We'll create workers and use a consumer-producer model. But first, we need to talk about shared data, race conditions and synchronization...

## Summary:

* `threading` module ✅
* `_thread`  module ❌

A thread's life cycle is Instantiated > Started > Running > Finished.