## Introduction to Concurrency in Python

## Overview
Concurrency is a technique that allows multiple tasks to run simultaneously, making efficient use of system resources and improving the performance of applications. In Python, concurrency can be achieved using processes and threads. This lesson will introduce you to these concepts, explain their differences, and provide examples of how to implement them in Python.

### Learning Objectives
By the end of this lesson, you should be able to:
1. Understand the concepts of concurrency, processes, and threads.
2. Differentiate between processes and threads.
3. Implement concurrency using the `multiprocessing` and `threading` modules in Python.
4. Understand common use cases and challenges associated with concurrency.



### Concurrency Concepts
Concurrency involves multiple tasks making progress at the same time. These tasks can be managed using:
- **Processes**: Independent units of execution with their own memory space.
- **Threads**: Units of execution within a process that share the same memory space.

## The threading Module
### Creating threads

### Python's `threading` Module
The `threading` module provides a higher-level, more powerful interface for thread management. It treats each thread as an object, allowing for better organization and the use of OOP principles.

#### Characteristics
- High-level interface.
- Threads are managed as objects.
- More functionalities like synchronization primitives (e.g., locks, events, conditions, semaphores).

In [45]:
def print_numbers():
    for i in range(1, 3):
        print(i)
        time.sleep(1)

start_time = time.time()

print_numbers()
print_numbers()

end_time = time.time()

print(f"Execution time: {end_time - start_time} seconds")

1
2
1
2
Execution time: 4.0116400718688965 seconds


### Steps to Implement Multithreading

1. **Import the `threading` module**:
   ```python
   import threading
   ```

2. **Define a function for the thread to execute**:
   ```python
   def worker():
       # Perform some task
       pass
   ```

3. **Create a `Thread` object and pass the function as the target**:
   ```python
   thread = threading.Thread(target=worker)
   ```

4. **Start the thread using the `start()` method**:
   ```python
   thread.start()
   ```

5. **Optionally, use `join()` to wait for the thread to complete**:
   ```python
   thread.join()
   ```

In [44]:
import threading
import time

def print_numbers():
    for i in range(1, 3):
        print(i)
        time.sleep(1)

start_time = time.time()
# Create a thread
thread = threading.Thread(target=print_numbers)

# Start the thread
thread.start()

# Wait for the thread to complete
thread.join()

end_time = time.time()

print(f"Execution time: {end_time - start_time} seconds")

print("Thread finished execution")


1
2
Execution time: 2.00860595703125 seconds
Thread finished execution


### What we did in the example
 - We define a function print_numbers that prints numbers from 1 to 5 with a 1-second interval.
 - We create a thread targeting this function.
 - We start the thread using thread.start().
 - We wait for the thread to complete using thread.join().

### Create multiple threads
We can create multiple threads to perform concurrent tasks.

In [48]:
import threading
import time

# Define a global lock

def print_numbers():
    for i in range(1, 3):
        print(i)
        time.sleep(1)

start_time = time.time()
# Create the first thread
thread1 = threading.Thread(target=print_numbers)
thread1.start()

# Create the second thread
thread2 = threading.Thread(target=print_numbers)
thread2.start()

# Wait for both threads to complete
thread1.join()
thread2.join()
end_time = time.time()

print("Both threads finished execution")
print(f"Execution time: {end_time - start_time} seconds")


1
1
22

Both threads finished execution
Execution time: 2.013622999191284 seconds


### Adding Multiple Threads Using a Loop
    
 - We create a list called threads to store the thread objects.
 - We use a loop to create multiple threads, each targeting the print_numbers function.
 - Each thread is started using thread.start(), and the thread object is appended to the threads list.
 - After starting all threads, we use another loop to join each thread, ensuring the main program waits for all threads to complete.

In [None]:
import threading
import time

def print_numbers():
    for i in range(1, 6):
        print(i)
        time.sleep(1)

# Create and start multiple threads using a loop
threads = []
for _ in range(3):  # Adjust the number of threads as needed
    thread = threading.Thread(target=print_numbers)
    threads.append(thread)
    thread.start()

# Wait for all threads to complete
for thread in threads:
    thread.join()

print("All threads finished execution")


### Adding Arguments to the Function

Next, we'll modify the print_numbers function to accept an argument and pass different arguments to each thread

 - The print_numbers function is modified to accept an argument, name.
 - We create a list of names to be passed as arguments to the threads.
 - In the loop, we create each thread with threading.Thread(target=print_numbers, args=(name,)), where args is a tuple containing the argument for the function.
 - Each thread is started and appended to the threads list.
 - Finally, we join each thread to ensure the main program waits for all threads to complete.

In [1]:
import threading
import time

def print_numbers(name):
    for i in range(1, 6):
        print(f"{name}: {i}")
        time.sleep(1)

# Create and start multiple threads with arguments
threads = []
names = ["Thread-1", "Thread-2", "Thread-3"]  # Example arguments

for name in names:
    thread = threading.Thread(target=print_numbers, args=(name,))
    threads.append(thread)
    thread.start()

# Wait for all threads to complete
for thread in threads:
    thread.join()

print("All threads finished execution")


Thread-1: 1Thread-2: 1

Thread-3: 1
Thread-2: 2Thread-1: 2
Thread-3: 2

Thread-2: 3Thread-3: 3

Thread-1: 3
Thread-1: 4Thread-3: 4

Thread-2: 4
Thread-3: 5Thread-2: 5

Thread-1: 5
All threads finished execution


### Synchronizing Threads
To prevent race conditions, you can use thread synchronization primitives like Locks.

In [None]:
import threading
import time

lock = threading.Lock()

def print_numbers():
    with lock:
        for i in range(1, 6):
            print(f"Numbers: {i}")
            time.sleep(1)

def print_letters():
    with lock:
        for letter in 'ABCDE':
            print(f"Letters: {letter}")
            time.sleep(1.5)

# Create threads
thread1 = threading.Thread(target=print_numbers)
thread2 = threading.Thread(target=print_letters)

# Start threads
thread1.start()
thread2.start()

# Wait for threads to complete
thread1.join()
thread2.join()

print("Both threads finished execution")


### Explanation

 - Define a Global Lock: 
    We define a global lock object using threading.Lock(). This lock will be used to synchronize access to the critical section of the code.

 - Using the Lock:
    Within the print_numbers function, we use a with lock: statement to ensure that only one thread at a time can execute the code inside
    this block. This prevents the print statements from different threads from interleaving and ensures that the output for each thread is
    complete before another thread's output begins.

 - Create and Start Threads:
    We create and start multiple threads as before, passing different arguments to each thread.

 - Wait for Threads to Complete:
    We wait for all threads to complete using thread.join().

By using a lock, we ensure that the print statements from different threads do not interleave, making the output easier to read and ensuring that each thread's output appears sequentially. This is particularly useful when threads are accessing shared resources or performing operations that should not be interleaved.

In [None]:
import threading
import time

# Define a global lock
lock = threading.Lock()

def print_numbers(name):
    for i in range(1, 6):
        with lock:
            print(f"{name}: {i}")
        time.sleep(1)

# Create and start multiple threads with arguments
threads = []
names = ["Thread-1", "Thread-2", "Thread-3"]  # Example arguments

for name in names:
    thread = threading.Thread(target=print_numbers, args=(name,))
    threads.append(thread)
    thread.start()

# Wait for all threads to complete
for thread in threads:
    thread.join()

print("All threads finished execution")


### Web Scraping

Here's a more complex example using multithreading to scrape multiple web pages concurrently.

In [None]:
import threading
import requests
from bs4 import BeautifulSoup

urls = [
    'https://example.com/page1',
    'https://example.com/page2',
    'https://example.com/page3'
]

def fetch_url(url):
    response = requests.get(url)
    if response.status_code == 200:
        page_content = response.text
        soup = BeautifulSoup(page_content, 'html.parser')
        title = soup.title.string
        print(f'Title of {url}: {title}')
    else:
        print(f'Failed to retrieve {url}')

# Create a thread for each URL
threads = []
for url in urls:
    thread = threading.Thread(target=fetch_url, args=(url,))
    threads.append(thread)
    thread.start()

# Wait for all threads to complete
for thread in threads:
    thread.join()

print("All URLs fetched")

### concurrent.futures

The concurrent.futures module was introduced in Python 3.2. It provides a high-level interface for asynchronously executing callables (functions or methods) using threads or processes. This module abstracts away the details of thread and process management, making it easier to write concurrent code without having to deal with low-level threading or multiprocessing APIs directly.

### Pools for Inter-Process Communication (Queue's)

A thread pool is a collection of pre-initialized threads that are ready to perform tasks. Instead of creating threads dynamically whenever a task needs to be executed, a thread pool creates a fixed number of threads upfront and keeps them alive throughout the lifetime of the application. These threads continuously wait for tasks to be assigned to them, execute the tasks, and then return to the pool to be reused for subsequent tasks.

In [None]:
import concurrent.futures
import time

# Function to print numbers
def print_numbers(name):
    for i in range(1, 6):
        print(f"{name}: {i}")
        time.sleep(1)

# Create a thread pool with 2 threads
with concurrent.futures.ThreadPoolExecutor(max_workers=2) as executor:
    # Submit tasks to the thread pool
    executor.submit(print_numbers, "Thread-1")
    executor.submit(print_numbers, "Thread-2")

# The 'with' statement ensures that the thread pool is properly cleaned up after use


### Creating threads with concurrent.futures

In [33]:
import concurrent.futures
import time

def print_numbers():
    for i in range(1, 6):
        print(i)
        time.sleep(1)

# Create a ThreadPoolExecutor with one thread
with concurrent.futures.ThreadPoolExecutor() as executor:
    # Submit the print_numbers function to the thread pool
    future = executor.submit(print_numbers)

    # Wait for the task to complete
    future.result()

print("Thread finished execution")


1
2
3
4
5
Thread finished execution


### Creating multile threads with concurrent.futures

In [None]:
import concurrent.futures
import time

# Define the print_numbers function
def print_numbers():
    for i in range(1, 3):
        print(i)
        time.sleep(1)

start_time = time.time()
# Create a ThreadPoolExecutor with 2 threads
with concurrent.futures.ThreadPoolExecutor() as executor:
    # Submit print_numbers function twice to the executor
    future1 = executor.submit(print_numbers)
    future2 = executor.submit(print_numbers)

    # Retrieve and print the results of both tasks
    future1.result()
    future2.result()
    
end_time = time.time()
print("Both threads finished execution")
print(f"Execution time: {end_time - start_time} seconds")


### Creating multile threads using loops with concurrent.futures

In [None]:
import concurrent.futures
import time

# Define the print_numbers function
def print_numbers():
    for i in range(1, 3):
        print(i)
        time.sleep(1)

start_time = time.time()

# Create a ThreadPoolExecutor with 2 threads
with concurrent.futures.ThreadPoolExecutor() as executor:
    # Submit print_numbers function twice to the executor using a loop
    futures = [executor.submit(print_numbers) for _ in range(2)]

    # Retrieve and print the results of all tasks
    for future in futures:
        future.result()

end_time = time.time()

print("Both threads finished execution")
print(f"Execution time: {end_time - start_time} seconds")


### Creating multile threads parameters with concurrent.futures

In [None]:
import concurrent.futures
import time

# Define the print_numbers function with a variable seconds parameter
def print_numbers(seconds):
    for i in range(seconds + 1):
        print(i)
        time.sleep(1)

start_time = time.time()

# Create a ThreadPoolExecutor with 2 threads
with concurrent.futures.ThreadPoolExecutor() as executor:
    # Submit print_numbers function twice to the executor using a loop
    futures = [executor.submit(print_numbers, 3) for _ in range(2)]

    # Retrieve and print the results of all tasks
    for future in futures:
        future.result()

end_time = time.time()

print("Both threads finished execution")
print(f"Execution time: {end_time - start_time} seconds")


### Creating multile threads multiple parameters with concurrent.futures

In [None]:
import concurrent.futures
import time

# Define the print_numbers function with a variable seconds parameter
def print_numbers(seconds):
    for i in range(seconds + 1):
        print(i)
        time.sleep(1)

# List of values for seconds
seconds_list = [3, 5]

start_time = time.time()

# Create a ThreadPoolExecutor with 2 threads
with concurrent.futures.ThreadPoolExecutor() as executor:
    # Submit print_numbers function with different seconds values to the executor
    futures = [executor.submit(print_numbers, seconds) for seconds in seconds_list]

    # Retrieve and print the results of all tasks
    for future in futures:
        future.result()

end_time = time.time()

print("Both threads finished execution")
print(f"Execution time: {end_time - start_time} seconds")
